Commit Graph

49278 Commits

Author SHA1 Message Date
Nabeel Omer 0d41794335 [SLP] Add cost model for `llvm.powi.*` intrinsics (REAPPLIED)
Patch was reverted in 4c5f10a due to buildbot failures, now being
reapplied with updated AArch64 and RISCV tests.

This patch adds handling for the llvm.powi.* intrinsics in
BasicTTIImplBase::getIntrinsicInstrCost() and improves vectorization.
Closes #53887.

Differential Revision: https://reviews.llvm.org/D128172
2022-06-24 10:23:19 +00:00
Nikita Popov 54eff7da3c [AA] Export isEscapeSource() API (NFC)
Export API that was previously private to BasicAliasAnalysis and
will be used in D127202.
2022-06-24 11:59:15 +02:00
Fangrui Song 44ee3efb93 [CodeGen] Simplify isVirtualRegister. NFC 2022-06-23 23:26:02 -07:00
Douglas Yung f401dd6f43 Revert "Add support for decoding base64."
This reverts commit 8b987ca5e3.

This change breaks several Windows bots
 - https://lab.llvm.org/buildbot/#/builders/123/builds/11371
 - https://lab.llvm.org/buildbot/#/builders/117/builds/7685
 - https://lab.llvm.org/buildbot/#/builders/42/builds/6077
 - https://lab.llvm.org/buildbot/#/builders/216/builds/6340
2022-06-23 21:47:20 -07:00
Kai Luo 6710b21d46 [PowerPC] Allow llvm.ppc.cfence to accept pointer types
In the context of atomic load, integer, pointer and float point types are allowed, thus we should allow llvm.ppc.cfence to accept any type mentioned.

Fixes https://github.com/llvm/llvm-project/issues/55983.

Reviewed By: shchenz, vchuravy

Differential Revision: https://reviews.llvm.org/D127554
2022-06-24 10:55:32 +08:00
Greg Clayton 8b987ca5e3 Add support for decoding base64.
An upcoming patch to LLDB will require the ability to decode base64. This patch adds support for decoding base64 and adds tests.

Differential Revision: https://reviews.llvm.org/D126254
2022-06-23 16:13:19 -07:00
Derek Schuff 5a082d9c1c [WebAssembly][Object] Remove requirement that objects must have code sections
When parsing name and linking sections, we currently require that the object
must have a code section (it seems that this was intended to verify section
ordering). However it can be useful for binaries to have their code sections
stripped out (e.g. if we just want the debug info). In that case we need
the rest of the known sections (so e.g. we know how many functions there
are, to verify the name section) but not the actual code.

I've removed the restriction completely. I think this is OK because the
section-parsing code already checks function and global indices in many
places for validity and will return appropriate errors if the relevant sections
are missing. Also we can't just replace the requirement of seeing a code section
with a requirement  that we see a function or global section, because a binary
may just not have any functions or globals.
But there's only an problem if the name or linking section tries to name a
nonexistent function.

Part of a fix for https://github.com/emscripten-core/emscripten/issues/13084

Differential Revision: https://reviews.llvm.org/D128094
2022-06-23 13:56:17 -07:00
Jin Xin Ng 22f1273357
[ThinLTO][ELF] Add --thinlto-emit-index-files option
Allows ThinLTO indices to be written to disk on-the-fly/as-part-of “normal” linker execution. Previously ThinLTO indices could be written via --thinlto-index-only but that would cause the linker to exit early. For MLGO specifically, this enables saving the ThinLTO index files without having to restart the linker to collect data only available at later stages (i.e. output of --save-temps) of the linker's execution.

Note, this option does not currently work with:
--thinlto-object-suffix-replace, as this is intended to be used to consume minimized IR bitcode files while --thinlto-emit-index-files is intended to be run together with InProcessThinLTO (which cannot parse minimized IR).
--thinlto-prefix-replace  support is left unimplemented but can be implemented if needed

Differential Revision: https://reviews.llvm.org/D127777
2022-06-23 12:35:42 -07:00
Med Ismail Bennani 148071fbae
[llvm] Update module map to include the `IR/ConstantFold` header
This should fix the build failure occuring when enabling modules
(LLVM_ENABLE_MODULES=On):

https://green.lab.llvm.org/green/job/lldb-cmake/44785/

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-06-23 11:52:25 -07:00
Philip Reames 0c1326748f [BasicTTI] Avoid crash when costing scalable select expansion
If the target has chosen to expand a scalable vector type, BasicTTI tries to scalarize and we'd crash.  As a minimum, we should return an invalid cost instead.

The added test provide coverage for the moment, but given they show a number of gaps in RISCV costing, they're likely not to cover this code path long term.
2022-06-23 09:14:57 -07:00
Baptiste Saleil 79e77a9f39 [AMDGPU] Flush the vmcnt counter in loop preheaders when necessary
waitcnt vmcnt instructions are currently generated in loop bodies before using
values loaded outside of the loop. In some cases, it is better to flush the
vmcnt counter in a loop preheader before entering the loop body. This patch
detects these cases and generates waitcnt instructions to flush the counter.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D115747
2022-06-23 10:53:21 -04:00
Nikita Popov da34966a5a [llvm-c] Add LLVMGetAggregateElement() function
This adds LLVMGetAggregateElement() as a wrapper for
Constant::getAggregateElement(), which allows fetching a
struct/array/vector element without handling different possible
underlying representations.

As the changed echo test shows, previously you for example had to
treat ConstantArray (use LLVMGetOperand) and ConstantDataArray
(use LLVMGetElementAsConstant) separately, not to mention all the
other possible representations (like PoisonValue).

I've deprecated LLVMGetElementAsConstant() in favor of the new
function, which is strictly more powerful (but I could be convinced
to drop the deprecation).

This is partly motivated by https://reviews.llvm.org/D125795,
which drops LLVMConstExtractValue() because the underlying constant
expression no longer exists. This function could previously be used
as a poor man's getAggregateElement().

Differential Revision: https://reviews.llvm.org/D128417
2022-06-23 14:50:54 +02:00
Nikita Popov 20b5f0c641 [IR] Export ConstantFold.h header (NFC)
This is in preparation for https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.
As part of that change, we'll want to invoke some of these constant
folding APIs explicitly, as it won't happen as part of
ConstantExpr::getXYZ() anymore.

Ideally, we'd merge these with the DL-aware constant folding APIs
and only call those, but this is not easily possible for some
current usages (most important IRBuilder, which uses DL-independent
constant folding by default, and some major layering changes would
be needed to change that).

This is basically a reboot of D115035 with different motivation.

Differential Revision: https://reviews.llvm.org/D128213
2022-06-23 11:32:14 +02:00
wangpc 634484885c [TableGen] Add new operator !exists
We can cast a string to a record via !cast, but we have no mechanism
to check if it is valid and TableGen will raise an error if failed to
cast. Besides, we have no semantic null in TableGen (we have `?` but
different backends handle uninitialized value differently), so operator
like `dyn_cast<>` is hard to implement.

In this patch, we add a new operator `!exists<T>(s)` to check whether
a record with type `T` and name `s` exists. Self-references are allowed
just like `!cast`.

By doing these, we can write code like:
```
class dyn_cast_to_record<string name> {
  R value = !if(!exists<R>(name), !cast<R>(name), default_value);
}
defvar v = dyn_cast_to_record<"R0">.value; // R0 or default_value.
```

Reviewed By: tra, nhaehnle

Differential Revision: https://reviews.llvm.org/D127948
2022-06-23 11:11:47 +08:00
Mingming Liu bc856eb3fc [SampleProfile][Inline] Annotate sample profile inline remarks with link phase (prelink/postlink) information.
Differential Revision: https://reviews.llvm.org/D126833
2022-06-22 17:00:53 -07:00
Florian Mayer 9320a32bb9 [MTE] [HWASan] Use LoopInfo for reachability queries.
The reachability queries default to "reachable" after exploring too many
basic blocks. LoopInfo helps it skip over the whole loop.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D127917
2022-06-22 15:28:49 -07:00
Evgenii Stepanov 5011b4ca0e Revert "[Attributor] Ensure to use the proper liveness AA"
Reason: memory leaks

This reverts commit 083010312a.
2022-06-22 13:40:45 -07:00
Guillaume Gomez d0a4450ecd Rename GCCBuiltin into ClangBuiltin
This patch is needed because developers expect "GCCBuiltin" items to be the GCC intrinsics equivalent and not the Clang internals.

Reviewed By: #libc_abi, RKSimon, xbolva00

Differential Revision: https://reviews.llvm.org/D127460
2022-06-22 19:49:20 +01:00
Mingming Liu 67dc8021a1 [Support] Change TrackingStatistic and NoopStatistic to use uint64_t instead of unsigned.
Binary size of `clang` is trivial; namely, numerical value doesn't
change when measured in MiB, and `.data` section increases from 139Ki to
173 Ki.

Differential Revision: https://reviews.llvm.org/D128070
2022-06-22 10:11:40 -07:00
Daniel Thornburgh 8bd078b57c [Symbolize] Parse multi-line markup elements.
This allows registering certain tags as possibly beginning multi-line
elements in the symbolizer markup parser. The parser is kept agnostic to
how lines are delimited; it reports the entire contents, including line
endings, once the end of element marker is reached.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D124798
2022-06-22 10:00:43 -07:00
serge-sans-paille 27fd01d3f8 [iwyu] Handle regressions in libLLVM header include
Running iwyu-diff on LLVM codebase since fb67d683db detected a few
regressions, fixing them.

The impact on preprocessed output is negligible: -4k lines.
2022-06-22 18:50:39 +02:00
Guillaume Chatelet 57ffff6db0 Revert "[NFC] Remove dead code"
This reverts commit 8ba2cbff70.
2022-06-22 14:55:47 +00:00
Guillaume Chatelet 8ba2cbff70 [NFC] Remove dead code 2022-06-22 13:33:58 +00:00
Guillaume Chatelet 9803db8c18 [NFC] Remove dead code 2022-06-22 13:13:01 +00:00
David Sherwood aa0a413df8 [AArch64][SME] Add some SME PSTATE setting/query intrinsics
This patch adds support for:

* Querying the PSTATE.SM state with @llvm.aarch64.sme.get.pstatesm
* Reading/writing the TPIDR2 register with new
@llvm.aarch64.sme.get.tpidr2 and @llvm.aarch64.sme.set.tpidr2
intrinsics.

Tests added here:

  CodeGen/AArch64/sme-get-pstatesm.ll
  CodeGen/AArch64/sme-read-write-tpidr2.ll

Differential Revision: https://reviews.llvm.org/D127957
2022-06-22 10:26:45 +01:00
Pavel Samolysov f44bf3805a [DeadArgElim] Reformat the pass in accordance with the code style
The code has been reformatted in accordance with the code style. Some
function comments were extended to the Doxygen ones and reworded a bit
to eliminate the duplication of the function's/class' name in the
comment.

Differential Revision: https://reviews.llvm.org/D128168
2022-06-22 09:13:00 +03:00
Johannes Doerfert 083010312a [Attributor] Ensure to use the proper liveness AA
When determining liveness via Attributor::isAssumedDead(...) we might
end up without a liveness AA or with one pointing into another function.
Neither is helpful and we will avoid both from now on.

Reapplied after fixing the ASAN error which caused the revert:
db68a25ca9
2022-06-21 21:28:26 -05:00
Vasileios Porpodas 7a9ad25769 Recommit "[SLP][X86] Improve reordering to consider alternate instruction bundles"
This reverts commit 6d6268dcbf.

Review: https://reviews.llvm.org/D125712
2022-06-21 18:35:29 -07:00
Vasileios Porpodas 6d6268dcbf Revert "[SLP][X86] Improve reordering to consider alternate instruction bundles"
This reverts commit 6f88acf410.
2022-06-21 17:07:21 -07:00
Vasileios Porpodas 6f88acf410 [SLP][X86] Improve reordering to consider alternate instruction bundles
During the reordering transformation we should try to avoid reordering bundles
like fadd,fsub because this may block them being matched into a single vector
instruction in x86.
We do this by checking if a TreeEntry is such a pattern and adding it to the
list of TreeEntries with orders that need to be considered.

Differential Revision: https://reviews.llvm.org/D125712
2022-06-21 16:44:48 -07:00
Anubhab Ghosh 79fbee3cc5 Re-apply "[JITLink][Orc] Add MemoryMapper interface with InProcess implementation"
[JITLink][Orc] Add MemoryMapper interface with InProcess implementation

MemoryMapper class takes care of cross-process and in-process address space
reservation, mapping, transferring content and applying protections.

Implementations of this class can support different ways to do this such
as using shared memory, transferring memory contents over EPC or just
mapping memory in the same process (InProcessMemoryMapper).

The original patch landed with commit 6ede652050
It was reverted temporarily in commit 6a4056ab2a

Reviewed By: sgraenitz, lhames

Differential Revision: https://reviews.llvm.org/D127491
2022-06-21 23:53:16 +02:00
Simon Pilgrim 8cecb6be56 [DAG] Remove SelectionDAG::GetDemandedBits DemandedElts variant. NFC.
We're slowly removing SelectionDAG::GetDemandedBits and replacing it with SimplifyMultipleUseDemandedBits, we no longer have any uses for the vector demanded elt variant.
2022-06-21 21:23:10 +01:00
Daniel Bertalan 77b6efbd82 [ADT] [lld-macho] Check for end iterator deref in filter_iterator_base
If ld64.lld was supplied an object file that had a `__debug_abbrev` or
`__debug_str` section, but didn't have any compile unit DIEs in
`__debug_info`, it would dereference an iterator pointing to the empty
array of DIEs. This underlying issue started causing segmentation faults
when parsing for `__debug_info` was addded in D128184. That commit was
reverted, and this one fixes the invalid dereference to allow relanding
it.

This commit adds an assertion to `filter_iterator_base`'s dereference
operators to catch bugs like this one.

Ran check-llvm, check-clang and check-lld.

Differential Revision: https://reviews.llvm.org/D128294
2022-06-21 15:47:45 -04:00
Martin Sebor b19194c032 [InstCombine] handle subobjects of constant aggregates
Remove the known limitation of the library function call folders to only
work with top-level arrays of characters (as per the TODO comment in
the code) and allows them to also fold calls involving subobjects of
constant aggregates such as member arrays.
2022-06-21 11:55:14 -06:00
Nabeel Omer 4c5f10aeeb Revert rGe6ccb57bb3f6b761f2310e97fd6ca99eff42f73e "[SLP] Add cost model for `llvm.powi.*` intrinsics"
This reverts commit e6ccb57bb3.
2022-06-21 15:05:55 +00:00
Nabeel Omer e6ccb57bb3 [SLP] Add cost model for `llvm.powi.*` intrinsics
This patch adds handling for the llvm.powi.* intrinsics in
BasicTTIImplBase::getIntrinsicInstrCost() and improves vectorization.
Closes #53887.

Differential Revision: https://reviews.llvm.org/D128172
2022-06-21 14:40:34 +00:00
Jan Svoboda a44c6453fe [llvm][vfs] Implement in-memory symlinks
This patch implements symlinks for the in-memory VFS. Original author: @erik.pilkington.

Depends on D117648 & D117649.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D117650
2022-06-21 16:29:54 +02:00
Jan Svoboda b439a08dfc [llvm][vfs] NFC: Promote `InMemoryDirIterator` to nested class 2022-06-21 16:29:54 +02:00
Jan Svoboda 9e0398da8d [llvm][vfs] NFC: Promote `lookupInMemoryNode()` to member function 2022-06-21 16:29:53 +02:00
Jan Svoboda 1ff5330ea3 [llvm][vfs] NFC: Rename `InMemoryFileSystem::addHardLink()` arguments 2022-06-21 16:29:53 +02:00
Florian Hahn 4ea6891f95
[ConstraintElimination] Remove unneeded StackEntry::Condition (NFC).
The field was only used for debug printing. Print constraint from the
system instead.
2022-06-21 15:57:29 +02:00
Nico Weber 6a4056ab2a Revert "[JITLink][Orc] Add MemoryMapper interface with InProcess implementation"
This reverts commit 6ede652050.
Doesn't build on Windows, see https://reviews.llvm.org/D127491#3598773
2022-06-21 09:56:49 -04:00
Anubhab Ghosh 6ede652050 [JITLink][Orc] Add MemoryMapper interface with InProcess implementation
MemoryMapper class takes care of cross-process and in-process address space
reservation, mapping, transferring content and applying protections.

Implementations of this class can support different ways to do this such
as using shared memory, transferring memory contents over EPC or just
mapping memory in the same process (InProcessMemoryMapper).

Reviewed By: sgraenitz, lhames

Differential Revision: https://reviews.llvm.org/D127491
2022-06-21 13:44:17 +02:00
Markus Lavin 3815ae29b5 [machinesink] fix debug invariance issue
Do not include debug instructions when comparing block sizes with
thresholds.

Differential Revision: https://reviews.llvm.org/D127208
2022-06-21 08:13:09 +02:00
Kazu Hirata 7a47ee51a1 [llvm] Don't use Optional::getValue (NFC) 2022-06-20 22:45:45 -07:00
Kazu Hirata d66cbc565a Don't use Optional::hasValue (NFC) 2022-06-20 20:26:05 -07:00
Kazu Hirata 0916d96d12 Don't use Optional::hasValue (NFC) 2022-06-20 20:17:57 -07:00
Kazu Hirata 064a08cd95 Don't use Optional::hasValue (NFC) 2022-06-20 20:05:16 -07:00
Philip Reames bbf3fd4af1 [BasicTTI] Return Invalid for scalable vectors reaching getScalarizationOverhead
If we would scalarize a fixed vector, we know we can't do so for a scalable one.  However, there's no need to crash, we can instead simply return a invalid cost which will work its way through the computation (since invalid is sticky), and the client should bail out.

Sorry for the lack of test here.  The particular codepath I saw this reached on was the result of another bug.
2022-06-20 13:19:11 -07:00
Kazu Hirata 5413bf1bac Don't use Optional::hasValue (NFC) 2022-06-20 11:33:56 -07:00
David Green c0ecbfa4fd [AArch64] Known bits for AArch64ISD::DUP
An AArch64ISD::DUP is just a splat, where the known bits for each lane
are the same as the input. This teaches that to computeKnownBitsForTargetNode.

Problems arise for constants though, as a constant BUILD_VECTOR can be
lowered to an AArch64ISD::DUP, which SimplifyDemandedBits would then
turn back into a constant BUILD_VECTOR leading to an infinite cycle.
This has been prevented by adding a isTargetCanonicalConstantNode node
to prevent the conversion back into a BUILD_VECTOR.

Differential Revision: https://reviews.llvm.org/D128144
2022-06-20 19:11:57 +01:00
Philip Reames db85345f2d [BasicTTI] Allow generic handling of scalable vector fshr/fshl
This change removes an explicit scalable vector bailout for fshl and fshr. This bailout was added in 60e4698b9a, when sinking a unconditional bailout for all intrinsics into selected cases. Its not clear if the bailout was originally unneeded, or if our cost model infrastructure has simply matured in the meantime. Either way, the generic code appears to handle scalable vectors without issue.

Note that the RISC-V cost model changes here aren't particularly interesting. They do probably better match the current lowering, but the main point is to have coverage of the BasicTTI path and simply show lack of crashing.

AArch64 costing was changed to preserve legacy behavior.  There will most likely be an upcoming change to use the generic costs there too, but I didn't want to make that change not being particularly familiar with the target.

Differential Revision: https://reviews.llvm.org/D127680
2022-06-20 10:38:51 -07:00
Kazu Hirata e0e687a615 [llvm] Don't use Optional::hasValue (NFC) 2022-06-20 10:38:12 -07:00
David Candler d3919a8cc5 [ConstantFolding] Respect denormal handling mode attributes when folding instructions
Depending on the environment, a floating point instruction should
treat denormal inputs as zero, and/or flush a denormal output to zero.
Denormals are not currently accounted for when an instruction gets
folded to a constant, which can lead to differences in output between
a folded and a unfolded instruction when running on the target. The
denormal handling mode can be set by the function level attribute
denormal-fp-math, which this patch uses to determine whether any
denormal inputs to or outputs from folding should be zero, and that
the sign is set appropriately.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D116952
2022-06-20 16:41:46 +01:00
Fraser Cormack 398834f45b Update usage comments in Printable.h. NFC.
The example wouldn't compile, and used an invalid case style for a
function.

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D128176
2022-06-20 16:18:10 +01:00
Guillaume Chatelet d3cf49e984 [Alignment] Remove alignTo version taking a MaybeAlign 2022-06-20 15:15:53 +00:00
Jan Svoboda 192a3b33f9 [support][ci] Fix modular build on GreenDragon
This is to fix the following error on https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto:
BranchProbability.h:236:34: error: declaration of 'distance' must be imported from module 'std.iterator.__iterator.distance' before it is required
2022-06-20 16:56:20 +02:00
Guillaume Chatelet 7dbf8cfeb7 [NFC] Implement alignTo with skew in terms of alignTo 2022-06-20 14:10:14 +00:00
David Sherwood 013358632e [AArch64][SME] Add the zero intrinsic
The SME zero instruction takes a mask as an input declaring which
64-bit element tiles should be zeroed. There is a 1:1 mapping
between the zero intrinsic and the instruction, however we also
want to make the register allocator aware that some tile
registers are being written to.

We can actually just use the custom inserter for a pseudo instruction
to correctly mark all the appropriate registers in the mask as
implicitly defined by the operation.

 Differential Revision: https://reviews.llvm.org/D127843
2022-06-20 14:27:59 +01:00
Guillaume Chatelet 03036061c7 [Alignment] Use 'previous()' method instead of scalar division
This is in preparation of integration with D128052.

Differential Revision: https://reviews.llvm.org/D128169
2022-06-20 11:01:43 +00:00
Guillaume Chatelet 01cfc8a05a [NFC][Alignment] Remove dead code 2022-06-20 09:47:18 +00:00
Guillaume Chatelet f1255186c7 [NFC][Alignment] Remove max functions between Align and MaybeAlign
`llvm::max(Align, MaybeAlign)` and `llvm::max(MaybeAlign, Align)` are
not used often enough to be required. They also make the code more opaque.

Differential Revision: https://reviews.llvm.org/D128121
2022-06-20 08:37:48 +00:00
Guillaume Chatelet 009fe0755e [Alignment] Remove multiply by MaybeAlign 2022-06-20 08:37:15 +00:00
Chuanqi Xu 7782e080e8 [Coroutines] Only do symmetric transfer if optimization is on
Symmetric transfer is not a part of C++ standards. So the vendors is not
forced to implement it any way. Given the symmetric transfer nowadays is
an optimization. It makes more sense to enable it only if the
optimization is enabled. It is also helpful for the compilation speed in
O0.
2022-06-20 16:20:36 +08:00
Kazu Hirata c7987d4948 [ADT] Use value instead of getValue() (NFC)
Since Optional<clang::FileEntryRef> uses a custom storage class, this
patch adds value to MapEntryOptionalStorage.
2022-06-19 18:34:33 -07:00
Kazu Hirata 813f487228 [ADT] Use has_value (NFC)
This patch switches to has_value within Optional.

Since Optional<clang::FileEntryRef> uses custom storage class, this
patch adds has_entry to MapEntryOptionalStorage.
2022-06-19 18:10:13 -07:00
Nico Weber 7effcbda49 Rename parallelForEachN to just parallelFor
Patch created by running:

  rg -l parallelForEachN | xargs sed -i '' -c 's/parallelForEachN/parallelFor/'

No behavior change.

Differential Revision: https://reviews.llvm.org/D128140
2022-06-19 17:49:00 -04:00
Kazu Hirata 5d7e63fb4f [ADT] Rename value to alt (NFC)
This patch renames value to alt so that the parameter won't collide
with member function value().
2022-06-19 12:00:03 -07:00
Simon Pilgrim ba3f2667b6 [DAG] Add MaskedVectorIsZero helper
Equivalent to MaskedValueIsZero, except its checking if all of the demanded vectors elements are known to be zero
2022-06-19 17:56:30 +01:00
Kazu Hirata 129b531c9c [llvm] Use value_or instead of getValueOr (NFC) 2022-06-18 23:07:11 -07:00
Kazu Hirata 3c49576417 [ADT] Add has_value, value, value_or to llvm::Optional
This patch adds has_value, value, value_or to llvm::Optional so that
llvm::Optional looks more like std::optional.

I will keep the existing functions while migrating their callers and
then remove them later.

Differential Revision: https://reviews.llvm.org/D128131
2022-06-18 21:21:33 -07:00
Kazu Hirata 556bcc7821 [ADT] Rename value to val (NFC)
I'd like to introduce functions, such as value, value_or, has_value,
etc to make llvm::Optional look more like std::optional.  Renaming
value to val avoids name conflicts.

Differential Revision: https://reviews.llvm.org/D128125
2022-06-18 20:19:18 -07:00
Kazu Hirata 4271a1ff33 [llvm] Call *set::insert without checking membership first (NFC) 2022-06-18 10:17:22 -07:00
Simon Pilgrim 37185ceac9 [Object] Make IsLittleEndian check constexpr to silence static analyzer dead code warnings.
The "ELFT::TargetEndianness == support::little" check is known at compile time
2022-06-18 17:35:54 +01:00
Guillaume Chatelet 17e68156f6 [NFC][Alignment] Remove dead code 2022-06-18 15:00:55 +00:00
Kazu Hirata 621f58e716 [Target, CodeGen] Use isImm(), isReg(), etc (NFC) 2022-06-18 07:41:04 -07:00
Simon Pilgrim 3ea1422362 [CodeGen] Add back setOperationAction/setLoadExtAction/setLibcallName single opcode variants
The work to add ArrayRef helpers (D122557, D123467 etc.) to the TargetLowering::set* methods resulted in all the single opcode calls to these methods being cast to single element ArrayRef on the fly - resulting in a scary >5x increase in build time (identified with vcperf) on MSVC release builds of most of the TargetLowering/ISelLowering files.

This patch adds the back the single opcode variants to various set*Action calls to avoid this issue for now, and updates the ArrayRef helpers to wrap them - I'm still investigating whether the single element ArrayRef build times can be improved.
2022-06-18 13:02:05 +01:00
Chris Bieneman 3adc908b26 [DirectX][MC] Add MC support for DXContainer
DXContainer files resemble traditional object files in that they are
comprised of parts which resemble sections. Adding DXContainer as an
object file format in the MC layer will allow emitting DXContainer
objects through the normal object emission pipeline.

Differential Revision: https://reviews.llvm.org/D127165
2022-06-17 21:19:32 -05:00
Florian Hahn e9cced2739
Recommit "[LAA] Initial support for runtime checks with pointer selects."
This reverts commit 7aa8a67882.

This version includes fixes to address issues uncovered after
the commit landed and discussed at D11448.

Those include:

* Limit select-traversal to selects inside the loop.
* Freeze pointers resulting from looking through selects to avoid
  branch-on-poison.
2022-06-17 21:06:26 +02:00
Daniel Thornburgh 2040b6df0a [Symbolize] Parser for log symbolizer markup.
This adds a parser for the log symbolizer markup format discussed in
https://discourse.llvm.org/t/rfc-log-symbolizer/61282. The parser
operates in a line-by-line fashion with minimal memory requirements.

This doesn't yet include support for multi-line tags or specific parsing
for ANSI X3.64 SGR control sequences, but it can be extended to do so.
The latter can also be relatively easily handled by examining the
resulting text elements.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D124686
2022-06-17 10:26:24 -07:00
Joe Nash 75378d432f [AMDGPU] NFC. Change comment format on gfx11 interp and ldsdir intrinsics 2022-06-17 12:28:26 -04:00
Guillaume Chatelet 90f96ec7a5 [NFC][Alignment] Remove assumeAligned from MachineFrameInfo ctor 2022-06-17 15:21:17 +00:00
Joe Nash 20d20156f4 [AMDGPU] gfx11 VINTERP intrinsics and ISel support
Depends on D127664

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D127756
2022-06-17 09:16:59 -04:00
Joe Nash 6d5d8b1313 [AMDGPU] gfx11 ldsdir intrinsics and ISel
Reviewed By: #amdgpu, rampitec

Differential Revision: https://reviews.llvm.org/D127664
2022-06-17 09:03:16 -04:00
lorenzo chelini 84519bc5f7 [LLVM][IR] Fix typo in DerivedTypes.h (NFC) 2022-06-17 12:38:23 +02:00
Jennifer Yu bb83f8e70b [OpenMP] Initial parsing and sema for 'parallel masked' construct
Differential Revision: https://reviews.llvm.org/D127454
2022-06-16 18:01:15 -07:00
David Blaikie 61fac2c370 Incomplete attempt to pull DWARFTypePrinter into its own file for reuse
from lldb
2022-06-16 22:28:28 +00:00
Mitch Phillips ed5a349b89 Make setSanitizerMetadata byval.
This fixes a UaF bug in llvm::GlobalObject::copyAttributesFrom, where a
sanitizer metadata object is captured by reference, and passed by
reference to llvm::GlobalValue::setSanitizerMetadata. The reference
comes from the same map that the new value is going to be inserted to,
and the map insertion triggers iterator invalidation - leading to a
use-after-free on the dangling reference.

This patch fixes that bug by making setSanitizerMetadata's argument
byval. This should also systematically prevent the problem from
happening in future, as it's a very easy pattern to have. This shouldn't
be any performance problem, the SanitizerMetadata struct is a bitfield
POD.
2022-06-16 14:47:27 -07:00
Congzhe Cao 4c77d0276b [Delinearization] Refactoring of fixed-size array delinearization
This is a follow-up patch to D122857 where we added delinearization of
fixed-size arrays to loop cache analysis, which resulted in some duplicate
code, i.e., "tryDelinearizeFixedSize()", in LoopCacheCost.cpp and
DependenceAnalysis.cpp. Refactoring is done in this patch.

This patch refactors out the main logic of "tryDelinearizeFixedSize()" as
"tryDelinearizeFixedSizeImpl()" and moves it to Delinearization.cpp, such that
clients can reuse "llvm::tryDelinearizeFixedSizeImpl()" wherever they would
like to delinearize fixed-size arrays. Currently it has two users, i.e.,
DependenceAnalysis.cpp and LoopCacheCost.cpp.

Reviewed By: Meinersbur, #loopoptwg

Differential Revision: https://reviews.llvm.org/D124745
2022-06-16 16:03:41 -04:00
Joe Nash 2d43de13df [AMDGPU] gfx11 new dot instruction codegen support
Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D127904
2022-06-16 14:19:34 -04:00
Jay Foad 36ec1fcaac [AMDGPU] Add GFX11 llvm.amdgcn.ds.add.gs.reg.rtn / llvm.amdgcn.ds.sub.gs.reg.rtn intrinsics
Differential Revision: https://reviews.llvm.org/D127955
2022-06-16 18:23:14 +01:00
Jay Foad c155a944fb [AMDGPU] GFX11 CodeGen support for MIMG instructions
This includes:
- New llvm.amdgcn.image.msaa.load.* intrinsics
- NSA changes, because MIMG-NSA is now limited to 3 dwords
- Split CD forms of IMAGE_SAMPLE instructions out into separate
  test files since they are no longer supported in GFX11

Differential Revision: https://reviews.llvm.org/D127837
2022-06-16 18:23:14 +01:00
Jay Foad 445a483b41 [AMDGPU] Add new GFX11 intrinsic llvm.amdgcn.exp.row
Differential Revision: https://reviews.llvm.org/D127671
2022-06-16 18:23:14 +01:00
Mircea Trofin 7f24e574d4 [MLInliner] Don't inline call sites in unreachable basic blocks
This requires DominatorTree be updated, which we do in the ml inliner
case, but not in the default case, and the cost of doing so is
noticeable to compile time for the latter[1]. So the patch only affects
the ML inliner.

[1] https://llvm-compile-time-tracker.com/compare.php?from=9fc0aa45e3312944431ba7e1ca0cec99c613992b&to=7af461b1ce0d9138211ef5f883f35d5b9ddf47be&stat=wall-time

Differential Revision: https://reviews.llvm.org/D127899
2022-06-16 09:14:22 -07:00
Corentin Jabot b62e3a73e1 Replace to_hexString by touhexstr [NFC]
LLVM had 2 methods to convert a number to an hexa string,
this remove one of them.

Differential Revision: https://reviews.llvm.org/D127958
2022-06-16 17:29:50 +02:00
David Sherwood 6f6fa5aa10 [AArch64][SME] Add SME cntsb/h/w/d intrinsics
These intrinsics return the number of elements in a streaming
vector, for example aarch64.sme.cntsw returns the number of
32-bit elements. When in streaming mode these are equivalent
to aarch64.sve.cntb/h/w/d with an input value of 1.

I have implemented these intrinsics using the rdsvl instruction
and added tests here:

  CodeGen/AArch64/SME/sme-intrinsics-rdsvl.ll

Differential Revision: https://reviews.llvm.org/D127853
2022-06-16 10:50:25 +01:00
Sunho Kim f3e7e4d786 [JITLink][AArch64][NFC] Suppress unused variable error.
Suppress unused variable error when assertion got disabled.

Reviewed By: chapuni

Differential Revision: https://reviews.llvm.org/D127940
2022-06-16 15:30:04 +09:00
Craig Topper 3aa6ec619f [ValueTypes] Add types for nxv16bf16 and nxv32bf16.
This is needed by our downstream and makes bf16 and f16 have the
same set of scalable vector types.

Reviewed By: rui.zhang

Differential Revision: https://reviews.llvm.org/D127877
2022-06-15 23:00:53 -07:00
Jin Xin Ng aaff3fb6d5 [mlgo] Fix accounting for SCC splits
Previously if the inliner split an SCC such that an empty one remained, the MLInlineAdvisor could potentially lose track of the EdgeCount if a subsequent CGSCC pass modified the calls of a function that was initially in the SCC pre-split. Saving the seen nodes in onPassEntry resolves this.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D127693
2022-06-15 10:53:23 -07:00
Joseph Huber 601ec17d54 [Binary] Add iterator to the OffloadBinary string maps
The offload binary contains internally a string map of all the key and
value pairs identified in the binary itself. Normally users query these
values from the `getString` function, but this makes it difficult to
identify which strings are availible. This patch adds a simple const
iterator range to the offload binary allowing users to iterate through
the strings.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D127774
2022-06-15 12:24:26 -04:00
Guillaume Chatelet 412c788ab0 [NFC][Alignment] Use Align in MCAlignFragment 2022-06-15 12:31:00 +00:00
Benjamin Kramer 8bc0bb9564 Add a conversion from double to bf16
This introduces a new compiler-rt function `__truncdfbf2`.
2022-06-15 12:56:31 +02:00
Benjamin Kramer fb34d531af Promote bf16 to f32 when the target doesn't support it
This is modeled after the half-precision fp support. Two new nodes are
introduced for casting from and to bf16. Since casting from bf16 is a
simple operation I opted to always directly lower it to integer
arithmetic. The other way round is more complicated if you want to
preserve IEEE semantics, so it's handled by a new __truncsfbf2
compiler-rt builtin.

This is of course very bare bones, but sufficient to get a semi-softened
fadd on x86.

Possible future improvements:
 - Targets with bf16 conversion instructions can now make fp_to_bf16 legal
 - The software conversion to bf16 can be replaced by a trivial
   implementation under fast math.

Differential Revision: https://reviews.llvm.org/D126953
2022-06-15 12:56:31 +02:00
David Sherwood 5fa2416ea0 [AArch64][SME] Add SME read/write intrinsics that map to the mova instruction
This patch adds implementations for the read/write SME ACLE intrinsics:

  @llvm.aarch64.sme.read.horiz
  @llvm.aarch64.sme.read.vert
  @llvm.aarch64.sme.write.horiz
  @llvm.aarch64.sme.write.vert

These all map to the SME mova instruction.

Differential Revision: https://reviews.llvm.org/D127414
2022-06-15 10:31:07 +01:00
Austin Kerbow 48ebc1af29 [AMDGPU] Add more expressive sched_barrier controls
The sched_barrier builtin allow the scheduler's behavior to be shaped by users
when very specific codegen is needed in order to create highly optimized code.
This patch adds more granular control over the types of instructions that are
allowed to be reordered with respect to one or multiple sched_barriers. A mask
is used to specify groups of instructions that should be allowed to be scheduled
around a sched_barrier. The details about this mask may be used can be found in
llvm/include/llvm/IR/IntrinsicsAMDGPU.td.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D127123
2022-06-14 22:03:05 -07:00
Fangrui Song 94d1692aa1 [MC] Remove unused MCStreamer::SwitchSection
switchSection should be used instead.
2022-06-14 21:25:56 -07:00
Mircea Trofin 22a1f998f7 FunctionPropertiesAnalysis: handle callsite BBs that lose edges
There could be successors that were reached before but now are only
reachable from elsewhere in the CFG.

Suppose the following diamond CFG (lines are arrows pointing down):
    A
  /   \
 B     C
  \   /
    D
There's a call site in C that is inlined. Upon doing that, it turns out
it expands to:
   call void @llvm.trap()
   unreachable
D isn't reachable from C anymore, but we did discount it when we set up
FunctionPropertiesUpdater, so we need to re-include it here.

The patch also updates loop accounting to use LoopInfo rather than
traverse BBs.

Differential Revision: https://reviews.llvm.org/D127353
2022-06-14 15:19:44 -07:00
Venkata Ramanaiah Nalamothu 340b0ca900 [llvm] Add DW_CC_nocall to function debug metadata when either return values or arguments are removed
Adding the `DW_CC_nocall` calling convention to the function debug metadata is needed when either the return values or the arguments of a function are removed as this helps in informing debugger that it may not be safe to call this function or try to interpret the return value.
This translates to setting `DW_AT_calling_convention` with `DW_CC_nocall` for appropriate DWARF DIEs.

The DWARF5 spec (section 3.3.1.1 Calling Convention Information) says:

If the `DW_AT_calling_convention` attribute is not present, or its value is the constant `DW_CC_normal`, then the subroutine may be safely called by obeying the `standard` calling conventions of the target architecture. If the value of the calling convention attribute is the constant `DW_CC_nocall`, the subroutine does not obey standard calling conventions, and it may not be safe for the debugger to call this subroutine.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D127134
2022-06-15 03:30:15 +05:30
Luboš Luňák 4d27c154a5 remove a duplicated include 2022-06-14 18:55:26 +02:00
Jin Xin Ng 9f2b873a7d [inliner] Add per-SCC-pass InlineAdvisor printing option
Adds option to print the contents of the Inline Advisor after each SCC Inliner pass

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D127689
2022-06-14 08:06:52 -07:00
David Sherwood bd61664167 [AArch64][SME] Add ldr/str (fill/spill) intrinsics
This patch adds implementations for the fill/spill SME ACLE intrinsics:

    @llvm.aarch64.sme.ldr
    @llvm.aarch64.sme.str

Differential Revision: https://reviews.llvm.org/D127317
2022-06-14 13:58:22 +01:00
Guillaume Chatelet b4cf74dc9e [NFC] Remove dead code 2022-06-14 10:56:37 +00:00
Guillaume Chatelet 6725d80640 [NFC][Alignment] Use Align in shouldAlignPointerArgs 2022-06-14 10:56:36 +00:00
Rosie Sumpter 2c4e44752d [AArch64][SME] Add load/store intrinsics
This patch adds implementations for the load/store SME ACLE intrinsics:
  - @llvm.aarch64.sme.ld1*
  - @llvm.aarch64.sme.st1*

Differential Revision: https://reviews.llvm.org/D127210
2022-06-14 11:11:22 +01:00
Chuanqi Xu 735e6c40b5 [Coroutines] Convert coroutine.presplit to enum attr
This is required by @nikic in https://reviews.llvm.org/D127383 to
decrease the cost to check whether a function is a coroutine and this
fixes a FIXME too.

Reviewed By: rjmccall, ezhulenev

Differential Revision: https://reviews.llvm.org/D127471
2022-06-14 14:23:46 +08:00
Kazu Hirata a2232da2a5 [CodeGen] Remove addSEHCatchHandler and addSEHCleanupHandler (NFC)
The last uses of these functions are removed on Oct 9, 2015 in commit
14e773500e.
2022-06-13 23:08:49 -07:00
Kazu Hirata 34ff78c5cf [CodeGen] Remove restrictRef (NFC)
The last use was removed on Apr 14, 2017 in commit
4fe9d6c640.
2022-06-13 23:08:48 -07:00
Sunho Kim 398df667d6 [JITLink][AArch64] Implement MoveWide16 generic edge.
Implements MoveWide16 generic edge kind that can be used to patch MOVZ/MOVK (imm16) instructions.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D127584
2022-06-14 13:51:47 +09:00
Sunho Kim 6cc3450a52 [JITLink][AArch64] Lift fixup functions from aarch64.cpp to aarch64.h. (NFC)
Lift fixup functions from aarch64.cpp to aarch64.h so that they have better chance of getting inlined. Also, adds some comments documenting the purpose of functions.

Reviewed By: sgraenitz

Differential Revision: https://reviews.llvm.org/D127559
2022-06-14 13:34:00 +09:00
Sunho Kim db37225803 [JITLink][AArch64] Unify table managers of ELF and MachO.
Unifies GOT/PLT table managers of ELF and MachO on aarch64 architecture. Additionally, it migrates table managers from PerGraphGOTAndPLTStubsBuilder to generic crtp TableManager.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D127558
2022-06-14 13:16:03 +09:00
Fangrui Song bf0bac43ff [CodeGen] Initialize ISD after 800d222e53
In the Intrinsic::fptosi_sat branch, ISD was uninitialized when Tys.empty().
2022-06-13 19:52:21 -07:00
Philip Reames 800d222e53 [BasicTTI] Remove unused support for multiple opcodes in getTypeBasedIntrinsicInstrCost [nfc]
ISDs only ever contains a single ISD opcode.  We can simplify the code under this assumption. The code being removed was added back in 2016 in 0f26b0aeb4 to support FMAXNAN/FMINNAN, but at some point since then the motivating case was rewritten not to use the ISDs mechanism.  No reason to keep the false generality around now.
2022-06-13 18:23:39 -07:00
Lang Hames 14b7c108a2 [C-API][ORC] Add C API to suspend lookups during definition generation.
Slow definition generators may suspend lookups to temporarily release the
session lock, allowing unrelated lookups to proceed.

Using this functionality is discouraged: it is best to make definition
generation fast, rather than suspending the lookup. As a last resort where
this is not possible, suspension may be used.
2022-06-13 17:20:07 -07:00
Kazu Hirata 145cc9db2b [CodeGen] Remove futureWeight (NFC)
The last use was removed on Jun 5, 2022 in
commit 5c06f7168f, which itself was
a patch to remove unused code.
2022-06-13 17:10:23 -07:00
Lang Hames 803c770ee0 [C-API][ORC] Add LLVMOrcExecutionSessionLookup -- generic async symbol lookup.
An API to wrap ExecutionSession::lookup, this allows C API clients to use async
lookup.

The immediate motivation for adding this is to simplify upcoming
definition-generator unit tests.

As we're adding more tests that need to convert between C and C++ flag values
this commit adds helper functions to support this. This patch also updates the
CAPIDefinitionGenerator to use these new utilities.
2022-06-13 16:37:35 -07:00
Kazu Hirata 5c41b0f429 [Analysis] Remove getUniqueInstruction (NFC)
The last use was removed on Apr 7, 2022 in commit
5cefe7d9f5.
2022-06-13 14:26:20 -07:00
Lang Hames b425f55693 [C-API][ORC] Fix struct name.
This struct was using the wrong prefix (LLVMJIT... vs LLVMOrc...).
2022-06-13 13:53:51 -07:00
Jay Foad bfcfd53b92 [AMDGPU] Add GFX11 llvm.amdgcn.permlane64 intrinsic
Compared to permlane16, permlane64 has no BC input because it has no
boundary conditions, no fi input because the instruction acts as if FI
were always enabled, and no OLD input because it always writes to every
active lane.

Also use the new intrinsic in the atomic optimizer pass.

Differential Revision: https://reviews.llvm.org/D127662
2022-06-13 21:12:11 +01:00
Guillaume Chatelet 2b89a4dc51 [NFC] Remove dead code 2022-06-13 15:38:27 +00:00
Guillaume Chatelet 8865700f90 [NFC] Remove dead code 2022-06-13 15:38:27 +00:00
Guillaume Chatelet 111b32ecb4 [NFC][Alignment] Use getAlign in Attributor classes 2022-06-13 15:13:05 +00:00
Kazu Hirata 246e83e973 [GlobalISel] Remove buildSequence (NFC)
The last use was removed on Jun 27, 2019 in commit
8138996128.
2022-06-13 06:58:36 -07:00
Jez Ng d4bcb45db7 [MC][re-land] Omit DWARF unwind info if compact unwind is present where eligible
This reverts commit d941d59783.

Differential Revision: https://reviews.llvm.org/D122258
2022-06-12 17:24:19 -04:00
Jez Ng d941d59783 Revert "[MC] Omit DWARF unwind info if compact unwind is present where eligible"
This reverts commit ef501bf85d.
2022-06-12 10:47:08 -04:00
Jez Ng ef501bf85d [MC] Omit DWARF unwind info if compact unwind is present where eligible
Previously, omitting unnecessary DWARF unwinds was only done in two
cases:
* For Darwin + aarch64, if no DWARF unwind info is needed for all the
  functions in a TU, then the `__eh_frame` section would be omitted
  entirely. If any one function needed DWARF unwind, then MC would emit
  DWARF unwind entries for all the functions in the TU.
* For watchOS, MC would omit DWARF unwind on a per-function basis, as
  long as compact unwind was available for that function.

This diff makes it so that we omit DWARF unwind on a per-function basis
for Darwin + aarch64 as well. In addition, we introduce the flag
`--emit-dwarf-unwind=` which can toggle between `always`,
`no-compact-unwind` (only emit DWARF when CU cannot be emitted for a
given function), and the target platform `default`.  `no-compact-unwind`
is particularly useful for newer x86_64 platforms: we don't want to omit
DWARF unwind for x86_64 in general due to possible backwards compat
issues, but we should make it possible for people to opt into this
behavior if they are only targeting newer platforms.

**Motivation:** I'm working on adding support for `__eh_frame` to LLD,
but I'm concerned that we would suffer a perf hit. Processing compact
unwind is already expensive, and that's a simpler format than EH frames.
Given that MC currently produces one EH frame entry for every compact
unwind entry, I don't think processing them will be cheap. I tried to do
something clever on LLD's end to drop the unnecessary EH frames at parse
time, but this made the code significantly more complex. So I'm looking
at fixing this at the MC level instead.

**Addendum:** It turns out that there was a latent bug in the X86
backend when `OmitDwarfIfHaveCompactUnwind` is naively enabled, which is
not too surprising given that this combination has not been heretofore
used.

For functions that have unwind info that cannot be encoded with CU, MC
would end up dropping both the compact unwind entry (OK; existing
behavior) as well as the DWARF entries (not OK).  This diff fixes things
so that we emit the DWARF entry, as well as a CU entry with encoding
`UNWIND_X86_MODE_DWARF` -- this basically tells the unwinder to look for
the DWARF entry. I'm not 100% sure the `UNWIND_X86_MODE_DWARF` CU entry
is necessary, this was the simplest fix. ld64 seems to be able to handle
both the absence and presence of this CU entry. Ultimately ld64 (and
LLD) will synthesize `UNWIND_X86_MODE_DWARF` if it is absent, so there
is no impact to the final binary size.

Reviewed By: davide, lhames

Differential Revision: https://reviews.llvm.org/D122258
2022-06-12 10:03:56 -04:00
Fangrui Song adf4142f76 [MC] De-capitalize SwitchSection. NFC
Add SwitchSection to return switchSection. The API will be removed soon.
2022-06-10 22:50:55 -07:00
Mircea Trofin 7e7021ca1a [mlgo] Update FunctionPropertyCache after invalidating analyses
The update depends on LoopInfo, so we need that refreshed first, not
after.

Differential Revision: https://reviews.llvm.org/D127467
2022-06-10 16:18:14 -07:00
Mitch Phillips 8db981d463 Add sanitizer-specific GlobalValue attributes.
Plan is the migrate the global variable metadata for sanitizers, that's
currently carried around generally in the 'llvm.asan.globals' section,
onto the global variable itself.

This patch adds the attribute and plumbs it through the LLVM IR and
bitcode formats, but is a no-op other than that so far.

Reviewed By: vitalybuka, kstoimenov

Differential Revision: https://reviews.llvm.org/D126100
2022-06-10 12:28:18 -07:00
Shraiysh Vaishay f62baddac0 [OpenMP][IRBuilder] Add final clause to task
This patch adds final clause to OpenMP IR Builder.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D126626
2022-06-11 00:02:18 +05:30
Joe Nash ea3c9a87d3 [AMDGPU] gfx11 add bits to COMPUTE_PGM_RSRC3
Contributors:
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

Patch 21/N for upstreaming of AMDGPU gfx11 architecture

Depends on D127143

Reviewed By: rampitec, #amdgpu, kzhuravl

Differential Revision: https://reviews.llvm.org/D127241
2022-06-10 13:07:14 -04:00
Guillaume Chatelet 95083fa3b8 [NFC] Remove deadcode 2022-06-10 15:13:42 +00:00
Guillaume Chatelet 38637ee477 [clang] Add support for __builtin_memset_inline
In the same spirit as D73543 and in reply to https://reviews.llvm.org/D126768#3549920 this patch is adding support for `__builtin_memset_inline`.

The idea is to get support from the compiler to easily write efficient memory function implementations.

This patch could be split in two:
 - one for the LLVM part adding the `llvm.memset.inline.*` intrinsics.
 - and another one for the Clang part providing the instrinsic as a builtin.

Differential Revision: https://reviews.llvm.org/D126903
2022-06-10 13:13:59 +00:00
David Sherwood 8daaea206b [InstCombine] Use +0.0 instead of -0.0 as the FP identity for some folds
In foldSelectIntoOp we sometimes transform a select of a fadd into a
fadd of a select, where we select between data and an identity value.
For both fadd and fsub the identity is always -0.0, but if the nsz
flag is set on the select instruction we can use +0.0 instead. Doing
so then triggers other optimisations, such as when folding the select
of masked load into a new masked load.

Differential Revision: https://reviews.llvm.org/D126774
2022-06-10 12:42:34 +01:00
Nikita Popov d77f944832 [LoopInfo] Add getOutermostLoop() (NFC)
This is a recurring pattern, add an API function for it.
2022-06-10 11:48:21 +02:00
Jay Foad 6c372daa84 [AMDGPU] New GFX11 intrinsic llvm.amdgcn.s.sendmsg.rtn
Add new intrinsic and codegen support for the s_sendmsg_rtn_b32 and
s_sendmsg_rtn_b64 instructions.

Differential Revision: https://reviews.llvm.org/D127315
2022-06-10 08:15:23 +01:00
Peter S. Housel 1aa71f8679 [ORC][ORC_RT] Integrate ORC platforms with LLJIT and lli
This change enables integrating orc::LLJIT with the ORCv2
platforms (MachOPlatform and ELFNixPlatform) and the compiler-rt orc
runtime. Changes include:

- Adding SPS wrapper functions for the orc runtime's dlfcn emulation
  functions, allowing initialization and deinitialization to be invoked
  by LLJIT.

- Changing the LLJIT code generation default to add UseInitArray so
  that .init_array constructors are generated for ELF platforms.

- Integrating the ORCv2 Platforms into lli, and adding a
  PlatformSupport implementation to the LLJIT instance used by lli which
  implements initialization and deinitialization by calling the new
  wrapper functions in the runtime.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D126492
2022-06-09 22:47:58 -07:00
Philip Reames b59c2315af [BasicTTI] Return Invalid cost for more scalable vector scalarization cases
Instead of crashing on a cast<FixedVectorType>, we should isntead return Invalid for these cases.  This avoids crashes in assert builds, and potential miscompiles in release builds.
2022-06-09 16:10:51 -07:00
Philip Reames 206f10d3f6 Plumb InstructionCost through unroll costing
Teach the unroller(s) how to handle an invalid cost. This avoids crashes when the backend can't provide a cost due to either a fundemental limitation or an unimplemented cost model case.

Differential Revision: https://reviews.llvm.org/D127305
2022-06-09 15:42:53 -07:00
Philip Reames f85c5079b8 Pipe potentially invalid InstructionCost through CodeMetrics
Per the documentation in Support/InstructionCost.h, the purpose of an invalid cost is so that clients can change behavior on impossible to cost inputs. CodeMetrics was instead asserting that invalid costs never occurred.

On a target with an incomplete cost model - e.g. RISCV - this means that transformations would crash on (falsely) invalid constructs - e.g. scalable vectors. While we certainly should improve the cost model - and I plan to do so in the near future - we also shouldn't be crashing. This violates the explicitly stated purpose of an invalid InstructionCost.

I updated all of the "easy" consumers where bailouts were locally obvious. I plan to follow up with loop unroll in a following change.

Differential Revision: https://reviews.llvm.org/D127131
2022-06-09 15:17:24 -07:00
Johannes Doerfert 6555558a80 Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues"
This reverts commit da50dab1ae.

Patch broke AMD GPU OpenMP offload buildbots.
https://lab.llvm.org/buildbot/#/builders/193/builds/13246
2022-06-09 17:04:01 +02:00
Simon Moll 746908a038 [NFC] Clang-format PatternMatch.h 2022-06-09 16:51:32 +02:00
Johannes Doerfert da50dab1ae [Attributor] Replace AAValueSimplify with AAPotentialValues
For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many reasons:
- We recomputed the result a lot as there was no caching for the 9
  locations calling `genericValueTraversal`.
- We added the idea of "intra" vs. "inter" procedural simplification
  only as an afterthought. `genericValueTraversal` did offer an option
  but `AAValueSimplify` did not. Thus, we might end up with "too much"
  simplification in certain situations and then gave up on it.
- Because `genericValueTraversal` was not a real `AA` we ended up with
  problems like the infinite recursion bug (#54981) as well as code
  duplication.

This patch introduces `AAPotentialValues` and replaces the
`AAValueSimplify` uses with it. `genericValueTraversal` is folded into
`AAPotentialValues` as are the instruction simplifications performed in
`AAValueSimplify` before. We further distinguish "intra" and "inter"
procedural simplification now.

`AAValueSimplify` was not deleted as we haven't ported the
re-materialization of instructions yet. There are other differences over
the former handling, e.g., we may not fold trivially foldable
instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2`
but if an operand would be simplified to `i32 1` we would fold it still.

We are also even more aware of function/SCC boundaries in CGSCC passes,
which is good.

Fixes: https://github.com/llvm/llvm-project/issues/54981
2022-06-09 16:48:53 +02:00
Simon Moll b8c2781ff6 [NFC] format InstructionSimplify & lowerCaseFunctionNames
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName".  This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.

This is the alternative to the less invasive clang-format only patch: D126783

Reviewed By: spatel, rengolin

Differential Revision: https://reviews.llvm.org/D126889
2022-06-09 16:10:08 +02:00
Johannes Doerfert 14899bc43d [Attributor] Generalize interface from ConstantInt to Constant
We can use constant to allow undef and there is no need to force
integers in the API anyway. The user can decide if a non integer
constant is fine or not.
2022-06-09 12:00:26 +02:00
Johannes Doerfert 7a07b88f37 [Attributor][FIX] Replace call site argument uses, not values
We need to be careful replacing values as call site arguments
(IRPosition::IRP_CALL_SITE_ARGUMENT) is representing a use and not a
value. This patch replaces the interface to take a IR position instead
making it harder to misuse accidentally. It does not change our tests
right now but a follow up exposed the potential footgun.
2022-06-09 12:00:26 +02:00
Johannes Doerfert 1df6e171c3 [Attributor] Simplify (integer range) state handling
We used to be very conservative when integer states were merged.
Instead of adding the known range (which is large due to uncertainty)
into the assumed range (which is hopefully small), we can also only
allow to merge in both at the same time into their respective
counterpart. This will ensure we keep the invariant that assumed is part
of known.
2022-06-09 12:00:26 +02:00
Johannes Doerfert 481b8f31df [Attributor][NFC] Introduce helper struct
We often use a context associated with a value. For now only one use
case has been changed.
2022-06-09 12:00:26 +02:00
Nicolai Hähnle f971e77fb4 ADT/ArrayRef: Add makeMutableArrayRef overloads
Equivalent overloads already exist for makeArrayRef.

Differential Revision: https://reviews.llvm.org/D126421
2022-06-09 09:59:50 +02:00
Lang Hames 3fcd3669e3 [ORC] Add an output stream operator for SymbolStringPool.
Handy for checking string pool state, e.g. when debugging dangling-pool-entry
errors.
2022-06-08 16:49:51 -07:00
Florian Mayer 0593ce5f0b [MC] Add 'G' to augmentation string for MTE instrumented functions
This was agreed on in
https://lists.llvm.org/pipermail/llvm-dev/2020-May/141345.html

The thread proposed two options
* add a character to augmentation string and handle in libuwind
* use a separate personality function.

It was determined that this is the simpler and better option.

This is part of ARM's Aarch64 ABI:
https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst#id22

The next step after this is teaching libunwind to untag when this
augmentation character is set.

Reviewed By: MaskRay, eugenis

Differential Revision: https://reviews.llvm.org/D127007
2022-06-08 12:36:32 -07:00
Hongtao Yu ab34ab2b87 [PseudoProbe] Use callee name as callsite identfier for MCDecodedPseudoProbeInlineTree.
The callsite identifier used in pseudo probe encoding and decoding is consisted of a function name and the callsite probe id. For encoding, i.e., `MCPseudoProbeInlineTree`, the function name is callee function name. However for decoding, i.e., `MCDecodedPseudoProbeInlineTree`, the caller function name is used actually. This results in multiple callees that are inlined at the same callsite, likely via indirect call promotion, sharing the same decoded inline frame. While it is not a problem for profile generation, it confuses probe re-encoding in Bolt.

In Bolt, we decode pseudo probes first and build `MCDecodedPseudoProbeInlineTree`. The decoded tree is used for final re-encoding. Here comes the problem. Two inlinees from the same callsite share the same decoded inline frame. During re-encoding, the frame name (whatever inlinee comes first) will be used and encoded in the bolted binary. This will cause wrong inline contexts  in the profile generated on the bolted binary.

The fix is a no-op to pre-bolt profile generation. Some of the bolt tests are not yet upstreamed, thus I'm not adding a bolt test here.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D126434
2022-06-08 10:54:40 -07:00
Thomas Lively aff679a48c [WebAssembly] Implement remaining relaxed SIMD instructions
Add codegen, intrinsics, and builtins for the i16x8.relaxed_q15mulr_s,
i16x8.dot_i8x16_i7x16_s, and i32x4.dot_i8x16_i7x16_add_s instructions. These are
the last instructions from the relaxed SIMD proposal[1] that had not been
implemented.

[1]:
https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md.

Differential Revision: https://reviews.llvm.org/D127170
2022-06-08 10:32:10 -07:00
Philip Reames f0d2a55d3a Restore isa<Ty>(X) asserts inside cast<Ty>(X)
PLEASE DO NOT REVERT without careful consideration, and preferably prior
discussion.

cast<Ty>(X) is a "checked cast". Its entire purpose is explicitly documented
(https://llvm.org/docs/ProgrammersManual.html#the-isa-cast-and-dyn-cast
templates) as catching bad casts by asserting that the cast is valid.
Unfortunately, in a recent rewrite of our casting infrastructure about three
months back, these asserts got dropped.

This is discussed in more detail on discourse in https://discourse.llvm.org/t/cast-x-is-broken-implications-and-proposal-to-address/63033.

Differential Revision: https://reviews.llvm.org/D127231
2022-06-08 07:32:37 -07:00
Paul Walker d88354213c [SelectionDAG] Remove invalid TypeSize conversion from PromoteIntRes_BITCAST.
Extend the TypeWidenVector case of PromoteIntRes_BITCAST to work
with TypeSize directly rather than silently casting to unsigned.

To accomplish this I've extended TypeSize with an interface that
essentially allows TypeSize division when both operands have the
same number of dimensions.

There still exists combinations of scalable vector bitcasts that
cause compiler crashes. I call these out by adding "is missing"
entries to sve-bitcast.

Depends on D126957.
Fixes: #55114

Differential Revision: https://reviews.llvm.org/D127126
2022-06-08 10:30:07 +01:00
Nathan James 638b0fb4d6
[ADT][NFC] Early bail out for ComputeEditDistance
The minimun bound for number of edits is the size difference between the 2 arrays.
If MaxEditDistance is smaller than this, we can bail out early without needing to traverse any of the arrays.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D127070
2022-06-08 08:20:29 +01:00
Wolfgang Pieb 213eb424e8 Revert "[Metadata] Add a resize capability to MDNodes and add a push_back interface to MDNodes"
This reverts commit e3f6eda8c6.

Failure in unittest on https://lab.llvm.org/buildbot*builders/171/builds/15666
2022-06-07 15:48:31 -07:00
Wolfgang Pieb e3f6eda8c6 [Metadata] Add a resize capability to MDNodes and add a push_back interface to MDNodes
A change to the allocation characteristics of MDNodes, introducing the ability
to add operands one at a time. This functionality is restricted to MDTuples.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D125998
2022-06-07 14:34:38 -07:00
Sunho Kim 9f29916169 [JITLink][AArch64] Refactor isLoadStoreImm12 check out of getPageOffset12Shift.
The separate isLoadStoreImm12 predicate will be used for validating ELF/aarch64
ldst relocation types.

Reviewed By: lhames, sgraenitz

Differential Revision: https://reviews.llvm.org/D126628
2022-06-07 13:18:12 -07:00
Joseph Huber f06731e3c3 [Binary] Make the OffloadingImage type own the memory
Summary:
The OffloadingBinary uses a convenience struct to help manage the memory
that will be serialized using the binary format. This currently uses a
reference to an existing buffer, but this should own the memory instead
so it is easier to work with seeing as its only current use requires
saving the buffer anyway.
2022-06-07 15:56:09 -04:00
Philip Reames 781de11f42 Revert "[LLVM][Casting.h] Add trivial self-cast"
This reverts commit 0809f63826.  The patch appears not to have included corresponding isa<Ty> support.

This was revealed when reintroducing the required isa<Ty> asserts in cast<Ty>.  See https://discourse.llvm.org/t/cast-x-is-broken-implications-and-proposal-to-address/63033 for context.

Here's the template instantiation error:
In file included from /home/preames/llvm-repo/llvm-project/llvm/unittests/Support/Casting.cpp:9:
/home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h: In instantiation of ‘static bool llvm::isa_impl<To, From, Enabler>::doit(const From&) [with To = llvm::bar*; From = llvm::bar; Enabler = void]’:
/home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h:110:36:   required from ‘static bool llvm::isa_impl_cl<To, const From*>::doit(const From*) [with To = llvm::bar*; From = llvm::bar]’
/home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h:137:41:   required from ‘static bool llvm::isa_impl_wrap<To, FromTy, FromTy>::doit(const FromTy&) [with To = llvm::bar*; FromTy = const llvm::bar*]’
/home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h:129:13:   required from ‘static bool llvm::isa_impl_wrap<To, From, SimpleFrom>::doit(const From&) [with To = llvm::bar*; From = const llvm::bar* const; SimpleFrom = const llvm::bar*]’
/home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h:263:62:   required from ‘static bool llvm::CastIsPossible<To, From, Enable>::isPossible(const From&) [with To = llvm::bar*; From = const llvm::bar*; Enable = void]’
/home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h:517:38:   required from ‘static bool llvm::CastInfo<To, From, typename std::enable_if<(! llvm::is_simple_type<From>::value), void>::type>::isPossible(From&) [with To = llvm::bar*; From = llvm::bar* const]’
/home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h:556:46:   required from ‘bool llvm::isa(const From&) [with To = llvm::bar*; From = llvm::bar*]’
/home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h:585:3:   required from ‘decltype(auto) llvm::cast(From*) [with To = llvm::bar*; From = llvm::bar]’
/home/preames/llvm-repo/llvm-project/llvm/unittests/Support/Casting.cpp:181:27:   required from here
/home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h:64:64: error: ‘classof’ is not a member of ‘llvm::bar*’
   64 |   static inline bool doit(const From &Val) { return To::classof(&Val); }
2022-06-07 12:50:40 -07:00
Derek Schuff 2ae385e560 [WebAssembly] Add WASM_SEC_LAST_KNOWN to BinaryFormat section types list [NFC]
There are 3 places where we were using WASM_SEC_TAG as the "last" known
section type, which requires updating (or leaves a bug) when a new known
section type is added. Instead add a "last type" to the enum for this
purpose.

Differential Revision: https://reviews.llvm.org/D127164
2022-06-07 12:05:23 -07:00
Sunho Kim b6553f592a [JITLink][ELF][AArch64] Lift MachO/arm64 edges into aarch64.h, reuse for ELF.
This patch moves the aarch64 fixup logic from the MachO/arm64 backend to
aarch64.h header so that it can be re-used in the ELF/aarch64 backend. This
significantly expands relocation support in the ELF/aarch64 backend.

Reviewed By: lhames, sgraenitz

Differential Revision: https://reviews.llvm.org/D126286
2022-06-07 12:01:43 -07:00
Reid Kleckner 570e76bb6c [config] Remove vestigial LLVM_VERSION_INFO
This has been superseded by the llvm/Support/VCSRevision.h header. So
far as I can tell, nothing in the CMake build sets LLVM_VERSION_INFO. It
was always undefined, and the ifdefs using it were dead. However, CMake
is very flexible, so it's possible that I missed some ways to set this
variable. One could, for example, probably pass -DLLVM_VERSION_INFO=x on
the command line and get that through to configure_file, or set the
variable in an obscure way (`set(${proj}_VERSION_INFO "x")`). I'm
reasonably confident that isn't happening, but I'd like a second
opinion.

Update the Bazel and gn builds accordingly.

Differential Revision: https://reviews.llvm.org/D126977
2022-06-07 11:36:26 -07:00
Reid Kleckner b1c7889f32 [config] Remove RETSIGTYPE from config.h.cmake, NFC
This doesn't need to be configurable. It was hardcoded to void in all
LLVM build systems.
2022-06-07 11:35:25 -07:00
Matt Arsenault cc5a1b3dd9 llvm-reduce: Add cloning of target MachineFunctionInfo
MIR support is totally unusable for AMDGPU without this, since the set
of reserved registers is set from fields here.

Add a clone method to MachineFunctionInfo. This is a subtle variant of
the copy constructor that is required if there are any MIR constructs
that use pointers. Specifically, at minimum fields that reference
MachineBasicBlocks or the MachineFunction need to be adjusted to the
values in the new function.
2022-06-07 10:14:48 -04:00
Matt Arsenault 56303223ac llvm-reduce: Don't assert on functions which don't track liveness
Use the query that doesn't assert if TracksLiveness isn't set, which
needs to always be available. We also need to start printing liveins
regardless of TracksLiveness.
2022-06-07 10:00:25 -04:00
Guillaume Chatelet 0788186182 [Alignment][NFC] Remove usage of MemSDNode::getAlignment
I can't remove the function just yet as it is used in the generated .inc files.
I would also like to provide a way to compare alignment with TypeSize since it came up a few times.

Differential Revision: https://reviews.llvm.org/D126910
2022-06-07 13:52:20 +00:00
Jay Foad 1feed6691a [APInt] Remove truncOrSelf, zextOrSelf and sextOrSelf
Differential Revision: https://reviews.llvm.org/D125559
2022-06-07 10:01:49 +01:00
Fangrui Song 15d82c62dc [MC] De-capitalize MCStreamer functions
Follow-up to c031378ce0 .
The class is mostly consistent now.
2022-06-07 00:31:02 -07:00
luxufan a7b154aa17 [MC][ARM] Reuse symbol value in constant pool
Fix https://github.com/llvm/llvm-project/issues/55816

Before this patch, MCConstantExpr were reused, but MCSymbolExpr were
not. To reuse symbol value, this patch added a DenseMap to record the
symbol value.

Differential Revision: https://reviews.llvm.org/D127113
2022-06-07 13:39:52 +08:00
Chris Bieneman 21c9452305 [DX][ObjYAML] Support for parsing DXIL part
This patch adds support for parsing the DXIL part data into the
ObjectYAML tooling.

The DXIL part has additional headers describing the shader and bitcode
data and stores serialized bitcode after the headers.

Depends on D124945

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D126795
2022-06-06 18:46:19 -05:00
Philip Reames c1fb8bd777 [BasicTTI] Add missing scalable vector handling
BasicTTI needs to return an invalid cost for scalable vectors instead of crash.  Without this, it is impossible to write tests for missing functionality in a target.
2022-06-06 14:21:41 -07:00
Chris Bieneman 352c395fb6 [ObjectYAML][DX] Add dxcontainer2yaml support
This change finishes fleshing out the ObjectYAML tools to support
converting DXContainer files into yaml representations.

Depends on D124944

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D124945
2022-06-06 13:23:29 -05:00
Michael Kitzan b7fcf6632f [GISel] Add new combines for G_ADD
Patch adds new GICombineRules for G_ADD:

G_ADD(x, G_SUB(y, x)) -> y
G_ADD(G_SUB(y, x), x) -> y

Patch additionally adds new combine tests for AArch64 target for
these new rules.

Reviewed by: paquette

Differential Revision: https://reviews.llvm.org/D87936
2022-06-06 11:19:45 -07:00
Nimish Mishra 6a3c4a40f4 [flang][OpenMP] Added parser support for in_reduction clause
OpenMP 5.0 adds a new clause `in_reduction` on OpenMP directives.
This patch adds parser support for the same.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D124156
2022-06-06 14:55:27 +05:30
Kazu Hirata 7c009d2c31 [PDB] Remove truncate* (NFC)
- truncateQuotedNameFront: The last use was removed on Jul 10, 2017 in
  commit a9d944fd6f.

- truncateQuotedNameBack: The last use was removed on Mar 26, 2018 in
  commit 7b84b678a9.

- truncateStringMiddle: The last use was removed on Mar 26, 2018 in
  commit 7b84b678a9.

- truncateStringBack: The last use is in truncateQuotedNameBack being
  removed above.

- truncateStringFront: The last use is in truncateQuotedNameFront
  being removed above.
2022-06-05 23:33:51 -07:00
Kazu Hirata 43d4585e64 [GlobalISel] Remove widenWithUnmerge (NFC)
The last use was removed on Dec 23, 2021 in commit
29f88b93fd.
2022-06-05 19:58:18 -07:00
Kazu Hirata 61abcb0b37 [GlobalISel] Remove valueIsSplit (NFC)
The last use was removed on Jun 27, 2019 in commit
8138996128.
2022-06-05 19:51:03 -07:00
Kazu Hirata 3b9707dbc0 [llvm] Convert for_each to range-based for loops (NFC) 2022-06-05 12:07:14 -07:00
Alexey Lapshin 501d5b24db [Debuginfo][DWARF][NFC] Refactor DwarfStringPoolEntryRef - remove isIndexed().
This patch is extraction from the https://reviews.llvm.org/D126883.
It removes DwarfStringPoolEntryRef::isIndexed() and isIndexed bit
since they are not used.

Differential Revision: https://reviews.llvm.org/D126958
2022-06-05 21:18:31 +03:00
Nathan James a13b61f7f0
[ADT] Add edit_distance_insensitive to StringRef
In some instances its advantageous to calculate edit distances without worrying about casing.
Currently to achieve this both strings need to be converted to the same case first, then edit distance can be calculated.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D126159
2022-06-05 12:03:09 +01:00
Kazu Hirata 4969a6924d Use llvm::less_first (NFC) 2022-06-04 21:23:18 -07:00
Fangrui Song 557efc9a8b [llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!`
error. More were added due to cargo cult. Since the error has been removed,
cl::ZeroOrMore is unneeded.

Also remove cl::init(false) while touching the lines.
2022-06-03 21:59:05 -07:00
Reid Kleckner 0a832ba5c2 [config] Remove LLVM_DEFAULT_TARGET_TRILE from config.h
It is redundant with llvm-config.h, which is always included by
config.h.

Port D12660 / d178f4fc89 from config.h to
llvm-config.h.

Update the gn build accordingly.

NFCI
2022-06-03 10:15:46 -07:00
Alvin Wong bb94611d65 [COFF] Check table ptr more thoroughly and ignore empty sections
When loading split debug files for PE/COFF executables (produced with
`objcopy --only-keep-debug`), the tables or directories in such files
may point to data inside sections that may have been stripped.
COFFObjectFile shall detect and gracefully handle this, to allow the
object file be loaded without considering these tables or directories.
This is required for LLDB to load these files for use as debug symbols.

COFFObjectFile shall also check these pointers more carefully to account
for cases in which the section contains less raw data than the size
given by VirtualSize, to prevent going out of bounds.

This commit also changes COFFDump in llvm-objdump to reuse the pointers
that are already range-checked in COFFObjectFile. This fixes a crash
when trying to dump the TLS directory from a stripped file.

Fixes https://github.com/mstorsjo/llvm-mingw/issues/284

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D126898
2022-06-03 18:31:01 +03:00
Serguei Katkov 24e16e4af2 [SSAUpdaterImpl] Do not generate phi node with all the same incoming values
If all available vals to basic block are the same - do not build new phi node and
just use this value.

Reviewed By: sameerds
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D126525
2022-06-03 12:24:33 +07:00
Mingming Liu 8601f269f1 [Inline][Remark][NFC] Optionally provide inline context to inline
advisor.

This patch has no functional change, and merely a preparation patch for
main functional change. The motivating use case is to annotate inline
remark pass name with context information (e.g. prelink or postlink,
CGSCC or always-inliner), see D125495 for more details.

Differential Revision: https://reviews.llvm.org/D126824
2022-06-02 13:14:30 -07:00
Balazs Benics 7d24641f89 [llvm][analyzer][NFC] Introduce SFINAE for specializing FoldingSetTraits
Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126803
2022-06-02 19:46:38 +02:00
Liqiang Tao 14e8add939 [llvm][ModuleInliner] Refactor InlineSizePriority and PriorityInlineOrder
This patch introduces the abstract base class InlinePriority to serve as
the comparison function for the priority queue.  A derived class, such
as SizePriority, may choose to cache the priorities for different
functions for performance reasons.

This design shields the type used for the priority away from classes
outside InlinePriority and classes derived from it.  In turn,
PriorityInlineOrder no longer needs to be a template class.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D126300
2022-06-02 23:40:26 +08:00
Liqiang Tao 5c6ed60c51 Revert "[llvm][ModuleInliner] Refactor InlineSizePriority and PriorityInlineOrder"
This reverts commit 50de7f1e77.
2022-06-02 23:18:47 +08:00
Liqiang Tao 50de7f1e77 [llvm][ModuleInliner] Refactor InlineSizePriority and PriorityInlineOrder
This patch introduces the abstract base class InlinePriority to serve as
the comparison function for the priority queue.  A derived class, such
as SizePriority, may choose to cache the priorities for different
functions for performance reasons.

This design shields the type used for the priority away from classes
outside InlinePriority and classes derived from it.  In turn,
PriorityInlineOrder no longer needs to be a template class.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D126300
2022-06-02 22:28:33 +08:00
Joseph Huber 6bdf352ed8 [Binary] Remove OffloadBinary from the Objects enumeration
Summary:
We use the beginning and end of this enumeration to determine what is
and isn't an object format. The enumeration for the OffloadBinary was
put here by mistake which led to it being mistakenly classified as an
Object file.
2022-06-02 09:34:23 -04:00
Paul Walker 1fe4953d89 [SVE] Remove custom lowering of scalable vector MGATHER & MSCATTER operations.
Differential Revision: https://reviews.llvm.org/D126255
2022-06-02 11:19:52 +01:00
Snehasish Kumar 962db7de84 [memprof] Update summary output.
Update the YAML format print out of the profile to include a summary
instead of displaying the headers in the raw file buffer. This allows us
to release the raw buffer early saving memory.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D126834
2022-06-02 02:15:42 +00:00
Matthias Braun 850d53a197 LTO: Decide upfront whether to use opaque/non-opaque pointer types
LTO code may end up mixing bitcode files from various sources varying in
their use of opaque pointer types. The current strategy to decide
between opaque / typed pointers upon the first bitcode file loaded does
not work here, since we could be loading a non-opaque bitcode file first
and would then be unable to load any files with opaque pointer types
later.

So for LTO this:
- Adds an `lto::Config::OpaquePointer` option and enforces an upfront
  decision between the two modes.
- Adds `-opaque-pointers`/`-no-opaque-pointers` options to the gold
  plugin; disabled by default.
- `--opaque-pointers`/`--no-opaque-pointers` options with
  `-plugin-opt=-opaque-pointers`/`-plugin-opt=-no-opaque-pointers`
  aliases to lld; disabled by default.
- Adds an `-lto-opaque-pointers` option to the `llvm-lto2` tool.
- Changes the clang driver to pass `-plugin-opt=-opaque-pointers` to
  the linker in LTO modes when clang was configured with opaque
  pointers enabled by default.

This fixes https://github.com/llvm/llvm-project/issues/55377

Differential Revision: https://reviews.llvm.org/D125847
2022-06-01 18:05:53 -07:00
Hendrik Greving a92ed167f2 [ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4.
Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4.

Keeps MVT::i2, MVT::i4 lowering actions as expand, which should be
removed once targets set this explicitly.

Adjusts 11 lit tests to reflect slightly different behavior during
DAG combine.

Differential Revision: https://reviews.llvm.org/D125247
2022-06-02 00:49:11 +00:00
Quentin Colombet 1a155ee7de [RegisterClassInfo] Invalidate cached information if ignoreCSRForAllocationOrder changes
Even if CSR list is same between functions, we could have had a different
allocation order if ignoreCSRForAllocationOrder is evaluated differently.
Hence invalidate cached register class information if
ignoreCSRForAllocationOrder changes.

Patch by Srividya Karumuri <srividya_karumuri@apple.com>

Differential Revision: https://reviews.llvm.org/D126565
2022-06-01 17:15:51 -07:00
Shilei Tian eb673be5ac [OMPIRBuilder] Add the support for compare capture
This patch adds the support for `compare capture` in `OMPIRBuilder`.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120007
2022-06-01 19:53:43 -04:00
Joseph Huber afd2f7e991 [Binary] Promote OffloadBinary to inherit from Binary
We use the `OffloadBinary` to create binary images of offloading files
and their corresonding metadata. This patch changes this to inherit from
the base `Binary` class. This allows us to create and insepect these
more generically. This patch includes all the necessary glue to
implement this as a new binary format, along with added the magic bytes
we use to distinguish the offloading binary to the `file_magic`
implementation.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D126812
2022-06-01 18:40:57 -04:00
Chris Bieneman 129c056d62 [ObjectYAML][DX] Support yaml2dxcontainer
This patch adds a the first bits of support for a yaml representation
of dxcontainer files.

Since the YAML representation's primary purpose is testing
infrastructure, the yaml representation supports both verbose and a
more friendly format by making computable sizes and offsets optional.
If provided they are validated to be correct, otherwise they are
computed on the fly during emission.

As I expand the format I'll be able to make more size fields optional,
and I will continue to make the format easier to work with.

Depends on D124804

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D124944
2022-06-01 15:34:00 -05:00
Hendrik Greving e9d05cc7d8 Revert "[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4."
This reverts commit 430ac5c302.

Due to failures in Clang tests.

Differential Revision: https://reviews.llvm.org/D125247
2022-06-01 13:27:49 -07:00
Chris Bieneman 9e3919dac4 [Object][DX] Parse DXContainer Parts
DXContainer files are structured as parts. This patch adds support for
parsing out the file part offsets and file part headers.

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D124804
2022-06-01 14:55:36 -05:00
Hendrik Greving 430ac5c302 [ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4.
Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4.

Keeps MVT::i2, MVT::i4 lowering actions as `expand`, which should be
removed once targets set this explicitly.

Adjusts 11 lit tests to reflect slightly different behavior during
DAG combine.

Differential Revision: https://reviews.llvm.org/D125247
2022-06-01 12:48:01 -07:00
Matt Arsenault 0e1c71e4a4 CodeGen: Move getAddressSpaceForPseudoSourceKind into TargetMachine
Avoid the dependency on TargetInstrInfo, which depends on the subtarget
and therefore the individual function.

Currently AMDGPU is constructing PseudoSourceValue instances in MachineFunctionInfo.
In order to facilitate copying MachineFunctionInfo, we need to stop allocating these
there. Alternatively we could allow targets to subclass PseudoSourceValueManager,
and allocate them similarly to MachineFunctionInfo.
2022-06-01 09:45:40 -04:00
Alexander Kornienko 7aa8a67882 Revert "[LAA] Initial support for runtime checks with pointer selects."
This reverts commit 5890b30105 as per discussion
on the review thread: https://reviews.llvm.org/D114487#3547560.
2022-06-01 15:24:27 +02:00
Florian Hahn f68c547158
[LAA] Remove unused RuntimeCheckingPtrGroup constructor (NFC).
The constructor is not used. Remove it.
2022-06-01 13:30:33 +01:00
Sheng 3fd75ce9c4 [NFC] fix typo 2022-06-01 18:48:03 +08:00
Martin Storsjö 298e9cac92 [MC] [Win64EH] Check that the SEH unwind opcodes match the actual instructions
It's a fairly common issue that the generating code incorrectly marks
instructions as narrow or wide; check that the instruction lengths
add up to the expected value, and error out if it doesn't. This allows
catching code generation bugs.

Also check that prologs and epilogs are properly terminated, to
catch other code generation issues.

Differential Revision: https://reviews.llvm.org/D125647
2022-06-01 11:25:49 +03:00
Martin Storsjö 6b75a3523f [ARM] [MC] Add support for writing ARM WinEH unwind info
This includes .seh_* directives for generating it from assembly.
It is designed fairly similarly to the ARM64 handling.

For .seh_handler directives, such as
".seh_handler __C_specific_handler, @except" (which is supported
on x86_64 and aarch64 so far), the "@except" bit doesn't work in
ARM assembly, as '@' is used as a comment character (on all current
platforms).

Allow using '%' instead of '@' for this purpose. This convention
is used by GAS in similar contexts already,
e.g. [1]:

    Note on targets where the @ character is the start of a comment
    (eg ARM) then another character is used instead. For example the
    ARM port uses the % character.

In practice, this unfortunately means that all such .seh_handler
directives will need ifdefs for ARM.

Contrary to ARM64, on ARM, it's quite common that we can't evaluate
e.g. the function length at this point, due to instructions whose
length is finalized later. (Also, inline jump tables end with
a ".p2align 1".)

If unable to to evaluate the function length immediately, emit
it as an MCExpr instead. If we'd implement splitting the unwind
info for a function (which isn't implemented for ARM64 yet either),
we wouldn't know whether we need to split it though.

Avoid calling getFrameIndexOffset() on an unset
FuncInfo.UnwindHelpFrameIdx, to avoid triggering asserts in the
preexisting testcase CodeGen/ARM/Windows/wineh-basic.ll. (Once
MSVC exception handling is fully implemented, those changes
can be reverted.)

[1] https://sourceware.org/binutils/docs/as/Section.html#Section

Differential Revision: https://reviews.llvm.org/D125645
2022-06-01 11:25:48 +03:00
Martin Storsjö e71b07e468 [MC] [Win64EH] Wrap the epilog instructions in a struct. NFC.
For ARM SEH, the epilogs will need a little more associated data than
just the plain list of opcodes.

This is a preparatory refactoring for D125645.

Differential Revision: https://reviews.llvm.org/D125879
2022-06-01 11:25:48 +03:00
Mircea Trofin f46dd19b48 [mlgo] Incrementally update FunctionPropertiesInfo during inlining
Re-computing FunctionPropertiesInfo after each inlining may be very time
consuming: in certain cases, e.g. large caller with lots of callsites,
and when the overall IR doesn't increase (thus not tripping a size bloat
threshold).

This patch addresses this by incrementally updating
FunctionPropertiesInfo.

Differential Revision: https://reviews.llvm.org/D125841
2022-05-31 17:27:32 -07:00
Augie Fackler 42861faa8e attributes: introduce allockind attr for describing allocator fn behavior
I chose to encode the allockind information in a string constant because
otherwise we would get a bit of an explosion of keywords to deal with
the possible permutations of allocation function types.

I'm not sure that CodeGen.h is the correct place for this enum, but it
seemed to kind of match the UWTableKind enum so I put it in the same
place. Constructive suggestions on a better location most certainly
encouraged.

Differential Revision: https://reviews.llvm.org/D123088
2022-05-31 10:01:17 -04:00
Simon Moll 18c1ee04de Re-land "[VP] vp intrinsics are not speculatable" with test fix
Update the llvmir-intrinsics.mlir test to account for the modified
attribute sets.

This reverts commit 2e2a8a2d90.
2022-05-30 14:41:15 +02:00
Mehdi Amini 2e2a8a2d90 Revert "[VP] vp intrinsics are not speculatable"
This reverts commit 78a18d2b54.

Break MLIR bot: https://lab.llvm.org/buildbot/#/builders/61/builds/27127
2022-05-30 12:26:16 +00:00
Simon Moll 78a18d2b54 [VP] vp intrinsics are not speculatable
VP intrinsics show UB if the %evl parameter is out of bounds - they must
not carry the speculatable attribute.  The out-of-bounds UB disappears
when the %evl parameter is expanded into the mask or expansion replaces
the entire VP intrinsic with non-VP code.

This patch
- Removes the speculatable attribute on all VP intrinsics.
- Generalizes the isSafeToSpeculativelyExecute function to let VP
  expansion know whether the VP intrinsic replacement will be
  speculatable.  VP expansion may only discard %evl where this is the
  case.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D125296
2022-05-30 12:20:05 +02:00
Yuki Okushi bc08a16d82
Remove `deplibs` keyword completely
D102763 removed the almost support of `deplibs` but it seems `kw_deplibs` was missed.
This patch removes it.

Differential Revision: https://reviews.llvm.org/D126527
2022-05-29 01:16:44 +09:00
eopXD 6a84579243 [LSR][TTI][PowerPC][SystemZ][X86] Add const-ness to TTI::isLSRCostLess. NFC
Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D126350
2022-05-27 15:22:23 -07:00
Derek Schuff a205f2904d [WebAssembly] Consolidate sectionTypeToString in BinaryFormat [NFC]
Currently there are 2 duplicate implementation, and I want to add
a use in a 3rd place. Combine them in lib/BinaryFormat so they can
be shared.

Also update toString for symbol and reloc types to use StringRef

Differential Revision: https://reviews.llvm.org/D126553
2022-05-27 09:26:36 -07:00
Balazs Benics a73b50ad06 Revert "[llvm][clang][bolt][NFC] Use llvm::less_first() when applicable"
This reverts commit 3988bd1398.

Did not build on this bot:
https://lab.llvm.org/buildbot#builders/215/builds/6372

/usr/include/c++/9/bits/predefined_ops.h:177:11: error: no match for call to
‘(llvm::less_first) (std::pair<long unsigned int, llvm::bolt::BinaryBasicBlock*>&, const std::pair<long unsigned int, std::nullptr_t>&)’
  177 |  { return bool(_M_comp(*__it, __val)); }
2022-05-27 11:19:18 +02:00
Balazs Benics 3988bd1398 [llvm][clang][bolt][NFC] Use llvm::less_first() when applicable
One could reuse this functor instead of rolling out your own version.
There were a couple other cases where the code was similar, but not
quite the same, such as it might have an assertion in the lambda or other
constructs. Thus, I've not touched any of those, as it might change the
behavior in some way.

As per https://discourse.llvm.org/t/submitting-simple-nfc-patches/62640/3?u=steakhal
Chris Lattner
> LLVM intentionally has a “yes, you can apply common sense judgement to
> things” policy when it comes to code review. If you are doing mechanical
> patches (e.g. adopting less_first) that apply to the entire monorepo,
> then you don’t need everyone in the monorepo to sign off on it. Having
> some +1 validation from someone is useful, but you don’t need everyone
> whose code you touch to weigh in.

Differential Revision: https://reviews.llvm.org/D126068
2022-05-27 11:15:23 +02:00
Serge Pavlov bdd0093f4d [GlobalISel] Add G_IS_FPCLASS
Add a generic opcode to represent `llvm.is_fpclass` intrinsic.

Differential Revision: https://reviews.llvm.org/D121454
2022-05-27 13:49:47 +07:00
Piggy NL 842e48bd65
[demangler][RISCV] Fix for long double
Summary:
The size of long double in RISCV (both RV32 and RV64) is 16 bytes, thus
the mangled_size shouble be 32.

This patch will fix test case
"_ZN5test01hIfEEvRAcvjplstT_Le4001a000000000000000E_c"
in test_demangle.pass.cpp, which is expected to be invalid but demangler
returned "void test0::h<float>(char (&) [(unsigned int)((sizeof (float))
+ (0x0.000000004001ap-16382L))])" in RISCV environment without this patch.

Reviewed By: urnathan

Differential Revision: https://reviews.llvm.org/D126480
2022-05-27 13:53:20 +08:00
Rahman Lavaee 08cc058518 Reland "[Propeller] Promote functions with propeller profiles to .text.hot."
This relands commit 4d8d2580c5.

The major change here is using 'addUsedIfAvailable<BasicBlockSectionsProfileReader>()` to make sure we don't change the pipeline tests.

Differential Revision: https://reviews.llvm.org/D126518
2022-05-26 19:53:14 -07:00
Enna1 52992f136b Add !nosanitize to FixedMetadataKinds
This patch adds !nosanitize metadata to FixedMetadataKinds.def, !nosanitize indicates that LLVM should not insert any sanitizer instrumentation.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D126294
2022-05-27 09:46:13 +08:00
Rahman Lavaee 3aa249329f Revert "[Propeller] Promote functions with propeller profiles to .text.hot."
This reverts commit 4d8d2580c5.
2022-05-26 18:45:40 -07:00
Rahman Lavaee 4d8d2580c5 [Propeller] Promote functions with propeller profiles to .text.hot.
Today, text section prefixes (none, .unlikely, .hot, and .unkown) are determined based on PGO profile. However, Propeller may deem a function hot when PGO doesn't. Besides, when `-Wl,-keep-text-section-prefix=true` Propeller cannot enforce a global section ordering as the linker can only reorder sections within each output section (.text, .text.hot, .text.unlikely).

This patch promotes all functions with Propeller profiles (functions listed in the basic-block-sections profile) to .text.hot. The feature is hidden behind the flag `--bbsections-guided-section-prefix` which defaults to `true`.

The new implementation refactors the parsing of basic block sections profile into a new `BasicBlockSectionsProfileReader` analysis pass. This allows us to use the information earlier in `CodeGenPrepare` in order to set the functions text prefix. `BasicBlockSectionsProfileReader` will be used both by `BasicBlockSections` pass and `CodeGenPrepare`.

Differential Revision: https://reviews.llvm.org/D122930
2022-05-26 16:23:21 -07:00
Arthur Eubanks 36096c2b38 [NFC][JumpThreading] Remove InsertFreezeWhenUnfoldingSelect pass parameter
All callers pass true.

select-unfold-freeze.ll is now a subset of select.ll so delete it.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D126501
2022-05-26 16:13:34 -07:00
Adrian Tong 7c13ae6490 Give option to use isCopyInstr to determine which MI is
treated as Copy instruction in MCP.

This is then used in AArch64 to remove copy instructions after taildup
ran in machine block placement

Differential Revision: https://reviews.llvm.org/D125335
2022-05-26 18:43:16 +00:00
Zongwei Lan ad73ce318e [Target] use getSubtarget<> instead of static_cast<>(getSubtarget())
Differential Revision: https://reviews.llvm.org/D125391
2022-05-26 11:22:41 -07:00
Bruno Cardoso Lopes ce54b22657 [Clang][CoverageMapping] Fix switch counter codegen compile time explosion
C++ generated code with huge amount of switch cases chokes badly while emitting
coverage mapping, in our specific testcase (~72k cases), it won't stop after hours.
After this change, the frontend job now finishes in 4.5s and shrinks down `@__covrec_`
by 288k when compared to disabling simplification altogether.

There's probably no good way to create a testcase for this, but it's easy to
reproduce, just add thousands of cases in the below switch, and build with
`-fprofile-instr-generate -fcoverage-mapping`.

```
enum type : int {
 FEATURE_INVALID = 0,
 FEATURE_A = 1,
 ...
};

const char *to_string(type e) {
  switch (e) {
  case type::FEATURE_INVALID: return "FEATURE_INVALID";
  case type::FEATURE_A: return "FEATURE_A";}
  ...
  }

```

Differential Revision: https://reviews.llvm.org/D126345
2022-05-26 11:05:15 -07:00
Owen Anderson 939a43461b Revert "Replace the custom linked list in LeaderTableEntry with TinyPtrVector."
This reverts commit 1e91149844.

Pending further discussion.
2022-05-26 09:50:36 -07:00
Krzysztof Parzyszek aee6b8efd0 [ADT] Explicitly delete copy/move constructors and operator= in IntervalMap
The default implementations will perform a shallow copy instead of a deep
copy, causing some internal data structures to be shared between different
objects. Disable these operations so they don't get accidentally used.

Differential Revision: https://reviews.llvm.org/D126401
2022-05-26 07:58:18 -07:00
Paul Robinson 634c8ef69a [PS5] Allow dllimport/dllexport same as PS4 2022-05-26 07:01:30 -07:00
Chen Zheng d79275238f [MachineSink] replace MachineLoop with MachineCycle
reapply 62a9b36fcf and fix module build
failue:
1: remove MachineCycleInfoWrapperPass in MachinePassRegistry.def
   MachineCycleInfoWrapperPass is a anylysis pass, should not be there.
2: move the definition for MachineCycleInfoPrinterPass to cpp file.

Otherwise, there are module conflicit for MachineCycleInfoWrapperPass
in MachinePassRegistry.def and MachineCycleAnalysis.h after
62a9b36fcf.

MachineCycle can handle irreducible loop. Natural loop
analysis (MachineLoop) can not return correct loop depth if
the loop is irreducible loop. And MachineSink is sensitive
to the loop depth, see MachineSinking::isProfitableToSinkTo().

This patch tries to use MachineCycle so that we can handle
irreducible loop better.

Reviewed By: sameerds, MatzeB

Differential Revision: https://reviews.llvm.org/D123995
2022-05-26 06:45:23 -04:00
Ivan Kosarev ad1d60c3be [FileCheck] Catch missspelled directives.
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D125604
2022-05-26 11:37:19 +01:00
Fangrui Song 9ee15bba47 [MC] Lower case the first letter of EmitCOFF* EmitWin* EmitCV*. NFC 2022-05-26 00:14:08 -07:00
Owen Anderson 1e91149844 Replace the custom linked list in LeaderTableEntry with TinyPtrVector.
The purpose of the custom linked list was to optimize for the case
of a single-element list. It turns out that TinyPtrVector handles
the same basic scenario even better, reducing the size of
LeaderTableEntry by 33%, and requiring only log2(N) allocations
as the size of the list grows. The only downside is that we have
to store the Value's and BasicBlock's in separate vectors, which
is slightly awkward in a few cases. Fortunately that ends up being
entirely encapsulated inside helper functions.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D125205
2022-05-25 23:52:44 -07:00
serge-sans-paille fb67d683db [iwyu] Handle regressions in libLLVM header include
Running iwyu-diff on LLVM codebase since 7030654296 detected a few
regressions, fixing them.

Differential Revision: https://reviews.llvm.org/D126417
2022-05-26 08:12:34 +02:00
Snehasish Kumar ec51971eae [memprof] Keep and display symbol names in the RawMemProfReader.
Extend the Frame struct to hold the symbol name if requested
when a RawMemProfReader object is constructed. This change updates the
tests and removes the need to pass --debug to obtain the mapping from
GUID to symbol names.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D126344
2022-05-25 21:17:44 +00:00
Alexey Bataev 10f41a2147 [SLP]Fix PR55688: Miscompile due to incorrect nuw/nsw handling.
Need to use all ReductionOps when propagating flags for the reduction
ops, otherwise transformation is not correct. Plus, need to drop nuw/nsw
flags.

Differential Revision: https://reviews.llvm.org/D126371
2022-05-25 13:59:06 -07:00
Maksim Panchenko bed9efed71 [MCDisassembler] Disambiguate Size parameter in tryAddingSymbolicOperand()
MCSymbolizer::tryAddingSymbolicOperand() overloaded the Size parameter
to specify either the instruction size or the operand size depending on
the architecture. However, for proper symbolic disassembly on X86, we
need to know both sizes, as an instruction can have two operands, and
the instruction size cannot be reliably calculated based on the operand
offset and its size. Hence, split Size into OpSize and InstSize.

For X86, the new interface allows to fix a couple of issues:
  * Correctly adjust the value of PC-relative operands.
  * Set operand size to zero when the operand is specified implicitly.

Differential Revision: https://reviews.llvm.org/D126101
2022-05-25 13:44:32 -07:00
Christian Sigg c4bc416418 [LLVM] Add rcp.approx.ftz.f32 intrinsic
Split out from https://reviews.llvm.org/D126158.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D126369
2022-05-25 21:05:20 +02:00
Aaron Ballman 69da3b6aea Revert "[OpenMP] atomic compare fail : Parser & AST support"
This reverts commit 232bf8189e.

It broke the sanitize buildbot: https://lab.llvm.org/buildbot/#/builders/5/builds/24074

It also reproduces on Windows debug builds as a crash.
2022-05-25 13:34:34 -04:00
Zequan Wu a648724921 Reland "[llvm-pdbutil] Add options to only dump symbol record at specified offset and its parents or children with spcified depth."
This reverts commit cfb4e78252.
2022-05-25 09:57:35 -07:00
Takafumi Arakaki 18e6b8234a Allow pointer types for atomicrmw xchg
This adds support for pointer types for `atomic xchg` and let us write
instructions such as `atomicrmw xchg i64** %0, i64* %1 seq_cst`. This
is similar to the patch for allowing atomicrmw xchg on floating point
types: https://reviews.llvm.org/D52416.

Differential Revision: https://reviews.llvm.org/D124728
2022-05-25 16:20:26 +00:00
Anubhab Ghosh 9da89651a8 [llvm-objcopy][ObjectYAML][mips] Add MIPS specific ELF section indexes
This fixes https://github.com/llvm/llvm-project/issues/53998
and displays correct information in obj2yaml for SHN_MIPS_*
sections according to
https://refspecs.linuxfoundation.org/elf/mipsabi.pdf

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D123902
2022-05-25 09:01:12 -07:00
Sunil Kuravinakop ca27f3e3b2 [Clang][OpenMP] Support for omp nothing
Patch to support "#pragma omp nothing"

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D123286
2022-05-24 23:59:31 -05:00
Sunil Kuravinakop 232bf8189e [OpenMP] atomic compare fail : Parser & AST support
This is a support for " #pragma omp atomic compare fail ". It has Parser & AST support for now.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D123235
2022-05-24 23:56:42 -05:00
Chen Zheng 80c4910f3d Revert "[MachineSink] replace MachineLoop with MachineCycle"
This reverts commit 62a9b36fcf.
Cause build failure on lldb incremental buildbot:
https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/43994/changes
2022-05-24 22:43:37 -04:00
Chen Zheng 62a9b36fcf [MachineSink] replace MachineLoop with MachineCycle
MachineCycle can handle irreducible loop. Natural loop
analysis (MachineLoop) can not return correct loop depth if
the loop is irreducible loop. And MachineSink is sensitive
to the loop depth, see MachineSinking::isProfitableToSinkTo().

This patch tries to use MachineCycle so that we can handle
irreducible loop better.

Reviewed By: sameerds, MatzeB

Differential Revision: https://reviews.llvm.org/D123995
2022-05-24 01:16:19 -04:00
Shraiysh Vaishay 7604c59bd2 [OpenMP][IRBuilder] `omp task` support
This patch adds basic support for `omp task` to the OpenMPIRBuilder.

The outlined function after code extraction is called from a wrapper function with appropriate arguments. This wrapper function is passed to the runtime calls for task allocation.

This approach is different from the Clang approach - clang directly emits the runtime call to the outlined function. The outlining utility (OutlineInfo) simply outlines the code and generates a function call to the outlined function. After the function has been generated by the outlining utility, there is no easy way to alter the function arguments without meddling with the outlining itself. Hence the wrapper function approach is taken.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D71989
2022-05-24 10:22:11 +05:30
Hyoun Kyu Cho 6c12ae8163 Exposes interface to free up caching data structure in DWARFDebugLine and DWARFUnit for memory management
This is minimum changes extracted from https://reviews.llvm.org/D78950. The old patch tried to add LRU eviction of caching data structure. Due to multiple layers of interfaces that users could be using, it was not clear where to put the functionality. While we work out on where to put that functionality, it'll be great to add this minimum interface change so that the user could implement their own memory management. More specifically:

* Add a clearLineTable method for DWARFDebugLine which erases the given offset from the LineTableMap.
* DWARFDebugContext adds the clearLineTableForUnit method that leverages clearLineTable to remove the object corresponding to a given compile unit, for memory management purposes. When it is referred to again, the line table object will be repopulated.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D90006
2022-05-24 03:23:24 +00:00
Wolfgang Pieb ae9489025f [NFC][Metadata] Define move constructor and move assignment operator for MDOperand.
This is a preparatory patch for the MDNode resize functionality.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D125994
2022-05-23 20:04:45 -07:00
Sam Clegg 74f9841977 [lld][WebAssembly] Allow use of statically allocated TLS region.
It turns out we were already allocating static address space for TLS
data along with the non-TLS static data, but this space was going
unused/ignored.

With this change, we include the TLS segment in `__wasm_init_memory`
(which does the work of loading the passive segments into memory when a
module is first loaded).  We also set the `__tls_base` global to point
to the start of this segment.

This means that the runtime can use this static copy of the TLS data for
the first/primary thread if it chooses, rather than doing a runtime
allocation prior to calling `__wasm_init_tls`.

Practically speaking, this will allow emscripten to avoid dynamic
allocation of TLS region on the main thread.

Differential Revision: https://reviews.llvm.org/D126107
2022-05-23 17:27:17 -07:00
Jamie Schmeiser 24239e246c Add new hidden option -print-on-crash that prints out IR that caused opt pipeline to crash
A new hidden option -print-on-crash that prints the IR as it was upon entering
the last pass when there is a crash.

The IR is saved in its print form before each pass is started and a
signal handler is registered.  If the compilation crashes, the signal
handler will print the saved IR to dbgs().  This option
can be modified using -print-module-scope to get the IR for the complete
module.  Note that this option only works with the new pass manager.

Reviewed By: yrouban

Differential Revision: https://reviews.llvm.org/D86657
2022-05-23 15:38:38 -07:00
Mitch Phillips cead4eceb0 [symbolizer] Parse DW_TAG_variable DIs to show line info for globals
Currently, llvm-symbolizer doesn't like to parse .debug_info in order to
show the line info for global variables. addr2line does this. In the
future, I'm looking to migrate AddressSanitizer off of internal metadata
over to using debuginfo, and this is predicated on being able to get the
line info for global variables.

This patch adds the requisite support for getting the line info from the
.debug_info section for symbolizing global variables. This only happens
when you ask for a global variable to be symbolized as data.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D123538
2022-05-23 13:30:22 -07:00
Sanjay Patel e8c20d995b [IR] add and use pattern match specialization for sqrt intrinsic; NFC
This was included in D126190 originally, but it's
independent and a useful change for readability.
2022-05-23 14:16:30 -04:00
Jingu Kang bb82f74612 Revert "Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth""
This reverts commit 42ebfa8269.

The commmit from https://reviews.llvm.org/D125918 has fixed the stage 2 build
failure.

Differential Revision: https://reviews.llvm.org/D118979
2022-05-23 16:15:45 +01:00
Anastasia Stulova 72832efc94 [SPIR-V] Allow setting SPIR-V version via target triple.
Currently added versions are from v1.0 to v1.5, other versions
can be added as needed.

This change also adds documentation about SPIR-V target support
in LLVM.

Differential Revision: https://reviews.llvm.org/D124776
2022-05-23 14:24:00 +01:00
Peter Waller ade47bdc31 [LV] Improve register pressure estimate at high VFs
Previously, `getRegUsageForType` was implemented using
`getTypeLegalizationCost`.  `getRegUsageForType` is used by the loop
vectorizer to estimate the register pressure caused by using a vector
type.  However, `getTypeLegalizationCost` currently only appears to
understand splitting and not scalarization, so significantly
underestimates the register requirements.

Instead, use `getNumRegisters`, which understands when scalarization
can occur (via computeRegisterProperties).

This was discovered while investigating D118979 (Set maximum VF with
shouldMaximizeVectorBandwidth), where under fixed-length 512-bit SVE the
loop vectorizer previously ends up costing an v128i1 as 2 v64i*
registers where it actually occupies 128 i32 registers.

I'm sending this patch early for comment, I'm still doing some sanity checking
with LNT.  I note that getRegisterClassForType appears to return VectorRC even
though the type in question (large vNi1 types) end up occupying scalar
registers. That might be worth fixing too.

Differential Revision: https://reviews.llvm.org/D125918
2022-05-23 07:57:45 +00:00
Sergei Trofimovich 5e9be93566 [Support] Add missing <cstdint> header to Base64.h
Without the change llvm build fails on this week's gcc-13 snapshot as:

    [ 91%] Building CXX object unittests/Support/CMakeFiles/SupportTests.dir/Base64Test.cpp.o
    In file included from llvm/unittests/Support/Base64Test.cpp:14:
    llvm/include/llvm/Support/Base64.h: In function 'std::string llvm::encodeBase64(const InputBytes&)':
    llvm/include/llvm/Support/Base64.h:29:5: error: 'uint32_t' was not declared in this scope
       29 |     uint32_t x = ((unsigned char)Bytes[i] << 16) |
          |     ^~~~~~~~
2022-05-23 08:48:14 +01:00
Sergei Trofimovich ff1681ddb3 [Support] Add missing <cstdint> header to Signals.h
Without the change llvm build fails on this week's gcc-13 snapshot as:

    [  0%] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/Signals.cpp.o
    In file included from llvm/lib/Support/Signals.cpp:14:
    llvm/include/llvm/Support/Signals.h:119:8: error: variable or field 'CleanupOnSignal' declared void
      119 |   void CleanupOnSignal(uintptr_t Context);
          |        ^~~~~~~~~~~~~~~
2022-05-23 08:48:14 +01:00
NAKAMURA Takumi cd5f3241c3 ADT::GenericCycleInfo: Hide validateTree() in -Asserts.
validateTree() is instantiated with __FILE__.
It will be pruned at link time due to -ffunction-sections but be left in
object files.
Its user is only GenericCycleInfo::compute() with assert(validateTree());
Therefore I think validateTree() may be hidden with NDEBUG.

This is a fixup for https://reviews.llvm.org/D112696
2022-05-23 01:15:02 +09:00
Paul Walker 258dac43d6 [SVE] Enable use of 32bit gather/scatter indices for fixed length vectors
Differential Revision: https://reviews.llvm.org/D125193
2022-05-22 12:32:30 +01:00
Lang Hames 55e8f721d4 [ORC] Allow FailedToMaterialize errors to outlive ExecutionSessions.
Idiomatic llvm::Error usage can result in a FailedToMaterialize error tearing
down an ExecutionSession instance. Since the FailedToMaterialize error holds
SymbolStringPtrs and JITDylib references this leads to crashes when accessing
or logging the error.

This patch modifies FailedToMaterialize to retain the SymbolStringPool and
JITDylibs involved in the failure so that we can safely report an error message
to the client, even if the error tears down the session.

The contract for JITDylibs allows the getName method to be used even after the
session has been torn down, but no other JITDylib fields should be accessed via
the FailedToMaterialize error if the ssesion has been torn down. Logging the
error is guaranteed to be safe in all cases.
2022-05-21 13:51:02 -07:00
Lang Hames f3428dafdc [ORC] Add a ~ExectionSession destructor to verify that endSession was called.
Clients are required to call ExecutionSession::endSession before destroying the
ExecutionSession. Failure to do so can lead to memory leaks and other difficult
to debug issues. Enforcing this requirement by assertion makes it easy to spot
or debug situations where the contract was not followed.
2022-05-21 09:02:01 -07:00
Benjamin Kramer c312f02594 [STLExtras] Make indexed_accessor_range operator== compatible with C++20
This would be ambigious with itself when C++20 tries to lookup the
reversed form. I didn't find a use in LLVM, but MLIR does a lot of
comparisons of ranges of different types.
2022-05-21 13:00:30 +02:00
Nikita Popov 6f0ca6fd23 [JumpThreading] Insert freeze when unfolding select
JumpThreading may convert selects into branch instructions,
in which case the condition needs to be frozen (as branch on
poison is immediate undefined behavior, unlike select on poison).

The necessary code for this is already in place, this just enables
the option.

Differential Revision: https://reviews.llvm.org/D125869
2022-05-21 11:24:27 +02:00
Ahmed Bougacha 362b4066f0 [ObjCARC] Drop nullary clang.arc.attachedcall bundles in autoupgrade.
In certain use-cases, these can be emitted by old compilers, but the
operand is now always required.  These are only used for optimizations,
so it's safe to drop them if they happen to have the now-invalid format.
The semantically-required call is already a separate instruction.

Differential Revision: https://reviews.llvm.org/D123811
2022-05-20 15:27:29 -07:00
Shilei Tian ff60a0a364 [LLVM] Add a check if should cast atomic operations to integer type
Currently for atomic load, store, and rmw instructions, as long as the
operand is floating-point value, they are casted to integer. Nowadays many
targets can actually support part of atomic operations with floating-point
operands. For example, NVPTX supports atomic load and store of floating-point
values. This patch adds a series interface functions `shouldCastAtomicXXXInIR`,
and the default implementations are same as what we currently do. Later for
targets can have their specialization.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D125652
2022-05-20 17:23:53 -04:00
Jay Foad 9ece051847 [AMDGPU] Mark s_get_waveid_in_workgroup as not reading memory
It is already marked as having side effects, at least in MIR. It does
not interact with anything else that is modelled as a memory access
either in IR or MachineIR.

Differential Revision: https://reviews.llvm.org/D125985
2022-05-19 21:25:46 +01:00
Jay Foad 86b55edab6 [AMDGPU] Mark s_getreg as having side effects instead of reading memory
s_getreg does not interact with anything else that is modelled as a
memory access either in IR or MachineIR.

Differential Revision: https://reviews.llvm.org/D125968
2022-05-19 21:25:46 +01:00
Jennifer Yu 7aa9c39381 [Clang][[OpenMP5.1] Initial parser/sema for default(private) clause
This implements the default(private) clause as defined in OMP5.1

Differential Revision: https://reviews.llvm.org/D125912
2022-05-19 12:43:13 -07:00
Keith Smiley 066243057f
[Object] Fix updating darwin archives
When creating an archive, llvm-ar looks at the host to determine the
archive format to use, on Apple platforms this means it uses the
K_DARWIN format. K_DARWIN is _virtually_ equivalent to K_BSD, expect for
some very slight differences around padding, timestamps in deterministic
mode, and 64 bit formats. When updating an archive using llvm-ar, or
llvm-objcopy, Archive would try to determine the kind, but it was not
possible to get K_DARWIN in the initialization of the archive, because
they're virtually inciting usable from K_BSD, especially since the
slight differences only apply in very specific cases. This leads to
linker failures when the alignment workaround is not applied to an
archive copied with llvm-objcopy. This change teaches Archive to infer
the K_DARWIN type in the cases where it's possible and the first object
in the archive is a macho object. This avoids using the host triple to
determine this to not affect cross compiling.

Ideally we would eliminate the separate K_DARWIN type entirely since
it's not a truly separate archive type, but then we'd have to force the
macho workarounds on the BSD format generally. This might be acceptable
but then it would be unclear how to handle this case without forcing the
K_DARWIN64 format on all BSD users:

```
if (LastOffset >= Sym64Threshold) {
  if (Kind == object::Archive::K_DARWIN)
    Kind = object::Archive::K_DARWIN64;
  else
    Kind = object::Archive::K_GNU64;
}
```

The logic used to determine if the object is macho is derived from the
logic llvm-ar uses.

Previous context:

- 111cd669e9
- 23a76be5ad

Differential Revision: https://reviews.llvm.org/D124895
2022-05-19 10:56:26 -07:00
Lang Hames 4bb18a89c4 [ORC] Add missing std::moves, pass SymbolLookupSet by value.
Avoids some unnecessary SymbolStringPtr copies.
2022-05-19 10:51:20 -07:00
Paul Walker d640442518 [NFC] Fix a couple of whitespace issues. 2022-05-19 17:27:09 +00:00
Sotiris Apostolakis ca7c307d18 [SelectOpti][1/5] Setup new select-optimize pass
This is the first commit for the cmov-vs-branch optimization pass.
The goal is to develop a new profile-guided and target-independent cost/benefit analysis
for selecting conditional moves over branches when optimizing for performance.

Initially, this new pass is expected to be enabled only for instrumentation-based PGO.

RFC: https://discourse.llvm.org/t/rfc-cmov-vs-branch-optimization/6040

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D120230
2022-05-19 16:31:10 +00:00
Joseph Huber dbffa4073c [NVVM] Update intrinsic defintions to include the `nocallback` attribute
This patch adds the `nocallback` attribute to the NVVM intrinsics that
did not use the `DefaultAttrsIntrinsic` method that includes it already.
The `nocallback` attribute states that the intrinsic function cannot
enter back into the caller's translation-unit. This allows as to
determine that a function calling a `nocallback` function can have the
`norecurse` attribute.  This should be safe for all the NVVM intrinsics
because they do not call other functions within the translation unit.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D125937
2022-05-19 12:30:35 -04:00
Amy Kwan c35ca3a1c7 [PowerPC] Implement XL compat __fnabs and __fnabss builtins.
This patch implements the following floating point negative absolute value
builtins that required for compatibility with the XL compiler:
```
double __fnabs(double);
float __fnabss(float);
```

These builtins will emit :
- fnabs on PWR6 and below, or if VSX is disabled.
- xsnabsdp on PWR7 and above, if VSX is enabled.

Differential Revision: https://reviews.llvm.org/D125506
2022-05-19 11:28:40 -05:00
Jay Foad 4e432f1b7c [APInt] Deprecate truncOrSelf, zextOrSelf and sextOrSelf
Differential Revision: https://reviews.llvm.org/D125558
2022-05-19 11:23:13 +01:00
Michael Kruse 797fabaab2 [Analysis] Avoid virtual dtor. NFC.
Replace virtual destructor by a protected non-virtual one. Additionally also making derived structs as virtual avoids the warning from reappearing.

Also see the mailing list discussion: https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20220516/1038290.html

Reviewed By: dblaikie, YangKeao

Differential Revision: https://reviews.llvm.org/D125830
2022-05-18 17:41:17 -05:00
Yusra Syeda 1e14b1a797 [SystemZ][z/OS] Add missing include to llvm/include/llvm/BinaryFormat/GOFF.h
Differential Revision: https://reviews.llvm.org/D125921
2022-05-18 15:57:05 -04:00
Michael Kitzan 29bebb0237 [GISel] Add new combines for G_FMINNUM/MAXNUM and G_FMINIMUM/MAXIMUM
I noticed https://reviews.llvm.org/D87415 added SDAG combines to fold
FMIN/MAX instrs with NaNs.

The patch implements the same NaN combines for GISel GMIR FMIN/MAX opcodes:
G_FMINNUM(X, NaN) -> X
G_FMAXNUM(X, NaN) -> X
G_FMINIMUM(X, NaN) -> NaN
G_FMAXIMUM(X, NaN) -> NaN

The patch adds AArch64 tests for these combines as well.

Reviewed by: arsenm

Differential revision: https://reviews.llvm.org/D125819
2022-05-18 12:08:53 -07:00
Martin Storsjö d4257fbbba [llvm-readobj] Improve printing of Windows ARM packed unwind info
Fix a couple minor details in the existing logic for calculating
saved registers and stack adjustment.

Synthesize the corresponding prologues and epilogues and print them.
(This supersedes the previous printout of one single list of stored
registers; as there's lots of minor nuance differences in how
registers are pushed/popped in various corner cases, it's better to
print the full prologue/epilogue instead of trying to condense it
into one single list.)

Print the raw values of the fields Reg, R, L (LinkRegister) and C
(Chaining) instead of only printing the derived values.

Differential Revision: https://reviews.llvm.org/D125644
2022-05-18 21:33:08 +03:00
Yusra Syeda 5ac411aea8 [SystemZ][z/OS] Add the PPA1 to SystemZAsmPrinter
Differential Revision: https://reviews.llvm.org/D125725
2022-05-18 14:13:17 -04:00
Nikita Popov e1d47d86d8 [IR] Report whether replaceUsesOfWith() changed something (NFC)
With change reporting in transformation passes in mind.
2022-05-18 11:46:28 +02:00
Nikita Popov e9a1c82d69 [SCEVExpander] Expand umin_seq using freeze
%x umin_seq %y is currently expanded to %x == 0 ? 0 : umin(%x, %y).
This patch changes the expansion to umin(%x, freeze %y) instead
(https://alive2.llvm.org/ce/z/wujUhp).

The motivation for this change are the test cases affected by
D124910, where the freeze expansion ultimately produces better
optimization results. This is largely because
`(%x umin_seq %y) == %x` is a common expansion pattern, which
reliably optimizes in freeze representation, but only sometimes
with the zero comparison (in particular, if %x == 0 can fold to
something else, we generally won't be able to cover reasonable
code from this.)

Differential Revision: https://reviews.llvm.org/D125372
2022-05-18 09:53:07 +02:00
Stanislav Mekhanoshin dee3190293 [AMDGPU] Add llvm.amdgcn.global.load.lds intrinsic
Differential Revision: https://reviews.llvm.org/D125279
2022-05-17 12:35:27 -07:00
Stanislav Mekhanoshin 791ec1c68e [AMDGPU] Add intrinsics llvm.amdgcn.{raw|struct}.buffer.load.lds
Differential Revision: https://reviews.llvm.org/D124884
2022-05-17 10:32:13 -07:00
Ruobing Han e1cf702a02 fix typo error in DivergenceAnalysis.h
Fix a typo error in the comment in DivergenceAnalysis.h

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D125808
2022-05-17 17:20:17 +00:00
Michael Kruse bd93df937a [Polly] Mark classes as final by default. NFC.
This make is obivious that a class was not intended to be derived from.

NPM analysis pass can unfortunately not marked as final because they are
derived from a llvm::Checker<T> template internally by the NPM.

Also normalize the use of classes/structs
 * NPM passes are structs
 * Legacy passes are classes
 * structs that have methods and are not a visitor pattern are classes
 * structs have public inheritance by default, remove "public" keyword
 * Use typedef'ed type instead of inline forward declaration
2022-05-17 12:05:39 -05:00
Nikita Popov 5df22e507b [IRBuilder] Move insertvalue/extractvalue to fold infrastructure
Move from the old CreateXYZ() to the new FoldXYZ() mechanism.

This change is likely NFC in practice, because I don't think that
the places using InstSimplifyFolder use insertvalue/extractvalue.
2022-05-17 16:04:55 +02:00
Jay Foad 77480556c4 [RegAllocGreedy] New hook regClassPriorityTrumpsGlobalness
Add a new TargetRegisterInfo hook to allow targets to tweak the
priority of live ranges, so that AllocationPriority of the register
class will be treated as more important than whether the range is local
to a basic block or global. This is determined per-MachineFunction.

Differential Revision: https://reviews.llvm.org/D125102
2022-05-17 12:35:21 +01:00
Alexey Lapshin 4d9c083437 [DWARFLinker][NFC] Add None value to the DwarfLinkerAccelTableKind enum.
this review is extracted from D86539.

1. Rename AccelTableKind to DwarfLinkerAccelTableKind
   (to differentiate from AccelTableKind from CodeGen/AsmPrinter/DwarfDebug.h)

2. Add None value to the DwarfLinkerAccelTableKind.

3. added 'None' value for 'accelerator' option of dsymutil.

Differential Revision: https://reviews.llvm.org/D125474
2022-05-17 12:32:32 +03:00
Nikita Popov a694546f7c [KnownBits] Add operator==
Checking whether two KnownBits are the same is somewhat common,
mainly in test code.

I don't think there is a lot of room for confusion with "determine
what the KnownBits for an icmp eq would be", as that has a
different result type (this is what the eq() method implements,
which returns Optional<bool>).

Differential Revision: https://reviews.llvm.org/D125692
2022-05-17 09:38:13 +02:00
luxufan 63c81b23be [RISCV] Support getHostCpuName for sifive-u74
Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D123978
2022-05-17 14:06:59 +08:00
Yang Keao 7dce9eb6e5 [DomPrinter] Migrate -dot-dom to the new pass manager.
In D123677, @YangKeao provided an implementation of `DOTGraphTraits{Viewer,Printer}` in the new pass manager. This commit migrates the `DomPrinter` and `DomViewer` to the new pass manager.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D124904
2022-05-16 15:07:16 -05:00
Paul Walker 7dd05ba9ed [SelectionDAG] Remove duplicate "is scaled" information from gather/scatter SDNodes.
During early gather/scatter enablement two different approaches
were taken to represent scaled indices:

* A Scale operand whereby byte_offsets = Index * Scale
* An IndexType whereby byte_offsets = Index * sizeof(MemVT.ElementType)

Having multiple representations is bad as shown by this patch which
fixes instances where the two are out of sync. The dedicated scale
operand is more flexible and pervasive so this patch removes the
UNSCALED values from IndexType. This means all indices are scaled
but the scale can be one, hence unscaled. SDNodes now use the scale
operand to answer the "isScaledIndex" question.

I toyed with the idea of keeping the UNSCALED enums and helper
functions but because they will have no uses and force SDNodes to
validate the set of supported values I figured it's best to remove
them. We can re-add them if there's a real need. For similar
reasons I've kept the IndexType enum when a bool could be used as I
think being explicitly looks better.

Depends On D123347

Differential Revision: https://reviews.llvm.org/D123381
2022-05-16 20:47:52 +01:00
zhijian 52c615553c [AIX] fixed llvm-ar can not read empty big archive correctly.
Summary:

llvm-ar can not read empty big archive correctly. it output error as
error: unable to load 'empty.a': truncated or malformed archive (characters in size field in archive member header are not all decimal numbers: '<bigaf>'

Reviewers: James Henderson
Differential Revision: https://reviews.llvm.org/D124017
2022-05-16 14:29:37 -04:00
Rahman Lavaee 5f7ef65245 [llvm-objdump] Let --symbolize-operands symbolize basic block addresses based on the SHT_LLVM_BB_ADDR_MAP section.
`--symbolize-operands` already symbolizes branch targets based on the disassembly. When the object file is created with `-fbasic-block-sections=labels` (ELF-only) it will include a SHT_LLVM_BB_ADDR_MAP section which maps basic blocks to their addresses. In such case `llvm-objdump` can annotate the disassembly based on labels inferred on this section.

In contrast to the current labels, SHT_LLVM_BB_ADDR_MAP-based labels are created for every machine basic block including empty blocks and those which are not branched into (fallthrough blocks).

The old logic is still executed even when the SHT_LLVM_BB_ADDR_MAP section is present to handle functions which have not been received an entry in this section.

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D124560
2022-05-16 10:11:11 -07:00
Adrian Prantl 9c7c8be4a3 Remove stale file from modulemap 2022-05-16 10:05:38 -07:00
Hongtao Yu acfd0a3456 [llvm-profgen] Update callsite body samples by summing up all call target samples.
Current profile generation caculcates callsite body samples and call target samples separately. The former is done based on LBR range samples while the latter is done based on branch samples. Note that there's a subtle difference. LBR ranges is formed from two consecutive branch samples. Therefore the last entry in a LBR record will not be counted towards body samples while there's still a chance for it to be counted towards call targets if it is a function call. I'm making sense of the call body samples by updating it to the aggregation of call targets.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D122609
2022-05-16 09:13:37 -07:00
Ellis Hoag 6e23cd2bf0 [InstrProf][NFC] Save profile bias to function map
Add a map from functions to load instructions that compute the profile bias. Previously we assumed that if the first instruction in the function was a load instruction, then it must be computing the bias. This was likely to work out because functions usually start with the `llvm.instrprof.increment` instruction, but optimizations could change this. For example, inlining into a non-profiled function.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D114319
2022-05-16 08:32:31 -07:00
Sanjay Patel be7f09f7b2 [IR] create and use helper functions that test the signbit; NFCI 2022-05-16 11:26:23 -04:00
Philip Reames 55e2df7285 [LiveIntervals] Add range accessors for value numbers [nfc] 2022-05-16 08:23:12 -07:00
Florian Hahn b7315ffc3c
[LAA,LV] Add initial support for pointer-diff memory checks.
This patch adds initial support for a pointer diff based runtime check
scheme for vectorization. This scheme requires fewer computations and
checks than the existing full overlap checking, if it is applicable.

The main idea is to only check if source and sink of a dependency are
far enough apart so the accesses won't overlap in the vector loop. To do
so, it is sufficient to compute the difference and compare it to the
`VF * UF * AccessSize`. It is sufficient to check
`(Sink - Src) <u VF * UF * AccessSize` to rule out a backwards
dependence in the vector loop with the given VF and UF. If Src >=u Sink,
there is not dependence preventing vectorization, hence the overflow
should not matter and using the ULT should be sufficient.

Note that the initial version is restricted in multiple ways:

1. Pointers must only either be read or written, by a single
   instruction (this allows re-constructing source/sink for
   dependences with the available information)
 2. Source and sink pointers must be add-recs, with matching steps
 3. The step must be a constant.
 3. abs(step) == AccessSize.

Most of those restrictions can be relaxed in the future.

See https://github.com/llvm/llvm-project/issues/53590.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D119078
2022-05-16 15:27:22 +01:00
Nikita Popov 8ab819ad90 [ConstantRange] Add toKnownBits() method
Add toKnownBits() method to mirror fromKnownBits(). We know the
top bits that are constant between min and max.

The return value for an empty range is chosen to be conservative.
2022-05-16 16:12:25 +02:00
Liqin.Weng ff3f4988ed [CodeGen] Use ArrayRef in TargetLowering functions
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123656
2022-05-16 13:30:58 +00:00
Sheng aab5bd180a [ADT] Adopt the new casting infrastructure for PointerUnion
Reviewed By: lattner, bzcheeseman

Differential Revision: https://reviews.llvm.org/D125609
2022-05-16 18:40:05 +08:00
Abinav Puthan Purayil 485dd0b752 [GlobalISel] Handle constant splat in funnel shift combine
This change adds the constant splat versions of m_ICst() (by using
getBuildVectorConstantSplat()) and uses it in
matchOrShiftToFunnelShift(). The getBuildVectorConstantSplat() name is
shortened to getIConstantSplatVal() so that the *SExtVal() version would
have a more compact name.

Differential Revision: https://reviews.llvm.org/D125516
2022-05-16 16:03:30 +05:30
Nicolas Abram Lujan 436bbce765 [llvm-c] Add functions for enabling and creating opaque pointers
This is based on https://reviews.llvm.org/D125168 which adds a
wrapper to allow use of opaque pointers from the C API.

I added an opaque pointer mode test to echo.ll, and to fix assertions
that forbid the use of mixed typed and opaque pointers that were
triggering in it I had to also add wrappers for setOpaquePointers()
and isOpaquePointer().

I also changed echo.ll to remove a bitcast i32* %x to i8*, because
passing it through llvm-as and llvm-dis was generating a
%0 = bitcast ptr %x to ptr, but when building that same bitcast in
echo.cpp it was getting elided by IRBuilderBase::CreateCast
(08ac661248/llvm/include/llvm/IR/IRBuilder.h (L1998-L1999)).

Differential Revision: https://reviews.llvm.org/D125183
2022-05-16 10:53:46 +02:00
stk 9902a0945d Add ThreadPriority::Low, and use QoS class Utility on Mac
On Apple Silicon Macs, using a Darwin thread priority of PRIO_DARWIN_BG seems to
map directly to the QoS class Background. With this priority, the thread is
confined to efficiency cores only, which makes background indexing take forever.

Introduce a new ThreadPriority "Low" that sits in the middle between Background
and Default, and maps to QoS class "Utility" on Mac. Make this new priority the
default for indexing. This makes the thread run on all cores, but still lowers
priority enough to keep the machine responsive, and not interfere with
user-initiated actions.

I didn't change the implementations for Windows and Linux; on these systems,
both ThreadPriority::Background and ThreadPriority::Low map to the same thread
priority. This could be changed as a followup (e.g. by using SCHED_BATCH for Low
on Linux).

See also https://github.com/clangd/clangd/issues/1119.

Reviewed By: sammccall, dgoldman

Differential Revision: https://reviews.llvm.org/D124715
2022-05-16 10:01:49 +02:00
bzcheeseman 0809f63826 [LLVM][Casting.h] Add trivial self-cast
Casting from a type to itself should always be possible. Make this simple for all users, and add tests to ensure we keep being able to do this. Ref: https://reviews.llvm.org/D125543

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D125590
2022-05-15 22:22:16 -07:00
Alexey Lapshin fdae8641ad [DWARFLinker][NFC] cleanup AddressManager interface.
this review is extracted from D86539

1. delete areRelocationsResolved() method.
2. rename hasLiveMemoryLocation() -> isLiveVariable()
          hasLiveAddressRange() -> isLiveSubprogram().

Differential Revision: https://reviews.llvm.org/D125492
2022-05-15 22:47:04 +03:00
Sheng c644488a8b Rename `MCFixedLenDisassembler.h` as `MCDecoderOps.h`
The name `MCFixedLenDisassembler.h` is out of date after D120958.

Rename it as `MCDecoderOps.h` to reflect the change.

Reviewed By: myhsu

Differential Revision: https://reviews.llvm.org/D124987
2022-05-15 08:44:58 +08:00
Alex Brachet a74d9e74e5 [ifs] Add --strip-size flag
st_size may not be of importance to the abi if you are not using
copy relocations. This is helpful when you want to check the abi
of a shared object both when instrumented and not because asan
will increase the size of objects to include the redzone.

Differential revision: https://reviews.llvm.org/D124792
2022-05-14 18:50:20 +00:00
Alex Brachet 1f61260847 Revert "[ifs] Add --strip-size flag"
This reverts commit b6b0fd6a94.
2022-05-14 17:33:27 +00:00
Alex Brachet b6b0fd6a94 [ifs] Add --strip-size flag
st_size may not be of importance to the abi if you are not using
copy relocations. This is helpful when you want to check the abi
of a shared object both when instrumented and not because asan
will increase the size of objects to include the redzone.

Differential revision: https://reviews.llvm.org/D124792
2022-05-14 17:25:50 +00:00
Xiaodong Liu 7ff7001ba9 [llvm] Fix comment nits in Module class, NFC.
There is no member called "GlobalValRefMap" in Module class.
It has been changed to "GlobalList".

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D125187
2022-05-14 17:41:37 +08:00
Jay Foad 169ae6db69 [APInt] Allow extending and truncating to the same width
Allow zext, sext, trunc, truncUSat and truncSSat to extend or truncate
to the same bit width, which is a no-op.

Disallowing this forced clients to use workarounds like using
zextOrTrunc (even though they never wanted truncation) or zextOrSelf
(even though they did not want its strange behaviour of allowing a
*smaller* bit width, which is also treated as a no-op).

Differential Revision: https://reviews.llvm.org/D125556
2022-05-14 09:54:24 +01:00
Wolfgang Pieb 2740c1875d [NFC][Metadata] Refactor allocation, initalization and deletion of MDNodes.
This patch is refactoring the allocation, initialization and deletion
of MDNodes. It is intended as a preparatory patch for the upcoming
addition of dynamic resizability of MDNodes. It is fundamentally NFC,
but removes the necessity for suppressing the memory sanitizer for
MDNode's operator delete.

Reviewers: dexonsmith

Differential Revision: https://reviews.llvm.org/D125489
2022-05-13 16:05:29 -07:00
bzcheeseman c758708018 [LLVM][Casting.h] Add ForwardToPointerCast trait
Addresses use cases in Clang/MLIR that need pointer-to-pointer, reference-to-reference, and value-to-value casts from/to the same types. This should reduce boilerplate by allowing the user to simply specify the pointer cast and forward the reference cast directly to the pointer cast.

This cast trait DOES NOT implement `castFailed` and `doCastIfPossible` because in the general case doing so could result in a nullptr dereference. Users can use `NullableValueCastFailed` and `DefaultDoCastIfPossible` as desired for those cases where `nullptr` is acceptable.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D125576
2022-05-13 18:48:50 -04:00
bzcheeseman bc65fc8bb3 [LLVM][Casting.h] Remove CastInfo pointer partial specialization.
Since cast_convert_val now has pointer specializations, we don't need the pointer partial specialization for CastInfo. We want to trim these down when possible to avoid future ambiguous partial specialization errors.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D125578
2022-05-13 18:31:10 -04:00
Alexander Shaposhnikov badd088c57 [GlobalOpt] Enable optimization of constructors with different priorities
Adjust `optimizeGlobalCtorsList` to handle the case of different priorities.
This addresses the issue https://github.com/llvm/llvm-project/issues/55083.

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D125278
2022-05-13 22:19:29 +00:00
Amara Emerson 41fef10449 [GlobalISel] Combine G_SHL, G_ASHR, G_SHL of undef shifts to undef.
Differential Revision: https://reviews.llvm.org/D125041
2022-05-13 12:20:34 -07:00
Hongtao Yu 1662cfa4be [CSSPGO][CSProfileConverter] Remove call target samples when including callee samples into caller.
When a flat CS profile is converted to a nested profile, the call target samples for inlined callee contexts are left over in the callsite target map. This could cause indirect call promotion to function improperly. One issue is that the inlined callsites are treated with double amount of samples. The other is the inlined callsites are reconsidered for subsequent PGO ICP.

I'm fixing this by excluding call targets from the callsite for inlined targets. While fixing this I found that callsite target sum and the number of body samples for that callsite could be mismatched. {D122609} has an explanation and a fix for that on llvm-profgen side. For now I'm tolerating it in this change.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D125266
2022-05-13 09:19:32 -07:00
zhijian fe3b621f05 [AIX] support write operation of big archive.
SUMMARY

1. Enable supporting the write operation of big archive.
2. the first commit come from https://reviews.llvm.org/D104367
3. refactor the first commit and implement writing symbol table.
4. fixed the bugs and add more test cases in the second commit.

Reviewers: James Henderson
Differential Revision: https://reviews.llvm.org/D123949
2022-05-13 10:40:15 -04:00
Jay Foad fcbf617dcc [APInt] Fix documentation of *OrSelf methods
Document that truncOrSelf, zextOrSelf and sextOrSelf only enforce
an upper or lower bound on the bitwidth of the result.
2022-05-13 15:26:41 +01:00
Jonas Paulsson eaa78035c6 [SystemZ] Patchset for expanding memcpy/memset using at most two stores.
* Set MaxStoresPerMemcpy and MaxStoresPerMemset to 2.

* Optimize stores of replicated values in SystemZ::combineSTORE(). This
  handles the now expanded memory operations and as well some other
  pre-existing cases.

* Reject a big displacement in isLegalAddressingMode() for a vector type.

* Return true from shouldConsiderGEPOffsetSplit().

Reviewed By: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D122105
2022-05-13 15:31:09 +02:00
Nikita Popov ed1cb01baf [IRBuilder] Add IsInBounds parameter to CreateGEP()
We commonly want to create either an inbounds or non-inbounds GEP
based on a boolean value, e.g. when preserving inbounds from
existing GEPs. Directly accept such a boolean in the API, rather
than requiring a ternary between CreateGEP and CreateInBoundsGEP.

This change is not entirely NFC, because we now preserve an
inbounds flag in a constant expression edge-case in InstCombine.
2022-05-13 14:30:55 +02:00
Nathan Sidwell 562ce15924 [demangler] Avoid special-subst code duplication
We need to expand special substitutions in four different ways.  This
refactors to only have one conversion from enum to string, and derive
the other 3 needs off that.

The SpecialSubstitution node is derived from the
ExpandedSpecialSubstitution.  While this may seem unintuitive, it
works out quite well, as SpecialSubstitution can then use the former's
getBaseName and remove an unneeded 'basic_' prefix, for those
substitutions that are instantiations (to known typedef).  Similarly
all those instantiations use the same set of template arguments (with
'basic_string', getting an additional 'allocator' arg).

Expansion tests were added in D123134, and remain unchanged.

Reviewed By: MaskRay, dblaikie

Differential Revision: https://reviews.llvm.org/D125257
2022-05-13 04:35:29 -07:00
Nikita Popov 0485211dd0 [IRBuilder] Remove redundant createGEP() overloads (NFC)
ArrayRef<Value *> also accepts a single Value *, there's no need
to create separate overloads for this.
2022-05-13 12:43:21 +02:00
Zakk Chen 7dfc56c107 [RISCV] Add the passthru operand for RVV unmasked segment load IR intrinsics.
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125323
2022-05-13 02:16:40 -07:00
Jay Foad 26e1ebd3ea [GlobalISel] Change ConstantFoldVectorBinop to return vector of APInt
Previously it built MIR for the results and returned a Register.

This avoids building constants for earlier elements of the vector if
later elements will fail to fold, and allows CSEMIRBuilder::buildInstr
to avoid unconditionally building a copy from the result.

Use a new helper function MachineIRBuilder::buildBuildVectorConstant
to build a G_BUILD_VECTOR of G_CONSTANTs.

Differential Revision: https://reviews.llvm.org/D117758
2022-05-13 09:33:07 +01:00
Alex Bradbury cb778e9328 [WebAssembly] Implement ref.is_null MC layer support and codegen
Custom type-checking (in WebAssemblyAsmTypeCheck.cpp) is used to
workaround the fact that separate variants of the instruction are
defined for externref and funcref.

Based on an initial patch by Paulo Matos <pmatos@igalia.com>.

Differential Revision: https://reviews.llvm.org/D123484
2022-05-13 07:08:10 +01:00
bzcheeseman 0be41ed5bb [LLVM][Casting.h] Don't create a temporary while casting.
C-style casting can create a temporary when compiled by a C++ compiler, which was emitting a warning casting a reference to another reference. We can't use C++-style casting directly because it doesn't always work with incomplete types. In order to support the current use-cases, for references we switch to pointer space to perform the cast.

Reviewed By: qiongsiwu1

Differential Revision: https://reviews.llvm.org/D125482
2022-05-12 23:11:02 -04:00
Chen Zheng 0ca2b93cc2 [NFC] add the missing //@}
address code review comments for D123995
2022-05-12 22:43:35 -04:00
Florian Hahn 5890b30105
[LAA] Initial support for runtime checks with pointer selects.
Scaffolding support for generating runtime checks for multiple SCEV expressions
per pointer. The initial version just adds support for looking through
a single pointer select.

The more sophisticated logic for analyzing forks is in D108699

Reviewed By: huntergr

Differential Revision: https://reviews.llvm.org/D114487
2022-05-12 19:33:48 +01:00
Pedro Olsen Ferreira 28a0b94d22 Rename and fix ValueMap::resize to reserve
The underlying map type (DenseMap) has had its resize() function
renamed to reserve() as part of
c04fc7a60f (SVN 264026).

This is only visible when the member function is called, as it is
template type name dependent.

Differential Revision: https://reviews.llvm.org/D125387
2022-05-12 13:44:47 +01:00
Dmitry Vassiliev 7b22cf12ef [Intrinsics] Fix `nvvm_prmt` intrinsic attributes
`nvvm_prmt` doesn't seem to be `commutative`. nvvm also sets `IntrSpeculatable` for it.
Here is the doc https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-prmt

Reviewed By: tra, jchlanda

Differential Revision: https://reviews.llvm.org/D125423
2022-05-12 10:46:03 +02:00
bzcheeseman f156b51aec [LLVM][Casting.h] Update dyn_cast machinery to provide more control over how the casting is performed.
This patch expands the expressive capability of the casting utilities in LLVM by introducing several levels of configurability. By creating modular CastInfo classes we can enable projects like MLIR that need more fine-grained control over how a cast is actually performed to retain that control, while making it easy to express the easy cases (like a checked pointer to pointer cast).

The current implementation of Casting.h doesn't make it clear where the entry points for customizing the cast behavior are, so part of the motivation for this patch is adding that documentation. Another part of the motivation is to support using LLVM RTTI with a wider set of use cases, such as nullable value to value casts, or pointer to value casts (as in MLIR).

Reviewed By: lattner, rriddle

Differential Revision: https://reviews.llvm.org/D123901
2022-05-12 00:15:09 -04:00
Austin Kerbow 2db700215a [AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic
Adds an intrinsic/builtin that can be used to fine tune scheduler behavior. If
there is a need to have highly optimized codegen and kernel developers have
knowledge of inter-wave runtime behavior which is unknown to the compiler this
builtin can be used to tune scheduling.

This intrinsic creates a barrier between scheduling regions. The immediate
parameter is a mask to determine the types of instructions that should be
prevented from crossing the sched_barrier. In this initial patch, there are only
two variations. A mask of 0 means that no instructions may be scheduled across
the sched_barrier. A mask of 1 means that non-memory, non-side-effect inducing
instructions may cross the sched_barrier.

Note that this intrinsic is only meant to work with the scheduling passes. Any
other transformations that may move code will not be impacted in the ways
described above.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D124700
2022-05-11 13:22:51 -07:00
River Riddle 5a9a438a54 [TableGen] Refactor TableGenParseFile to no longer use a callback
Now that TableGen no longer relies on global Record state, we can allow
for the client to own the RecordKeeper and SourceMgr. Given that TableGen
internally still relies on the global llvm::SrcMgr, this method unfortunately
still isn't thread-safe.

Differential Revision: https://reviews.llvm.org/D125277
2022-05-11 11:55:33 -07:00
River Riddle 2ac3cd20ca [TableGen] Remove the use of global Record state
This commits removes TableGens reliance on managed static global record state
by moving the RecordContext into the RecordKeeper. The RecordKeeper is now
treated similarly to a (LLVM|MLIR|etc)Context object and is passed to static
construction functions. This is an important step forward in removing TableGens
reliance on global state, and in a followup will allow for users that parse tablegen
to parse multiple tablegen files without worrying about Record lifetime.

Differential Revision: https://reviews.llvm.org/D125276
2022-05-11 11:55:33 -07:00
Joseph Huber e7858a9fab [Cuda] Add initial support for wrapping CUDA images in the new driver.
This patch adds the initial support for wrapping CUDA images. This
requires changing some of the logic for how we bundle images. We now
need to copy the image for all kinds that are active for the
architecture. Then we need to run a separate wrapping job if the Kind is
Cuda. For cuda wrapping we need to use the `fatbinary` program from the
CUDA SDK to bundle all the binaries together. This is then passed to a
new function to perfom the actual module code generation that will be
implemented in a later patch.

Depends on D120273 D123471

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D123810
2022-05-11 07:30:23 -04:00
Nikita Popov c1bb4a881e [SCEVExpander] Deduplicate min/max expansion code (NFC) 2022-05-11 12:11:11 +02:00
Fraser Cormack c1d48b35d8 [SelectionDAG][VP] Rename VP sext/zext/trunc ISD opcodes
Rather than VP_SEXT/VP_ZEXT/VP_TRUNC, having
VP_SIGN_EXTEND/VP_ZERO_EXTEND/VP_TRUNCATE better matches their non-VP
counterparts.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125298
2022-05-11 10:25:51 +01:00
Arthur Eubanks 7e0802aeb5 [BasicAA] Fix order in which we pass MemoryLocations to alias()
D98718 caused the order of Values/MemoryLocations we pass to alias() to
be significant due to storing the offset in the PartialAlias case. But
some callers weren't audited and were still passing swapped arguments,
causing the returned PartialAlias offset to be negative in some
cases. For example, the newly added unittests would return -1
instead of 1.

Fixes #55343, a miscompile.

Reviewed By: asbirlea, nikic

Differential Revision: https://reviews.llvm.org/D125328
2022-05-10 12:05:38 -07:00
Matthias Braun cd19af74c0 Avoid 8 and 16bit switch conditions on x86
This adds a `TargetLoweringBase::getSwitchConditionType` callback to
give targets a chance to control the type used in
`CodeGenPrepare::optimizeSwitchInst`.

Implement callback for X86 to avoid i8 and i16 types where possible as
they often incur extra zero-extensions.

This is NFC for non-X86 targets.

Differential Revision: https://reviews.llvm.org/D124894
2022-05-10 10:00:10 -07:00
Nicolai Hähnle 38bb46523f GlobalISel: Trivial documentation and comment fixes
Differential Revision: https://reviews.llvm.org/D124808
2022-05-10 07:48:56 -05:00
Craig Topper 39e63bd2d8 [IR][CostModel] A scalable vector shuffle can't be an identity or reverse shuffle.
Even if the minimum number of elements is 1 and the length doesn't change,
we don't know what vscale is so we can't classify it as identity mask. Instead it
is a zero element splat.

For reverse, we shouldn't classify it as a reverse unless there are at least 2 elements
in the mask. This applies to both fixed and scalable vectors. For fixed vectors, a single
element would be an identity shuffle. For scalable vector it's a zero elt splat.

Reviewed By: sdesmalen, liaolucy

Differential Revision: https://reviews.llvm.org/D124655
2022-05-09 21:37:25 -07:00
Mircea Trofin c35ad9ee4f [mlgo] Support exposing more features than those supported by models
This allows the compiler to support more features than those supported by a
model. The only requirement (development mode only) is that the new
features must be appended at the end of the list of features requested
from the model. The support is transparent to compiler code: for
unsupported features, we provide a valid buffer to copy their values;
it's just that this buffer is disconnected from the model, so insofar
as the model is concerned (AOT or development mode), these features don't
exist. The buffers are allocated at setup - meaning, at steady state,
there is no extra allocation (maintaining the current invariant). These
buffers has 2 roles: one, keep the compiler code simple. Second, allow
logging their values in development mode. The latter allows retraining
a model supporting the larger feature set starting from traces produced
with the old model.

For release mode (AOT-ed models), this decouples compiler evolution from
model evolution, which we want in scenarios where the toolchain is
frequently rebuilt and redeployed: we can first deploy the new features,
and continue working with the older model, until a new model is made
available, which can then be picked up the next time the compiler is built.

Differential Revision: https://reviews.llvm.org/D124565
2022-05-09 18:01:21 -07:00
Michael Kruse 588ffdaf37 [polly] Fix compiler warning. NFC.
Fix the warning

   warning: 'polly::ScopViewer' has virtual functions but non-virtual destructor [-Wnon-virtual-dtor]

and for several other classes by inserting virtual destructors.
2022-05-09 14:04:40 -05:00
Michael Kruse 6b3b87376b [polly] migrate -polly-show to the new pass manager
Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D123678
2022-05-09 14:04:29 -05:00
Michael Kruse a6b399ad79 [PassManager] Implement DOTGraphTraitsViewer under NPM
Rename the legacy `DOTGraphTraits{Module,}{Viewer,Printer}` to the corresponding `DOTGraphTraits...WrapperPass`, and implement a new `DOTGraphTraitsViewer` with new pass manager.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D123677
2022-05-09 14:04:28 -05:00
Kazu Hirata 3f64f03289 [CodeGen] Clarify the semantics of ADDCARRY/SUBCARRY
This patch clarifies the semantics of ADDCARRY/SUBCARRY, specifically
stating that both the incoming and outgoing carries are active high.

Differential Revision: https://reviews.llvm.org/D125130
2022-05-09 10:17:00 -07:00
Alexey Bataev 9dc4ced204 [SLP]Try partial store vectorization if supported by target.
We can try to vectorize number of stores less than MinVecRegSize
/ scalar_value_size, if it is allowed by target. Gives an extra
opportunity for the vectorization.

Fixes PR54985.

Differential Revision: https://reviews.llvm.org/D124284
2022-05-09 09:48:15 -07:00
Nathan Sidwell bc150a07f1 [demangler] No need to space adjacent template closings
With the demangler parenthesizing 'a >> b' inside template parameters,
because C++11 parsing of >> there, we don't really need to add spaces
between adjacent template arg closing '>' chars.  In 2022, that just
looks odd.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D123134
2022-05-09 06:14:44 -07:00
Nathan Sidwell e48cd7088b [demangler] Buffer peeking needs buffer
The output buffer has a 'back' member, which returns NUL when you try
it with an empty buffer.  But there are no use cases that need that
additional functionality.  This makes the 'back' member behave more
like STL containers' back members.  (It still returns a value, not a
reference.)

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D123201
2022-05-09 04:17:22 -07:00
Philipp Tomsich 91b24b0180 [AArch64] Ampere1 does not support MTE
The initial support for the Ampere1 mistakenly signalled support for
the MTE feature.  However, the core does not include the optional MTE
functionality.

Update the target parser to not include MTE for Ampere1.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D125191
2022-05-09 11:29:42 +02:00
Sam McCall d44ffd631c [Bitstream] Only consider flushing to file on block boundaries
The goal of flushing to disk is to keep a reasonable bound on peak memory usage.
With a a default threshold of 512MB (and most BitstreamWriters having no backing
file at all), checking after every byte whether to flush seems excessive.

This change makes clangd's unittests run 5% faster (in opt), so it's not
actually free even in the case with no backing file. Likely there are more
important workloads where it makes some difference.

Differential Revision: https://reviews.llvm.org/D125145
2022-05-07 16:57:03 +02:00
Amaury Séchet c2c259224b const char* for LLVMTargetMachineEmitToFile's argument
The `LLVMTargetMachineEmitToFile` takes a `char* Filename` right now, but it doesn't modify it.
This is annoying to use in the case where you want to pass a const string, because you either have to remove the const, or copy it somewhere else and pass that. Either way, it's not very nice.

I added a const and clang formatted it. This shouldn't break any ABI in my opinion.
I'm sorry but I didn't know whom to put as reviewer for this, so I chose someone with a lot of commits from the .cpp file.

Reviewed By: deadalnix

Differential Revision: https://reviews.llvm.org/D124453
2022-05-07 14:40:55 +00:00
Serge Pavlov eb28da89a6 [InstCombine] Remove side effect of replaced constrained intrinsics
If a constrained intrinsic call was replaced by some value, it was not
removed in some cases. The dangling instruction resulted in useless
instructions executed in runtime. It happened because constrained
intrinsics usually have side effect, it is used to model the interaction
with floating-point environment. In some cases side effect is actually
absent or can be ignored.

This change adds specific treatment of constrained intrinsics so that
their side effect can be removed if it actually absents.

Differential Revision: https://reviews.llvm.org/D118426
2022-05-07 19:04:11 +07:00
Paul Walker 702c4ade22 [ISD::IndexType] Helper functions for common queries.
Add helper functions to query the signed and scaled properties
of ISD::IndexType along with functions to change them.

Remove setIndexType from MaskedGatherSDNode because it only has
one usage and typically should only be changed alongside its
index operand.

Minimise the direct use of the enum values to lay the groundwork
for more refactoring.

Differential Revision: https://reviews.llvm.org/D123347
2022-05-07 11:23:42 +01:00
Sam McCall 0a83ff83af [FuzzMutate] Move LLVM module (de)serialization from FuzzerCLI -> IRMutator. NFC
These are not directly related to the CLI, and are mostly (always?) used when
mutating the modules as part of fuzzing.

Motivation: split FuzzerCLI into its own library that does not depend on IR.
Subprojects that don't use IR should be be fuzzed without the dependency.

Differential Revision: https://reviews.llvm.org/D125080
2022-05-07 12:09:49 +02:00
Peter S. Housel 981523b2e4 [ORC-RT][ORC] Handle dynamic unwind registration for libunwind
This changes the ELFNix platform Orc runtime to use, when available,
the __unw_add_dynamic_eh_frame_section interface provided by libunwind
for registering .eh_frame sections loaded by JITLink. When libunwind
is not being used for unwinding, the ELFNix platform detects this and
defaults to the __register_frame interface provided by libgcc_s.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D114961
2022-05-06 14:00:29 -07:00
Craig Topper 2ca78d2bdf [SelectionDAG] Improve asserts in SelectionDAG::getSelect.
The VT passed in must match the type of LHS and RHS.
Previously we only checked that the vectorness matched.
2022-05-06 09:41:11 -07:00
Marco Elver 9ae87b5973 [Instrumentation] Share InstrumentationIRBuilder between TSan and SanCov
Factor our InstrumentationIRBuilder and share it between ThreadSanitizer
and SanitizerCoverage. Simplify its usage at the same time (use function
of passed Instruction or BasicBlock).

This class may be used in other instrumentation passes in future.

NFCI.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D125038
2022-05-06 09:15:17 +02:00
Sam McCall ba0d50ad7e [Support] Fix UB in BumpPtrAllocator when first allocation is zero.
BumpPtrAllocator::Allocate() is marked __attribute__((returns_nonnull)) when the
compiler supports it, which makes it UB to return null.

When there have been no allocations yet, the current slab is [nullptr, nullptr).
A zero-sized allocation fits in this range, and so Allocate(0, 1) returns null.

There's no explicit docs whether Allocate(0) is valid. I think we have to assume
that it is:
 - the implementation tries to support it (e.g. >= tests instead of >)
 - malloc(0) is allowed
 - requiring each callsite to do a check is bug-prone
 - I found real LLVM code that makes zero-sized allocations

Differential Revision: https://reviews.llvm.org/D125040
2022-05-06 08:57:27 +02:00
Ilia Diachkov 0098f2aebb [SPIRV] Add SPIR-V specific intrinsics, two passes and tests
The patch adds SPIR-V specific intrinsics required to keep information
critical to SPIR-V consistency (types, constants, etc.) during translation
from IR to MIR.

Two related passes (SPIRVEmitIntrinsics and SPIRVPreLegalizer) and several
LIT tests (passed with this change) have also been added.

It also fixes the issue with opaque pointers in SPIRVGlobalRegistry.cpp
and the mismatch of the data layout between the SPIR-V backend and clang
(Issue #55122).

Differential Revision: https://reviews.llvm.org/D124416

Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
2022-05-06 03:02:00 +03:00
Lang Hames 16dcbb53dc [ORC] Return ExecutorAddrs rather than JITEvaluatedSymbols from LLJIT::lookup.
Clients don't care about linkage, and ExecutorAddr is much more ergonomic.
2022-05-05 13:56:00 -07:00
Lang Hames 98616cfc02 [ORC] Add an ExecutorAddr::toPtr overload for function types.
In the common case of converting an ExecutorAddr to a function pointer type,
this eliminates the need for the '(*)' boilerplate to explicitly specify a
function pointer. E.g.:

auto *F = A.toPtr<int(*)()>();

can now be written as

auto *F = A.toPtr<int()>();
2022-05-05 12:37:23 -07:00
David Blaikie 0d8cb8b399 DWARFVerifier: Verify CU/TU index overlap issues
Discovered in a large object that would need a 64 bit index (but the
cu/tu index format doesn't include a 64 bit offset/length mode in
DWARF64 - a spec bug) but instead binutils dwp overflowed the offsets
causing overlapping regions.
2022-05-05 18:18:53 +00:00
Serge Pavlov e1554ac63a Revert "[InstCombine] Remove side effect of replaced constrained intrinsics"
This reverts commit 83914ee96f.
The change caused discussion: https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20220502/1034841.html
2022-05-06 01:09:16 +07:00
Brian Tracy 87a55137e2 Fix "the the" typo in documentation and user facing strings
There are many more instances of this pattern, but I chose to limit this change to .rst files (docs), anything in libcxx/include, and string literals. These have the highest chance of being seen by end users.

Reviewed By: #libc, Mordante, martong, ldionne

Differential Revision: https://reviews.llvm.org/D124708
2022-05-05 17:52:08 +02:00
Xing Xue e5926906eb [XCOFF][AIX] Use unique section names for LSDA and EH info sections with -ffunction-sections
Summary:
When -ffunction-sections is on, this patch makes the compiler to generate unique LSDA and EH info sections for functions on AIX by appending the function name to the section name as a suffix. This will allow the AIX linker to garbage-collect unused function.

Reviewed by: MaskRay, hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D124855
2022-05-05 09:01:36 -04:00
Nikita Popov 9678936f18 [DAGCombine] Fold (X & ~Y) | Y with truncated not
This extends the (X & ~Y) | Y to X | Y fold to also work if ~Y is
a truncated not (when taking into account the mask X). This is
done by exporting the infrastructure added in D124856 and reusing
it here.

I've retained the old value of AllowUndefs=false, though probably
this can be switched to true with extra test coverage.

Differential Revision: https://reviews.llvm.org/D124930
2022-05-05 11:10:11 +02:00
Chuanqi Xu 405bf90235 [NFC] [Pipelines] Hoist CoroCleanup as Module Pass
This is similar to previous patch https://reviews.llvm.org/D123925. It
could also reduce the time we call declaresCoroCleanupIntrinsics. And it
is helpful for further changes.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D124362
2022-05-05 15:15:09 +08:00
Serge Pavlov 83914ee96f [InstCombine] Remove side effect of replaced constrained intrinsics
If a constrained intrinsic call was replaced by some value, it was not
removed in some cases. The dangling instruction resulted in useless
instructions executed in runtime. It happened because constrained
intrinsics usually have side effect, it is used to model the interaction
with floating-point environment. In some cases it is correct behavior
but often the side effect is actually absent or can be ignored.

This change adds specific treatment of constrained intrinsics so that
their side effect can be removed if it actually absents.

Differential Revision: https://reviews.llvm.org/D118426
2022-05-05 12:02:42 +07:00
Junfeng Dong a0fb387941 [DebugInfo] Give warning instead of error for premature terminator in .debug_aranges section.
llvm-profgen gives error message when the input binary contains premature terminator in .debug_aranges section. These zero length items point to some rodata with zero size type in embed Rust Library. Considering Zero-Sized Types are a valid feature in Rust. They are not real error. This change makes the "error:" message into a warning to avoid misleading.

Why do we still want a warning on such case? because it doesn't follow dwarf standard.  https://bugs.llvm.org/show_bug.cgi?id=46805 contains early discussion.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D124121
2022-05-04 15:21:58 -07:00
serge-sans-paille 7030654296 [iwyu] Handle regressions in libLLVM header include
Running iwyu-diff on LLVM codebase since fa5a4e1b95 detected a few
regressions, fixing them.

Differential Revision: https://reviews.llvm.org/D124847
2022-05-04 08:32:38 +02:00
Luboš Luňák 8ef5710e63 [ThreadPool] add ability to group tasks into separate groups
This is needed for parallelizing of loading modules symbols in LLDB
(D122975). Currently LLDB can parallelize indexing symbols
when loading a module, but modules are loaded sequentially. If LLDB
index cache is enabled, this means that the cache loading is not
parallelized, even though it could. However doing that creates
a threadpool-within-threadpool situation, so the number of threads
would not be properly limited.

This change adds ThreadPoolTaskGroup as a simple type that can be
used with ThreadPool calls to put tasks into groups that can be
independently waited for (even recursively from within a task)
but still run in the same thread pool.

Differential Revision: https://reviews.llvm.org/D123225
2022-05-04 06:16:55 +02:00
Alex Borcan afaa56df7a Implement support for __llvm_addrsig for MachO in llvm-mc
The __llvm_addrsig section is a section that the linker needs for safe icf.
This was not yet implemented for MachO - this is the implementation.
It has been tested with a safe deduplication implementation inside lld.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D123751
2022-05-03 18:19:18 -04:00
Philipp Tomsich 64816e68f4 [AArch64] Support for Ampere1 core
Add support for the Ampere Computing Ampere1 core.
Ampere1 implements the AArch64 state and is compatible with ARMv8.6-A.

Differential Revision: https://reviews.llvm.org/D117112
2022-05-03 15:54:02 +01:00
Nathan Sidwell ed2d4da732 [demangler] Fold expressions of .* and ->*
(Exitingly) a fold expression's operators include .* and ->*, but we
failed to demangle them as we categorize those as MemberExprs, not
BinaryExprs.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D123305
2022-05-03 06:45:25 -07:00
Simon Tatham 32814df442 [Windows] Fix handling of \" in program name on cmd line.
Bugzilla #47579: if you invoke clang on Windows via a pathname in
which a quoted section closes just after a backslash, e.g.

  "C:\Program Files\Whatever\"clang.exe

then cmd.exe and CreateProcess will correctly find the binary, because
when they parse the program name at the start of the command line,
they don't regard the \ before the " as having any kind of escaping
effect. This is different from the behaviour of the Windows standard C
library when it parses the rest of the command line, which would
consider that \" not to close the quoted string.

But this confuses windows::GetCommandLineArguments, because the
Windows API function GetCommandLineW() will return a command line
containing that \" sequence, and cl::TokenizeWindowsCommandLine will
tokenize the whole string according to the C library's rules. So it
will misidentify where the program name stops and the arguments start.

To fix this, I've introduced a new variant function
cl::TokenizeWindowsCommandLineFull(), intended to be applied to the
string returned from GetCommandLineW(). It parses the first word of
the command line according to CreateProcess's rules, considering \ to
never be an escaping character; thereafter, it switches over to the C
library rules for the rest of the command line.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D122914
2022-05-03 11:57:50 +01:00
Igor Kirillov 4e5e042d9a [LoopVectorize] Support reductions that store intermediary result
Adds ability to vectorize loops containing a store to a loop-invariant
address as part of a reduction that isn't converted to SSA form due to
lack of aliasing info. Runtime checks are generated to ensure the store
does not alias any other accesses in the loop.

Ordered fadd reductions are not yet supported.

Differential Revision: https://reviews.llvm.org/D110235
2022-05-03 10:12:30 +01:00
Jeremy Morse 1d712c3818 [DebugInfo][InstrRef] Don't generate redundant DBG_PHIs
In SelectionDAG, DBG_PHI instructions are created to "read" physreg values
and give them an instruction number, when they can't be traced back to a
defining instruction. The most common scenario if arguments to a function.
Unfortunately, if you have 100 inlined methods, each of which has the same
"this" pointer, then the 100 dbg.value instructions become 100
DBG_INSTR_REFs plus 100 DBG_PHIs, where only one DBG_PHI would suffice.

This patch adds a vreg cache for MachienFunction::salvageCopySSA, if we've
already traced a value back to the start of a block and created a DBG_PHI
then it allows us to re-use the DBG_PHI, as well as reducing work.

Differential Revision: https://reviews.llvm.org/D124517
2022-05-03 09:56:12 +01:00
Markus Lavin dd8cf372c5 [NFC] Minimal refactor of TTI to avoid clangsa complaint
Differential Revision: https://reviews.llvm.org/D124754
2022-05-03 10:43:48 +02:00
David Green 6f81903e89 [LV][SLP] Mark fptosi_sat as vectorizable
This adds fptosi_sat and fptoui_sat to the list of trivially
vectorizable functions, mainly so that the loop vectorizer can vectorize
the instruction. Marking them as trivially vectorizable also allows them
to be SLP vectorized, and Scalarized.

The signature of a fptosi_sat requires two type overrides
(@llvm.fptosi.sat.v2i32.v2f32), unlike other intrinsics that often only
take a single. This patch alters hasVectorInstrinsicOverloadedScalarOpd
to isVectorIntrinsicWithOverloadTypeAtArg, so that it can mark the first
operand of the intrinsic as a overloaded (but not scalar) operand.

Differential Revision: https://reviews.llvm.org/D124358
2022-05-03 09:32:34 +01:00
Chris Bieneman 966c40aea6 [Object][DX] Identify DXBC file magic
This adds support to llvm::identify_magic to detect DXBC and classify
it as the dxcontainer format.
2022-05-02 16:24:36 -05:00
Bardia Mahjour 363b3a645a fix warning caused by ef4ecc3cef 2022-05-02 17:06:27 -04:00
Bardia Mahjour ef4ecc3cef [LoopCacheAnalysis] Consider dimension depth of the subscript reference when calculating cost
Reviewed By: congzhe, etiotto

Differential Revision: https://reviews.llvm.org/D123400
2022-05-02 16:49:10 -04:00
Chris Bieneman 55e13a6bc0 [NFC] Fix warning reported on bots 2022-05-02 15:02:44 -05:00
Chris Bieneman 4070aa0156 [Object][DX] Initial DXContainer parsing support
This patch begins adding DXContainer parsing support to libObject.
Following the pattern used by ELFFile my goal here is to write a
standalone DXContainer parser and later write an adapter interface to
support a subset of the ObjectFile interfaces so that we can add
limited objdump support. I will also be adding ObjectYAML support to
help drive testing of the object tools and MC-level object writers as
those come together.

DXContainer is a slightly odd format. It is arranged in "parts" that
are semantically similar to sections, but it doesn't support symbol
listing.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D124643
2022-05-02 13:56:33 -05:00
Jonas Paulsson 304378fd09 Reapply "[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building
libcalls." (was 0f8c626). This reverts commit 14d9390.

The patch previously failed to recognize cases where user had defined a
function alias with an identical name as that of the library
function. Module::getFunction() would then return nullptr which is what the
sanitizer discovered.

In this updated version a new function isLibFuncEmittable() has as well been
introduced which is now used instead of TLI->has() anytime a library function
is to be emitted . It additionally also makes sure there is e.g. no function
alias with the same name in the module.

Reviewed By: Eli Friedman

Differential Revision: https://reviews.llvm.org/D123198
2022-05-02 19:37:00 +02:00
Raghu Maddhipatla c685f82126 [mlir][OpenMP] Add omp.cancel and omp.cancellationpoint.
Reviewed By: kiranchandramohan, peixin, shraiysh

Differential Revision: https://reviews.llvm.org/D123828
2022-05-02 12:23:38 -05:00
Nikita Popov 95fedfab6c [InstCombine] Handle non-canonical GEP index in indexed compare fold (PR55228)
Normally the index type will already be canonicalized here, but
this is not guaranteed depending on visitation order. The code
was already accounting for a potentially needed sext, but a trunc
may also be needed.

Add a ConstantExpr::getSExtOrTrunc() helper method to make this
simpler. This matches the corresponding IRBuilder method in behavior.

Fixes https://github.com/llvm/llvm-project/issues/55228.
2022-05-02 17:56:01 +02:00
Phoebe Wang 7c04454227 [ArgPromotion][Attributor] Update min-legal-vector-width when do promotion
X86 codegen uses function attribute `min-legal-vector-width` to select the proper ABI. The intention of the attribute is to reflect user's requirement when they passing or returning vector arguments. So Clang front-end will iterate the vector arguments and set `min-legal-vector-width` to the width of the maximum for both caller and callee.

It is assumed any middle end optimizations won't care of the attribute expect inlining and argument promotion.
- For inlining, we will propagate the attribute of inlined functions because the inlining functions become the newer caller.
- For argument promotion, we check the `min-legal-vector-width` of the caller and callee and refuse to promote when they don't match.

The problem comes from the optimizations' combination, as shown by https://godbolt.org/z/zo3hba8xW. The caller `foo` has two callees `bar` and `baz`. When doing argument promotion, both `foo` and `bar` has the same `min-legal-vector-width`. So the argument was promoted to vector. Then the inlining inlines `baz` to `foo` and updates `min-legal-vector-width`, which results in ABI mismatch between `foo` and `bar`.

This patch fixes the problem by expanding the concept of `min-legal-vector-width` to indicator of functions arguments. That says, any passes touch functions arguments have to set `min-legal-vector-width` to the value reflects the width of vector arguments. It makes sense to me because any arguments modifications are ABI related and should response for the ABI compatibility.

Differential Revision: https://reviews.llvm.org/D123284
2022-05-02 14:13:05 +08:00
Congzhe Cao 3d6fe7ace8 [LoopCacheAnalysis] Use stable_sort() to avoid non-deterministic print output
The print output of loop cache analysis sometimes has a non-deterministic order
and therefore we have been using `CHECK-DAG` in its lit tests. This patch changes
the sorting of LoopCosts to llvm::stable_sort() where we compare loop cost numbers
and sort the loops. In case of the same loop cost numbers, llvm::stable_sort() now
would output a deterministic loop order.

Reviewed By: Meinersbur, fhahn, #loopoptwg

Differential Revision: https://reviews.llvm.org/D124725
2022-05-02 00:49:45 -04:00
Matt Arsenault 3939e99aae llvm-reduce: Add pass to reduce IR references from MIR
This is typically the first thing I do when reducing a new testcase
until the IR section can be deleted.
2022-05-01 17:40:53 -04:00
Jack Andersen 09325d3606 [CAPI] Expose CastInst::getCastOpcode in C API
Reviewed By: deadalnix

Differential Revision: https://reviews.llvm.org/D91514
2022-04-30 18:40:04 -04:00
Hongtao Yu e36786d15f [CSSPGO] Rename ProfileIsCSNested and ProfileIsCSFlat
To be more clear and definitive, I'm renaming `ProfileIsCSFlat` back to `ProfileIsCS` which stands for full context-sensitive flat profiles.  `ProfileIsCSNested` is now renamed to `ProfileIsPreInlined` and is extended to be applicable for CS flat profiles too. More specifically, `ProfileIsPreInlined` is for any kind of profiles (flat or nested) that contain 'ShouldBeInlined' contexts. The flag is encoded in the profile summary section for extbinary profiles and is computed on-the-fly for text profiles.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D122602
2022-04-29 17:03:52 -07:00
James Y Knight 02aa795785 Revert "[JumpThreading][NFC][CompileTime] Do not recompute BPI/BFI analyzes"
This change has caused non-reproducibility of a self-build of Clang
when using NewPM and providing profile data.

This reverts commit 35f38583d2.
2022-04-29 21:15:47 +00:00
Congzhe Cao c428a3d2a0 [LoopCacheAnalysis] Enable delinearization of fixed sized arrays
Currently loop cache cost (LCC) cannot analyze fix-sized arrays
since it cannot delinearize them. This patch adds the capability
to delinearize fix-sized arrays to LCC. Most of the code is ported
from DependenceAnalysis.cpp and some refactoring will be done in a
next patch.

Reviewed By: #loopoptwg, Meinersbur

Differential Revision: https://reviews.llvm.org/D122857
2022-04-29 16:01:27 -04:00
David Penry dcb77643e3 Reapply [CodeGen][ARM] Enable Swing Module Scheduling for ARM
Fixed "private field is not used" warning when compiled
with clang.

original commit: 28d09bbbc3
reverted in: fa49021c68

------

This patch permits Swing Modulo Scheduling for ARM targets
turns it on by default for the Cortex-M7.  The t2Bcc
instruction is recognized as a loop-ending branch.

MachinePipeliner is extended by adding support for
"unpipelineable" instructions.  These instructions are
those which contribute to the loop exit test; in the SMS
papers they are removed before creating the dependence graph
and then inserted into the final schedule of the kernel and
prologues. Support for these instructions was not previously
necessary because current targets supporting SMS have only
supported it for hardware loop branches, which have no
loop-exit-contributing instructions in the loop body.

The current structure of the MachinePipeliner makes it difficult
to remove/exclude these instructions from the dependence graph.
Therefore, this patch leaves them in the graph, but adds a
"normalization" method which moves them in the schedule to
stage 0, which causes them to appear properly in kernel and
prologues.

It was also necessary to be more careful about boundary nodes
when iterating across successors in the dependence graph because
the loop exit branch is now a non-artificial successor to
instructions in the graph. In additional, schedules with physical
use/def pairs in the same cycle should be treated as creating an
invalid schedule because the scheduling logic doesn't respect
physical register dependence once scheduled to the same cycle.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D122672
2022-04-29 10:54:39 -07:00
Joe Nash 813e521e55 [AMDGPU] Add gfx11 subtarget ELF definition
This is the first patch of a series to upstream support for the new
subtarget.

Contributors:
Jay Foad <jay.foad@amd.com>
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

Patch 1/N for upstreaming AMDGPU gfx11 architectures.

Reviewed By: foad, kzhuravl, #amdgpu

Differential Revision: https://reviews.llvm.org/D124536
2022-04-29 12:27:17 -04:00
Joseph Huber 643c9b22ef [OpenMP] Make generating offloading entries more generic
This patch moves the logic for generating the offloading entries to the
OpenMPIRBuilder. This makes it easier to re-use in other places, such as
for OpenMP support in Flang or using the same method for generating
offloading entires for other languages like Cuda.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D123460
2022-04-29 09:14:31 -04:00
NAKAMURA Takumi 2e6657b340 llvm/Support/Debug.h: Suppress warnings with -Asserts. [-Wunused-variable]
Re. setCurrentDebugTypes(X,N), the only user is llvm-ml.cpp (exc. DebugTests)
since llvmorg-15-init-8355-g82ecf9a0b1b3.

FIXME: X and N are evaluated regardless of NDEBUG.
Could we avoid evaluating (but w/o warnings) with NDEBUG?
2022-04-29 21:01:47 +09:00
Florian Hahn fb4113ef0c
[Passes] Remove legacy LoopUnswitch pass.
The legacy LoopUnswitch pass is only used in the legacy pass manager
pipeline, which is deprecated.

The NewPM replacement is SimpleLoopUnswitch and I think it is time to
remove the legacy LoopUnswitch code.

Fixes #31000.

Reviewed By: aeubanks, Meinersbur, asbirlea

Differential Revision: https://reviews.llvm.org/D124376
2022-04-29 10:30:49 +01:00
Mircea Trofin 49942d595f [NFC] remove const from FunctionPropertiesAnalysis::run, keep on Result
The goal in 75881d8b02 was just modifying what `Result` is, didn't
need to also modify ::run.
2022-04-28 15:10:21 -07:00
David Penry fa49021c68 Revert "[CodeGen][ARM] Enable Swing Module Scheduling for ARM"
This reverts commit 28d09bbbc3
while I investigate a buildbot failure.
2022-04-28 13:29:27 -07:00
David Penry 28d09bbbc3 [CodeGen][ARM] Enable Swing Module Scheduling for ARM
This patch permits Swing Modulo Scheduling for ARM targets
turns it on by default for the Cortex-M7.  The t2Bcc
instruction is recognized as a loop-ending branch.

MachinePipeliner is extended by adding support for
"unpipelineable" instructions.  These instructions are
those which contribute to the loop exit test; in the SMS
papers they are removed before creating the dependence graph
and then inserted into the final schedule of the kernel and
prologues. Support for these instructions was not previously
necessary because current targets supporting SMS have only
supported it for hardware loop branches, which have no
loop-exit-contributing instructions in the loop body.

The current structure of the MachinePipeliner makes it difficult
to remove/exclude these instructions from the dependence graph.
Therefore, this patch leaves them in the graph, but adds a
"normalization" method which moves them in the schedule to
stage 0, which causes them to appear properly in kernel and
prologues.

It was also necessary to be more careful about boundary nodes
when iterating across successors in the dependence graph because
the loop exit branch is now a non-artificial successor to
instructions in the graph. In additional, schedules with physical
use/def pairs in the same cycle should be treated as creating an
invalid schedule because the scheduling logic doesn't respect
physical register dependence once scheduled to the same cycle.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D122672
2022-04-28 13:01:18 -07:00
Mircea Trofin 75881d8b02 [NFC] const-ed the return type of FunctionPropertiesAnalysis
The result is a data bag, this makes sure it's signaled to a user that
the data can't be mutated when, for example, doing something like:

auto &R = FAM.getResult<FunctionPropertiesAnalysis>(F)
...
R.Uses++
2022-04-28 12:42:16 -07:00
David Tenty 8042699a30 [LLVM] Add exported visibility style for XCOFF
For the AIX linker, under default options, global or weak symbols which
have no visibility bits set to zero (i.e. no visibility, similar to ELF
default) are only exported if specified on an export list provided to
the linker. So AIX has an additional visibility style called
"exported" which indicates to the linker that the symbol should
be explicitly globally exported.

This change maps "dllexport" in the LLVM IR to correspond to XCOFF
exported as we feel this best models the intended semantic (discussion
on the discourse RFC thread: https://discourse.llvm.org/t/rfc-adding-exported-visibility-style-to-the-ir-to-model-xcoff-exported-visibility/61853)
and allows us to enable writing this visibility for the AIX target
in the assembly path.

Reviewed By: DiggerLin

Differential Revision: https://reviews.llvm.org/D123951
2022-04-28 14:56:00 -04:00
Alexey Bataev 75e1cf4a6a [COST]Improve cost model for shuffles in SLP.
Introduced masks where they are not added and improved target dependent
cost models to avoid returning of the incorrect cost results after
adding masks.

Differential Revision: https://reviews.llvm.org/D100486
2022-04-28 10:04:41 -07:00
Pavel Samolysov 6b825e50f7 [ArgPromotion] Change the condition to check the promotion limit
The condition should be 'ArgParts.size() > MaxElements', so that if we
have exactly 3 elements in the 'ArgParts' vector, the promotion should
be allowed because the 'MaxElement' threshold is not exceeded yet.

The default value for 'MaxElement' has been decreased to 2 in order
to avoid an actual change in argument promoting behavior. However,
this changes byval argument transformation behavior by allowing
adding not more than 2 arguments to the function instead of 3 allowed
before.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D124178
2022-04-28 09:42:58 -07:00
Alexey Bataev 9861ca0c23 Revert "[COST]Improve cost model for shuffles in SLP."
This reverts commit 29a470e380 to fix
a crash reported in https://reviews.llvm.org/D100486#3479989.
2022-04-28 08:11:56 -07:00
gbreynoo 5420834aad [demangler] Fix demangling a template argument which happens to be a null pointer
As seen in https://github.com/llvm/llvm-project/issues/51854
llvm-cxxfilt was having trouble demangling the case "_Z1fIDnLDn0EEvv".
We handled the "LDNE" case and "LPi0E" but not "LDn0E". This change adds
that handling.

Differential Revision: https://reviews.llvm.org/D124010
2022-04-28 15:55:26 +01:00
Pavel Samolysov 744a837838 [ArgPromotion] Rename variables according to the code style. NFC
Some loop counters ('i', 'e') and variables ('type') were named not
in accordance with the code style and clang-tidy issues warnings
about the using of such variables. This patch renames the variables
and fixes some typos in the comments within the source file.

Differential Revision: https://reviews.llvm.org/D123662
2022-04-28 15:32:05 +02:00
Chris Jackson c792884589 [Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2]
Reland 3f2b76ec90 with the test corrected
to require x86-registered-target.

Differential Revision: https://reviews.llvm.org/D120169
2022-04-28 14:21:56 +01:00
Chris Jackson cd5f9efc4d Revert "[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2]"
This reverts commit 3f2b76ec90.
2022-04-28 14:07:31 +01:00
Chris Jackson 3f2b76ec90 [Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2]
Reland commit 74273d575f following a fix
for a memory leak. The DVIRecoveryRecord vectors now use unique_ptr.

Differential Revision: https://reviews.llvm.org/D120169
2022-04-28 13:55:49 +01:00
Michael Forster cfb4e78252 Revert "[llvm-pdbutil] Add options to only dump symbol record at specified offset and its parents or children with spcified depth."
This reverts commit a3b7cb015f.

symbol-offset.test fails under MSAN:

[  1] ; RUN: llvm-pdbutil yaml2pdb %p/Inputs/symbol-offset.yaml --pdb=%t.pdb [FAIL]
llvm-pdbutil yaml2pdb <REDACTED>/llvm/test/tools/llvm-pdbutil/Inputs/symbol-offset.yaml --pdb=<REDACTED>/tmp/symbol-offset.test/symbol-offset.test.tmp.pdb
==9283==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x55f975e5eb91 in __libcpp_tls_set <REDACTED>/include/c++/v1/__threading_support:428:12
    #1 0x55f975e5eb91 in set_pointer <REDACTED>/include/c++/v1/thread:196:5
    #2 0x55f975e5eb91 in void* std::__msan::__thread_proxy<std::__msan::tuple<std::__msan::unique_ptr<std::__msan::__thread_struct, std::__msan::default_delete<std::__msan::__thread_struct> >, llvm::parallel::detail::(anonymous namespace)::ThreadPoolExecutor::ThreadPoolExecutor(llvm::ThreadPoolStrategy)::'lambda'()::operator()() const::'lambda'()> >(void*) <REDACTED>/include/c++/v1/thread:285:27
    #3 0x7f74a1e55b54 in start_thread (<REDACTED>/libpthread.so.0+0xbb54) (BuildId: 64752de50ebd1a108f4b3f8d0d7e1a13)
    #4 0x7f74a1dc9f7e in clone (<REDACTED>/libc.so.6+0x13cf7e) (BuildId: 7cfed7708e5ab7fcb286b373de21ee76)
2022-04-28 12:42:31 +02:00
Ties Stuij 051deb2d9d [ARM] add Armv9 build attribute
The build attribute number can be found in the Arm ABI addenda32 document:
https://github.com/ARM-software/abi-aa/blob/main/addenda32/addenda32.rst#335target-related-attributes

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D124090
2022-04-28 10:48:26 +01:00
Jay Foad 515f890033 [CodeGen] Remove an outdated comment in MachinePointerInfo
This comment has been untrue since D39758 changed
MachinePointerInfo to store AddrSpace separately from V.
2022-04-28 09:06:36 +01:00
Nikita Popov b9dc565147 [GVN] Encode GEPs in offset representation
When using opaque pointers, convert GEPs into offset representation
of the form P + V1 * Scale1 + V2 * Scale2 + ... + ConstantOffset.
This allows us to recognize equivalent address calculations even if
the GEPs don't use the same source element type.

This fixes an opaque pointer codegen regression seen in rustc.

Differential Revision: https://reviews.llvm.org/D124527
2022-04-28 09:32:05 +02:00
Max Kazantsev 35f38583d2 [JumpThreading][NFC][CompileTime] Do not recompute BPI/BFI analyzes
They can already be available, and even if not, DT/LI can be available.
We should not recompute them. Old PM is unchanged because it would
require changing dependencies, and we don't care enough about it.

Differential Revision: https://reviews.llvm.org/D124439
Reviewed By: nikic, aeubanks
2022-04-28 10:46:08 +07:00
Fangrui Song c74a706893 [LegacyPM] Remove ThreadSanitizerLegacyPass
Using the legacy PM for the optimization pipeline was deprecated in 13.0.0.
Following recent changes to remove non-core features of the legacy
PM/optimization pipeline, remove ThreadSanitizerLegacyPass.

Reviewed By: #sanitizers, vitalybuka

Differential Revision: https://reviews.llvm.org/D124209
2022-04-27 16:25:41 -07:00
Kirill Stoimenov 761366e6ae Revert "[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2]"
This reverts commit 74273d575f.

Buildbot: https://lab.llvm.org/buildbot/#/builders/5/builds/22795
Failing with memory leak.
2022-04-27 23:11:48 +00:00
Matt Arsenault 7c2db66632 llvm-reduce: Support multiple MachineFunctions
The current testcase I'm trying to reduce only reproduces with IPRA
enabled and requires handling multiple functions.

The only real difference vs. the IR is the extra indirect to look for
the underlying MachineFunction, so treat the ReduceWorkItem as the
module instead of the function.

The ugliest piece of this is really the ugliness of
MachineModuleInfo. It not only tracks actual module state, but has a
number of transient fields used for isel and/or the asm printer. These
shouldn't do any harm for the use here, though they should be
separated out.
2022-04-27 18:11:59 -04:00
Zequan Wu a3b7cb015f [llvm-pdbutil] Add options to only dump symbol record at specified offset and its parents or children with spcified depth.
Right now, if we want to dump symbol at specified offset, we need to use `grep`.
And it can only show surrounding symbols in layout (not in lexical scope sense).

This adds similar options to `dump` command as `llvm-dwarfdump` to allow users
to dump symbol record at specified offset and its parents or children with
spcified depth.

`--symbol-offset=` must be used with `--modi` to dump only one symbol at given
offset.

`--show-parents`/`--show-children` must be used with `--symbol-offset` to
dump all symbols that are parents/children of the symbol at given offset.

`--parent-recurse-depth`/`--children-recurse-depth` must be used with
`--show-parents`/`--show-children` to specify the max up/down depth.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D124317
2022-04-27 14:37:35 -07:00
Alexey Bataev 29a470e380 [COST]Improve cost model for shuffles in SLP.
Introduced masks where they are not added and improved target dependent
cost models to avoid returning of the incorrect cost results after
adding masks.

Differential Revision: https://reviews.llvm.org/D100486
2022-04-27 10:56:26 -07:00
Denis Antrushin 4059770af5 [StatepointLowering] Only export STATEPOINT results if used in nonlocal blocks.
Cuurently we always export STATEPOINT results (GC pointers lowered via VRegs)
to virtual registers. When processing gc.relocate instructions we have to
generate CopyFromRegs node and then export it to VReg again if gc.relocate
is used in other basic blocks. This results in generation of extra COPY MIR
instruction if statepoint and its gc.relocate are in the same BB, but gc.relocate
result is used in other blocks.

This patch changes this behavior to export statepoint results only if used
in other basic blocks. For local uses StatepointLoweringState.(get|set)Location()
API is used to communicate appropriate statepoint result from `LowerStatepoint()`
to `visitGCRelocate()`

This is NFC and is purely compile time optimization. On big methids it can improve
codegen compile time up to 10%.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D124444
2022-04-27 15:53:24 +03:00
Chris Jackson 74273d575f [Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2]
This relands commit 8f550368b1.

The test is amended with REQUIRES: x86-registered-target, in line with
the other debuginfo-scev-salvage tests.

Differential Revision: https://reviews.llvm.org/D120169
2022-04-27 13:10:30 +01:00
Chris Jackson 855752e563 Revert [Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics[2/2]
This reverts commit 8f550368b1.
2022-04-27 13:06:03 +01:00
Chris Jackson 8f550368b1 [Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2]
Second of two patches to extend SCEV-based salvaging to dbg.value
intrinsics that have multiple location ops pre-LSR. This second patch
adds the core implementation.

Reviewers: @StephenTozer, @djtodoro

Differential Revision: https://reviews.llvm.org/D120169
2022-04-27 12:47:35 +01:00
David Green 4a8c13a6f4 [CostModel] Add basic fptoi_sat costs
This adds some basic fptosi_sat and fptoui_sat target independent cost
modelling. The fptosi_sat is modelled as a fmin/fmax to saturate the
value, followed by a fp convert. The signed values then have an
additional fcmp+select for handling Nan correctly.

The AArch64/Arm costs may be more incorrect, as the instruction exist
natively. This can be fixed with target specific cost updates.

Differential Revision: https://reviews.llvm.org/D124269
2022-04-27 09:30:00 +01:00
Nikita Popov 86c770346c [AsmParser] Automatically declare and lex attribute keywords (NFC)
Rather than listing these by hand, include all enum attribute
keywords from Attributes.inc. This reduces the number of places
one has to update whenever an enum attribute is added.

Differential Revision: https://reviews.llvm.org/D124465
2022-04-27 09:27:26 +02:00
River Riddle 09af7fefc8 [mlir][PDLL] Add document link and hover support to mlir-pdll-lsp-server
This allows for navigating to included files on click, and also provides hover
information about the include file (similarly to clangd).

Differential Revision: https://reviews.llvm.org/D124077
2022-04-26 18:33:17 -07:00
Alexander Shaposhnikov 6beb2db7d1 [Support] Factor out isCrash from throwIfCrash
This diff factors out the check "isCrash" from the static method "throwIfCrash".
This is a helper function that can be useful in debugging / analysis, in particular,
I'm planning to use it in the future patches for lld-fuzzer.

Test plan:
1/ ninja check-all
2/ export LLD_IN_TEST=5 ninja check-lld

Differential revision: https://reviews.llvm.org/D124414
2022-04-27 00:52:53 +00:00
David Tenty f6d209b3ec [AIX][XCOFF] error on emit symbol visibility for XCOFF object file
This is a follow on to the revert of D84265 to add an error if we'd need
to write a non-zero visibility type in the xcoff object file. We can't
currently do that because we lack the auxilary header to interpret the
bits in XCOFF32. This is important because visibility is being enabled
in the assembly writing path, and without this error the visibility
could be silently ignored.

Differential Revision: https://reviews.llvm.org/D124392
2022-04-26 19:22:44 -04:00
Michael Kruse ff289feeba [OpenMPIRBuilder] Remove ContinuationBB argument from Body callback.
The callback is expected to create a branch to the ContinuationBB (sometimes called FiniBB in some lambdas) argument when finishing. This creates problems:

 1. The InsertPoint used for CodeGenIP does not need to be the end of a block. If it is not, a naive callback will insert a branch instruction into the middle of the block.

 2. The BasicBlock the CodeGenIP is pointing to may or may not have a terminator. There is an conflict where to branch to if the block already has a terminator.

 3. Some API functions work only with block having a terminator. Some workarounds have been used to insert a temporary terminator that is removed again.

 4. Some callbacks are sensitive to whether the BasicBlock has a terminator or not. This creates a callback ordering problem where different callback may have different behaviour depending on whether a previous callback created a terminator or not. The problem also exists for FinalizeCallbackTy where some callbacks do create branch to another "continue" block, but unlike BodyGenCallbackTy does not receive the target as argument. This is not addressed in this patch.

With this patch, the callback receives an CodeGenIP into a BasicBlock where to insert instructions. If it has to insert control flow, it can split the block at that position as needed but otherwise no separate ContinuationBB is needed. In particular, a callback can be empty without breaking the emitted IR. If the caller needs the control flow to branch to a specific target, it can insert the branch instruction itself and pass an InsertPoint before the terminator to the callback.

Certain frontends such as Clang may expect the current IRBuilder position to be at the end of a basic block. In this case its callbacks must split the block at CodeGenIP before setting the IRBuilder position such that the instructions after CodeGenIP are moved to another basic block and before returning create a new branch instruction to the split block.

Some utility functions such as `splitBB` are supporting correct splitting of BasicBlocks, independent of whether they have a terminator or not, returning/setting the InsertPoint of an IRBuilder to the end of split predecessor block, and optionally omitting creating a branch to the split successor block to be added later.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D118409
2022-04-26 16:35:01 -05:00
Vasileios Porpodas fa8a9fea47 Recommit "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`"
This reverts commit 6a9bbd9f20.

Code review: https://reviews.llvm.org/D124202
2022-04-26 14:02:40 -07:00
Kirill Stoimenov aabeb5eb7f Revert "[demangler] Simplify OutputBuffer initialization"
Reverting due to a bot failure:
https://lab.llvm.org/buildbot/#/builders/5/builds/22738

This reverts commit 5b3ca24a35.
2022-04-26 20:24:06 +00:00
Martin Sebor 25febbd155 [InstCombine] Fold strnlen with a bound of zero and one.
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123816
2022-04-26 14:02:50 -06:00
Martin Sebor 2807c420cd [InstCombine] add a strnlen handler stub.
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123815
2022-04-26 14:02:49 -06:00
Andrew Savonichev 0a27622a1d [NVPTX] Disable DWARF .file directory for PTX
Default behavior for .file directory was changed in D105856, but
ptxas (CUDA 11.5 release) refuses to parse it:

    $ llc -march=nvptx64 llvm/test/DebugInfo/NVPTX/debug-file-loc.ll
    $ ptxas debug-file-loc.s
    ptxas debug-file-loc.s, line 42; fatal   : Parsing error near
    '"foo.h"': syntax error

Added a new field to MCAsmInfo to control default value of
UseDwarfDirectory. This value is used if -dwarf-directory command line
option is not specified.

Differential Revision: https://reviews.llvm.org/D121299
2022-04-26 21:40:36 +03:00
Vasileios Porpodas 6a9bbd9f20 Revert "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`"
This reverts commit 55ce296d6f.
2022-04-26 11:25:26 -07:00
Vasileios Porpodas 55ce296d6f [SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`
Before this patch `Args` was used to pass a broadcat's arguments by SLP.
This patch changes this. `Args` is now used for passing the operands of
the shuffle.

Differential Revision: https://reviews.llvm.org/D124202
2022-04-26 11:11:29 -07:00
Augie Fackler a907d36cfe Attributes: add a new `allocptr` attribute
This continues the push away from hard-coded knowledge about functions
towards attributes. We'll use this to annotate free(), realloc() and
cousins and obviate the hard-coded list of free functions.

Differential Revision: https://reviews.llvm.org/D123083
2022-04-26 13:57:11 -04:00
Nathan Sidwell 5b3ca24a35 [demangler] Simplify OutputBuffer initialization
Every non-testcase use of OutputBuffer contains code to allocate an
initial buffer (using either 128 or 1024 as initial guesses). There's
now no need to do that, given recent changes to the buffer extension
heuristics -- it allocates a 1k(ish) buffer on first need.

Just pass in a buffer (if any) to the constructor.  Thus the
OutputBuffer's ownership of the buffer starts at its own lifetime
start. We can reduce the lifetime of this object in several cases.

That new constructor takes a 'size_t *' for the size argument, as all
uses with a non-null buffer are passing through a malloc'd buffer from
their own caller in this manner.

The buffer reset member function is never used, and is deleted.

The original buffer initialization code would return a failure code if
that first malloc failed.  Existing code either ignored that, called
std::terminate with a FIXME, or returned an error code.

But that's not foolproof anyway, as a subsequent buffer extension
failure ends up calling std::terminate. I am working on addressing
that unfortunate failure mode in a manner more consistent with the C++
ABI design.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D122604
2022-04-26 04:23:12 -07:00
Alexey Lapshin 854c33946f [llvm-gsymutil][NFC] refactor AddressRange&AddresRanges structures.
llvm-gsymutil has an implementation of AddressRange and AddressRanges
classes. That implementation might be reused in other parts of llvm.
This patch moves AddressRange and AddressRanges classes into llvm/ADT.

Differential Revision: https://reviews.llvm.org/D124350
2022-04-26 12:00:43 +03:00
Serge Pavlov 170a903144 Intrinsic for checking floating point class
This change introduces a new intrinsic, `llvm.is.fpclass`, which checks
if the provided floating-point number belongs to any of the the specified
value classes. The intrinsic implements the checks made by C standard
library functions `isnan`, `isinf`, `isfinite`, `isnormal`, `issubnormal`,
`issignaling` and corresponding IEEE-754 operations.

The primary motivation for this intrinsic is the support of strict FP
mode. In this mode using compare instructions or other FP operations is
not possible, because if the value is a signaling NaN, floating-point
exception `Invalid` is raised, but the aforementioned functions must
never raise exceptions.

Currently there are two solutions for this problem, both are
implemented partially. One of them is using integer operations to
implement the check. It was implemented in https://reviews.llvm.org/D95948
for `isnan`. It solves the problem of exceptions, but offers one
solution for all targets, although some can do the check in more
efficient way.

The other, implemented in https://reviews.llvm.org/D96568, introduced a
hook 'clang::TargetCodeGenInfo::testFPKind', which injects a target
specific code into IR to implement `isnan` and some other functions. It is
convenient for targets that have dedicated instruction to determine FP data
class. However using target-specific intrinsic complicates analysis and can
prevent some optimizations.

A special intrinsic for value class checks allows representing data class
tests with enough flexibility. During IR transformations it represents the
check in target-independent way and saves it from undesired transformations.
In the instruction selector it allows efficient lowering depending on the
used target and mode.

This implementation is an extended variant of `llvm.isnan` introduced
in https://reviews.llvm.org/D104854. It is limited to minimal intrinsic
support. Target-specific treatment will be implemented in separate
patches.

Differential Revision: https://reviews.llvm.org/D112025
2022-04-26 13:09:16 +07:00
Mircea Trofin b1fa5ac3ba [mlgo] Factor out TensorSpec
This is a simple datatype with a few JSON utilities, and is independent
of the underlying executor. The main motivation is to allow taking a
dependency on it on the AOT side, and allow us build a correctly-sized
buffer in the cases when the requested feature isn't supported by the
model. This, in turn, allows us to grow the feature set supported by the
compiler in a backward-compatible way; and also collect traces exposing
the new features, but starting off the older model, and continue
training from those new traces.

Differential Revision: https://reviews.llvm.org/D124417
2022-04-25 18:35:46 -07:00
YASHASVI KHATAVKAR e83543f8c2 Don't replace Undef with null value for Constants Differential Revision:https://reviews.llvm.org/D124098 2022-04-25 20:50:00 -04:00
Chris Bieneman e6f44a3cd2 Add PointerType analysis for DirectX backend
As implemented this patch assumes that Typed pointer support remains in
the llvm::PointerType class, however this could be modified to use a
different subclass of llvm::Type that could be disallowed from use in
other contexts.

This does not rely on inserting typed pointers into the Module, it just
uses the llvm::PointerType class to track and unique types.

Fixes #54918

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D122268
2022-04-25 17:49:43 -05:00
Frederik Gossen 8fbf9acc8c Add missing comparison operators to SmallVector
Differential Revision: https://reviews.llvm.org/D124407
2022-04-25 18:18:14 -04:00
Fangrui Song 39e23bb059 [LegacyPM] Remove HWAsanSanitizerLegacyPass
Using the legacy PM for the optimization pipeline was deprecated in 13.0.0.
Following recent changes to remove non-core features of the legacy
PM/optimization pipeline, remove AddressSanitizerLegacyPass...

...,
ModuleAddressSanitizerLegacyPass, and ASanGlobalsMetadataWrapperPass.

MemorySanitizerLegacyPass was removed in D123894.
AddressSanitizerLegacyPass was removed in D124216.

Reviewed By: #sanitizers, vitalybuka

Differential Revision: https://reviews.llvm.org/D124337
2022-04-25 10:21:26 -07:00
David Green 9727c77d58 [NFC] Rename Instrinsic to Intrinsic 2022-04-25 18:13:23 +01:00
Nathan Sidwell c47bcf9af6 [demangler][NFC] OperatorInfo table unit test
Placing a run-once test inside the operator lookup function caused
problems with the thread sanitizer. See D122975.

Break out the operator table into a member variable, and move the test
to the unit test machinery.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D123390
2022-04-25 10:02:08 -07:00
Zakk Chen ffe03ff75c [RISCV] Fix incorrect policy implement for unmasked vslidedown and vslideup.
vslideup works by leaving elements 0<i<OFFSET undisturbed.
so it need the destination operand as input for correctness
regardless of policy. Add a operand to indicate policy.

We also add policy operand for unmaksed vslidedown to keep the interface consistent with vslideup
because vslidedown have only undisturbed at 0<i<vstart but user have no way to control of vstart.

Reviewed By: rogfer01, craig.topper

Differential Revision: https://reviews.llvm.org/D124186
2022-04-25 09:18:41 -07:00
Vitaly Buka 4b4437c084 [asan] Enable detect_stack_use_after_return=1 by default
By default -fsanitize=address already compiles with this check,
why not use it.
For compatibly it can be disabled with env ASAN_OPTIONS=detect_stack_use_after_return=0.

Reviewed By: eugenis, kda, #sanitizers, hans

Differential Revision: https://reviews.llvm.org/D124057
2022-04-22 15:31:43 -07:00
Alexey Lapshin 79c1991010 [llvm-objcopy][NFC] refactor restoreStatOnFile out of llvm-objcopy.
Functionality of restoreStatOnFile may be reused. Move it into
FileUtilities.cpp. Create helper class FilePermissionsApplier
to store and apply permissions.

Differential Revision: https://reviews.llvm.org/D123821
2022-04-22 20:06:01 +03:00
Matt Arsenault 9c122537cd MIR: Serialize FunctionContextIdx in MachineFrameInfo 2022-04-22 11:07:41 -04:00
Fangrui Song 16a4d3a85c [LegacyPM] Remove AddressSanitizerLegacyPass
Using the legacy PM for the optimization pipeline was deprecated in 13.0.0.
Following recent changes to remove non-core features of the legacy
PM/optimization pipeline, remove AddressSanitizerLegacyPass,
ModuleAddressSanitizerLegacyPass, and ASanGlobalsMetadataWrapperPass.

MemorySanitizerLegacyPass was removed in D123894.

Reviewed By: #sanitizers, vitalybuka

Differential Revision: https://reviews.llvm.org/D124216
2022-04-21 19:25:57 -07:00
Nico Weber 0e0759f441 Revert "[LegacyPM] Remove AddressSanitizerLegacyPass"
This reverts commit e68c589e53.
Breaks check-llvm, see comments on https://reviews.llvm.org/D124216
2022-04-21 22:14:36 -04:00
Fangrui Song e68c589e53 [LegacyPM] Remove AddressSanitizerLegacyPass
Using the legacy PM for the optimization pipeline was deprecated in 13.0.0.
Following recent changes to remove non-core features of the legacy
PM/optimization pipeline, remove AddressSanitizerLegacyPass,
ModuleAddressSanitizerLegacyPass, and ASanGlobalsMetadataWrapperPass.

MemorySanitizerLegacyPass was removed in D123894.

Reviewed By: #sanitizers, vitalybuka

Differential Revision: https://reviews.llvm.org/D124216
2022-04-21 18:18:39 -07:00
Mircea Trofin e4794ff5c6 [mlgo][nfc] Decouple TensorSpec from tensorflow.
The motivation is twofold:

1) Allow plugging in a different training-time evaluator, e.g.
   TFLite-based, etc.

2) Allow using TensorSpec for AOT, too, to support evolution: we start
   by extracting a superset of the features currently supported by a
   model. For the tensors the model does not support, we just return a
   valid, but useless, buffer. This makes using a 'smaller' model (less
   supported tensors) transparent to the compiler. The key is to
   dimension the buffer appropriately, and we already have TensorSpec
   modeling that info.

The only coupling was due to the reliance of a TF internal API for
getting the element size, but for the types we are interested in,
`sizeof` is sufficient.

A subsequent change will yank out TensorSpec in its own module.

Differential Revision: https://reviews.llvm.org/D124045
2022-04-21 15:37:01 -07:00
Alexander Yermolovich c87d405b22 [DWARF] Add API to get data from MCDwarfLineStr
This API will be used in D121876, to get finalized string data for
.debug_line_str.

Reviewed By: dblaikie, rafauler

Differential Revision: https://reviews.llvm.org/D124052
2022-04-21 14:08:20 -07:00
Fangrui Song 409eb5dc3e [LegacyPM] Remove GCOVProfilerLegacyPass
Using the legacy PM for the optimization pipeline was deprecated in 13.0.0.
Following recent changes to remove non-core features of the legacy
PM/optimization pipeline, remove GCOVProfilerLegacyPass.

I have checked many LLVM users and only llvm-hs[1] uses the legacy gcov pass.

[1]: https://github.com/llvm-hs/llvm-hs/issues/392

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123829
2022-04-21 10:59:30 -07:00
Fangrui Song d133538b8b [LegacyPM] Remove MemorySanitizerLegacyPass
Using the legacy PM for the optimization pipeline was deprecated in 13.0.0.
Following recent changes to remove non-core features of the legacy
PM/optimization pipeline, remove MemorySanitizerLegacyPass.

Differential Revision: https://reviews.llvm.org/D123894
2022-04-21 10:21:46 -07:00
Vasileios Porpodas 889588ee97 [SLP] Refactoring isLegalBroadcastLoad() to use `ElementCount`.
Replacing `unsigned` with `ElementCount` in the argument of `isLegalBroadcastLoad()`.
This helps reduce the diff of a future SLP patch for AArch64.
2022-04-21 10:19:00 -07:00
Chuanqi Xu 483efc9ad0 [Pipelines] Remove Legacy Passes in Coroutines
The legacy passes are deprecated now and would be removed in near
future. This patch tries to remove legacy passes in coroutines.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D123918
2022-04-21 10:59:11 +08:00
Pengxuan Zheng 38612fbc89 Reland "[COFF, ARM64] Add __break intrinsic"
https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170

Reland after fixing the test failure. The failure was due to conflict with a
change (D122983) which was merged right before this patch.

Reviewed By: rnk, mstorsjo

Differential Revision: https://reviews.llvm.org/D124032
2022-04-20 13:01:30 -07:00
Pengxuan Zheng bff8356b19 Revert "[COFF, ARM64] Add __break intrinsic"
This reverts commit 8a9b4fb4aa.
2022-04-20 11:57:49 -07:00
Paul Kirth 61e36e87df [safestack] Support safestack in stack size diagnostics
Current stack size diagnostics ignore the size of the unsafe stack.
This patch attaches the size of the static portion of the unsafe stack
to the function as metadata, which can be used by the backend to emit
diagnostics regarding stack usage.

Reviewed By: phosek, mcgrathr

Differential Revision: https://reviews.llvm.org/D119996
2022-04-20 18:29:40 +00:00
Pengxuan Zheng 8a9b4fb4aa [COFF, ARM64] Add __break intrinsic
https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170

Reviewed By: rnk, mstorsjo

Differential Revision: https://reviews.llvm.org/D124032
2022-04-20 11:20:26 -07:00
Alexey Bataev 2cca53c815 [DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer.
We can process the long shuffles (working across several actual
vector registers) in the best way if we take the actual register
represantion into account. We can build more correct representation of
register shuffles, improve number of recognised buildvector sequences.
Also, same function can be used to improve the cost model for the
shuffles. in future patches.

Part of D100486

Differential Revision: https://reviews.llvm.org/D115653
2022-04-20 09:37:16 -07:00
Matt Arsenault 9209a51918 MachineModuleInfo: Move AddrLabelSymbols to AsmPrinter
This was tracking global state only used by the AsmPrinter, which can
store its own module global state.
2022-04-20 11:21:40 -04:00
Matt Arsenault 3659780d58 MachineModuleInfo: Remove UsesMorestackAddr
This is x86 specific, and adds statefulness to
MachineModuleInfo. Instead of explicitly tracking this, infer if we
need to declare the symbol based on the reference previously inserted.

This produces a small change in the output due to the move from
AsmPrinter::doFinalization to X86's emitEndOfAsmFile. This will now be
moved relative to other end of file fields, which I'm assuming doesn't
matter (e.g. the __morestack_addr declaration is now after the
.note.GNU-split-stack part)

This also produces another small change in code if the module happened
to define/declare __morestack_addr, but I assume that's invalid and
doesn't really matter.
2022-04-20 11:10:20 -04:00
Matt Arsenault d7938b1a81 MachineModuleInfo: Move HasSplitStack handling to AsmPrinter
This is used to emit one field in doFinalization for the module. We
can accumulate this when emitting all individual functions directly in
the AsmPrinter, rather than accumulating additional state in
MachineModuleInfo.

Move the special case behavior predicate into MachineFrameInfo to
share it. This now promotes it to generic behavior. I'm assuming this
is fine because no other target implements adjustForSegmentedStacks,
or has tests using the split-stack attribute.
2022-04-20 10:54:29 -04:00
Matt Arsenault 53d88581f1 llvm-reduce: Clone properties of blocks
getSuccProbability was private for some reason, saying to go through
MachineBranchProbabilityInfo. There doesn't seem to be much point to
that, as that wrapper directly calls this.

Like other areas, some of these fields aren't handled by the MIR
printer/parser so aren't tested.
2022-04-20 09:47:45 -04:00
Alexey Bataev 5f7ac15912 Revert "[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer."
This reverts commit 2f49163b33 to fix
a buildbot failure. Reported in https://lab.llvm.org/buildbot#builders/105/builds/24284
2022-04-20 06:35:55 -07:00
Alexey Bataev 2f49163b33 [DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer.
We can process the long shuffles (working across several actual
vector registers) in the best way if we take the actual register
represantion into account. We can build more correct representation of
register shuffles, improve number of recognised buildvector sequences.
Also, same function can be used to improve the cost model for the
shuffles. in future patches.

Part of D100486

Differential Revision: https://reviews.llvm.org/D115653
2022-04-20 05:32:56 -07:00
Nikita Popov dcab8e60c5 [Support] Remove unused LLVM_PTR_SIZE macro
This was used for LLVM_ALIGNAS() arguments in the past, but has
since been superseded by plain alignas() which also accepts a type.
2022-04-20 12:27:37 +02:00
Nikita Popov 903c30f4d1 [Support] Remove LLVM_ATTRIBUTE_DEPRECATED
The guidance since D94219 is to use [[deprecated]] directly. Now
that all historical uses of the macro have been removed, drop the
macro itself.
2022-04-20 12:16:41 +02:00
Nikita Popov f767a7d115 [DomTreeUpdater] Remove deprecated methods
Remove the insertEdge(), insertEdgeRelaxed(), deleteEdge() and
deleteEdgeRelaxed() methods, which have been deprecated three
years ago.
2022-04-20 12:14:29 +02:00
Nikita Popov 9b9bd995c5 [IRBuilder] Remove deprecated CreateShuffleVector() method
This method has been deprecated for two years.
2022-04-20 12:11:03 +02:00
Nikita Popov c99424f765 [IR] Deprecate Type::getPointerElementType() (NFC)
There are no more in-tree users of this method, outside the
experimental SPIRV backend.
2022-04-20 11:55:40 +02:00
Fangrui Song 14d9390721 Revert D123198 "[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building libcalls."
test/Transforms/InstCombine/pr39177.ll failed in a -DLLVM_USE_SANITIZER=Undefined build.
```
lib/Transforms/Utils/BuildLibCalls.cpp:1217:17: runtime error: reference binding to null pointer of type 'llvm::Function'
```
`Function &F = *M->getFunction(Name);`

This reverts commit 0f8c626723.
2022-04-19 22:26:10 -07:00
Matt Arsenault 9592e88f59 MachineModuleInfo: Don't allow dynamically setting DbgInfoAvailable
This can be set up front, and used only as a cache. This avoids a
field that looks like it requires MIR serialization.

I believe this fixes 2 bugs for CodeView. First, this addresses a
FIXME that the flag -diable-debug-info-print only works with
DWARF. Second, it fixes emitting debug info with emissionKind NoDebug.
2022-04-19 21:08:37 -04:00
Matt Arsenault 9a519179d9 ValueMap: Fix typo 2022-04-19 21:07:54 -04:00
Matt Arsenault 8591328e15 Intrinsics: Mark llvm.eh.sjlj.callsite argument as immarg
The assert in SelectionDAG implies that it is
2022-04-19 21:04:33 -04:00
Matt Arsenault 507259820a GlobalISel: Add LegalizeMutations to help use More/FewerElements 2022-04-19 21:04:32 -04:00
Matt Arsenault 12d79b1514 GlobalISel: Add LLT helper to multiply vector sizes 2022-04-19 21:04:32 -04:00
Ilia Diachkov 6c69427e88 [SPIR-V](3/6) Add MC layer, object file support, and InstPrinter
The patch adds SPIRV-specific MC layer implementation, SPIRV object
file support and SPIRVInstPrinter.

Differential Revision: https://reviews.llvm.org/D116462

Authors: Aleksandr Bezzubikov, Lewis Crawford, Ilia Diachkov,
Michal Paszkowski, Andrey Tretyakov, Konrad Trifunovic

Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Ilia Diachkov <iliya.diyachkov@intel.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
2022-04-20 01:10:25 +02:00
Paul Kirth bac6cd5bf8 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D115907
2022-04-19 21:23:48 +00:00
Jonas Paulsson 0f8c626723 [BuildLibCalls] Introduce getOrInsertLibFunc() for use when building libcalls.
A new set of overloaded functions named getOrInsertLibFunc() are now supposed
to be used instead of getOrInsertFunction() when building a libcall from
within an LLVM optimizer(). The idea is that this new function also makes
sure that any mandatory argument attributes are added to the function
prototype (after calling getOrInsertFunction()).

inferLibFuncAttributes() is renamed to inferNonMandatoryLibFuncAttrs() as it
only adds attributes that are not necessary for correctness but merely
helping with later optimizations.

Generally, the front end is responsible for building a correct function
prototype with the needed argument attributes. If the middle end however is
the one creating the call, e.g. when replacing one libcall with another, it
then must take this responsibility.

This continues the work of properly handling argument extension if required
by the target ABI when building a lib call. getOrInsertLibFunc() now does
this for all libcalls currently built by any LLVM optimizer. It is expected
that when in the future a new optimization builds a new libcall with an
integer argument it is to be added to getOrInsertLibFunc() with the proper
handling. Note that not all targets have it in their ABI to sign/zero extend
integer arguments to the full register width, but this will be done
selectively as determined by getExtAttrForI32Param().

Review: Eli Friedman, Nikita Popov, Dávid Bolvanský

Differential Revision: https://reviews.llvm.org/D123198
2022-04-19 21:22:07 +02:00
Jonas Paulsson 4aa5dc15f0 [SystemZ] Handle SystemZ specific inline assembly address operands.
Handle ZQ, ZR, ZS and ZT inline assembly operand constraints.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D110267
2022-04-19 16:55:45 +02:00
Chuanqi Xu f9bee35689 [Pipelines] Hoist CoroEarly as a module pass
This change could reduce the time we call `declaresCoroEarlyIntrinsics`.
And it is helpful for future changes.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D123925
2022-04-19 11:04:24 +08:00
Michael Kruse 2d92ee97f1 Reapply "[OpenMP] Refactor OMPScheduleType enum."
This reverts commit af0285122f.

The test "libomp::loop_dispatch.c" on builder
openmp-gcc-x86_64-linux-debian fails from time-to-time.
See #54969. This patch is unrelated.
2022-04-18 21:56:47 -05:00
Michael Kruse af0285122f Revert "[OpenMP] Refactor OMPScheduleType enum."
This reverts commit 9ec501da76.

It may have caused the openmp-gcc-x86_64-linux-debian buildbot to fail.
https://lab.llvm.org/buildbot/#/builders/4/builds/20377
2022-04-18 14:38:31 -05:00
Michael Kruse 9ec501da76 [OpenMP] Refactor OMPScheduleType enum.
The OMPScheduleType enum stores the constants from libomp's internal sched_type in kmp.h and are used by several kmp API functions. The enum values have an internal structure, namely each scheduling algorithm (e.g.) exists in four variants: unordered, orderend, normerge unordered, and nomerge ordered.

This patch (basically a followup to D114940) splits the "ordered" and "nomerge" bits into separate flags, as was already done for the "monotonic" and "nonmonotonic", so we can apply bit flags operations on them. It also now contains all possible combinations according to kmp's sched_type. Deriving of the OMPScheduleType enum from clause parameters has been moved form MLIR's OpenMPToLLVMIRTranslation.cpp to OpenMPIRBuilder to make available for clang as well. Since the primary purpose of the flag is the binary interface to libomp, it has been made more private to LLVMFrontend. The primary interface for generating worksharing-loop using OpenMPIRBuilder code becomes `applyWorkshareLoop` which derives the OMPScheduleType automatically and calls the appropriate emitter function.

While this is mostly a NFC refactor, it still applies the following functional changes:
 * The logic from OpenMPToLLVMIRTranslation to derive the OMPScheduleType also applies to clang. Most notably, it now applies the nonmonotonic flag for non-static schedules by default.
 * In OpenMPToLLVMIRTranslation, the nonmonotonic default flag was previously not applied if the simd modifier was used. I assume this was a bug, since the effect was due to `loop.schedule_modifier()` returning `mlir::omp::ScheduleModifier::none` instead of `llvm::Optional::None`.
 * In OpenMPToLLVMIRTranslation, the nonmonotonic default flag was set even if ordered was specified, in breach to what the comment before citing the OpenMP specification says. I assume this was an oversight.

The ordered flag with parameter was not considered in this patch. Changes will need to be made (e.g. adding/modifying function parameters) when support for it is added. The lengthy names of the enum values can be discussed, for the moment this is avoiding reusing previously existing enum value names such as `StaticChunked` to avoid confusion.

Reviewed By: peixin

Differential Revision: https://reviews.llvm.org/D123403
2022-04-18 14:03:17 -05:00
chenglin.bi 222adf338a [Arch64][SelectionDAG] Add target-specific implementation of srem
1. X%C to the equivalent of X-X/C*C is not always fastest path if there is no SDIV pair exist. So check target have faster for srem only first.
2. Add AArch64 faster path for SREM only pow2 case.

Fix https://github.com/llvm/llvm-project/issues/54649

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D122968
2022-04-19 02:49:42 +08:00
Arthur Eubanks 2e6ac54cf4 [LegacyPM] Remove ThinLTO/LTO pipelines
Using the legacy PM for the optimization pipeline was deprecated in 13.0.0.
Following recent changes to remove non-core features of the legacy
PM/optimization pipeline, remove the (Thin)LTO pipelines.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D123882
2022-04-18 10:09:41 -07:00
Arthur Eubanks a7e20a8a7a [CallPrinter] Port CallPrinter passes to new pass manager
Port the legacy CallGraphViewer and CallGraphDOTPrinter to work with the new pass manager.

Addresses issue https://github.com/llvm/llvm-project/issues/54323
Adds back related tests that were removed in commits d53a4e7b4a and 9e9d9aba14

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D122989
2022-04-18 10:02:18 -07:00
Qiu Chaofan 1e23175df6 [PowerPC] Mark side effects of Power9 darn instruction
This fixes CVE-2019-15847, preventing random number generation from
being merged.

Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D122783
2022-04-18 13:21:40 +08:00
chenglin.bi acfc025a72 Revert "[Arch64][SelectionDAG] Add target-specific implementation of srem"
This reverts commit 9d9eddd3dd.
2022-04-18 10:35:09 +08:00
Johannes Doerfert e87f10a771 [Attributor] CGSCC pass should not recompute results outside the SCC (reapply)
When we run the CGSCC pass we should only invest time on the SCC. We can
initialize AAs with information from the module slice but we should not
update those AAs. We make an exception for are call site of the SCC as
they are helpful providing information for the SCC.

Minor modifications to pointer privatization allow us to perform it even
in the CGSCC pass, similar to ArgumentPromotion.
2022-04-17 12:48:49 -05:00
Andrew Savonichev 52053aa94f [NVPTX] Disable parens for identifiers starting with '$'
ptxas fails to parse such syntax:

    mov.u64 %rd1, ($str);
    fatal   : Parsing error near '$str': syntax error

A new MCAsmInfo option was added because InParens parameter of
MCExpr::print is not sufficient to disable parens
completely. MCExpr::print resets it to false for a recursive call in
case of unary or binary expressions.

Targets that require parens around identifiers that start with '$'
should always pass MCAsmInfo to MCExpr::print.
Therefore 'operator<<(raw_ostream &, MCExpr&)' should be avoided
because it calls MCExpr::print with nullptr MAI.

Differential Revision: https://reviews.llvm.org/D123702
2022-04-17 18:02:33 +03:00
Lang Hames 42614062e2 [JITLink] Error instead of asserting on unrecognized edge kinds.
It's idiomatic to require that plugins (especially platform plugins) be
installed to handle special edge kinds. If the plugins are not installed and an
object is loaded that uses one of the special edge kinds then we want to error
out rather than asserting.
2022-04-16 18:52:27 -07:00
Lang Hames a7bceb3f83 [ORC] Make IRSpeculationLayer::BaseLayer an IRLayer.
BaseLayer was originally written as an IRCompileLayer, but there was no need for
this restriction. Using IRLayer gives clients more flexibility in choosing the
underlying layer.
2022-04-16 14:10:49 -07:00
Andrew Litteken d7c56a076e [IROutliner] Ensure that phi values that are passed in as arguments are remapped as arguments
Issue: https://github.com/llvm/llvm-project/issues/54430

For incoming values of phi nodes added to an outlined function to accommodate different exit paths in the function, when a value is a constant that is passed into the outlined function as an argument, we find the corresponding value in the first extracted function used to fill the overall outlined function. When this value is an argument, the corresponding value used will be the old value, prior to outlining. This patch maintains a mapping from these values to arguments, and uses this mapping to update the added phi node accordingly.

Reviewers: paquette

Recommit of d6eb480afb

Differential Revision: https://reviews.llvm.org/D122206
2022-04-16 15:47:52 -05:00
chenglin.bi 9d9eddd3dd [Arch64][SelectionDAG] Add target-specific implementation of srem
X%C to the equivalent of X-X/C*C is not always fastest path if there is no SDIV pair exist. So check target have faster for srem only first. Add AArch64 faster path for SREM only pow2 case.

Fix https://github.com/llvm/llvm-project/issues/54649

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D122968
2022-04-16 12:29:11 +08:00
Joseph Huber 984a0dc386 [OpenMP] Use new offloading binary when embedding offloading images
The previous patch introduced the offloading binary format so we can
store some metada along with the binary image. This patch introduces
using this inside the linker wrapper and Clang instead of the previous
method that embedded the metadata in the section name.

Differential Revision: https://reviews.llvm.org/D122683
2022-04-15 20:35:26 -04:00
Matt Arsenault 193fde7509 llvm-reduce: Clone some of the easy function properties
Error on some of these other fields, since tracking down test cases
for all of these at once is exhausting.
2022-04-15 20:31:07 -04:00
Matt Arsenault b8033de063 MIR: Serialize a few bool function fields 2022-04-15 20:31:07 -04:00
Johannes Doerfert 3be3b40188 [Attributor][NFCI] Introduce AttributorConfig to bundle all options
Instead of lengthy constructors we can now set the members of a
read-only struct before the Attributor is created. Should make it
clearer what is configurable and also help introducing new options in
the future. This actually added IsModulePass and avoids deduction
through the Function set size. No functional change was intended.
2022-04-15 18:17:19 -05:00
Chih-Ping Chen eab6e94f91 [DebugInfo] Add a TargetFuncName field in DISubprogram for
specifying DW_AT_trampoline as a string. Also update the signature
of DIBuilder::createFunction to reflect this addition.

Differential Revision: https://reviews.llvm.org/D123697
2022-04-15 16:38:23 -04:00
Johannes Doerfert 39a68cc016 Revert "[Attributor] CGSCC pass should not recompute results outside the SCC"
This reverts commit 0d7f81e313, it caused
the AMDGPU tests that use the Attributor to fail.
2022-04-15 15:29:51 -05:00
Johannes Doerfert 04f3a224bc [Attributor][NFC] Introduce a flag to distinguish the scope of a query 2022-04-15 14:56:10 -05:00
Johannes Doerfert 0d7f81e313 [Attributor] CGSCC pass should not recompute results outside the SCC
When we run the CGSCC pass we should only invest time on the SCC. We can
initialize AAs with information from the module slice but we should not
update those AAs.
2022-04-15 14:56:09 -05:00
Johannes Doerfert bd72acf4d8 [Attributor][NFC] Code cleanup to minimize follow up changes 2022-04-15 14:56:09 -05:00
Johannes Doerfert 2d8e7834b0 [Attributor][NFC] Rename AAPotentialValues to AAPotentialConstantValues 2022-04-15 14:56:09 -05:00
Fangrui Song 04e094a336 [PGO] Remove legacy PM passes
Legacy PM for optimization pipeline was deprecated in 13.0.0 and Clang dropped
legacy PM support in D123609. This change removes legacy PM passes for PGO so
that downstream projects won't be able to use it. It seems appropriate to start
removing such "add-on" features like instrumentations, before we remove more
stuff after 15.x is branched.

I have checked many LLVM users and only ldc[1] uses the legacy PGO pass.

[1]: https://github.com/ldc-developers/ldc/issues/3961

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D123834
2022-04-15 10:26:43 -07:00
Fraser Cormack eafe182fdc [VP] Rename ISD::VP_FPROUND and ISD::VP_FPEXT
Rename them to be more closely related to their non-VP counterparts.

Reviewed By: jacquesguan, ym1813382441

Differential Revision: https://reviews.llvm.org/D123847
2022-04-15 16:16:45 +01:00
Matt Arsenault f163106f39 llvm-reduce: Handle cloning MachineFrameInfo and stack objects
This didn't work at all before, and would assert on any frame
index. Also copy the other fields, which I believe should cover
everything. There are a few that are untested since MIR serialization
is apparently still missing them (isStatepointSpillSlot,
ObjectSSPLayout, and ObjectSExt/ObjectZExt).
2022-04-14 21:25:06 -04:00
Matt Arsenault 4975c3a949 MachineFunction: Remove unused field 2022-04-14 20:21:18 -04:00
Andrew Savonichev 5193f2a558 Revert "[NVPTX] Disable parens for identifiers starting with '$'"
This reverts commit 78d70a1c97.

Failed on Mips32:
https://lab.llvm.org/buildbot#builders/109/builds/36628

   # CHECK: # fixup A - offset: 0, value: ($tmp0), kind: fixup_Mips_26
   <stdin>:580:2: note: possible intended match here
   # fixup A - offset: 0, value: $tmp0, kind: fixup_Mips_26
2022-04-14 21:25:31 +03:00
Andrew Savonichev 78d70a1c97 [NVPTX] Disable parens for identifiers starting with '$'
ptxas fails to parse such syntax:

    mov.u64 %rd1, ($str);
    fatal   : Parsing error near '$str': syntax error

A new MCAsmInfo option was added because InParens parameter of
MCExpr::print is not sufficient to disable parens
completely. MCExpr::print resets it to false for a recursive call in
case of unary or binary expressions.

Differential Revision: https://reviews.llvm.org/D123702
2022-04-14 21:07:43 +03:00
Andrew Litteken 6f8eba06c2 Revert "[IROutliner] Ensure that phi values that are passed in as arguments are remapped as arguments"
Failing test due to typo

This reverts commit d6eb480afb.
2022-04-14 12:23:33 -05:00
Andrew Litteken d6eb480afb [IROutliner] Ensure that phi values that are passed in as arguments are remapped as arguments
Issue: https://github.com/llvm/llvm-project/issues/54430

For incoming values of phi nodes added to an outlined function to accommodate different exit paths in the function, when a value is a constant that is passed into the outlined function as an argument, we find the corresponding value in the first extracted function used to fill the overall outlined function. When this value is an argument, the corresponding value used will be the old value, prior to outlining. This patch maintains a mapping from these values to arguments, and uses this mapping to update the added phi node accordingly.

Reviewers: paquette

Differential Revision: https://reviews.llvm.org/D122206
2022-04-14 12:16:23 -05:00
Joseph Huber e471ba3d01 [Object] Add binary format for bundling offloading metadata
We need to embed certain metadata along with a binary image when we wish
to perform a device-linking job on it. Currently this metadata was
embedded in the section name of the data itself. This worked, but made
adding new metadata very difficult and didn't work if the user did any
sort of section linking.

This patch introduces a custom binary format for bundling offloading
metadata with a device object file. This binary format is fundamentally
a simple string map table with some additional data and an embedded
image. I decided to use a custom format rather than using an existing
format (ELF, JSON, etc) because of the specialty use-case of this. We
need a simple binary format that can be concatenated without requiring
other external dependencies.

This extension will make it easier to extend the linker wrapper's
capabilties with whatever data is necessary. Eventually this will allow
us to remove all the external arguments passed to the linker wrapper and
embed it directly in the host's linker so device linking behaves exactly
like host linking.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D122069
2022-04-14 10:50:52 -04:00
Joseph Huber 11f47b791f [OpenMP] Make offloading sections have the SHF_EXCLUDE flag
Offloading sections can be embedded in the host during codegen via a
section. This section was originally marked as metadata to prevent it
from being loaded, but these sections are completely unused at runtime
so the linker should automatically drop them from the final executable
or shard library. This flag adds support for the SHF_EXCLUDE flag in
target lowering and uses it.

Reviewed By: JonChesterfield, MaskRay

Differential Revision: https://reviews.llvm.org/D122987
2022-04-14 10:50:49 -04:00
Paul Walker 0c44115e51 [SVE] Add support for non-element-type sized scaling when lowering MGATHER/MSCATTER.
The lowering code did not use the scale operand of MGATHER/MSCATTER
nodes, but instead assumed scaled indices were always scaled based
on the element type of the memory type. This patch adds the missing
support by rewritting the nodes as unscaled variants.

Differential Revision: https://reviews.llvm.org/D123670
2022-04-14 11:54:46 +01:00
Corentin Jabot d8d793f29b Fix compatibility with retroactive C++23 change [NFC]
Referring to capture in parameter list is now ill-formed.
This change is made to prepare for https://reviews.llvm.org/D119136
2022-04-13 22:57:39 +02:00
serge-sans-paille fa5a4e1b95 [iwyu] Handle regressions in libLLVM header include
Running iwyu-diff on LLVM codebase since a96638e50e detected a few
regressions, fixing them.
2022-04-13 20:53:19 +02:00
serge-sans-paille 262eba01b3 Revert "[ValueTracking] Make getStringLenth aware of strdup"
This reverts commit e810d55809.

The commit was not taken into account the fact that strduped string could be
modified. Checking if such modification happens would make the function very
costly, without a test case in mind it's not worth the effort.
2022-04-13 19:17:28 +02:00
Nathan Sidwell 201c4b9cc4 [demangler] Rust demangler buffer return
The rust demangler has some odd buffer handling code, which will copy
the demangled string into the provided buffer, if it will fit.
Otherwise it uses the allocated buffer it made.  But the length of the
incoming buffer will have come from a previous call, which was the
length of the demangled string -- not the buffer size.  And of course,
we're unconditionally allocating a temporary buffer in the first
place.  So we don't actually get buffer reuse, and we get a memcpy in
somecases.

However, nothing in LLVM ever passes in a non-null pointer.  Neither
does anything pass in a status pointer that is then made use of.  The
only exercise these have is in the test suite.

So let's just make the rust demangler have the same API as the dlang
demangler.

Reviewed By: tmiasko

Differential Revision: https://reviews.llvm.org/D123420
2022-04-13 08:50:04 -07:00
Jonas Paulsson 46f83caebc [InlineAsm] Add support for address operands ("p").
This patch adds support for inline assembly address operands using the "p"
constraint on X86 and SystemZ.

This was in fact broken on X86 (see example at
https://reviews.llvm.org/D110267, Nov 23).

These operands should probably be treated the same as memory operands by
CodeGenPrepare, which have been commented with "TODO" there.

Review: Xiang Zhang and Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D122220
2022-04-13 12:50:21 +02:00
Nikita Popov 0d86fc65ba [LTO] Remove legacy PM support
We don't have any places setting NewPM=false anymore, so drop the
support code in LTOBackend.
2022-04-13 10:48:08 +02:00
Daniel Kiss b0343a38a5 Support the min of module flags when linking, use for AArch64 BTI/PAC-RET
LTO objects might compiled with different `mbranch-protection` flags which will cause an error in the linker.
Such a setup is allowed in the normal build with this change that is possible.

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D123493
2022-04-13 09:31:51 +02:00
Muhammad Omair Javaid 42ebfa8269 Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth"
This reverts commit 64b6192e81.

This broke LLVM AArch64 buildbot clang-aarch64-sve-vls-2stage:

https://lab.llvm.org/buildbot/#/builders/176/builds/1515

llvm-tblgen crashes after applying this patch.
2022-04-13 04:53:07 +05:00
Nick Desaulniers 23ec5782c3 [Bitcode] materialize Functions early when BlockAddress taken
IRLinker builds a work list of functions to materialize, then moves them
from a source module to a destination module one at a time.

This is a problem for blockaddress Constants, since they need not refer
to the function they are used in; IPSCCP is quite good at sinking these
constants deep into other functions when passed as arguments.

This would lead to curious errors during LTO:
  ld.lld: error: Never resolved function from blockaddress ...
based on the ordering of function definitions in IR.

The problem was that IRLinker would basically do:

  for function f in worklist:
    materialize f
    splice f from source module to destination module

in one pass, with Functions being lazily added to the running worklist.
This confuses BitcodeReader, which cannot disambiguate whether a
blockaddress is referring to a function which has not yet been parsed
("materialized") or is simply empty because its body was spliced out.
This causes BitcodeReader to insert Functions into its BasicBlockFwdRefs
list incorrectly, as it will never re-materialize an already
materialized (but spliced out) function.

Because of the possibility that blockaddress Constants may appear in
Functions other than the ones they reference, this patch adds a new
bitcode function code FUNC_CODE_BLOCKADDR_USERS that is a simple list of
Functions that contain BlockAddress Constants that refer back to this
Function, rather then the Function they are scoped in. We then
materialize those functions when materializing `f` from the example loop
above. This might over-materialize Functions should the user of
BitcodeReader ultimately decide not to link those Functions, but we can
at least now we can avoid this ordering related issue with blockaddresses.

Fixes: https://github.com/llvm/llvm-project/issues/52787
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1215

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D120781
2022-04-12 11:38:35 -07:00
Harald van Dijk 3337f50625
[X86] Fix handling of maskmovdqu in x32 differently
This reverts the functional changes of D103427 but keeps its tests, and
and reimplements the functionality by reusing the existing 32-bit
MASKMOVDQU and VMASKMOVDQU instructions as suggested by skan in review.
These instructions were previously predicated on Not64BitMode. This
reimplementation restores the disassembly of a class of instructions,
which will see a test added in followup patch D122449.

These instructions are in 64-bit mode special cased in
X86MCInstLower::Lower, because we use flags with one meaning for subtly
different things: we have an AdSize32 class which indicates both that
the instruction needs a 0x67 prefix and that the text form of the
instruction implies a 0x67 prefix. These instructions are special in
needing a 0x67 prefix but having a text form that does *not* imply a
0x67 prefix, so we encode this in MCInst as an instruction that has an
explicit address size override.

Note that originally VMASKMOVDQU64 was special cased to be excluded from
disassembly, as we cannot distinguish between VMASKMOVDQU and
VMASKMOVDQU64 and rely on the fact that these are indistinguishable, or
close enough to it, at the MCInst level that it does not matter which we
use. Because VMASKMOVDQU now receives special casing, even though it
does not make a difference in the current implementation, as a
precaution VMASKMOVDQU is excluded from disassembly rather than
VMASKMOVDQU64.

Reviewed By: RKSimon, skan

Differential Revision: https://reviews.llvm.org/D122540
2022-04-12 18:32:14 +01:00
Shao-Ce SUN e90110e696 [NFC][CodeGen] Use ArrayRef in TargetLowering functions
This patch is similar to D122557, adding an `ArrayRef` version for `setOperationAction`, `setLoadExtAction`, `setCondCodeAction`, `setLibcallName`.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123467
2022-04-13 00:46:05 +08:00
serge-sans-paille e810d55809 [ValueTracking] Make getStringLenth aware of strdup
During strlen compile-time evaluation, make it possible to track size of
strduped strings.

Differential Revision: https://reviews.llvm.org/D123497
2022-04-12 14:47:29 +02:00
Carlos Alberto Enciso e758b77161 [llvm-pdbutil] Fix broken '-modi' option after change D122226.
The change described by:

https://reviews.llvm.org/D122226

Moved some llvm-pdbutil functionality to the debug PDB library.

This patch addresses a broken '-modi' argument handling, which
causes an assertion if its value is other than '0' or '1'.

In addition, it moves the assertion for the number of occurrences
of the '-modi' argument from the PDB library into the llvm-pdbutil
driver.

Reviewed By: zequanwu

Differential Revision: https://reviews.llvm.org/D123483
2022-04-12 06:31:12 +01:00
Matt Arsenault d1f97a3419 GlobalISel: Add memSizeNotByteSizePow2 legality helper
This is really a replacement for memSizeInBytesNotPow2 that actually
does what most every target wants. In particular, since s1 rounds to 1
byte, it wasn't lowered by this predicate. This results in targets
needing to think harder and add more matchers to catch all the
degenerate cases.

Also small bug fix that prevented the correct insertion of
G_ASSERT_ZEXT in the AArch64 use case.
2022-04-11 19:43:37 -04:00
Ben Barham fe2478d44e [VFS] RedirectingFileSystem only replace path if not already mapped
If the `ExternalFS` has already remapped to an external path then
`RedirectingFileSystem` should not change it to the originally provided
path. This fixes the original path always being used if multiple VFS
overlays were provided and the path wasn't found in the highest (ie.
first in the chain).

For now this is accomplished through the use of a new
`ExposesExternalVFSPath` field on `vfs::Status`. This flag is true when
the `Status` has an external path that's different from its virtual
path, ie. the contained path is the external path. See the plan in
`FileManager::getFileRef` for where this is going - eventually we won't
need `IsVFSMapped` any more and all returned paths should be virtual.

Resolves rdar://90578880 and llvm-project#53306.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D123398
2022-04-11 14:52:48 -07:00
Fangrui Song a8ef1647aa [CMake][gn][Bazel] Remove HAVE_PTHREAD_GETSPECIFIC
The only user was removed by d351f54a07.
2022-04-11 14:44:45 -07:00
Craig Topper 2ce2562876 [RISCV][SelectionDAG] Add a hook to sign extend i32 ConstantInt operands of phis on RV64.
Materializing constants on RISCV is simpler if the constant is sign
extended from i32. By default i32 constant operands of phis are
zero extended.

This patch adds a hook to allow RISCV to override this for i32. We
have an existing isSExtCheaperThanZExt, but it operates on EVT which
we don't have at these places in the code.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D122951
2022-04-11 14:38:39 -07:00
Fangrui Song aefa4b60ce [Driver] Simplify hasFlag pattern with addOptInFlag/addOptOutFlag helpers
Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D123468
2022-04-11 12:29:25 -07:00
Fraser Cormack cab1ecf251 [TableGen][NFC] Reflow Record accessor comments 2022-04-11 18:33:26 +01:00
Fraser Cormack 74dd95face [TableGen][NFC] Fix copy/paste error in comment 2022-04-11 18:29:29 +01:00
Momchil Velikov b4ad28da19 [CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the
inconsistency of the call-frame information caused by linear (non-CGA
aware) nature of the unwind tables.

Unlike the `CFIInstrInserer` pass, this one almost always emits only
`.cfi_remember_state`/`.cfi_restore_state`, which results in smaller
unwind tables and also transparently handles custom unwind info
extensions like CFA offset adjustement and save locations of SVE
registers.

This pass takes advantage of the constraints taht LLVM imposes on the
placement of save/restore points (cf. `ShrinkWrap.cpp`):

  * there is a single basic block, containing the function prologue

  * possibly multiple epilogue blocks, where each epilogue block is
    complete and self-contained, i.e. CSR restore instructions (and the
    corresponding CFI instructions are not split across two or more
    blocks.

  * prologue and epilogue blocks are outside of any loops

Thus, during execution, at the beginning and at the end of each basic
block the function can be in one of two states:

  - "has a call frame", if the function has executed the prologue, or
     has not executed any epilogue

  - "does not have a call frame", if the function has not executed the
    prologue, or has executed an epilogue

These properties can be computed for each basic block by a single RPO
traversal.

From the point of view of the unwind tables, the "has/does not have
call frame" state at beginning of each block is determined by the
state at the end of the previous block, in layout order.

Where these states differ, we insert compensating CFI instructions,
which come in two flavours:

- CFI instructions, which reset the unwind table state to the
    initial one.  This is done by a target specific hook and is
    expected to be trivial to implement, for example it could be:
```
     .cfi_def_cfa <sp>, 0
     .cfi_same_value <rN>
     .cfi_same_value <rN-1>
     ...
```
where `<rN>` are the callee-saved registers.

- CFI instructions, which reset the unwind table state to the one
    created by the function prologue. These are the sequence:
```
       .cfi_restore_state
       .cfi_remember_state
```
In this case we also insert a `.cfi_remember_state` after the
last CFI instruction in the function prologue.

Reviewed By: MaskRay, danielkiss, chill

Differential Revision: https://reviews.llvm.org/D114545
2022-04-11 13:27:26 +01:00
Nikita Popov ceadf6ee61 [ThinLTOCodeGenerator] Remove support for legacy PM
All users of NewPM=false for the (legacy) ThinLTOCodeGenerator
have been removed, so we can remove this functionality entirely.
2022-04-11 11:30:50 +02:00
Nikita Popov 2121dc5b15 [llvm-lto] Remove support for legacy pass manager
This removes support for the legacy pass manager in llvm-lto and
llvm-lto2. In this case I've dropped the use-new-pm option entirely,
as I don't think this is considered part of the public interface.

This also makes -debug-pass-manager work with llvm-lto, because
that was needed to migrate some tests to NewPM.

Differential Revision: https://reviews.llvm.org/D123376
2022-04-11 09:40:17 +02:00
Alex Fan acb408fbbc [ORC] add lazy jit support for riscv64
This adds resolver, indirection and trampoline stubs for riscv64,
allowing lazy compilation to work.

It assumes hard float extension exists. I don't know the proper way to detect it as Triple doesn't provide the interface to check riscv +f +d abi.

I am also not sure if orclazy tests should be enabled because lli needs an additional -codemodel=melany for tests to pass.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D122543
2022-04-10 18:44:50 +08:00
Florian Hahn d5e66c16c0
[IRBuilder] Remove commented out include.
Looks like this was left over during some include optimizations. Remove
it.
2022-04-09 21:33:39 +02:00
Jennifer Yu 187ccc66fa [clang][OpenMP5.1] Initial parsing/sema for has_device_addr
Added basic parsing/sema/ support for the 'has_device_addr' clause.

Differential Revision: https://reviews.llvm.org/D123402
2022-04-08 21:19:38 -07:00
Matt Arsenault 9fdd25848a Transforms: Fix code duplication between LowerAtomic and AtomicExpand 2022-04-08 19:06:36 -04:00
Arthur Eubanks b22ffc7b98 [CaptureTracking] Ignore ephemeral values in EarliestEscapeInfo
And thread DSE's ephemeral values to EarliestEscapeInfo.

This allows more precise analysis in DSEState::isReadClobber() via BatchAA.

Followup to D123162.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123342
2022-04-08 10:07:26 -07:00
Snehasish Kumar 6dd6a6161f [memprof] Deduplicate and outline frame storage in the memprof profile.
The current implementation of memprof information in the indexed profile
format stores the representation of each calling context fram inline.
This patch uses an interned representation where the frame contents are
stored in a separate on-disk hash table. The table is indexed via a hash
of the contents of the frame. With this patch, the compressed size of a
large memprof profile reduces by ~22%.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D123094
2022-04-08 09:15:20 -07:00
Carlos Alberto Enciso 10c11f5c43 [llvm-pdbutil] Move global state (Filters) inside LinePrinter class.
The changes described by:

https://reviews.llvm.org/D121801
https://reviews.llvm.org/D122226

Moved some llvm-pdbutil functionality to the debug PDB library.

This patch addresses one outstanding issue concerning the global
state (Filters) created in the PDB library.

- Move 'Filters' inside the 'LinePrinter' class.
- Omit 'Optional' and just pass 'PrintScope &HeaderScope' everywhere.

Reviewed By: aganea

Differential Revision: https://reviews.llvm.org/D122887
2022-04-08 14:54:55 +01:00
Fraser Cormack 18106b99f0 [VP] Explicitly map from VP intrinsic to ISD opcode
This patch aims to overcome an issue in these mappings where, when an ISD
node was registered with BEGIN_REGISTER_VP_SDNODE but outwidth the scope
of a pair of BEGIN_REGISTER_VP_INTRINSIC/END_REGISTER_VP_INTRINSIC
macros, the switch cases fell apart. This in particular happened with
VP_SETCC, where we'd end up with something along the lines of:

  case Intrinsic::vp_fcmp:
    break;
  case Intrinsic::vp_icmp:
    break;
    ResOpc = ISD::VP_SETCC;
  case Intrinsic::vp_store:
    ...

To remedy this, we introduce a special-purpose mapping macro which can
map any number of VP intrinsic opcodes to an ISD opcode.

As a result, we no longer need to special-case the mapping from vp.icmp
and vp.fcmp to VP_SETCC, as the new helper macro does it for us.

Thanks to @craig.topper for noticing this and to @rogfer01 for the idea.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D123324
2022-04-08 12:30:22 +01:00
Kito Cheng f922dbb792 Revert "Reland "[RISCV][NFC] Moving RVV intrinsic type related util to llvm/Support""
This reverts commit fc2d8326ae.
2022-04-08 16:20:19 +08:00
Nikita Popov c8c6362560 [LICM] Pass MemorySSAUpdater by referene (NFC)
Make it clearer that this is a required dependency.
2022-04-08 10:08:57 +02:00
Nikita Popov 5cefe7d9f5 [LoopSink] Require MemorySSA
This makes MemorySSA in LoopSink required, and removes the AST-based
implementation, as well as the related support code in LICM.

Differential Revision: https://reviews.llvm.org/D123288
2022-04-08 09:49:44 +02:00
serge-sans-paille aa15ea47e2 [builtin_object_size] Basic support for posix_memalign
It actually implements support for seeing through loads, using alias analysis to
refine the result.

This is rather limited, but I didn't want to rely on more than available
analysis at that point (to be gentle with compilation time), and it does seem to
catch common scenario, as showcased by the included tests.

Differential Revision: https://reviews.llvm.org/D122431
2022-04-08 09:31:11 +02:00
Kito Cheng fc2d8326ae Reland "[RISCV][NFC] Moving RVV intrinsic type related util to llvm/Support"
Reland Note: We've resolve the circular dependency issue on llvm/lib/Support and
llvm/TableGen.

Differential Revision: https://reviews.llvm.org/D121984
2022-04-08 15:09:03 +08:00
Senran Zhang a23652f6f9 [demangler] Support C23 _BitInt type
Reviewed By: #libc_abi, aaron.ballman, urnathan

Differential Revision: https://reviews.llvm.org/D122530
2022-04-08 12:20:45 +08:00
Evgeniy Brevnov da41214d65 Add support for atomic memory copy lowering
Currently, the utility supports lowering of non atomic memory transfer routines only. This patch adds support for atomic version of memcopy. This may be useful for targets not supporting atomic memcopy.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D118443
2022-04-08 10:41:31 +07:00
Arthur Eubanks 17fdaccccf [CaptureTracking] Ignore ephemeral values when determining pointer escapeness
Ephemeral values cannot cause a pointer to escape.

No change in compile time:
https://llvm-compile-time-tracker.com/compare.php?from=4371710085ba1c376a094948b806ddd3b88319de&to=c5ddbcc4866f38026737762ee8d7b9b00395d4f4&stat=instructions

This partially fixes some regressions caused by more calls to `__builtin_assume` (D122397).

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D123162
2022-04-07 10:11:14 -07:00
Augie Fackler 5f09498a11 MemoryBuiltins: also check function definition for allocalign
This got changed to use hasAttrSomewhere() during review, and I didn't
notice until today when I was writing some tests for another part of
this system that using hasAttrSomewhere only checked the callsite for
allocalign, rather than both the callsite and the definition. This fixes
that by introducing a helper method.

Differential Revision: https://reviews.llvm.org/D121641
2022-04-07 12:38:44 -04:00
Alexey Bataev 8e066b86a7 Add missing template keywords
GCC 14 warns about these.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D121047
2022-04-07 07:08:44 -07:00
Antonio Frighetto 7c3d8c8977 Fix warnings when `-Wdeprecated-enum-enum-conversion` is enabled
clang may throw the following warning:
include/clang/AST/DeclarationName.h:210:52: error: arithmetic between
different enumeration types ('clang::DeclarationName::StoredNameKind'
and 'clang::detail::DeclarationNameExtra::ExtraKind') is deprecated
when flags -Werror,-Wdeprecated-enum-enum-conversion are on.

This adds the `addEnumValues()` helper function to STLExtras.h to hide
the details of adding enumeration values together from two different
enumerations.
2022-04-07 08:20:54 -04:00
Chih-Ping Chen c226a5c4d7 [DebugInfo] Use DW_ATE_signed encoding when creating a Fortran
array index type.
2022-04-07 07:00:56 -04:00
Mehdi Chinoune 3031fa88f0 [lldb] Fix building standalone LLDB on Windows.
It was broken since https://reviews.llvm.org/D110172

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D122523
2022-04-07 12:30:33 +03:00
Fraser Cormack 8216255c9f [RISCV][VP] Add basic RVV codegen for vp.fcmp
This patch adds the necessary infrastructure to lower vp.fcmp via
ISD::VP_SETCC to RVV instructions.

Most notably this patch adds cond-code legalization for VP_SETCC,
reusing the existing TargetLowering::LegalizeSetCCCondCode by passing in
additional SDValue parameters for the Mask and EVL. This method then
uses VP operations to legalize the condcode.

There is still a general lack of canonicalization on VP_SETCC as opposed
to SETCC which results in worse code than is theoretically possible.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123051
2022-04-07 09:16:07 +01:00
Matt Arsenault 7f14a1d46b AtomicExpand: Add NotAtomic lowering strategy
Currently LowerAtomics exists as a separate pass which blindly
replaces all atomics. Add a new lowering strategy option to eliminate
the atomics which the target can control on a per-instruction level.
2022-04-06 22:34:35 -04:00
Matt Arsenault c4ea925f50 AtomicExpand: Change return type for shouldExpandAtomicStoreInIR
Use the same enum as the other atomic instructions for consistency, in
preparation for addition of another strategy.

Introduce a new "Expand" option, since the store expansion does not
use cmpxchg. Alternatively, the existing CmpXChg strategy could be
renamed to Expand.
2022-04-06 22:34:04 -04:00
Matt Arsenault 39f1568633 Transforms: Split LowerAtomics into separate Utils and pass
This will allow code sharing from AtomicExpandPass. Not entirely sure
why these exist as separate passes though.
2022-04-06 20:54:45 -04:00
River Riddle ea64828a10 [mlir:PDL] Expand how native constraint/rewrite functions can be defined
This commit refactors the expected form of native constraint and rewrite
functions, and greatly reduces the necessary user complexity required when
defining a native function. Namely, this commit adds in automatic processing
of the necessary PDLValue glue code, and allows for users to define
constraint/rewrite functions using the C++ types that they actually want to
use.

As an example, lets see a simple example rewrite defined today:

```
static void rewriteFn(PatternRewriter &rewriter, PDLResultList &results,
                      ArrayRef<PDLValue> args) {
  ValueRange operandValues = args[0].cast<ValueRange>();
  TypeRange typeValues = args[1].cast<TypeRange>();
  ...
  // Create an operation at some point and pass it back to PDL.
  Operation *op = rewriter.create<SomeOp>(...);
  results.push_back(op);
}
```

After this commit, that same rewrite could be defined as:

```
static Operation *rewriteFn(PatternRewriter &rewriter ValueRange operandValues,
                            TypeRange typeValues) {
  ...
  // Create an operation at some point and pass it back to PDL.
  return rewriter.create<SomeOp>(...);
}
```

Differential Revision: https://reviews.llvm.org/D122086
2022-04-06 17:41:59 -07:00
Nathan Sidwell 51f6caf2fb [demangler][NFC] Rename SwapAndRestore to ScopedOverride
The demangler has a utility class 'SwapAndRestore'. That name is
confusing. It's not swapping anything, and the restore part happens at
the object's destruction. What it's actually doing is allowing a
override of some value that is dynamically accessible within the
lifetime of a lexical scope. Thus rename it to ScopedOverride, and
tweak it's member variable names.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D122606
2022-04-06 12:55:35 -07:00
Daniil Kovalev 28cb9081f8 [NFC][CodeGen] Add comments for SDNode debug ID
Normally, we place fields serving for debug purpose declarations
under `#if LLVM_ENABLE_ABI_BREAKING_CHECKS`. For `SDNode::PersistentId` and
`SelectionDAG::NextPersistentId`, we do not want to do so because it adds
unneeded complexity without noticeable benefits (see discussion with @thakis
in D120714). This patch adds comments describing why we don't place those
fields under `#if` not to confuse anyone more.

Differential Revision: https://reviews.llvm.org/D123238
2022-04-06 21:05:04 +03:00
Daniil Kovalev 62a983ebc5 Revert "[CodeGen] Place SDNode debug ID declaration under appropriate #if"
This reverts commit 83a798d4b0.

As discussed in D120714 with @thakis, the patch added unneeded complexity
without noticeable benefits.
2022-04-06 20:32:53 +03:00
Nathan Sidwell df4522feb7 [demangler] Fix undocumented Local encoding
GCC emits [some] static symbols with an 'L' mangling, which we attempt
to demangle.  But the module mangling changes have exposed that we
were doing so at the wrong level.  Such manglings are outside of the
ABI as they are internal-linkage, so a bit of reverse engineering was
needed.  This adjusts the demangler along the same lines as the
existing gcc demangler (which is not yet module-aware).  'L' is part
of an unqualified name.  As before we merely parse the 'L', and then
ignore it.

Reviewed By: iains

Differential Revision: https://reviews.llvm.org/D123138
2022-04-06 10:12:36 -07:00
Fraser Cormack 6be5e875be [RISCV][VP] Add basic RVV codegen for vp.icmp
This patch adds the minimum required to successfully lower vp.icmp via
the new ISD::VP_SETCC node to RVV instructions.

Regular ISD::SETCC goes through a lot of canonicalization which targets
may rely on which has not hereto been ported to VP_SETCC. It also
supports expansion of individual condition codes and a non-boolean
return type. Support for all of that will follow in later patches.

In the case of RVV this largely isn't a problem as the vector integer
comparison instructions are plentiful enough that it can lower all
VP_SETCC nodes on legal integer vectors except for boolean vectors,
which regular SETCC folds away immediately into logical operations.

Floating-point VP_SETCC operations aren't as well supported in RVV and
the backend relies on condition code expansion, so support for those
operations will come in later patches.

Portions of this code were taken from the VP reference patches.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122743
2022-04-06 16:51:22 +01:00
Daniil Kovalev 83a798d4b0 [CodeGen] Place SDNode debug ID declaration under appropriate #if
Place PersistentId declaration under #if LLVM_ENABLE_ABI_BREAKING_CHECKS to
reduce memory usage when it is not needed.

Differential Revision: https://reviews.llvm.org/D120714
2022-04-06 14:09:32 +03:00
Jeremy Morse fb6596f1ec [DebugInfo][InstrRef] Avoid a crash from mixed variable location modes
Variable locations now come in two modes, instruction referencing and
DBG_VALUE. At -O0 we pick DBG_VALUE to allow fast construction of variable
information. Unfortunately, SelectionDAG edits the optimisation level in
the presence of opt-bisect-limit, meaning different passes have different
views of what variable location mode we should use. That causes assertions
when they're mixed.

This patch plumbs through a boolean in SelectionDAG from start to
instruction emission, so that we don't rely on the current optimisation
level for correctness.

Differential Revision: https://reviews.llvm.org/D123033
2022-04-06 11:55:38 +01:00
Nikita Popov ed4e6e0398 [cmake] Remove LLVM_ENABLE_NEW_PASS_MANAGER cmake option
Or rather, error out if it is set to something other than ON. This
removes the ability to enable the legacy pass manager by default,
but does not remove the ability to explicitly enable it through
various flags like -flegacy-pass-manager or -enable-new-pm=0.

I checked, and our test suite definitely doesn't pass with
LLVM_ENABLE_NEW_PASS_MANAGER=OFF anymore.

Differential Revision: https://reviews.llvm.org/D123126
2022-04-06 09:52:21 +02:00
Argyrios Kyrtzidis 330268ba34 [Support/Hash functions] Change the `final()` and `result()` of the hashing functions to return an array of bytes
Returning `std::array<uint8_t, N>` is better ergonomics for the hashing functions usage, instead of a `StringRef`:

* When returning `StringRef`, client code is "jumping through hoops" to do string manipulations instead of dealing with fixed array of bytes directly, which is more natural
* Returning `std::array<uint8_t, N>` avoids the need for the hasher classes to keep a field just for the purpose of wrapping it and returning it as a `StringRef`

As part of this patch also:

* Introduce `TruncatedBLAKE3` which is useful for using BLAKE3 as the hasher type for `HashBuilder` with non-default hash sizes.
* Make `MD5Result` inherit from `std::array<uint8_t, 16>` which improves & simplifies its API.

Differential Revision: https://reviews.llvm.org/D123100
2022-04-05 21:38:06 -07:00
Evgeniy Brevnov acfc785c0e Preserve aliasing info during memory intrinsics lowering
By specification, source and destination of llvm.memcpy.* must either be equal or non-overlapping. This semantics is hard or impossible to figure out once lowered. This patch explicitly marks loads from source and stores to destination as not aliasing if source and destination is known to be not equal.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D118441
2022-04-06 11:33:54 +07:00
Johannes Doerfert af30de7788 [Attributor] Introduce AAInstanceInfo
The Attributor, as many other parts in LLVM, uses pointer equivalence
for `llvm::Value`s. This only works as long as `llvm::Value`s are
dynamically unique, or, to be exact, we will never end up with the same
`llvm::Value` representing two dynamic instances. We already provided a
helper to check the former, namely `AA::isDynamicallyUnique`, however we
could not check the latter. In this patch we move the logic into a
separate AA which helps with the growing complexity and use cases. We
also extend the interface to answer the second question rather than the
first. So we do not determine dynamically uniqueness but if we might end
up with the `llvm::Value` describing a different dynamic instance. Note
that the latter is very much tied to the Attributor capabilities to look
through memory, recursion, etc. so we need to update the logic as we go.
2022-04-05 23:07:13 -05:00
Johannes Doerfert c42aa1be74 [Attributor] Keep loads feeding in `llvm.assume` if stores stays
If a load is only used by an `llvm.assume` and the stores feeding into
the load are not removable, keep the load.
2022-04-05 23:07:12 -05:00
Ting Wang b389354b28 [Clang][PowerPC] Add max/min intrinsics to Clang and PPC backend
Add support for builtin_[max|min] which has below prototype:
A builtin_max (A1, A2, A3, ...)
All arguments must have the same type; they must all be float, double, or long double.
Internally use SelectCC to get the result.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D122478
2022-04-05 22:43:48 -04:00
Lang Hames 9a62d9db2e [JITLink][MachO] Fix alignment bug in the c-string literal section graphifier.
This function had been assuming a 1-byte alignment, which isn't always correct.
This commit updates it to take the alignment from the __cstring section.

The key change is to the createContentBlock call, but the surrounding code is
updated with clearer debugging output to support the testcase (and any future
debugging work).
2022-04-05 17:38:54 -07:00
Johannes Doerfert 3e8c4366e2 [Attributor] Visit droppable uses in AAIsDead
If we ignore droppable users everything only used in llvm.assume (among
other things) is going to be deleted as dead. This is not helpful.
Instead we want to only delete things we actually don't need anymore. A
follow up will deal with loads in a smarter way.
2022-04-05 18:20:45 -05:00
William Woodruff d81b014469 [NFC][Bitstream] Improve the dumpability of bitstream/bitcode headers
The `LLVMBitCodes.h` header contains various enums that are updated whenever LLVM's bitcode fundamentally changes. It would be nice to track these changes in a semi-automated way, so that external tools that attempt to parse LLVM's bitstream and bitcode can remain in sync.

Before this change, `LLVMBitCodes.h` had a single dependency -- it needed the `FIRST_APPLICATION_BLOCKID` enum value from `BitCodes.h`. `BitCodes.h`, in turn, had a whole tree of include dependencies that boiled down to `llvm-config.h`, meaning that it was impossible to dump the AST of either file without having a partial or full LLVM build tree already present.

To eliminate that requirement, this patch introduces a new leaf-only header, `BitCodeEnums.h`, which includes the "core" enums originally in `BitCodes.h`. `LLVMBitCodes.h` and `BitCodes.h` both include this new header in turn, preserving the current header relationships while allowing `LLVMBitCodes.h` to be dumped fully independently with a command like this (run from the repository root):

```
clang -fsyntax-only -x c++ -Illvm/include -Xclang -ast-dump=json -Xclang -ast-dump-filter -Xclang llvm::bitc::BlockIDs llvm/include/llvm/Bitcode/LLVMBitCodes.h
```

I recognize that this is a pretty unusual change and perhaps not a guarantee that the LLVM authors would like to make in the general case (i.e., that individual files within LLVM can have their AST dumped with minimal dependencies). However, I believe the criticality/limited scope of the file(s) in this patch warrants an exception. Please let me know if there's any other information I can provide, or anything else I can do to improve this patch!

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D108438
2022-04-05 15:10:49 -07:00
Ben Barham f65b0b5dcf Revert "[VFS] RedirectingFileSystem only replace path if not already mapped"
This reverts commit 3fda0edc51, which
breaks crash reproducers in very specific circumstances. Specifically,
since crash reproducers have `UseExternalNames` set to false, the
`File->getFileEntry().getDir()->getName()` call in `DoFrameworkLookup`
would use the *cached* directory name instead of the directory of the
looked-up file.

The plan is to re-commit this patch but to *add*
`ExposesExternalVFSPath` rather than replace `IsVFSMapped`.

Differential Revision: https://reviews.llvm.org/D123103
2022-04-05 14:24:40 -07:00
Artur Pilipenko 857d699667 Move BasicBlock::getTerminator definition to the header
This way it can be inlined to its caller. This method
shows up in the profile and it is essentially a fancy
getter. It would benefit from inlining into its callers.

NFC.
2022-04-05 13:11:38 -07:00
Paul Robinson 077f90315b [PS5] Add PS5 as a legal triple component 2022-04-05 12:55:12 -07:00
Matt Devereau 2c3f66519c [SVE] Extend support for folding select + masked gathers
Extend the work done in D106376 to include masked gathers

Differential Revision: https://reviews.llvm.org/D122896
2022-04-05 16:27:11 +00:00
Jingu Kang 64b6192e81 [AArch64] Set maximum VF with shouldMaximizeVectorBandwidth
Set the maximum VF of AArch64 with 128 / the size of smallest type in loop.

Differential Revision: https://reviews.llvm.org/D118979
2022-04-05 13:16:52 +01:00
Nikita Popov 46cfbe561b [LLVMContext] Replace enableOpaquePointers() with setOpaquePointers()
This allows both explicitly enabling and explicitly disabling
opaque pointers, in anticipation of the default switching at some
point.

This also slightly changes the rules by allowing calls if either
the opaque pointer mode has not yet been set (explicitly or
implicitly) or if the value remains unchanged.
2022-04-05 12:02:48 +02:00
Pavel Labath dbb158ebf4 Remove top-level using directives from Transforms/IPO headers
These directives pollute the namespace of all files which include the
header.
2022-04-05 11:22:37 +02:00
Muhammad Omair Javaid 0320115c16 Revert "[CodeGen] Async unwind - add a pass to fix CFI information"
This reverts commit 980c3e6dd2.

This commit had failing tests with clang crashing across various
AArch64/Linux buildots.

https://lab.llvm.org/buildbot/#/builders/179/builds/3346

Differential Revision: https://reviews.llvm.org/D114545
2022-04-05 13:12:30 +05:00
Ilia Diachkov 28a681316f Fix nulltpr typo in comment. NFC
The patch fixes the typo "nulltpr", accidentally found in comments.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D122993
2022-04-04 09:05:06 -07:00
Momchil Velikov 980c3e6dd2 [CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the
inconsistency of the call-frame information caused by linear (non-CFG
aware) nature of the unwind tables.

Unlike the `CFIInstrInserer` pass, this one almost always emits only
`.cfi_remember_state`/`.cfi_restore_state`, which results in smaller
unwind tables and also transparently handles custom unwind info
extensions like CFA offset adjustement and save locations of SVE
registers.

This pass takes advantage of the constraints that LLVM imposes on the
placement of save/restore points (cf. `ShrinkWrap.cpp`):

  * there is a single basic block, containing the function prologue

  * possibly multiple epilogue blocks, where each epilogue block is
    complete and self-contained, i.e. CSR restore instructions (and the
    corresponding CFI instructions are not split across two or more
    blocks.

  * prologue and epilogue blocks are outside of any loops

Thus, during execution, at the beginning and at the end of each basic
block the function can be in one of two states:

  - "has a call frame", if the function has executed the prologue, or
     has not executed any epilogue

  - "does not have a call frame", if the function has not executed the
    prologue, or has executed an epilogue

These properties can be computed for each basic block by a single RPO
traversal.

In order to accommodate backends which do not generate unwind info in
epilogues we compute an additional property "strong no call frame on
entry" which is set for the entry point of the function and for every
block reachable from the entry along a path that does not execute the
prologue. If this property holds, it takes precedence over the "has a
call frame" property.

From the point of view of the unwind tables, the "has/does not have
call frame" state at beginning of each block is determined by the
state at the end of the previous block, in layout order.

Where these states differ, we insert compensating CFI instructions,
which come in two flavours:

- CFI instructions, which reset the unwind table state to the
    initial one.  This is done by a target specific hook and is
    expected to be trivial to implement, for example it could be:
```
     .cfi_def_cfa <sp>, 0
     .cfi_same_value <rN>
     .cfi_same_value <rN-1>
     ...
```
where `<rN>` are the callee-saved registers.

- CFI instructions, which reset the unwind table state to the one
    created by the function prologue. These are the sequence:
```
       .cfi_restore_state
       .cfi_remember_state
```
In this case we also insert a `.cfi_remember_state` after the
last CFI instruction in the function prologue.

Reviewed By: MaskRay, danielkiss, chill

Differential Revision: https://reviews.llvm.org/D114545
2022-04-04 14:38:22 +01:00
Nathan Sidwell ee6ec9e861 [demangler] Parenthesize >> inside template args
Both > and >> expressions need to be parenthesized inside template
argument lists.

Reviewed By: dblaikie, rjmccall

Differential Revision: https://reviews.llvm.org/D122474
2022-04-04 06:35:32 -07:00
Nikita Popov c0cc98251a [Float2Int] Make sure dependent ranges are calculated first (PR54669)
The range calculation in walkForwards() assumes that the ranges of
the operands have already been calculated. With the used visit
order, this is not necessarily the case when there are multiple
roots. (There is nothing guaranteeing that instructions are visited
in topological order.)

Fix this by queuing instructions for reprocessing if the operand
ranges haven't been calculated yet.

Fixes https://github.com/llvm/llvm-project/issues/54669.

Differential Revision: https://reviews.llvm.org/D122817
2022-04-04 10:18:39 +02:00
Augie Fackler e90bce8f91 CallBase: fix getFnAttr so it also checks the function
Prior to this change, CallBase::hasFnAttr checked the called function to
see if it had an attribute if it wasn't set on the CallBase, but
getFnAttr didn't do the same delegation, which led to very confusing
behavior. This patch fixes the issue by making CallBase::getFnAttr also
check the function under the same circumstances.

Test changes look (to me) like they're cleaning up redundant attributes
which no longer get specified both on the callee and call. We also clean
up the one ad-hoc implementation of this getter over in InlineCost.cpp.

Differential Revision: https://reviews.llvm.org/D122821
2022-04-03 23:19:23 -04:00
Philip Reames 7c51669c21 [memcpyopt] Restructure store(load src, dest) form of callslotopt for compile time
The search for the clobbering call is fairly expensive if uses are not optimized at construction.  Defer the clobber walk to the point in the implementation we need it; there are a bunch of bailouts before that point.  (e.g. If the source pointer is not an alloca, we can't do callslotopt.)

On a test case which involves a bunch of copies from argument pointers, this switches memcpyopt from > 1/2 second to < 10ms.
2022-04-03 20:16:20 -07:00
Simon Pilgrim 76cd11f303 [DAG] Add llvm::isMinSignedConstant helper. NFC
Pulled out of D122754
2022-04-01 17:47:34 +01:00
Nathan Sidwell abffdd8876 [demangler] Fix node matchers
* Add instantiation tests to ItaniumDemangleTest, to make sure all
  match functions provide constructor arguments to the provided functor.

* Fix the Node constructors that lost const qualification on arguments.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D122665
2022-04-01 05:19:34 -07:00
Nathan Sidwell 369337e3c2 [demangler][NFC] Use def file for node names
In order to add a unit test, we need to expose the node names beyond
ItaniumDemangle.h.  This breaks them out into a def file.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D122739
2022-04-01 05:03:34 -07:00
Peixin-Qiao 3e7415a0ff [OMPIRBuilder] Support ordered clause specified without parameter
This patch supports ordered clause specified without parameter in
worksharing-loop directive in the OpenMPIRBuilder and lowering MLIR to
LLVM IR.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D114940
2022-04-01 16:17:29 +08:00
Jake Vossen 4b82bb6d82 Fix Typo in SmallVector doc
Replace forward slash with backward slash.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D120930
2022-04-01 03:44:37 +00:00
yanming a7c0b7504c [VP] Add more cast VPintrinsic and docs.
Add vp.fptoui, vp.uitofp, vp.fptrunc, vp.fpext, vp.trunc, vp.zext, vp.sext, vp.ptrtoint, vp.inttoptr intrinsic and docs.

Reviewed By: frasercrmck, craig.topper

Differential Revision: https://reviews.llvm.org/D122291
2022-04-01 09:16:10 +08:00
Jorge Gorbe Moya fc7573f29c Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit 46774df307.
2022-03-31 14:54:41 -07:00
Luboš Luňák de4bcdc2ba include stddef.h for size_t
Needed at least for -DLLVM_ENABLE_MODULES=On.
2022-03-31 23:51:07 +02:00
Paul Kirth 46774df307 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D115907
2022-03-31 17:38:21 +00:00
Wenju He 0bda12b5bc [NewPM] Add OptimizerEarly module extension point
VectorizerStart extension is module callback in old PM, but is function
callback in new PM. We lack a module extension point between end of
buildModuleSimplificationPipeline and the function optimization
(including vectorizer) pipeline. So this patch adds a new module
extension point before the function optimization pipeline.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D122296
2022-03-31 08:22:27 -07:00
Nikita Popov f66975555f [Float2Int] Extract calcRange() method (NFC)
This avoids the awkward "Abort" flag, because we can simply
early-return instead.
2022-03-31 16:13:13 +02:00
Jay Foad aa4c055e25 [AMDGPU] Document the intended semantics of llvm.amdgcn.s.buffer.load
Differential Revision: https://reviews.llvm.org/D122653
2022-03-31 09:13:30 +01:00
Serge Pavlov 881350a92d Mapping of FP operations to constrained intrinsics
A new function 'getConstrainedIntrinsic' is added, which for any gived
instruction returns id of the corresponding constrained intrinsic. If
there is no constrained counterpart for the instruction or the instruction
is already a constrained intrinsic, the function returns zero.

This is recommit of 115b3ace36, reverted in
8160dd582b.

Differential Revision: https://reviews.llvm.org/D69562
2022-03-31 11:07:47 +07:00
David Blaikie 6f5ecd089f Demangle: Fix crash-on-invalid demangling of a module name with no underlying entity 2022-03-30 20:26:32 +00:00
Amir Ayupov c31af7cfe3 [MC][BOLT] Add setter for AllowAtInName
Use the setter in BOLT to allow printing names with variant kind in the name
(e.g. "func@PLT").
Fixes BOLT buildbot tests that broke after D122516:
https://lab.llvm.org/buildbot/#/builders/215/builds/3595

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D122694
2022-03-30 13:04:28 -07:00
Eli Friedman 72517e27c1 [AArch64] Fix AArch64TargetParser.def to match AArch64.td.
Currently, we have two different lists of features each CPU supports...
and those lists aren't consistent. This patch assumes AArch64.td is
right, and tries to fix AArch64TargetParser to match.

It's hard to find documentation for the right features, but reviewers
have confirmed these changes.

Probably we should try to unify the two lists at some point, but
synchronizing them seems like a prerequisite to that anyway.

Differential Revision: https://reviews.llvm.org/D122274
2022-03-30 12:15:39 -07:00
Ben Barham 3fda0edc51 [VFS] RedirectingFileSystem only replace path if not already mapped
If the `ExternalFS` has already remapped a path then the
`RedirectingFileSystem` should not change it to the originally provided
path. This fixes the original path always being used if multiple VFS
overlays were provided and the path wasn't found in the highest (ie.
first in the chain).

This also renames `IsVFSMapped` to `ExposesExternalVFSPath` and only
sets it if `UseExternalName` is true. This flag then represents that the
`Status` has an external path that's different from its virtual path.
Right now the contained path is still the external path, but further PRs
will change this to *always* be the virtual path. Clients that need the
external can then request it specifically.

Note that even though `ExposesExternalVFSPath` isn't set for all
VFS-mapped paths, `IsVFSMapped` was only being used by a hack in
`FileManager` that was specific to module searching. In that case
`UseExternalNames` is always `true` and so that hack still applies.

Resolves rdar://90578880 and llvm-project#53306.

Differential Revision: https://reviews.llvm.org/D122549
2022-03-30 11:52:41 -07:00
Fraser Cormack 73244e8f85 [VP] Add vp.icmp comparison intrinsic and docs
This patch mostly follows up on D121292 which introduced the vp.fcmp
intrinsic.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122729
2022-03-30 17:05:11 +01:00
Sanjay Patel 436b875e49 [SDAG] avoid libcalls to fmin/fmax for soft-float targets
This is an extension of D70965 to avoid creating a mathlib
call where it did not exist in the original source. Also see
D70852 for discussion about an alternative proposal that was
abandoned.

In the motivating bug report:
https://github.com/llvm/llvm-project/issues/54554
...we also have a more general issue about handling "no-builtin" options.

Differential Revision: https://reviews.llvm.org/D122610
2022-03-30 11:22:03 -04:00
Fraser Cormack da6131f20a [VP] Add vp.fcmp comparison intrinsic and docs
This patch adds the first support for vector-predicated comparison
intrinsics, starting with vp.fcmp. It uses metadata to encode its
condition code, like the llvm.experimental.constrained.fcmp intrinsic.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121292
2022-03-30 14:39:18 +01:00
Serge Pavlov 8160dd582b Revert "Mapping of FP operations to constrained intrinsics"
This reverts commit 115b3ace36.
Starting from this commit the buildbot sanitizer-x86_64-linux-bootstrap-msan
starts failing (build 10071). Reverted for investigation.
2022-03-30 16:46:43 +07:00
Vitaly Buka 15972e37ba [CodeGen] Avoid access after runtime
Insts must be destroyd before xParent
or it can read it with stack like this:
   0 in llvm::MachineInstr::getMF() const MachineInstr.cpp:637:3
   1 in getMF MachineInstr.h:302:50
   2 in removeNodeFromList MachineBasicBlock.cpp:163:32
2022-03-30 02:08:13 -07:00
Luboš Luňák a60e09509c add missing include for -DLLVM_ENABLE_MODULES=On 2022-03-30 10:31:59 +02:00
Serge Pavlov 115b3ace36 Mapping of FP operations to constrained intrinsics
A new function 'getConstrainedIntrinsic' is added, which for any gived
instruction returns id of the corresponding constrained intrinsic. If
there is no constrained counterpart for the instruction or the instruction
is already a constrained intrinsic, the function returns zero.

Differential Revision: https://reviews.llvm.org/D69562
2022-03-30 12:21:30 +07:00
Zakk Chen 10b2760da0 Revert "[RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR"
This reverts commit 10fd2822b7.

I have a better implementation for those operations without the
additional policy operand.
masked compare and vmsbf/vmsif/vmsof are always tail agnostic so we could
assume undef maskedoff is mask agnostic.

Differential Revision: https://reviews.llvm.org/D122455
2022-03-29 18:05:33 -07:00
Chris Bieneman 9130e471fe Add DXContainer
DXIL is wrapped in a container format defined by the DirectX 11
specification. Codebases differ in calling this format either DXBC or
DXILContainer.

Since eventually we want to add support for DXBC as a target
architecture and the format is used by DXBC and DXIL, I've termed it
DXContainer here.

Most of the changes in this patch are just adding cases to switch
statements to address warnings.

Reviewed By: pete

Differential Revision: https://reviews.llvm.org/D122062
2022-03-29 14:34:23 -05:00
Chris Bieneman b39f437757 [ADT] add initializer list specialization for is_contained
Adding an initializer list specialization for is_contained allows for
compile-time evaluation when called with a constant or runtime
evaluation for non-constant values.

This patch doesn't add any uses of this template, but that is coming in
a subsequent patch.

Reviewed By: pete

Differential Revision: https://reviews.llvm.org/D122079
2022-03-29 12:39:39 -05:00
Chris Bieneman 5b6207f3cd [ADT] Flesh out HLSL raytracing environments
Fleshing this out now allows me to rely on enum math to translate
values rather than having to translate the off cases.

I should have added this in the first pass, but wasn't thinking about
it.
2022-03-29 09:43:03 -05:00
Nathan Sidwell c204cee642 [demangler] Update node match calls
Each demangler node's match function needs to call the provided
functor with constructor arguments.  That was omitted from D120905.
This adds the new Precedence argument where necessary (and a missing
boolean for a module node).

The two visitors need updating with a printer for that type, and this
adds a stub to cxa_demangle's version.  blaikie added one to llvm's.
I'll fill out those printers in a followup, rather than wait, so that
downstream consumers are unbroken.
2022-03-29 05:32:36 -07:00
Thomas Preud'homme f1d8e46258 Clarify invariants of software pipelining hooks
PowerPC backend relies on each pair of prologue/epilogue of a software
pipelined loop to correspond to a single iteration a the loop through
its use of the BDZ instruction to skip inner prologues/epilogues and
loop kernel. However the interface does not make it clear that it is a
valid way to check that the trip count is big enough to execute inner
prologues/epilogues and kernel loop.

The API also does not specify in which order of prologues the
createTripCountGreaterCondition() hook is being called. Knowing that it
starts with the last/innermost prologues can help recording some
information when createTripCountGreaterCondition() is first executed and
reuse it in setPreheader() or adjustTripCount().

This commit documents both aspects.

Reviewed By: jmolloy

Differential Revision: https://reviews.llvm.org/D122642
2022-03-29 11:44:10 +01:00
serge-sans-paille 01be9be2f2 Cleanup includes: final pass
Cleanup a few extra files, this closes the work on libLLVM dependencies on my
side.

Impact on libLLVM preprocessed output: -35876 lines

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D122576
2022-03-29 09:00:21 +02:00
Paul Kirth 90cb325abd Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit 2add3fbd97.
2022-03-29 06:20:30 +00:00
Johannes Doerfert 7df2eba7fa [Attributor][OpenMP] Add assumption for non-call assembly instructions
Inline assembly is scary but we need to support it for the OpenMP GPU
device runtime. The new assumption expresses the fact that it may not
have call semantics, that is, it will not call another function but
simply perform an operation or side-effect. This is important for
reachability in the presence of inline assembly.

Differential Revision: https://reviews.llvm.org/D109986
2022-03-28 20:57:52 -05:00
Johannes Doerfert bb0b23174e [InstCombineCalls] Optimize call of bitcast even w/ parameter attributes
Before we gave up if a call through bitcast had parameter attributes.
Interestingly, we allowed attributes for the return value already. We
now handle both the same way, namely, we drop the ones that are
incompatible with the new type and keep the rest. This cannot cause
"more UB" than initially present.

Differential Revision: https://reviews.llvm.org/D119967
2022-03-28 20:57:52 -05:00
Shao-Ce SUN 662b9fa02c [NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D122557
2022-03-29 09:53:24 +08:00
Paul Kirth 2add3fbd97 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D115907
2022-03-28 23:30:04 +00:00
David Blaikie 1d1cf9b6c4 ItaniumDemangler: Update BinaryExpr::match to match the ctor
Not sure if this could use more testing, but hopefully this is adequate.
2022-03-28 21:51:27 +00:00
zhijian 39772da5fd [AIX][XCOFF] address post-commit review comments of patch https://reviews.llvm.org/D82549
Summary:
Address post-commit review comments in the https://reviews.llvm.org/D82549, including

changed file name from llvm/test/tools/llvm-readobj/XCOFF/xcoff-auxiliary-header.test --> llvm/test/tools/llvm-readobj/XCOFF/auxiliary-header.test
replaced macro define by using lambda function.
added a helper function to reduce the duplicated check and print error code.

Reviewer : James Henderson
Differential Revision: https://reviews.llvm.org/D116220
2022-03-28 15:05:41 -04:00
Nathan Sidwell 1066e397fa [demangler] Add StringView conversion operator
The OutputBuffer class tries to present a NUL-terminated string API to
consumers.  But several of them would prefer a StringView.  In
particular the Microsoft demangler, juggles between NUL-terminated and
StringView, which is confusing.

This adds a StringView conversion, and adjusts the Demanglers that can
benefit from that.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D120990
2022-03-28 11:19:55 -07:00
Jyotsna Verma 65a2f6ad9c [Hexagon] Create an intrinsic to profile using a custom handler
The intrinsic is lowered into a hexagon pseudo instruction which
after register allocation is expanded into A2_tfrsi and J2_call.
2022-03-28 10:31:41 -05:00
Nathan Sidwell b3b4113a23 [demangler] Add operator precedence
The demangler had no concept of operator precendence, and would
parenthesize many more subexpressions than necessary.  In particular
it would parenthesize primary-expressions, such as '4', which just
looks strange.  It would also parenthesize '>' expressions, just in
case they were inside a template parameter list.

This patch fixes both issues.

* Add operator precedence to the OpInfo structure, and add a
  subexpression helper that will parenthesize a lower precedence
  subexpression.

* Add a 'greater-than is greater-than' indicator to the output buffer,
  so the expression printer knows whether it is immediately inside a
  template parameter list (and must therefore parenthesize 'expr >
  expr').  This is a counter, so that ...

* Add open and close printers to the output buffer, that increment and
  decrement the gt-is-gt indicator.

* Parenthesize comma operators inside comma-separated lists. (probably
  a rare case, but still).

This dramatically reduces the extraneous parentheses being printed.

Reviewed By: dblaikie, bruno

Differential Revision: https://reviews.llvm.org/D120905
2022-03-28 06:17:57 -07:00
Pavel Labath ec6d621050 Remove a top-level using-directive from EPCDebugObjectRegistrar.h
The directive pollutes the namespace of all files which include the
header.

Use alternate ways to reference the namespace constituents instead.
2022-03-28 15:14:20 +02:00
Alexandros Lamprineas 8045bf9d0d [FuncSpec] Support function specialization across multiple arguments.
The current implementation of Function Specialization does not allow
specializing more than one arguments per function call, which is a
limitation I am lifting with this patch.

My main challenge was to choose the most suitable ADT for storing the
specializations. We need an associative container for binding all the
actual arguments of a specialization to the function call. We also
need a consistent iteration order across executions. Lastly we want
to be able to sort the entries by Gain and reject the least profitable
ones.

MapVector fits the bill but not quite; erasing elements is expensive
and using stable_sort messes up the indices to the underlying vector.
I am therefore using the underlying vector directly after calculating
the Gain.

Differential Revision: https://reviews.llvm.org/D119880
2022-03-28 12:01:53 +01:00
Fangrui Song c0eb9b4cde Revert D121984 "[RISCV][NFC] Moving RVV intrinsic type related util to llvm/Support"
This reverts commit ad57e10dbc and 1967fd8d5e

llvm/lib/Support/RISCVVIntrinsicUtils.cpp introduced llvm/TableGen includes,
a circular dependency https://llvm.org/docs/CodingStandards.html#library-layering
I think this particular instance is serious and should be reverted.
2022-03-28 01:17:37 -07:00
Fangrui Song 1967fd8d5e [RISCV] Remove using namespace llvm from public header after D121984 2022-03-28 00:51:58 -07:00
Kito Cheng ad57e10dbc [RISCV][NFC] Moving RVV intrinsic type related util to llvm/Support
This patch is split from https://reviews.llvm.org/D111617, we need those
stuffs on clang, so must moving those stuff to llvm/Support.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D121984
2022-03-28 14:35:28 +08:00
Luo, Yuanke 321cbf75be [Verifier] Verify parameter alignment.
In DAGISel, the parameter alignment only have 4 bits to hold the value.
The encode(alignment) would plus the shift value by 1, so the max aligment
ISel can support is 2^14. This patch verify the parameter and return
value for alignment.

Differential Revision: https://reviews.llvm.org/D121898
2022-03-27 08:35:05 +08:00
Fangrui Song 02f20a09c3 [Option] Remove the error-prone default argument true from 4-argument hasFlag 2022-03-26 01:09:18 -07:00
Fangrui Song 522712e2d2 [Option] Remove the error-prone default argument true from 3-argument hasFlag 2022-03-26 00:58:39 -07:00
Adrian Prantl 1f98e09bf8 Add missing include diagnosed in modules build. (NFC) 2022-03-25 12:40:08 -07:00
Dávid Bolvanský 39d348c602 [NFCI] Fix set-but-unused warning in DenseMap.h in some configurations 2022-03-25 17:12:53 +01:00
Johannes Doerfert a81fff8afd Reapply "[Intrinsics] Add `nocallback` to the default intrinsic attributes"
This reverts commit c5f789050d and
reapplies 7aea3ea8c3 with additional test
changes.
2022-03-25 09:36:50 -05:00
Carlos Alberto Enciso 75112133b8 [llvm-pdbutil] Move InputFile/FormatUtil/LinePrinter to PDB library.
At Sony we are developing llvm-dva

https://lists.llvm.org/pipermail/llvm-dev/2020-August/144174.html

For its PDB support, it requires functionality already present in
llvm-pdbutil.

We intend to move that functionaly into the PDB library to be
shared by both tools. That change will be done in 2 steps, that
will be submitted as 2 patches:

(1) Replace 'ExitOnError' with explicit error handling.
(2) Move the intended shared code to the PDB library.

Patch for step (1): https://reviews.llvm.org/D121801

This patch is for step (2).

Move InputFile.cpp[h], FormatUtil.cpp[h] and LinePrinter.cpp[h]
files to the debug PDB library.

It exposes the following functionality that can be used by tools:

- Open a PDB file.
- Get module debug stream.
- Traverse module sections.
- Traverse module subsections.

Most of the needed functionality is in InputFile, but there are
dependencies from LinePrinter and FormatUtil.

Some other functionality is in the following functions in
DumpOutputStyle.cpp file:

- iterateModuleSubsections
- getModuleDebugStream
- iterateOneModule
- iterateSymbolGroups
- iterateModuleSubsections

Only these specific functions from DumpOutputStyle are moved to
the PDB library.

Reviewed By: aganea, dblaikie, rnk

Differential Revision: https://reviews.llvm.org/D122226
2022-03-25 07:12:58 +00:00
Stanislav Mekhanoshin 6e3e14f600 [AMDGPU] Support gfx940 smfmac instructions
Differential Revision: https://reviews.llvm.org/D122191
2022-03-24 12:40:42 -07:00
Stanislav Mekhanoshin 27439a7642 [AMDGPU] New gfx940 mfma instructions
Differential Revision: https://reviews.llvm.org/D122044
2022-03-24 12:12:52 -07:00
Johannes Doerfert c5f789050d Revert "[Intrinsics] Add `nocallback` to the default intrinsic attributes"
This reverts commit 7aea3ea8c3 as it
breaks the buildbots.

I didn't see these failures in the pre-merge checks, looking into it.
2022-03-24 14:04:41 -05:00
Johannes Doerfert 7aea3ea8c3 [Intrinsics] Add `nocallback` to the default intrinsic attributes
Most intrinsics, especially "default" ones, will not call back into the
IR module. `nocallback` encodes this nicely. As it was not used before,
this patch also makes use of `nocallback` in the Attributor which
results in many more `norecurse` deductions.

Tablegen part is mechanical, test updates by script.

Differential Revision: https://reviews.llvm.org/D118680
2022-03-24 13:50:54 -05:00
Argyrios Kyrtzidis 7f05aa2d4c [Support/BLAKE3] LLVM-specific changes over the original BLAKE3 C implementation
Changes from original BLAKE3 sources:

* `blake.h`:
    * Changes to avoid conflicts if a client also links with its own BLAKE3 version:
        * Renamed the header macro guard with `LLVM_C_` prefix
        * Renamed the C symbols to add the `llvm_` prefix
    * Added a top header comment that references the CC0 license and points to the `LICENSE` file in the repo.
* `blake3_impl.h`: Added `#define`s to remove some of `llvm_` prefixes for the rest of the internal implementation.
* Implementation files:
    * Added a top header comment for `blake.c`
    * Used `llvm_` prefix for the C public API functions
    * Used `LLVM_LIBRARY_VISIBILITY` for internal implementation functions
    * Added `.private_extern`/`.hidden` in assembly files to reduce visibility of the internal implementation functions
* `README.md`:
    * added a note about where the sources originated from
    * Used the C++ BLAKE3 class and `llvm_` prefixed C API in place of examples and API documentation.
    * Removed instructions about how to build the files.
2022-03-24 10:26:39 -07:00
Argyrios Kyrtzidis 9aa701984d [Support] Introduce the BLAKE3 hashing function implementation
BLAKE3 is a cryptographic hash function that is secure and very performant.
The C implementation originates from https://github.com/BLAKE3-team/BLAKE3/tree/1.3.1/c
License is at https://github.com/BLAKE3-team/BLAKE3/blob/1.3.1/LICENSE

This patch adds:

* `llvm/include/llvm-c/blake3.h`: The BLAKE3 C API
* `llvm/include/llvm/Support/BLAKE3.h`: C++ wrapper of the C API
* `llvm/lib/Support/BLAKE3`: Directory containing the BLAKE3 C implementation files, including the `LICENSE` file
* `llvm/unittests/Support/BLAKE3Test.cpp`: unit tests for the BLAKE3 C++ wrapper

This initial patch contains the pristine BLAKE3 sources, a follow-up patch will introduce
LLVM-specific prefixes to avoid conflicts if a client also links with its own BLAKE3 version.

And here's some timings comparing BLAKE3 with LLVM's SHA1/SHA256/MD5.
Timings include `AVX512`, `AVX2`, `neon`, and the generic/portable implementations.
The table shows the speed-up multiplier of BLAKE3 for hashing 100 MBs:

|        Processor        | SHA1  | SHA256 |  MD5 |
|-------------------------|-------|--------|------|
| Intel Xeon W (AVX512)   | 10.4x |   27x  | 9.4x |
| Intel Xeon W (AVX2)     | 6.5x  |   17x  | 5.9x |
| Intel Xeon W (portable) | 1.3x  |  3.3x  | 1.1x |
|      M1Pro (neon)       | 2.1x  |  4.7x  | 2.8x |
|      M1Pro (portable)   | 1.1x  |  2.4x  | 1.5x |

Differential Revision: https://reviews.llvm.org/D121510
2022-03-24 10:26:39 -07:00
Shraiysh Vaishay 8722c12c12 [mlir][OpenMP][IRBuilder] Add support for nowait on single construct
This patch adds the nowait parameter to `createSingle` in
OpenMPIRBuilder and handling for IR generation from OpenMP Dialect.

Also added tests for the same.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D122371
2022-03-24 22:51:52 +05:30
Mike Rice f82ec5532b [OpenMP] Initial parsing/sema for the 'omp target parallel loop' construct
Adds basic parsing/sema/serialization support for the
 #pragma omp target parallel loop directive.

Differential Revision: https://reviews.llvm.org/D122359
2022-03-24 09:19:00 -07:00
Florian Hahn 46432a0088
[VPlan] Add VPWidenPointerInductionRecipe.
This patch moves pointer induction handling from VPWidenPHIRecipe to its
own recipe. In the process, it adds all information required to generate
code for pointer inductions without relying on Legal to access the list
of induction phis.

Alternatively VPWidenPHIRecipe could also take an optional pointer to InductionDescriptor.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D121615
2022-03-24 14:58:45 +00:00
Antonio Frighetto 6872c8bdc4 [NFC] Mark derived destructors as `override`
Derived destructors can be marked as override, in order to prevent
possible compilation failures of projects depending on those
headers (when compiled with flags -Wall, -Wsuggest-destructor-override,
-Winconsistent-missing-destructor-override).

Differential Revision: https://reviews.llvm.org/D121993
2022-03-24 11:42:47 +01:00
Daniil Kovalev c53cbce45e [CodeGen] Define ABI breaking class members correctly
Non-static class members declared under #ifndef NDEBUG should be declared
under #if LLVM_ENABLE_ABI_BREAKING_CHECKS to make headers library-friendly and
allow cross-linking, as discussed in D120714.

Differential Revision: https://reviews.llvm.org/D121549
2022-03-24 12:42:59 +03:00
Xiang1 Zhang 287dad13ab [InlineAsm] Fix mangle problem when global variable used in inline asm
(Add modifier P for ARR[BaseReg+IndexReg+..])

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D120887
2022-03-24 09:41:23 +08:00
Xiang1 Zhang 8a6b644c79 [Inline asm] Fix mangle problem when variable used in inline asm.
(Connect InlineAsm Memory Operand with its real value not just name)
Revert 2 history bugfix patch:

Revert "[X86][MS-InlineAsm] Make the constraint *m to be simple place holder"
This patch revert https://reviews.llvm.org/D115225 which mainly
fix problems intrduced by https://reviews.llvm.org/D113096

This reverts commit d7c07f60b3.

Revert "Reland "[X86][MS-InlineAsm] Use exact conditions to recognize MS global variables""
This patch revert https://reviews.llvm.org/D116090 which fix problem
intrduced by https://reviews.llvm.org/D115225

This reverts commit 24c68ea1eb.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D120886
2022-03-24 09:41:22 +08:00
Julian Lettner 64902d335c Reland "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121736
2022-03-23 18:36:55 -07:00
Vasileios Porpodas 39aa202aff Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit e6ead19b77.
2022-03-23 18:32:17 -07:00
Zequan Wu 581dc3c729 Revert "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
This reverts commit 22570bac69.
2022-03-23 16:11:54 -07:00
Hongtao Yu 3f97016857 [llvm-profgen] Decoding pseudo probe for profiled function only.
Complete pseudo probes decoding can result in large memory usage. In practice only a small porting of the decoded probes are used in profile generation. I'm changing the full decoding mode to be decoding for profiled functions only, though we still do a full scan of the .pseudoprobe section due to a missing table-of-content but we don't have to build the in-memory data structure for functions not sampled.

To build the in-memory data structure for profiled functions only, I'm rewriting the previous non-recursive probe decoding logic to be recursive. This is easy to read and maintain.

I also have to change the previous representation of unsymbolized context from probe-based stack to address-based stack since the profiled functions are unknown yet by the time of virtual unwinding. The address-based stack will be converted to probe-based stack after virtual unwinding and on-demand probe decoding.

I'm seeing 20GB memory is saved for one of our internal large service.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D121643
2022-03-23 14:15:11 -07:00
Arthur Eubanks 9bd66b312c [PassManager][Coroutine] Run passes under -O0 conditionally and run GlobalDCE
CoroSplit lowers various coroutine intrinsics. It's a CGSCC pass and
CGSCC passes don't run on unreachable functions. Normally GlobalDCE will
come along and delete unreachable functions, but we don't run GlobalDCE
under -O0, so an unreachable function with coroutine intrinsics may
never have CoroSplit run on it.

This patch adds GlobalDCE when coroutines intrinsics are present. It
also now runs all coroutine passes conditional when coroutine intrinsics
are present. This should also solve the -O0 regression reported in
D105877 due to LazyCallGraph construction.

Fixes https://github.com/llvm/llvm-project/issues/54117

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D122275
2022-03-23 11:03:26 -07:00
Arthur Eubanks e6ead19b77 Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash."
This reverts commit 27bd8f9492.

Causes crashes, see comments in D121973
2022-03-23 10:57:45 -07:00
serge-sans-paille 02c28970b2 Cleanup include: codegen second round
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D122180
2022-03-23 13:54:00 +01:00
Marcus Johnson d14ccbc2e8 Re-land c346068928 with fixes
It was previously reverted in a6beb18b84
due to test failures.
2022-03-23 08:13:17 -04:00
Craig Topper 681fd2c11e Revert "[SelectionDAG] Don't create entries in ValueMap in ComputePHILiveOutRegInfo"
This reverts commit 1a9b55b63a.

Causing build bot failures
2022-03-22 23:41:47 -07:00
Craig Topper 1a9b55b63a [SelectionDAG] Don't create entries in ValueMap in ComputePHILiveOutRegInfo
Instead of using operator[], use DenseMap::find to prevent default
constructing an entry if it isn't already in the map.
2022-03-22 23:24:53 -07:00
Shengchen Kan b7a4b67380 [Bundle][Codegen] Ignore bundle for meta-instruction
The purpose is to keep the default behavior as before.
Noticed by comments in D121600.

Reviewed By: bjope

Differential Revision: https://reviews.llvm.org/D122221
2022-03-23 10:14:54 +08:00
Vasileios Porpodas 27bd8f9492 Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit f7d7d2a08d.
2022-03-22 16:41:55 -07:00
Snehasish Kumar 27a4f2545f Reland "[memprof] Store callsite metadata with memprof records."
This reverts commit f4b794427e.

Reland with underlying msan issue fixed in D122260.
2022-03-22 14:40:02 -07:00
Snehasish Kumar 61c75eb637 [memprof] Initialize MemInfoBlock data.
This patch updates the existing default no-arg constructor for
MemInfoBlock to explicitly initialize all members. Also add missing
DataTypeId initialization to the other constructor. These issues were
exposed by msan on patch D121179. With this patch D121179 builds cleanly
on msan.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D122260
2022-03-22 14:35:57 -07:00
Mike Rice 2cedaee6f7 [OpenMP] Initial parsing/sema for the 'omp parallel loop' construct
Adds basic parsing/sema/serialization support for the
  #pragma omp parallel loop directive.

 Differential Revision: https://reviews.llvm.org/D122247
2022-03-22 13:55:47 -07:00
Arthur Eubanks f7d7d2a08d Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads.""
This reverts commit 79613185d3.

Causes crashes, see comments in https://reviews.llvm.org/D121973.
2022-03-22 13:33:49 -07:00
Aaron Ballman a6beb18b84 Revert "Add UTF32 to/from UTF8 conversion functions"
This reverts commit c346068928.

It broke at least one of the builders:
https://lab.llvm.org/buildbot#builders/100/builds/13947
2022-03-22 15:00:40 -04:00
Marcus Johnson c346068928 Add UTF32 to/from UTF8 conversion functions
This is anticipated to be used in new format specifier checking code.
2022-03-22 13:41:43 -04:00
Zakk Chen 23d60ce164 [RISCV][NFC] Refine and refactor RISCVVEmitter and riscv_vector.td.
1. Rename nomask as unmasked to keep with the terminology in the spec.
2. Merge UnMaskpolicy and Maskedpolicy arguments into one in RVVBuiltin class.
3. Rename HasAutoDef as HasBuiltinAlias.
4. Move header definition code into one class.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D120870
2022-03-22 09:58:43 -07:00
Craig Topper 49c2206b3b [VP] Preserve address space of pointer for strided load/store intrinsics.
This adds LLVMAnyPointerToElt to use instead of LLVMPointerToElt.
This allows us to preserve the address space as part of the type
overload for the intrinsic, but still require the vector element
type to match the pointer type.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D122042
2022-03-22 09:52:54 -07:00
Nathan Sidwell c354167ae2 [demangler] Add support for C++20 modules
Add support for module name demangling.  We have two new demangler
nodes -- ModuleName and ModuleEntity. The former represents a module
name in a hierarchical fashion. The latter is the combination of a
(name) node and a module name. Because module names and entity
identities use the same substitution encoding, we have to adjust the
flow of how substitutions are handled, and examine the substituted
node to know how to deal with it.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D119933
2022-03-22 09:42:52 -07:00
Zakk Chen 10fd2822b7 [RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR
intrinsics.

Those operations are updated under a tail agnostic policy, but they
could have mask agnostic or undisturbed.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D120228
2022-03-22 07:47:21 -07:00
Joseph Huber 5856f30b5a [LTO] Add configuartion option to use default optimization pipeline
This patch adds a configuration option to simply use the default pass
pipeline in favor of the LTO-specific one. We observed some severe
performance penalties when uding device-side LTO for OpenMP offloading
applications caused by the LTO-pass pipeline. This is primarily because
OpenMP uses an LLVM bitcode library to implement a GPU runtime library.
In a standard compilation we link this bitcode library into each source
file and optimize it with the default pipeline. When performing LTO we
link it late with all the files, but the bitcode library never has the
regular optimization pipeline applied to it so we miss a few
optimizations just using the LTO pipeline to optimize it.

I'm not committed to this solution, but it's the easiest method to solve
this performance regression when using LTO without changing the
optimizatin pipeline for other users.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D122133
2022-03-22 09:28:45 -04:00
Djordje Todorovic 73777b4c35 [Debugify] Optimize debugify original mode
Before we start addressing the issue with having
a lot of false positives when using debugify in
the original mode, we have made a few patches that
should speed up the execution of the testing
utility Passes.

For example, when testing a large project
(let's say LLVM project itself), we can face
a lot of potential DI issues. Usually, we use
-verify-each-debuginfo-preserve (that is very
similar to -debugify-each) -- it collects
DI metadata before each Pass, and after the Pass
it checks if the Pass preserved the DI metadata.
However, we can speed up this process, since we
don't need to collect DI metadata before each
Pass -- we could use the DI metadata that are
collected after the previous Pass from
the pipeline as an input for the next Pass.

This patch speeds up the utility for ~2x.

Differential Revision: https://reviews.llvm.org/D115622
2022-03-22 12:14:00 +01:00
Simon Moll 7de383c892 [VP] Fix VPintrinsic::getStaticVectorLength for vp.merge|select
VPIntrinsic::getStaticVectorLength infers the operational vector length
of a VPIntrinsic instance from a type that is used with the intrinsic.
The function used the mask operand before. Yet, vp.merge|select do not
have a mask operand (in the predicating sense that the other VP
intrinsics are using them - it is a selection mask for them). Fallback
to the return type to fix this.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D121913
2022-03-22 11:41:23 +01:00
Zakk Chen 9ab18cc535 [RISCV] Add policy operand for masked vid and viota IR intrinsics.
Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D120227
2022-03-22 02:32:31 -07:00
serge-sans-paille f1985a3f85 Cleanup includes: Transforms/IPO
Preprocessor output diff: -238205 lines
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D122183
2022-03-22 10:06:28 +01:00
Zakk Chen abb5a985e9 [RISCV] Support mask policy for RVV IR intrinsics.
Add the UsesMaskPolicy flag to indicate the operations result
would be effected by the mask policy. (ex. mask operations).

It means RISCVInsertVSETVLI should decide the mask policy according
by mask policy operand or passthru operand.
If UsesMaskPolicy is false (ex. unmasked, store, and reduction operations),
the mask policy could be either mask undisturbed or agnostic.
Currently, RISCVInsertVSETVLI sets UsesMaskPolicy operations default to
MA, otherwise to MU to keep the current mask policy would not be changed
for unmasked operations.

Add masked-tama, masked-tamu, masked-tuma and masked-tumu test cases.
I didn't add all operations because most of implementations are using
the same pseudo multiclass. Some tests maybe be duplicated in different
tests. (ex. masked vmacc with tumu shows in vmacc-rv32.ll and masked-tumu)
I think having different tests only for policy would make the testing
clear.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120226
2022-03-22 01:19:16 -07:00
Arthur Eubanks 2362c4ecdc Revert "Revert "[OpaquePtr][LLParser] Automatically detect opaque pointers in .ll files""
This reverts commit 9c96a6bbfd.

Issues were already fixed at head.
2022-03-21 17:24:56 -07:00
Mitch Phillips 9c96a6bbfd Revert "[OpaquePtr][LLParser] Automatically detect opaque pointers in .ll files"
This reverts commit 295172ef51.

Reason: Broke the ASan buildbot. More details are available on the
original Phab review at https://reviews.llvm.org/D119482.
2022-03-21 16:04:36 -07:00
Mitch Phillips f4b794427e Revert "[memprof] Store callsite metadata with memprof records."
This reverts commit 0d362c90d3.

Reason: Causes the MSan buildbot to fail (see comments on
https://reviews.llvm.org/D121179 for more information
2022-03-21 15:59:13 -07:00
Vasileios Porpodas 79613185d3 Recommit "[SLP] Fix lookahead operand reordering for splat loads."
Original review: https://reviews.llvm.org/D121354

The original commit 9136145eb0 broke the build on several targets.

Differential Revision: https://reviews.llvm.org/D121973
2022-03-21 15:57:32 -07:00
Snehasish Kumar 0d362c90d3 [memprof] Store callsite metadata with memprof records.
To ease profile annotation, each of the callsites in a function can be
annotated with profile data - "IR metadata format for MemProf" [1]. This
patch extends the on-disk serialized record format to store the debug
information for allocation callsites incl inline frames. This change is
incompatible with the existing format i.e. indexed profiles must be
regenerated, raw profiles are unaffected.

[1] https://groups.google.com/g/llvm-dev/c/aWHsdMxKAfE/m/WtEmRqyhAgAJ

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D121179
2022-03-21 13:58:29 -07:00
Mehdi Amini 734b8eadd7 Adjust `llvm_unreachable` macro to account for platforms that don't define LLVM_BUILTIN_UNREACHABLE
Post 892c104fb7, LLVM_BUILTIN_UNREACHABLE may not be defined anymore.
Also when LLVM_UNREACHABLE_OPTIMIZE is OFF, emit LLVM_BUILTIN_UNREACHABLE
after LLVM_BUILTIN_TRAP to ensure that diagnostics are suppressed on
environments where LLVM_BUILTIN_TRAP is not marked as noreturn.

Differential Revision: https://reviews.llvm.org/D122170
2022-03-21 20:50:07 +00:00
Duncan P. N. Exon Smith 892c104fb7 Compiler: Remove empty fallback definition for LLVM_BUILTIN_UNREACHABLE
`llvm_unreachable()` and `LLVM_ASSUME_ALIGNED` use
`defined(LLVM_BUILTIN_UNREACHABLE)` to check whether it has a
definition. Remove the fallback added in 26827337df (as a drive-by
when updating the GCC logic) and add a comment to prevent future
mistakes.

Differential Revision: https://reviews.llvm.org/D122167
2022-03-21 11:52:24 -07:00
Snehasish Kumar 5cfb110090 [DebugInfo][NFC] Add a comment on the ordering of DILineInfo frames.
Add a comment to getFrame() to mention that frames are stored in
bottom-up, i.e. leaf to root in order of increasing index.

Differential Revision: https://reviews.llvm.org/D122033
2022-03-21 10:41:05 -07:00
Daniel Thornburgh 7917b3c695 [Debuginfod] Don't depend on Content-Length.
The present implementation of debuginfod lookups requires the
Content-Length field to be populated in the HTTP server response.
Unfortunately, Content-Length is optional, and there are some real
scenarios where it's missing. (For example, a Google Cloud Storage
server doing on-the-fly gunzipping.)

This changes the debuginfod response handler to directly stream the
output to the cache file as it is received. In addition to allowing
lookups to proceed without a Content-Lenght, it seems somewhat more
straightforward to implement, and it allows the disk I/O to be
interleaved with the network I/O.

Reviewed By: noajshu

Differential Revision: https://reviews.llvm.org/D121720
2022-03-21 17:27:45 +00:00
Philip Reames ee7324b898 Rename mayBeMemoryDependent to mayHaveNonDefUseDependency [nfc] 2022-03-21 10:01:40 -07:00
Craig Topper 800ac15dcc [VP] Make VectorBuilder take IRBuilderBase instead of IRBuilder<>
IRBuilder<> assumes specific values for template parameters.
IRBuilderBase abstracts away the template.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D122032
2022-03-21 09:17:37 -07:00
serge-sans-paille d8e0a6d5e9 [LowerConstantIntrinsics] Support phi operand in __builtin_object_size folder
The implementation is just a generalization of the Select handler.
We're no trying to be smart and compute any kind of fixed point.

Differential Revision: https://reviews.llvm.org/D121897
2022-03-21 11:30:50 +01:00
Marek Kurdej df4da5f37d [ADT] Add drop_end.
This patch adds drop_end that is analogical to drop_begin.
It tries to fill the functional gap where one could drop first elements but not the last ones.
The need for it came in when refactoring clang-format.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D122009
2022-03-21 09:43:19 +01:00
Chris Bieneman 95871187bf Add DXIL triple
This patch adds triple support for:

* dxil architecture
* shadermodel OS (with version parsing)
* shader stages as environment

Reviewed By: MaskRay, pete

Differential Revision: https://reviews.llvm.org/D122031
2022-03-19 00:17:43 -05:00
Arthur Eubanks ddc702376a [NewPM] Don't skip SCCs not in current RefSCC
With D107249 I saw huge compile time regressions on a module (150s ->
5700s). This turned out to be due to a huge RefSCC in
the module. As we ran the function simplification pipeline on functions
in the SCCs in the RefSCC, some of those SCCs would be split out to
their RefSCC, a child of the current RefSCC. We'd skip the remaining
SCCs in the huge RefSCC because the current RefSCC is now the RefSCC
just split out, then revisit the original huge RefSCC from the
beginning.  This happened many times because many functions in the
RefSCC were optimizable to the point of becoming their own RefSCC.

This patch makes it so we don't skip SCCs not in the current RefSCC so
that we split out all the child RefSCCs on the first iteration of
RefSCC. When we split out a RefSCC, we invalidate the original RefSCC
and add the remainder of the SCCs into a new RefSCC in
RCWorklist. This happens repeatedly until we finish visiting all
SCCs, at which point there is only one valid RefSCC in
RCWorklist from the original RefSCC containing all the SCCs that
were not split out, and we visit that.

For example, in the newly added test cgscc-refscc-mutation-order.ll,
we'd previously run instcombine in this order:
f1, f2, f1, f3, f1, f4, f1

Now it's:
f1, f2, f3, f4, f1

This can cause more passes to be run in some specific cases,
e.g. if f1<->f2 gets optimized to f1<-f2, we'd previously run f1, f2;
now we run f1, f2, f2.

This improves kimwitu++ compile times by a lot (12-15% for various -O3 configs):
https://llvm-compile-time-tracker.com/compare.php?from=2371c5a0e06d22b48da0427cebaf53a5e5c54635&to=00908f1d67400cab1ad7bcd7cacc7558d1672e97&stat=instructions

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D121953
2022-03-18 14:16:29 -07:00
Mike Rice 6bd8dc91b8 [OpenMP] Initial parsing/sema for the 'omp target teams loop' construct
Adds basic parsing/sema/serialization support for the
 #pragma omp target teams loop directive.

Differential Revision: https://reviews.llvm.org/D122028
2022-03-18 13:48:32 -07:00
Stanislav Mekhanoshin 0a79e1f30a [AMDGPU] reuse blgp as neg in 2 mfma operations on gfx940
GFX940 repurposes BLGP as NEG only in DGEMM MFMA.

Differential Revision: https://reviews.llvm.org/D121745
2022-03-18 12:56:51 -07:00
Mehdi Amini 7b983917d4 Add a cmake flag to turn `llvm_unreachable()` into builtin_trap() when assertions are disabled
This re-lands 6316129e06 after fixing the condition logic.

The new flag seems to not be working yet on Windows, where the builtin
trap isn't "no return".

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D121750
2022-03-18 19:24:14 +00:00
Philip Reames 8f108c32bc Revert "[SLP] Optionally preserve MemorySSA"
This reverts commit 1cfa986d68.  See https://github.com/llvm/llvm-project/issues/54256 for why I'm discontinuing the project.

Seperately, it turns out that while this patch does correctly preserve MSSA, it's correct only at the end of the pass; not between vectorization attempts.  Even if we decide to resurrect this, we'll need to fix that before reapplying.
2022-03-18 10:45:59 -07:00
Florian Hahn 5ab421fb4e
[LICM] Add allowspeculation pass options.
This adds a new option to control AllowSpeculation added in D119965 when
using `-passes=...`.

This allows reproducing #54023 using opt.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D121944
2022-03-18 16:51:57 +00:00
Nikita Popov 6ffb3ad631 [SCEV] Use constant ranges when determining reachable blocks (PR54434)
This avoids false positive verification failures if the condition
is not literally true/false, but SCEV still makes use of the fact
that a loop is not reachable through more complex reasoning.

Fixes https://github.com/llvm/llvm-project/issues/54434.
2022-03-18 12:04:35 +01:00
Nikita Popov f96428e16d [MemorySSA] Don't optimize uses during construction
This changes MemorySSA to be constructed in unoptimized form.
MemorySSA::ensureOptimizedUses() can be called to optimize all
uses (once). This should be done by passes where having optimized
uses is beneficial, either because we're going to query all uses
anyway, or because we're doing def-use walks.

This should help reduce the compile-time impact of MemorySSA for
some use cases (the reason why I started looking into this is
D117926), which can avoid optimizing all uses upfront, and instead
only optimize those that are actually queried.

Actually, we have an existing use-case for this, which is EarlyCSE.
Disabling eager use optimization there gives a significant
compile-time improvement, because EarlyCSE will generally only query
clobbers for a subset of all uses (this change is not included in
this patch).

Differential Revision: https://reviews.llvm.org/D121381
2022-03-18 09:56:16 +01:00
Nikita Popov 112aafcaf4 Revert "Add a cmake flag to turn `llvm_unreachable()` into builtin_trap() when assertions are disabled"
This reverts commit 6316129e06.

This was implemented with inverted logic.
2022-03-18 09:21:53 +01:00
Pavel Labath ab25757522 Remove a top-level "using namespace" directive from LegalizationArtifactCombiner.h
The directive pollutes the namespace of all files including that header.
Move the directive into the bodies of functions that need it instead.
2022-03-18 08:58:30 +01:00
Shengchen Kan 9e832a67fe [Codegen][tablgen][NFC] Allow meta instruction to be target dependent
An instruction is a meta-instruction if it doesn't produce any output
in the form of executable instructions. So in the concept, a
meta-instruction does not have to be target independent.

Before this patch, `isMetaInstruction` is implemented by checking the
opcode of the instruction, add we have no way to add target dependent
opcode to the list, which does not make sense.

After this patch, a bit `isMeta` is added for class `Instruction` in
tablegen, which is used to indicate whether it's a meta instruction.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D121600
2022-03-18 13:09:01 +08:00
Vasileios Porpodas 9136145eb0 Revert "[SLP] Fix lookahead operand reordering for splat loads." due to build failures
This reverts commit 5efa78985b.
2022-03-17 18:22:04 -07:00
Vasileios Porpodas 5efa78985b [SLP] Fix lookahead operand reordering for splat loads.
Splat loads are inexpensive in X86. For a 2-lane vector we need just one
instruction: `movddup (%reg), xmm0`. Using the standard Splat score leads
to worse code. This patch adds a new score dedicated for splat loads.

Please note that a splat is usually three IR instructions:
- It is usually a load and 2 inserts:
 %ld = load double, double* %gep
 %ins1 = insertelement <2 x double> poison, double %ld, i32 0
 %ins2 = insertelement <2 x double> %ins1, double %ld, i32 1

- But it can also be a load, an insert and a shuffle:
 %ld = load double, double* %gep
 %ins = insertelement <2 x double> poison, double %ld, i32 0
 %shf = shufflevector <2 x double> %ins, <2 x double> poison, <2 x i32> zeroinitializer

Because of this some of the lit tests contain more IR instructions.

Differential Revision: https://reviews.llvm.org/D121354
2022-03-17 18:05:54 -07:00
Benjamin Kramer 5d2ce7663b Use llvm::append_range instead of push_back loops where applicable. NFCI. 2022-03-18 01:25:34 +01:00
Paul Kirth 964398ccb1 Revert "Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics"""
This reverts commit 6cf560d69a.
2022-03-18 00:21:33 +00:00
Paul Kirth 6cf560d69a Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics""
I mistakenly reverted my commit, so I'm relanding it.

This reverts commit 10866a1df4.
2022-03-18 00:04:22 +00:00
Paul Kirth 10866a1df4 Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit e7749d4713.
2022-03-17 23:54:26 +00:00
Paul Kirth e7749d4713 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Differential Revision: https://reviews.llvm.org/D115907
2022-03-17 23:46:23 +00:00
Mehdi Amini 6316129e06 Add a cmake flag to turn `llvm_unreachable()` into builtin_trap() when assertions are disabled
Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D121750
2022-03-17 22:21:14 +00:00
Ben Barham 4125524112 [VFS] Add print/dump to the whole FileSystem hierarchy
For now most are implemented by printing out the name of the filesystem,
but this can be expanded in the future. Only `OverlayFileSystem` and
`RedirectingFileSystem` are properly implemented in this patch.
  - `OverlayFileSystem`: Prints each filesystem in the order that any
    operations are actually run on them. Optionally prints recursively.
  - `RedirectingFileSystem`: Prints out all mappings, as well as the
    `ExternalFS`. Most of this was already implemented other than the
    handling for the `DirectoryRemap` case and to actually print out the
    mapping.

Each FS should implement `printImpl` rather than `print`, where the
latter just fowards to the former. This is to avoid spreading the
default arguments through to the subclasses (where we may miss updating
in the future).

Differential Revision: https://reviews.llvm.org/D121421
2022-03-17 13:02:40 -07:00
Julian Lettner 22570bac69 Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121736
2022-03-17 10:47:13 -07:00
Augusto Noronha 9b3af5e7b7 [dsymutil] Apply relocations present in Swift reflection sections
The strippable Swift reflection sections contain subtractor relocations
that need to be applied. There are two situations we need to support.
1) Both symbols used in the relocation come from the .o file (for
   example, one symbol lives in __swift5_fieldmd and the second in
   __swift5_reflstr).
2) One symbol comes from th .o file and the second from the main
   binary (for example, __swift5_fieldmd and __swift5_typeref).

Differential Revision: https://reviews.llvm.org/D120574
2022-03-17 14:23:20 -03:00
Arthur Eubanks 295172ef51 [OpaquePtr][LLParser] Automatically detect opaque pointers in .ll files
This allows us to not have to specify -opaque-pointers when updating
IR tests from typed pointers to opaque pointers.

We detect opaque pointers in .ll files by looking for relevant tokens,
either "ptr" or "*".

Reviewed By: #opaque-pointers, nikic

Differential Revision: https://reviews.llvm.org/D119482
2022-03-17 08:37:18 -07:00
Marco Elver cbe1e67ead [Instruction] Introduce getAtomicSyncScopeID()
An analysis may just be interested in checking if an instruction is
atomic but system scoped or single-thread scoped, like ThreadSanitizer's
isAtomic(). Unfortunately Instruction::isAtomic() can only answer the
"atomic" part of the question, but to also check scope becomes rather
verbose.

To simplify and reduce redundancy, introduce a common helper
getAtomicSyncScopeID() which returns the scope of an atomic operation.
Start using it in ThreadSanitizer.

NFCI.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D121910
2022-03-17 14:59:37 +01:00
Jay Foad a3a4591856 [LegacyPassManager] Move structural hashing into Pass classes. NFC.
Move structural hashing into virtual methods on Pass. This will
allow MachineFunctionPass to override the method to add hashing of
the MachineFunction.

Differential Revision: https://reviews.llvm.org/D120123
2022-03-17 09:51:12 +00:00
Heejin Ahn b8038a916d [WebAssembly] Disable SimplifyDemandedVectorElts after legalization
This fixes a reported bug that caused an infinite loop during the
SelectionDAG optimization phase in ISel, by creating an overridable hook
in `TargetLowering` that allows us to bail out from running
`SimplifyDemandedVectorElts`.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D121869
2022-03-16 20:52:43 -07:00
Mike Rice 79f661edc1 [OpenMP] Initial parsing/sema for the 'omp teams loop' construct
Adds basic parsing/sema/serialization support for the #pragma omp teams loop
directive.

Differential Revision: https://reviews.llvm.org/D121713
2022-03-16 14:39:18 -07:00
Thomas Lively 7e8913d775 [WebAssembly] Fix names of SIMD instructions containing '_zero'
Fix the instruction names to match the WebAssembly spec:

 - `i32x4.trunc_sat_zero_f64x2_{s,u}` => `i32x4.trunc_sat_f64x2_{s,u}_zero`
 - `f32x4.demote_zero_f64x2` => `f32x4.demote_f64x2_zero`

Also rename related things like intrinsics, builtins, and test functions to
match.

Reviewed By: aheejin

Differential Revision: https://reviews.llvm.org/D121661
2022-03-16 13:34:57 -07:00
Marco Elver 555df03012 [SelectionDAG][NFC] Clean up SDCallSiteDbgInfo accessors
* Consistent naming: addCallSiteInfo vs. getCallSiteInfo;
* Use ternary operator to reduce verbosity;
* const'ify getters;
* Add comments;

NFCI.

Differential Revision: https://reviews.llvm.org/D121820
2022-03-16 17:46:06 +01:00
Daniel Thornburgh 9990395859 [Symbolize] Fix overflow warning on 32-bit hosts.
The inserted cast is a no-op.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D121752
2022-03-16 16:44:36 +00:00
Shengchen Kan 37b378386e [NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments 2022-03-16 20:25:42 +08:00
Matthias Gehre 09854f2af3 [SelectionDAG] Emit calls to __divei4 and friends for division/remainder of large integers
Emit calls to __divei4 and friends for divison/remainder of large integers.

This fixes https://github.com/llvm/llvm-project/issues/44994.

The overall RFC is in https://discourse.llvm.org/t/rfc-add-support-for-division-of-large-bitint-builtins-selectiondag-globalisel-clang/60329

The compiler-rt part is in https://reviews.llvm.org/D120327

Differential Revision: https://reviews.llvm.org/D120329
2022-03-16 09:36:28 +00:00
serge-sans-paille 989f1c72e0 Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
2022-03-16 08:43:00 +01:00
Philip Reames 1cfa986d68 [SLP] Optionally preserve MemorySSA
This initial patch adds code to preserve MemorySSA through a run of SLP vectorizer. The eventual plan is to use MemorySSA to accelerate SLP's memory dependence checking, but we're a ways from that.  In particular, this patch is correct, but really slow. It's being landed so that we can work incrementally in tree, not because it's expected to be useful to anyone just yet.

The broader effort is being tracked in https://github.com/llvm/llvm-project/issues/54256.  Its worth noting expicitly that this may not work out, and if not, we will be reverting all of the MSSA support in SLP at some point in the next few weeks.

Differential Revision: https://reviews.llvm.org/D117926
2022-03-15 16:36:15 -07:00
Florian Hahn 014f5bcf7a
[FunctionAttrs] Replace MemoryAccessKind with FMRB.
Update FunctionAttrs to use FunctionModRefBehavior instead
MemoryAccessKind.

This allows for adding support for inferring argmemonly and others,
see D121415.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D121460
2022-03-15 19:35:54 +00:00
Shubham Sandeep Rastogi d46409fc8e Move DWARFRecordSectionSplitter code to its own file
With 229d576b31 the class EHFrameSplitter was renamed to DWARFRecordSectionSplitter. This change merely moves it to it's own .cpp/.h file

Differential Revision: https://reviews.llvm.org/D121721
2022-03-15 11:38:25 -07:00
Simon Pilgrim 7262eacd41 Revert rG9c542a5a4e1ba36c24e48185712779df52b7f7a6 "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
Mane of the build bots are complaining: Unknown command line argument '-lower-global-dtors'
2022-03-15 13:01:35 +00:00
Pavel Labath 991dc4b4e0 Remove a top-level "using namespace" in TargetTransformInfoImpl.h
Avoids polluting the namespace of all files including the header.
2022-03-15 13:49:20 +01:00
Dmitry Makogon 361034ba78 [NFC] Add LazyValueInfo::clear method
This method just calls LazyValueInfoImpl::clear
2022-03-15 17:52:50 +07:00
River Riddle 1d7120c69a [mlir] Split out AttrDef/TypeDef and pattern constructs from OpBase.td
OpBase.td has formed into a huge monolith of all ODS constructs. This
commits starts to rectify that by splitting out some constructs to their
own .td files.

Differential Revision: https://reviews.llvm.org/D118636
2022-03-15 00:18:03 -07:00
Julian Lettner 9c542a5a4e Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121327
2022-03-14 17:51:18 -07:00
Joseph Huber 24ebdb6c25 [CUDA] Add CUDA fatbinary magic
Nvidia uses fatbinaries to bundle all of their device code. This patch
adds the magic number "0x50ed55ba" used in their propeitary format to
the list of magic identifies. This is technically undocumented and could
unlikely be changed by Nvidia in the future.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D120932
2022-03-14 20:08:31 -04:00
Andrew Litteken 228cc2c38b [IROutliner] Ensure merged PHINodes respect order and incoming blocks, not just incoming values
When matching PHINodes when margining functions the IROutliner only checks that an incoming value exists in phi node in overall function. It doesn't check the length, the order, or that the incoming block also matches. In the given example, we see that both phi nodes have the same incoming values, but from different blocks.

The fix is to to enforce stricter a match of the incoming value, and the incoming block as well when matching the created phi nodes.

Reviewers: paquette

Differential Revision: https://reviews.llvm.org/D121310
2022-03-14 16:48:21 -05:00
Nick Desaulniers 236695e70c [IRLinker] make IRLinker::AddLazyFor optional (llvm::unique_function). NFC
2 of the 3 callsite of IRMover::move() pass empty lambda functions. Just
make this parameter llvm::unique_function.

Came about via discussion in D120781. Probably worth making this change
regardless of the resolution of D120781.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D121630
2022-03-14 14:37:34 -07:00
Ben Barham 3466b8e23d [Support] Add const to `FileError::getFileName`
`getFileName` returns a `StringRef`, there's no reason it shouldn't be
const.

Differential Revision: https://reviews.llvm.org/D121495
2022-03-14 11:45:29 -07:00
Ben Barham cc63ae42d7 [VFS] Rename `RedirectingFileSystem::dump` to `print`
The rest of LLVM uses `print` for the method taking the `raw_ostream`
and `dump` only for the method with no parameters. Use the same for
`RedirectingFileSystem`.

Differential Revision: https://reviews.llvm.org/D121494
2022-03-14 11:44:07 -07:00
Fangrui Song 407c721ceb [Support] Change zlib::compress to return void
With a sufficiently large output buffer, the only failure is Z_MEM_ERROR.
Check it and call the noreturn report_bad_alloc_error if applicable.
resize_for_overwrite may call report_bad_alloc_error as well.

Now that there is no other error type, we can replace the return type with void
and simplify call sites.

Reviewed By: ikudrin

Differential Revision: https://reviews.llvm.org/D121512
2022-03-14 11:38:04 -07:00
Mircea Trofin 294eca35a0 [regalloc] Remove -consider-local-interval-cost
Discussed extensively on D98232. The functionality introduced in D35816
never worked correctly. In D98232, it was fixed, but, as it was
introducing a large compile-time regression, and the value of the
original patch was called into doubt, we disabled it by default
everywhere. A year later, it appears that caused no grief, so it seems
safe to remove the disabled code.

This should be accompanied by re-opening bug 26810.

Differential Revision: https://reviews.llvm.org/D121128
2022-03-14 10:49:16 -07:00
Teresa Johnson fee0bde4c6 [WPD] Extend checking mode to support fallback to indirect call
Extend -wholeprogramdevirt-check to support both the existing
trapping mode on an incorrect devirtualization, as well as a new
mode to fallback to an indirect call on a mismatch. The new mode is

The new mode is useful in cases where we want to enable
devirtualization but cannot fully guarantee whole program visibility
(e.g in the case where LTO has been disabled for a small set of objects
that could potentially override virtual methods without having a symbol
reference to anything in the base class including the vtable).

Remove !prof and !callees metadata (which are used by indirect call
promotion) from both the new direct call and the fallback indirect call
(so that we don't perform another round of promotion on the latter).
Also remove it from the direct call in the non-fallback cases, which
was an oversight, although it didn't seem to cause any issues. Add tests
for the metadata removal covering the various cases.

Differential Revision: https://reviews.llvm.org/D121419
2022-03-14 10:16:28 -07:00
Jonas Devlieghere 8550c1f328
[llvm] Fix warning: missing submodule 'LLVM_Analysis.ScalarFuncs' 2022-03-14 10:11:52 -07:00
Arthur Eubanks 4fc7c55fff [NewPM] Actually recompute GlobalsAA before module optimization pipeline
RequireAnalysis<GlobalsAA> doesn't actually recompute GlobalsAA.
GlobalsAA isn't invalidated (unless specifically invalidated) because
it's self-updating via ValueHandles, but can be imprecise during the
self-updates.

Rather than invalidating GlobalsAA, which would invalidate AAManager and
any analyses that use AAManager, create a new pass that recomputes
GlobalsAA.

Fixes #53131.

Differential Revision: https://reviews.llvm.org/D121167
2022-03-14 09:42:34 -07:00
Jonas Devlieghere f51d7e4bae
Fix the implicit module build
This fixes the implicit module build after b1b4b6f366 broke the LLDB
build: https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/42084/
2022-03-14 09:24:17 -07:00
Sam Clegg 20f7f733fe [WebAssembly] Rename member in WasmYAML.h to avoid compiler warning
Followup/fix for https://reviews.llvm.org/D121349.
2022-03-14 09:09:43 -07:00
Sam Clegg 9504ab32b7 [WebAssembly] Second phase of implemented extended const proposal
This change continues to lay the ground work for supporting extended
const expressions in the linker.

The included test covers object file reading and writing and the YAML
representation.

Differential Revision: https://reviews.llvm.org/D121349
2022-03-14 08:55:47 -07:00
Aaron Ballman 8cba72177d Implement literal suffixes for _BitInt
WG14 adopted N2775 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2775.pdf)
at our Feb 2022 meeting. This paper adds a literal suffix for
bit-precise types that automatically sizes the bit-precise type to be
the smallest possible legal _BitInt type that can represent the literal
value. The suffix chosen is wb (for a signed bit-precise type) which
can be combined with the u suffix (for an unsigned bit-precise type).

The preprocessor continues to operate as-if all integer types were
intmax_t/uintmax_t, including bit-precise integer types. It is a
constraint violation if the bit-precise literal is too large to fit
within that type in the context of the preprocessor (when still using
a pp-number preprocessing token), but it is not a constraint violation
in other circumstances. This allows you to make bit-precise integer
literals that are wider than what the preprocessor currently supports
in order to initialize variables, etc.
2022-03-14 09:24:19 -04:00
Erich Keane dc152659b4 Have cpu-specific variants set 'tune-cpu' as an optimization hint
Due to various implementation constraints, despite the programmer
choosing a 'processor' cpu_dispatch/cpu_specific needs to use the
'feature' list of a processor to identify it. This results in the
identified processor in source-code not being propogated to the
optimizer, and thus, not able to be tuned for.

This patch changes to use the actual cpu as written for tune-cpu so that
opt can make decisions based on the cpu-as-spelled, which should better
match the behavior expected by the programmer.

Note that the 'valid' list of processors for x86 is in
llvm/include/llvm/Support/X86TargetParser.def. At the moment, this list
contains only Intel processors, but other vendors may wish to add their
own entries as 'alias'es (or with different feature lists!).

If this is not done, there is two potential performance issues with the
patch, but I believe them to be worth it in light of the improvements to
behavior and performance.

1- In the event that the user spelled "ProcessorB", but we only have the
features available to test for "ProcessorA" (where A is B minus
features),
AND there is an optimization opportunity for "B" that negatively affects
"A", the optimizer will likely choose to do so.

2- In the event that the user spelled VendorI's processor, and the
feature
list allows it to run on VendorA's processor of similar features, AND
there
is an optimization opportunity for VendorIs that negatively affects
"A"s,
the optimizer will likely choose to do so. This can be fixed by adding
an
alias to X86TargetParser.def.

Differential Revision: https://reviews.llvm.org/D121410
2022-03-14 06:14:30 -07:00
Benoit Jacob 9879c555f2 Expose ScalarizerPass options to C++ (not just commandline)
Context: I needed this for https://github.com/google/iree/pull/8474 .
I found that TSan instrumentation expects vector sizes to be <= 16,
and in my project (IREE) we have tests with higher vector sizes.
That left some test functions uninstrumented, resulting in crashes as
instrumented code called into them.

Differential Revision: https://reviews.llvm.org/D121182
2022-03-14 12:00:35 +01:00
Kazushi (Jam) Marukawa 9260592141 [VE] Support more intrinsics
Support new intrinsics for following instrauctions.
  - VLDZ, VPCNT, VBRV
  - LCR, SCR, TSCR, FIDCR
  - FENCE
Also clean the intrinsics implementation of a following instruction.
  - SVOB

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D121509
2022-03-14 19:17:15 +09:00
David Sherwood e7b89c2fc3 Add BasicTTIImpl cost model for llvm.get.active.lane.mask intrinsic
The vectoriser sometimes generates predicated vector loops using
the llvm.get.active.lane.mask intrinsic so it's important that we
are able to calculate a valid cost for the call instruction. When
SVE is enabled we are able to use a single whilelo instruction
for some vector types - in such cases I've marked the cost as 1.
For all other cases I've set the cost according to how the intrinsic
will be expanded.

Tests added here:

  Analysis/CostModel/AArch64/sve-intrinsics.ll
  Analysis/CostModel/ARM/active_lane_mask.ll
  Analysis/CostModel/RISCV/active_lane_mask.ll

Differential Revision: https://reviews.llvm.org/D121109
2022-03-14 09:35:05 +00:00
sstwcw 65a3712af6 [yamlio] Allow parsing an entire mapping as an enumeration
For when we want to change a configuration option from an enum into a
struct.  The need arose when working on D119599.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D120363
2022-03-14 04:41:40 +00:00
Patrick Holland 55cedf9cc5 [MCA] Moved six instruction flags from InstrDesc to InstructionBase.
Differential Revision: https://reviews.llvm.org/D121508
2022-03-13 21:21:05 -07:00
Andrew Litteken 1643f01232 [IRSim][IROutliner] Ignoring Musttail Function
Musttail calls require extra handling to properly propagate the calling convention information and tail call information. The outliner does not currently do this, so we ignore call instructions that utilize the swifttailcc and tailcc calling convention as well as functions marked with the attribute musttail.

Reviewers: paquette, aschwaighofer

Differential Revision: https://reviews.llvm.org/D120733
2022-03-13 19:27:25 -05:00
Andrew Litteken 66f90fdff1 Revert "[IRSim][IROutliner] Ignoring Musttail Function"
This reverts commit c7037c7257.

Pushed too soon
2022-03-13 19:26:51 -05:00
Andrew Litteken c7037c7257 [IRSim][IROutliner] Ignoring Musttail Function 2022-03-13 18:57:24 -05:00
Austin Kerbow 62bcfcb5a5 [AMDGPU] Add llvm.amdgcn.s.setprio intrinsic
Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D120976
2022-03-12 22:15:42 -08:00
serge-sans-paille 0467eb2cb7 Replace forward declaration by actual declaration of MemoryBuffer in Object/Binary.h
This is a partial undo of ed98c1b376, see
https://lab.llvm.org/buildbot#builders/37/builds/11529
for the actual error.
2022-03-12 21:53:14 +01:00
serge-sans-paille ed98c1b376 Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121332
2022-03-12 17:26:40 +01:00
Johannes Doerfert d6e09ce86f [CaptureTracking][NFCI] Expose capture tracking logic
The logic exposed by this patch via `llvm::DetermineUseCaptureKind` was
part of `llvm::PointerMayBeCaptured`. In the Attributor we want to keep
track of the work list items but still reuse the logic if a use might
capture a value. A follow up for the Attributor removes ~100 lines of
code and complexity while making future handling of simplified values
possible.

Differential Revision: https://reviews.llvm.org/D121272
2022-03-11 22:56:16 -06:00
Fangrui Song 689c3a2552 [MC] Fix letter case of some MCSection member functions 2022-03-11 20:07:00 -08:00
Julian Lettner 46626bc873 [NFC] Improve comment and rename test file 2022-03-11 14:47:54 -08:00
Johannes Doerfert e92891f864 [Attributor] Allow not to default initialize AAs for live internal functions
Outside users of the Attributor, e.g., OpenMP-opt, want to seed AAs
themselves. We should not seed all default AAs one an internal function
becomes live. That said, there should be a callback such that they can
do lazy seeding as well.

Differential Revision: https://reviews.llvm.org/D121489
2022-03-11 16:46:03 -06:00
Johannes Doerfert f3ad8cf00e [Attributor] Cleanup manifest and liveness for CGSCC passes
There was some ad-hoc handling of liveness and manifest to avoid
breaking CGSCC guarantees. Things always slipped through though.
This cleanup will:

1) Prevent us from manifesting any "information" outside the CGSCC.
   This might be too conservative but we need to opt-in to annotation
   not try to avoid some problematic ones.
2) Avoid running any liveness analysis outside the CGSCC. We did have
   some AAIsDeadFunction handling to this end but we need this for all
   AAIsDead classes. The reason is that AAIsDead information is only
   correct if we actually manifest it, since we don't (see point 1) we
   cannot actually derive/use it at all. We are currently trying to
   avoid running any AA updates outside the CGSCC but that seems to
   impact things quite a bit.
3) Assert, don't check, that our modifications (during cleanup) modifies
   only CGSCC functions.
2022-03-11 16:46:02 -06:00
Benjamin Kramer 3b15a7c042 [BFI] Use SmallPtrSets. NFCI. 2022-03-11 15:20:57 +01:00
serge-sans-paille fbbc41f8dd Cleanup include: TableGen
This also includes a few cleanup from Support.

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121331
2022-03-11 11:41:32 +01:00
Pavel Labath 161bddf3af [ADT] Make BitmaskEnum operations constant expressions
This avoids runtime initialization (a global constructor) whenever they appear
in the initializer.

The patch just adds the constexpr keyword to a couple of functions.

Differential Revision: https://reviews.llvm.org/D121281
2022-03-11 11:11:55 +01:00
Nikita Popov dcc4b94d94 [llvm-c] Document that LLVMGetElementType on pointers is deprecated (NFC)
We can't actually deprecate the function, because it is also used
for arrays and vectors, so we can only document this.
2022-03-11 09:28:18 +01:00
Yevgeny Rouban c5f34d1692 [CommandLine] Keep option default value unset if no cl::init() is used
Current declaration of cl::opt is incoherent between class and non-class
specializations of the opt_storage template. There is an inconsistency
in the initialization of the Default field: for inClass instances
the default constructor is used - it sets the Optional Default field to
None; though for non-inClass instances the Default field is set to
the type's default value. For non-inClass instances it is impossible
to know if the option is defined with cl::init() initializer or not:

cl::opt<int> i1("option-i1");
cl::opt<int> i2("option-i2", cl::init(0));
cl::opt<std::string> s1("option-s1");
cl::opt<std::string> s2("option-s2", cl::init(""));

assert(s1.Default.hasValue() != s2.Default.hasValue()); // Ok
assert(i1.Default.hasValue() != i2.Default.hasValue()); // Fails

This patch changes constructor of the non-class specializations to keep
the Default field unset (that is None) rather than initialize it with
DataType().

Reviewed By: lattner
Differential Revision: https://reviews.llvm.org/D114645
2022-03-11 14:24:25 +07:00
Johannes Doerfert 7211dbd01d [Attributor][NFCI] Remove non-deterministic behavior and debug output 2022-03-10 23:27:47 -06:00
Lorenzo Albano 28cfa764c2 [VP] Strided loads/stores
This patch introduces two new experimental IR intrinsics and SDAG nodes
to represent vector strided loads and stores.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D114884
2022-03-10 18:46:54 +01:00
serge-sans-paille f06d487dd6 Cleanup includes: WindowsDriver & WindowsManifest
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121330
2022-03-10 17:19:06 +01:00
serge-sans-paille 2fccde0ea7 Cleanup includes: MCDisassembler
Some extra minor cleanup.

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121329
2022-03-10 14:43:29 +01:00
Nikita Popov 3c47dd47a4 [FuzzMutate] Support opaque pointers
Avoid checks that are irrelevant for opaque pointers, and pick
load/GEP types independently of the pointer type.

The GEP case at least could be done more efficiently by directly
generating a type, but this would require some significant API
changes.
2022-03-10 14:36:20 +01:00
Nico Weber a278250b0f Revert "Cleanup codegen includes"
This reverts commit 7f230feeea.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10 07:59:22 -05:00
serge-sans-paille 8246ec242a Add missing include in llvm/CodeGen/CodeGenPassBuilder.h
As a follow-up to 7f230feeea
2022-03-10 10:05:25 +01:00
serge-sans-paille 7f230feeea Cleanup codegen includes
after:  1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169
2022-03-10 10:00:30 +01:00
serge-sans-paille 3c4410dfca Cleanup includes: LLVMTarget
Most notably, Pass.h is no longer included by TargetMachine.h
before: 1063570306
after:  1063332844

Differential Revision: https://reviews.llvm.org/D121168
2022-03-10 10:00:29 +01:00
Luke 0803dba7dd [RISCV] Add fixed-length vector instrinsics for segment load
Inspired by reviews.llvm.org/D107790.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119834
2022-03-10 16:23:40 +08:00
Xiang1 Zhang c31014322c TLS loads opimization (hoist)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D120000
2022-03-10 09:29:06 +08:00
Yevgeny Rouban fcd9fa416d [Support] Try 2: Reset option to its default if its Default field is undefined
opt::setDefaultImpl() is changed to set the option value to the option
type's default if the Default field is not set. This results in option
value reset by Option::reset() or ResetAllOptionOccurrences() even if
the cl::init() is not specified.

Example:
  StackOption<std::string> Str("str"); // No cl::init().
  Str = "some value";
  cl::ResetAllOptionOccurrences();
  EXPECT_EQ("", Str); // The Str is reset.

Reviewed By: lattner
Differential Revision: https://reviews.llvm.org/D115433
2022-03-10 08:26:34 +07:00
Florian Hahn f98125abb2
Revert "[PassManager] Add pretty stack entries before P->run() call."
This reverts commit 128745cc26.

This increased compile-time unnecessarily. Revert this change and follow
ups 2c7afadb47 & add0c5856d.

http://llvm-compile-time-tracker.com/compare.php?from=338dfcd60f843082bb589b287d890dbd9394eb82&to=128745cc2681c284bc6d0150a319673a6d6e8424&stat=instructions
2022-03-09 18:46:32 +00:00
Shraiysh Vaishay 7c385c4b2f [mlir][OpenMP] Generating enums in accordance with the guidelines
This patch changes the enums generated from `OMP.td` for MLIR according
to the enum naming guidelines in LLVM Coding Standards.

This also helps the issues we had with `static` being a C++ keyword and
also a value for the schedule clause.

Enumerator naming guidelines: https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D120825
2022-03-09 20:10:45 +05:30
fourdim 16dc90cbe7 [JITLink][RISCV] Refactor range checking and alignment checking
This patch refactors the range checking function to make it compatible with all relocation types and supports range checking for R_RISCV_BRANCH. Moreover, it refactors the alignment check functions.

Reviewed By: StephenFan

Differential Revision: https://reviews.llvm.org/D117946
2022-03-09 22:13:57 +08:00
Florian Hahn 128745cc26
[PassManager] Add pretty stack entries before P->run() call.
This patch adds PrettyStackEntries before running passes. The entries
include the pass name and the IR unit the pass runs on.

The information is used the print additional information when a pass
crashes, including the name and a reference to the IR unit on which it
crashed. This is similar to the behavior of the legacy pass manager.

The improved stack trace now includes:

Stack dump:
0.	Program arguments: bin/opt -loop-vectorize -force-vector-width=4 crash.ll
1.	Running pass 'ModuleToFunctionPassAdaptor' on module 'crash.ll'
2.	Running pass 'LoopVectorizePass' on function '@a'

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D120993
2022-03-09 13:01:09 +00:00
Simon Pilgrim 95969c5dbd [IR] MatrixBuilder - CreateIndexAssumption - fix unused variable warning on NDEBUG builds 2022-03-09 12:02:53 +00:00
Florian Mayer e86bd32b71 [NFC] [HWASan] [MTE] Use function_ref over template. 2022-03-08 15:49:55 -08:00
spupyrev 81aedab7dd introducing some profi flags
Differential Revision: https://reviews.llvm.org/D120508
2022-03-08 12:35:15 -08:00
Lang Hames 151f809c55 [JITLink] Demote symbol scope to local during external-to-absolute conversion.
When an external symbol is converted to an absolute it should be demoted to
local scope so that the symbol does not become a new definition within this
LinkGraph.
2022-03-08 10:31:20 -08:00
eopXD 550b2eaaa6 [RISCV] Add combination crypto extensions in ISAInfo
The crypto extension have several shorthand extensions that don't consist of any extra instructions.
Take `zk` for example, while the extension would imply `zkn, zkr, zkt`. The 3 extensions should also
combine back into `zk` to maintain the canonical order in isa strings.

This patch addresses the above.

Reviewed By: VincentWu

Differential Revision: https://reviews.llvm.org/D119530
2022-03-08 09:52:38 -08:00
Philip Reames 8e06058bfe [MSSA] Add clarifying comment for isOptimized on MemoryUse [nfc] 2022-03-08 09:43:16 -08:00
Philip Reames 52f7578489 [MSSA] Add comments describing optimized uses for MemoryDefs [nfc]
As clarified by a recent email chain with Alina.
2022-03-08 09:39:11 -08:00
Hongtao Yu 50bc945a8f [CSSPGO][SCCIterator] Fix a non-determinism in scc_member_iterator
Previously we initialed the work queue with MST roots based on NodeInfoMap which is an unordered map. This could cause a non-determinism. I'm fixing this by initializing the queue based on SortedEdges.

I don't see any performance move with this change. However this helps debugging.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D120670
2022-03-08 09:08:47 -08:00
Arthur Eubanks 53e5e58670 [NewPM][Inliner] Make inlined calls to functions in same SCC as callee exponentially expensive
Introduce a new attribute "function-inline-cost-multiplier" which
multiplies the inline cost of a call site (or all calls to a callee) by
the multiplier.

When processing the list of calls created by inlining, check each call
to see if the new call's callee is in the same SCC as the original
callee. If so, set the "function-inline-cost-multiplier" attribute of
the new call site to double the original call site's attribute value.
This does not happen when the original call site is intra-SCC.

This is an alternative to D120584, which marks the call sites as
noinline.

Hopefully fixes PR45253.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D121084
2022-03-07 23:51:09 -08:00
Arthur Eubanks 79a1f3e7c6 [NFC] Cleanup StandardInstrumentations 2022-03-07 16:24:36 -08:00
Stanislav Mekhanoshin 932f628121 [AMDGPU] new gfx940 fp atomics
Differential Revision: https://reviews.llvm.org/D121028
2022-03-07 12:32:02 -08:00
Mitch Phillips 73df82572a [MTE] Add NT_ANDROID_TYPE_MEMTAG
This ELF note is aarch64 and Android-specific. It specifies to the
dynamic loader that specific work should be scheduled to enable MTE
protection of stack and heap regions.

Current synthesis of the ".note.android.memtag" ELF note is done in the
Android build system. We'd like to move that to the compiler, and this
is the first step.

Reviewed By: MaskRay, jhenderson

Differential Revision: https://reviews.llvm.org/D119381
2022-03-07 11:28:56 -08:00
Craig Topper 845bfcede1 [RISCV] Rename 'SplatOperand' to 'ScalarOperand'. NFC
vslide1up/down have this flag set, but the value isn't a splat.
Rename for clarity.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D121037
2022-03-07 11:28:32 -08:00
Richard Howell 5917219438 [llvm] remove empty __LLVM segment in llvm-bitcode-strip
When running llvm-bitcode-strip we want to remove the __LLVM
segment as well as the __bundle section when there are no other
sections in the segment.

Differential Revision: https://reviews.llvm.org/D120927
2022-03-07 08:52:25 -08:00
Florian Hahn 542c335159
[ConstraintElimination] Remove dead variables when dropping constraints.
This patch extends ConstraintElimination to also remove dead variables
when removing a constraint. When a constraint is removed because it is
out of scope, all new variables added for this constraint can also be
removed.

This keeps the total size of the systems much smaller, because it
reduces the number of variables drastically.

It also fixes a bug where variables where removed incorrectly.

Fixes https://github.com/llvm/llvm-project/issues/54228
2022-03-07 09:04:07 +00:00
Simon Moll 5f62156762 [VP] Introducing VectorBuilder, the VP intrinsic builder
VectorBuilder wraps around an IRBuilder and
VectorBuilder::createVectorInstructions emits VP intrinsics as if they
were regular instructions.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D105283
2022-03-07 10:02:07 +01:00
Nikita Popov d1e880acaa [SCEV] Enable verification in LoopPM
Currently, we hardly ever actually run SCEV verification, even in
tests with -verify-scev. This is because the NewPM LPM does not
verify SCEV. The reason for this is that SCEV verification can
actually change the result of subsequent SCEV queries, which means
that you see different transformations depending on whether
verification is enabled or not.

To allow verification in the LPM, this limits verification to
BECounts that have actually been cached. It will not calculate
new BECounts.

BackedgeTakenInfo::getExact() is still not entirely readonly,
it still calls getUMinFromMismatchedTypes(). But I hope that this
is not problematic in the same way. (This could be avoided by
performing the umin in the other SCEV instance, but this would
require duplicating some of the code.)

Differential Revision: https://reviews.llvm.org/D120551
2022-03-07 09:46:20 +01:00
Johannes Doerfert 5af11ec34b [Attributor] Determine potentially loaded values through memory
We already look through memory to determine where a value that is stored
might pop up again (potential copies). This patch introduces the other
direction with similar logic. If a value is loaded, we can follow all
the accesses to the pointer (or better object) and try to determine what
value might have been stored.
2022-03-06 23:26:37 -06:00
Johannes Doerfert ad26e199ff [Attributor] Use CFG reasoning also for read accesses
With D106397 we used CFG reasoning to filter out writes that will not
interfere with a given load instruction. With this patch we use the
same logic (modulo the reversal in reachability check order) for store
instructions. As an example, we can now proof stores to shared memory
are dead if all the loads of the shared memory are not reachable from
them.
2022-03-06 23:26:22 -06:00
Qiu Chaofan b2497e5435 [PowerPC] Add generic fnmsub intrinsic
Currently in Clang, we have two types of builtins for fnmsub operation:
one for float/double vector, they'll be transformed into IR operations;
one for float/double scalar, they'll generate corresponding intrinsics.

But for the vector version of builtin, the 3 op chain may be recognized
as expensive by some passes (like early cse). We need some way to keep
the fnmsub form until code generation.

This patch introduces ppc.fnmsub.* intrinsic to unify four fnmsub
intrinsics.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D116015
2022-03-07 13:00:06 +08:00
Johannes Doerfert efedf70aa5 [Attributor][NFC] Expose helper with more generic interface
This simply makes the function argument of the
`Attributor::checkForAllInstructions` helper explicit so one can iterate
over instructions in other functions.
2022-03-06 19:59:23 -06:00
William S. Moses 87ec6f41bb [OpenMPIRBuilder] Allocate temporary at the correct block in a nested parallel
The OpenMPIRBuilder has a bug. Specifically, suppose you have two nested openmp parallel regions (writing with MLIR for ease)

```
omp.parallel {
  %a = ...
  omp.parallel {
    use(%a)
  }
}
```

As OpenMP only permits pointer-like inputs, the builder will wrap all of the inputs into a stack allocation, and then pass this
allocation to the inner parallel. For example, we would want to get something like the following:

```
omp.parallel {
  %a = ...
  %tmp = alloc
  store %tmp[] = %a
  kmpc_fork(outlined, %tmp)
}
```

However, in practice, this is not what currently occurs in the context of nested parallel regions. Specifically to the OpenMPIRBuilder,
the entirety of the function (at the LLVM level) is currently inlined with blocks marking the corresponding start and end of each
region.

```
entry:
  ...

parallel1:
  %a = ...
  ...

parallel2:
  use(%a)
  ...

endparallel2:
  ...

endparallel1:
  ...
```

When the allocation is inserted, it presently inserted into the parent of the entire function (e.g. entry) rather than the parent
allocation scope to the function being outlined. If we were outlining parallel2, the corresponding alloca location would be parallel1.

This causes a variety of bugs, including https://github.com/llvm/llvm-project/issues/54165 as one example.

This PR allows the stack allocation to be created at the correct allocation block, and thus remedies such issues.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D121061
2022-03-06 18:34:25 -05:00
Benjamin Kramer 4d2669002e [YAML] Simplify code a bit. NFC. 2022-03-05 22:24:35 +01:00
Arthur Eubanks f909aed671 Revert "[SCEV] Infer ranges for SCC consisting of cycled Phis"
This reverts commit fc539b0004.

Causes miscompiles, see D110620.
2022-03-04 19:52:44 -08:00
Augie Fackler e1895a46dc OpenMP: add allocsize(0) attribute to __kmpc_alloc_shared
This is the second step in obviating two columns about allocation
functions in MemoryBuiltins.cpp.

Differential Revision: https://reviews.llvm.org/D119583
2022-03-04 16:26:03 -05:00
Augie Fackler dba73135c8 getAllocAlignment: respect allocalign attribute if present
As with allocsize(), we prefer the table data to attributes.

Differential Revision: https://reviews.llvm.org/D118263
2022-03-04 15:57:54 -05:00
Augie Fackler d664c4b73c Attributes: add a new allocalign attribute
This will let us start moving away from hard-coded attributes in
MemoryBuiltins.cpp and put the knowledge about various attribute
functions in the compilers that emit those calls where it probably
belongs.

Differential Revision: https://reviews.llvm.org/D117921
2022-03-04 15:57:53 -05:00
Snehasish Kumar 11314f4059 [memprof] Filter out callstack frames which cannot be symbolized.
This patch filters out callstack frames which can't be symbolized or if
the frames belong to the runtime. Symbolization may not be possible if
debug information is unavailable or if the addresses are from a shared
library. For now we only support optimization of the main binary which
is statically linked to the compiler runtime.

Differential Revision: https://reviews.llvm.org/D120860
2022-03-04 11:10:08 -08:00
Augie Fackler 5e4c75db3b InstructionCombining: avoid eliding mismatched alloc/free pairs
Prior to this change LLVM would happily elide a call to any allocation
function and a call to any free function operating on the same unused
pointer. This can cause problems in some obscure cases, for example if
the body of operator::new can be inlined but the body of
operator::delete can't, as in this example from jyknight:

    #include <stdlib.h>
    #include <stdio.h>

    int allocs = 0;

    void *operator new(size_t n) {
        allocs++;
        void *mem = malloc(n);
        if (!mem) abort();
        return mem;
    }

    __attribute__((noinline)) void operator delete(void *mem) noexcept {
        allocs--;
        free(mem);
    }

    void deleteit(int*i) { delete i; }
    int main() {
        int*i = new int;
        deleteit(i);
        if (allocs != 0)
          printf("MEMORY LEAK! allocs: %d\n", allocs);
    }

This patch addresses the issue by introducing the concept of an
allocator function family and uses it to make sure that alloc/free
function pairs are only removed if they're in the same family.

Differential Revision: https://reviews.llvm.org/D117356
2022-03-04 10:41:10 -05:00
Nathan Sidwell 64221645a8 [demangler] Make OutputBuffer non-copyable
In addressing the buffer ownership API, I discovered a rogue member
function that returned by value rather than by reference. It clearly
intended to return by reference, but because the copy ctor wasn't
deleted this wasn't caught.

It is not necessary to make this a move-only type, although that would
be an alternative.

Reviewed By: bruno

Differential Revision: https://reviews.llvm.org/D120901
2022-03-04 04:43:37 -08:00
River Riddle 81f2f4dfb2 [PDLL] Add support for tablegen includes and importing ODS information
This commit adds support for processing tablegen include files, and importing
various information from ODS. This includes operations, attribute+type constraints,
attribute/operation/type interfaces, etc. This will allow for much more robust tooling,
and also allows for referencing ODS constructs directly within PDLL (imported interfaces
can be used as constraints, operation result names can be used for member access, etc).

Differential Revision: https://reviews.llvm.org/D119900
2022-03-03 16:14:03 -08:00
River Riddle e865fa7530 [TableGen] Add a library-based entry point for parsing td files
This commit adds a new `TableGenParseFile` entry point for tablegen
that parses an input buffer and invokes a callback function with
a record keeper (notably without an output buffer). This kind of entry
point is very useful for tablegen consuming tools that don't create
output, and want invoke tablegen multiple times. The current way
that we interact with tablegen is via relative includes to
TGParser(not great).

Differential Revision: https://reviews.llvm.org/D119899
2022-03-03 16:14:03 -08:00
Matt Arsenault 27712243ab Revert "Inliner: Correctly merge amdgpu-unsafe-fp-atomics attribute"
This reverts commit 169ebf03ab.

This was effectively rendering the attribute useless in the real
world, although this is still broken.
2022-03-03 15:25:32 -05:00
Snehasish Kumar dda7b74967 [memprof] Symbolize and cache stack frames.
Currently, symbolization of stack frames occurs on demand when the instrprof writer
iterates over all the records in the raw memprof reader. With this
change we symbolize and cache the frames immediately after reading the
raw profiles. For a large internal binary this results in a runtime
reduction of ~50% (2m -> 48s) when merging a memprof raw profile with a
raw instr profile to generate an indexed profile. This change also makes
it simpler in the future to generate additional calling context
metadata to attach to each memprof record.

Differential Revision: https://reviews.llvm.org/D120430
2022-03-03 11:00:37 -08:00
Craig Topper 608161225e [InstCombine][Analysis] Move getFCmpCode and getPredForFCmpCode to CmpInstAnalysis. NFC
The similar getICmpCode and getPredForICmpCode are already there.
This moves FP for consistency.

I think InstCombine is currently the only user of both.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120754
2022-03-03 09:33:24 -08:00
Paul Robinson 7b85f0f32f [PS4] isPS4 and isPS4CPU are not meaningfully different 2022-03-03 11:36:59 -05:00
Alexandros Lamprineas 910eb988eb [FuncSpec][NFC] Refactor internal structures.
`ArgInfo` is reduced to only contain a pair of {formal,actual} values.
The specialized function `Fn` and the `Partial` flag are redundant in
this structure. The `Gain` is moved to a new struct `SpecializationInfo`.

The value mappings created by cloneCandidateFunction() are being used
by rewriteCallSites() for matching the formal arguments of recursive
functions.

The list of specializations is passed by reference to calculateGains()
instead of being returned by value.

The `IsPartial` flag is removed from isArgumentInteresting() and
getPossibleConstants() as it's no longer used anywhere in the code.

Differential Revision: https://reviews.llvm.org/D120753
2022-03-03 13:08:13 +00:00
Simon Moll 8de8731591 Revert "[VP] Introducing VectorBuilder, the VP intrinsic builder"
This reverts commit 8bcbfb50e8.

Taking this patch offline to fix breakage: https://lab.llvm.org/buildbot/#/builders/110/builds/10912
2022-03-03 13:34:37 +01:00
Simon Moll 8bcbfb50e8 [VP] Introducing VectorBuilder, the VP intrinsic builder
VectorBuilder wraps around an IRBuilder and
VectorBuilder::createVectorInstructions emits VP intrinsics as if they
were regular instructions.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D105283
2022-03-03 11:31:57 +01:00
serge-sans-paille 59630917d6 Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line:
before: 1062981579
after:  1062494547

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120817
2022-03-03 07:56:34 +01:00
Aakanksha 840695814a [AMDGPU] Add gfx1036 target
Differential Revision: https://reviews.llvm.org/D120846
2022-03-02 23:26:38 +00:00
Stanislav Mekhanoshin 2e2e64df4a [AMDGPU] Add gfx940 target
This is target definition only.

Differential Revision: https://reviews.llvm.org/D120688
2022-03-02 13:54:48 -08:00
Pavel Labath e8784289c0 Revert "Remove a top-level "using namespace" from TargetTransformInfoImpl.h"
Causing failures on many bots.

This reverts commit 31efecfde9.
2022-03-02 15:47:41 +01:00
Pavel Labath 31efecfde9 Remove a top-level "using namespace" from TargetTransformInfoImpl.h
Move it into the implementation of the function that needs it.

Avoids polluting the namespace of all files including the header.
2022-03-02 15:38:20 +01:00
Pavel Labath 11511e9357 Remove "using namespace llvm" from ReleaseModeModelRunner.h
A using directive in a header pollutes the namespace of all files which
include that header. It seems this snuck in in D115764 by moving some
code from a cpp file.
2022-03-02 15:29:12 +01:00
Simon Moll d05ddb86f6 [VP] vp.sitofp cast intrinsic and docs
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D119922
2022-03-02 10:16:19 +01:00
Martin Storsjö 6ec18aafec [Object] [COFF] Improve error messages
This aids debugging when working with possibly broken files,
instead of just flat out erroring out without telling what's wrong.

Differential Revision: https://reviews.llvm.org/D120679
2022-03-02 10:44:41 +02:00
Xiang1 Zhang 65588a0776 Revert "TLS loads opimization (hoist)"
Revert for more reviews

This reverts commit 30e612ebdf.
2022-03-02 14:10:11 +08:00
Mircea Trofin cb2160760e [nfc][codegen] Move RegisterBank[Info].h under CodeGen
This wraps up from D119053. The 2 headers are moved as described,
fixed file headers and include guards, updated all files where the old
paths were detected (simple grep through the repo), and `clang-format`-ed it all.

Differential Revision: https://reviews.llvm.org/D119876
2022-03-01 21:53:25 -08:00
Xiang1 Zhang 30e612ebdf TLS loads opimization (hoist)
Reviewed By: Wang Pheobe, Topper Craig

Differential Revision: https://reviews.llvm.org/D120000
2022-03-02 10:37:24 +08:00
Lang Hames 34e539dcd7 [ORC] Set ResolverBlockAddr in EPCIndirectionUtils::writeResolverBlock.
Without this, EPCIndirectionUtils::getResolverBlockAddr (and lazy compilation
via EPC) won't work.

No test case: lli is still using LocalLazyCallThroughManager. I'll revisit this
soon when I look at adding lazy compilation support to the ORC runtime.
2022-03-01 16:44:55 -08:00
Duncan P. N. Exon Smith 15ab7bc3af Testing: Make TempFile safe to move; test Temp{Dir,File,Link}
Default the moves and delete the copies for TempFile, matching TempDir
and TempLink, and add tests for all of them to confirm that the
destructor is not harmful after it has been moved from.

Differential Revision: https://reviews.llvm.org/D120691
2022-03-01 13:45:51 -08:00
Zequan Wu 5c9e20d7d0 [PDB] Add char8_t type
Differential Revision: https://reviews.llvm.org/D120690
2022-03-01 13:39:51 -08:00
serge-sans-paille a494ae43be Cleanup includes: TransformsUtils
Estimation on the impact on preprocessor output:
before: 1065307662
after:  1064800684

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120741
2022-03-01 21:00:07 +01:00
Craig Topper 7bc6667845 [Analysis] Simplify the interface to llvm::getICmpCode. NFC
Instead of passing an InstCmpInt * and a bool just pass the predicate
from the caller.

I'm considering moving the similar FCmp functions from InstCombine
over here and this makes the interface consistent with what is used
for FCmp.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120609
2022-03-01 09:53:27 -08:00
Tong Zhang 17ce89fa80 [SanitizerBounds] Add support for NoSanitizeBounds function
Currently adding attribute no_sanitize("bounds") isn't disabling
-fsanitize=local-bounds (also enabled in -fsanitize=bounds). The Clang
frontend handles fsanitize=array-bounds which can already be disabled by
no_sanitize("bounds"). However, instrumentation added by the
BoundsChecking pass in the middle-end cannot be disabled by the
attribute.

The fix is very similar to D102772 that added the ability to selectively
disable sanitizer pass on certain functions.

In this patch, if no_sanitize("bounds") is provided, an additional
function attribute (NoSanitizeBounds) is attached to IR to let the
BoundsChecking pass know we want to disable local-bounds checking. In
order to support this feature, the IR is extended (similar to D102772)
to make Clang able to preserve the information and let BoundsChecking
pass know bounds checking is disabled for certain function.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D119816
2022-03-01 18:47:02 +01:00
serge-sans-paille 71c3a5519d Cleanup includes: LLVMAnalysis
Number of lines output by preprocessor:
before: 1065940348
after:  1065307662

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120659
2022-03-01 18:01:54 +01:00
Nathan Sidwell 75db1795e4 [demangler] Add co_await demangling
The demangler doesn't understand 'aw' as an operator name. This adds
the necessary smarts -- you may use this as an operator functionname,
but not as an expression operator.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D120143
2022-03-01 04:49:19 -08:00
Nathan Sidwell 7f89fa32e8 [demangler][NFC] Tabularize operator name parsing
We need to parse operator names in 3 places -- expressions, names &
fold expressions.  Currently we have 3 separate pieces to do this, and a FIXME.

The operator name and expression parsing are implemented as
handwritten two-character nested switches, the fold expression is a
sequence of string comparisons.

This adds a new OperatorInfo class to encode the operator info
(encoding, kind, name), and has a table that it can binary search.
From that each of the above 3 uses are altered to use the new scheme.

Existing tests cover parsing operator encodings.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D119467
2022-03-01 04:44:56 -08:00
Nathan Sidwell 024495e626 [demangler] Improve buffer hysteresis
Improve demangler buffer hysteresis.  If we needed more than double
the buffer, the original code would allocate exactly the amount
needed, and thus consequently the next request would also realloc.
We're very unlikely to get into wanting more than double, after the
first allocation, as it would require the user to have used an
identifier larger than the hysteresis.  With machine generated code
that's possible, but unlikely.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D119972
2022-03-01 04:37:24 -08:00
Nathan Sidwell e5c98e22fb [demangler] Simplify SwapAndRestore
The SwapAndRestore class is over engineered.  Nothing makes use of the
early restoration machinery.  Let's just remove that cognative burdon.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D120673
2022-03-01 04:37:24 -08:00
Alexey Lapshin a6f3fedc3f [objcopy] Refactor CommonConfig to add posibility to specify added/updated sections as MemoryBuffer.
Current objcopy implementation has a possibility to add or update sections.
The incoming section is specified as a pair: section name and name of the file
containing section data. The interface does not allow to specify incoming
section as a memory buffer. This patch adds possibility to specify incoming
section as a memory buffer.

Differential Revision: https://reviews.llvm.org/D120486
2022-03-01 14:49:41 +03:00
Kristina Bessonova 57aaab3b17 [NVPTX] Fix nvvm.match.sync*.i64 intrinsics return type (i64 -> i32)
NVVM IR specification defines them with i32 return type:

  declare i32 @llvm.nvvm.match.any.sync.i64(i32 %membermask, i64 %value)
  declare {i32, i1} @llvm.nvvm.match.all.sync.i64(i32 %membermask, i64 %value)
  ...
  The i32 return value is a 32-bit mask where bit position in mask corresponds
  to thread’s laneid.

as well as PTX ISA:

  9.7.12.8. Parallel Synchronization and Communication Instructions: match.sync

  match.any.sync.type  d, a, membermask;
  match.all.sync.type  d[|p], a, membermask;
  ...
  Destination d is a 32-bit mask where bit position in mask corresponds
  to thread’s laneid.

Additionally, ptxas doesn't accept intructions, produced by NVPTX backend.
After this patch, it compiles with no issues.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D120499
2022-03-01 12:26:16 +02:00
Jonas Devlieghere 4429cf146e
[Support] Allow the ability to change WithColor's auto detection function
WithColor has an "auto detection mode" which looks whether the
corresponding whether the corresponding cl::opt is enabled or not. While
this is great when opting into cl::opt, it's not so great for downstream
users of this utility, which might have their own competing options to
enable or disable colors. The WithColor constructor takes a color mode,
but the big benefit of the class are its static error and warning
helpers and default error handlers.

In order to allow users of this utility to enable or disable colors
globally, this patch adds the ability to specify a global auto detection
function. By default, the auto detection function behaves the way that
it does today. The benefit of this patch lies in that it can be
overwritten. In addition to a ability to change the auto detection
function, I've also made it possible to get your hands on the default
auto detection function, so you swap it back if if you so desire.

This patch allow downstream users (like LLDB) to globally disable colors
with its own command line flag.

Differential revision: https://reviews.llvm.org/D120593
2022-02-28 20:30:06 -08:00
Phoebe Wang e03d216c28 [X86] Use bit test instructions to optimize some logic atomic operations
This is to match GCC's optimizations: https://gcc.godbolt.org/z/3odh9e7WE

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120199
2022-03-01 09:57:08 +08:00
Kirill Stoimenov b7fd30eac3 [ASan] Removed unused AddressSanitizerPass functional pass.
This is a clean-up patch. The functional pass was rolled into the module pass in D112732.

Reviewed By: vitalybuka, aeubanks

Differential Revision: https://reviews.llvm.org/D120674
2022-03-01 00:41:29 +00:00
Michael Kruse a66f7769a3 [OpenMPIRBuilder] Implement static-chunked workshare-loop schedules.
Add applyStaticChunkedWorkshareLoop method implementing static schedule when chunk-size is specified. Unlike a static schedule without chunk-size (where chunk-size is chosen by the runtime such that each thread receives one chunk), we need two nested loops: one for looping over the iterations of a chunk, and a second for looping over all chunks assigned to the threads.

This patch includes the following related changes:
 * Adapt applyWorkshareLoop to triage between the schedule types, now possible since all schedules have been implemented. The default schedule is assumed to be non-chunked static, as without OpenMPIRBuilder.
 * Remove the chunk parameter from applyStaticWorkshareLoop, it is ignored by the runtime. Change the value for the value passed to the init function to 0, as without OpenMPIRBuilder.
 * Refactor CanonicalLoopInfo::setTripCount and CanonicalLoopInfo::mapIndVar as used by both, applyStaticWorkshareLoop and applyStaticChunkedWorkshareLoop.
 * Enable Clang to use the OpenMPIRBuilder in the presence of the schedule clause.

Differential Revision: https://reviews.llvm.org/D114413
2022-02-28 18:18:33 -06:00
Jonas Devlieghere 3a167c4a90
Revert "[Support] Allow the ability to change WithColor's auto detection function"
This reverts commit a83cf7a846 because it
breaks a bunch of build bots.
2022-02-28 15:32:15 -08:00
Jonas Devlieghere a83cf7a846
[Support] Allow the ability to change WithColor's auto detection function
WithColor has an "auto detection mode" which looks whether the
corresponding whether the corresponding cl::opt is enabled or not. While
this is great when opting into cl::opt, it's not so great for downstream
users of this utility, which might have their own competing options to
enable or disable colors. The WithColor constructor takes a color mode,
but the big benefit of the class are its static error and warning
helpers and default error handlers.

In order to allow users of this utility to enable or disable colors
globally, this patch adds the ability to specify a global auto detection
function. By default, the auto detection function behaves the way that
it does today. The benefit of this patch lies in that it can be
overwritten. In addition to a ability to change the auto detection
function, I've also made it possible to get your hands on the default
auto detection function, so you swap it back if if you so desire.

This patch allow downstream users (like LLDB) to globally disable colors
with its own command line flag.

Differential revision: https://reviews.llvm.org/D120593
2022-02-28 15:03:04 -08:00
Scott Linder f3487c7be9 [YAMLParser] Add multi-line literal folding support
Last year I was working at Swift to add support for [Localization of Compiler Diagnostic Messages](https://forums.swift.org/t/localization-of-compiler-diagnostic-messages/36412/41). We are currently using YAML as the new diagnostic format. The LLVM::YAMLParser didn't have a support for multiline string literal folding and it's crucial to have that for the diagnostic message to help us keep up with the 80 columns rule. Therefore, I decided to add a multiline string literal folding support to the YAML parser.

Patch By: @HassanElDesouky (Hassan ElDesouky)

Differential Revision: https://reviews.llvm.org/D102590
2022-02-28 21:03:36 +00:00
luxufan 3362f54d08 [JITLink] Add R_RISCV_SUB6 relocation
Add R_RISCV_SUB6 relocation

Differential Revision: https://reviews.llvm.org/D120001
2022-03-01 01:45:03 +08:00
esmeyi 61835d19a8 [llvm-objcopy] Initial XCOFF32 support.
Summary: This is an initial implementation of lvm-objcopy for XCOFF32.
Currently only supports simple copying, op-passthrough to follow.

Reviewed By: jhenderson, shchenz

Differential Revision: https://reviews.llvm.org/D97656
2022-02-28 04:59:46 -05:00
Zi Xuan Wu f467aa1b64 [Support] Fix the build errors because missing CSKYTargetParser.def in module.modulemap of 21bce9007a
Add textual header "Support/CSKYTargetParser.def" in module.modulemap.

Build Failure: https://green.lab.llvm.org/green/job/lldb-cmake/41771
2022-02-28 13:47:55 +08:00
Zi Xuan Wu 21bce9007a [Support] Add CSKY target parser and attributes parser
Construct LLVM Support module about CSKY target parser and attribute parser.
It refers CSKY ABIv2 and implementation of GNU binutils and GCC.

https://github.com/c-sky/csky-doc/blob/master/C-SKY_V2_CPU_Applications_Binary_Interface_Standards_Manual.pdf

Now we only support CSKY 800 series cpus and newer cpus in the future undering CSKYv2 ABI specification.
There are 11 archs including ck801, ck802, ck803, ck803s, ck804, ck805, ck807, ck810, ck810v, ck860, ck860v.

Every arch has base extensions, the cpus of that arch family have more extended extensions than base extensions.
We need specify extended extensions for every cpu. Every extension has its enum value, name and related llvm feature string with +/-.
Every enum value represents a bit of uint64_t integer.

Differential Revision: https://reviews.llvm.org/D119917
2022-02-28 11:35:07 +08:00