Commit Graph

49278 Commits

Author SHA1 Message Date
Victor Campos 1d66c5ebbc [ARM] Fix bug in also_compatible_with attribute parser
Check ScopedPrinter pointer before attempting to print the attribute's
parsed information.

Patch by Michael Platings and Victor Campos

Reviewed By: pratlucas

Differential Revision: https://reviews.llvm.org/D132214
2022-08-22 09:40:37 +01:00
Max Kazantsev e587199a50 [SCEV] Prove condition invariance via context, try 2
Initial implementation had too weak requirements to positive/negative
range crossings. Not crossing zero with nuw is not enough for two reasons:

- If ArLHS has negative step, it may turn from positive to negative
  without crossing 0 boundary from left to right (and crossing right to
  left doesn't count for unsigned);
- If ArLHS crosses SINT_MAX boundary, it still turns from positive to
  negative;

In fact we require that ArLHS always stays non-negative or negative,
which an be enforced by the following set of preconditions:

- both nuw and nsw;
- positive step (looks liftable);

Because of positive step, boundary crossing is only possible from left
part to the right part. And because of no-wrap flags, it is guaranteed
to never happen.
2022-08-22 14:31:19 +07:00
Ting Wang d2d77e050b [PowerPC][Coroutines] Add tail-call check with call information for coroutines
Fixes #56679.

Reviewed By: ChuanqiXu, shchenz

Differential Revision: https://reviews.llvm.org/D131953
2022-08-21 22:20:40 -04:00
Joe Loser 4a51b0c05b [ADT] Remove `is_invocable` from `STLExtras.h`
As a follow-up of https://reviews.llvm.org/D132318, now that the callers have
been adjusted to use `std::is_invocable`, remove `llvm::is_invocable` and its
tests.

Differential Revision: https://reviews.llvm.org/D132321
2022-08-21 18:15:38 -06:00
Joe Loser 7e2cf2679e [ADT] Clarify llvm::bit_cast implementation comment
When reviewing https://reviews.llvm.org/D132330, I noticed a few pre-existing
comments regarding the implementation of `llvm::bit_cast`. One comment is a bit
misleading since `std::bit_cast` is a C++20 standard library thing, not a
C++17 one (otherwise we could use it directly). Clarify that in the comment.

Differential Revision: https://reviews.llvm.org/D132332
2022-08-21 18:13:41 -06:00
Kazu Hirata be35870dc8 [ADT] Simplify llvm::bit_cast (NFC)
This patch removes macro tricks to check GCC versions.

The commit message from 19262fc596
states that "is_trivially_copyable is only in GCC 5.1 and later".
Note that we now require GCC 7.1 or higher.

Since both std::is_trivially_constructible and
std::is_trivially_copyable are C++11 features, and we now require
C++17, we probably don't need to worry about the availability of the
C++11 features.

Differential Revision: https://reviews.llvm.org/D132330
2022-08-21 10:39:21 -07:00
Kazu Hirata 36357c967c Remove llvm::is_trivially_copyable (NFC)
This patch removes llvm::is_trivially_copyable as it seems to be dead.
Once I remove it, HAVE_STD_IS_TRIVIALLY_COPYABLE has no users, so this
patch removes the macro also.

The comment on llvm::is_trivially_copyable mentions GCC 4.9, but note
that we now require GCC 7.1 or higher.

Differential Revision: https://reviews.llvm.org/D132328
2022-08-21 10:39:19 -07:00
Sesha Kalyur d9ff670330 [flang][OpenMP] Parser support for Target directive and Device clause
This patch adds support for the device clause on `Target` directive.
Device clause was added in OpenMP specification version 4.5 to
create a device data environment for the extent of a region. On
target construct, the device expression be either be `ancestor`
(taking after the parent) or assign a new `device_num`.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D126441
2022-08-21 22:26:02 +05:30
Joe Loser 99694a5d1d
[ADT] Replace `void_t` equivalent with `std::void_t`
Use `std::void_t` instead of defining our own equivalent in `STLExtras.h` now
that C++17 is available for use.

Differential Revision: https://reviews.llvm.org/D132319
2022-08-21 10:37:38 -06:00
Simon Pilgrim 5263155d5b [CostModel] Add CostKind argument to getShuffleCost
Defaults to TCK_RecipThroughput - as most explicit calls were assuming TCK_RecipThroughput (vectorizers) or was just doing a before-vs-after comparison (vectorcombiner). Calls via getInstructionCost were just dropping the CostKind, so again there should be no change at this time (as getShuffleCost and its expansions don't use CostKind yet) - but it will make it easier for us to better account for size/latency shuffle costs in inline/unroll passes in the future.

Differential Revision: https://reviews.llvm.org/D132287
2022-08-21 10:54:51 +01:00
Kazu Hirata 8b1b0d1d81 Revert "Use std::is_same_v instead of std::is_same (NFC)"
This reverts commit c5da37e42d.

This patch seems to break builds with some versions of MSVC.
2022-08-20 23:00:39 -07:00
Kazu Hirata c5da37e42d Use std::is_same_v instead of std::is_same (NFC) 2022-08-20 22:36:26 -07:00
Kazu Hirata ce377df57e Ensure newlines at the end of files (NFC) 2022-08-20 21:18:23 -07:00
Kazu Hirata 01ffe31cbb [llvm] Remove llvm::is_trivially_{copy/move}_constructible (NFC)
This patch removes llvm::is_trivially_{copy/move}_constructible in
favor of std::is_trivially_{copy/move}_constructible.

The previous attempt to remove them in Dec 2020,
c8d406c93c, broke builds with "some
versions of GCC" according to
6cd9608fb3.

It's been 20 months since then, and the minimum requirement for GCC
has been updated to 7.1 from 5.1.

FWIW, I was able to build llvm with gcc 8.4.0.

Differential Revision: https://reviews.llvm.org/D132311
2022-08-20 14:06:42 -07:00
Lang Hames e0fc85e092 [JITLink] Fix LinkGraph::makeAbsolute, add unit test.
makeAbsolute was not updating the symbol address when applied to external
symbols.

This commit adds a unit test for makeAbsolute, and updates the makeExternal unit
test to check that makeExternal works correctly for absolute symbols.
2022-08-20 13:43:21 -07:00
Kazu Hirata 7dec4648c4 [ADT] Simplify llvm::sort with constexpr if (NFC)
Differential Revision: https://reviews.llvm.org/D132305
2022-08-20 09:34:36 -07:00
Kazu Hirata abb6271d80 [ADT] Deprecate Any::hasValue
This patch deprecates Any::hasValue as I've migrated all known uses of
it to Any::has_value.  I'm planning to remove the deprecated method in
3 months or so.

Differential Revision: https://reviews.llvm.org/D132304
2022-08-20 09:34:35 -07:00
Kazu Hirata e15359debf [ADT] Implement Any::has_value
This patch implements Any::has_value for consistency with std::any in
C++17.

My plan is to deprecate Any::hasValue after migrating all of its uses
to Any::has_value.  Since I am about to do so, this patch simply
replaces hasValue with has_value in the unit test instead of adding
tests for has_value.

Differential Revision: https://reviews.llvm.org/D132278
2022-08-20 07:28:04 -07:00
Kazu Hirata 0e0e638249 [ADT] Simplify llvm::reverse with constexpr if (NFC)
Differential Revision: https://reviews.llvm.org/D132279
2022-08-20 07:28:03 -07:00
Philip Reames b0a2c48e9f [tti] Consolidate getOperandInfo without OperandValueProperties copies [nfc] 2022-08-19 16:22:22 -07:00
Austin Kerbow b0f4678b90 [AMDGPU] Add iglp_opt builtin and MFMA GEMM Opt strategy
Adds a builtin that serves as an optimization hint to apply specific optimized
DAG mutations during scheduling. This also disables any other mutations or
clustering that may interfere with the desired pipeline. The first optimization
strategy that is added here is designed to improve the performance of small gemm
kernels on gfx90a.

Reviewed By: jrbyrnes

Differential Revision: https://reviews.llvm.org/D132079
2022-08-19 15:38:36 -07:00
Alexey Bataev c167028684 [SLP]Delay vectorization of postponable values for instructions with no users.
SLP vectorizer tries to find the reductions starting the operands of the
instructions with no-users/void returns/etc. But such operands can be
postponable instructions, like Cmp, InsertElement or InsertValue. Such
operands still must be postponed, vectorizer should not try to vectorize
them immediately.

Differential Revision: https://reviews.llvm.org/D131965
2022-08-19 08:39:16 -07:00
Alexey Bataev d53e245951 [COST][NFC]Introduce OperandValueKind in getMemoryOpCost, NFC.
Added OperandValueKind OpdInfo parameter to getMemoryOpCost functions to
better estimate cost with immediate values.

Part of D126885.
2022-08-19 07:33:00 -07:00
Max Kazantsev f798c042f4 Revert "[SCEV] Prove condition invariance via context"
This reverts commit a3d1fb3b59.

Reverting until investigation of https://github.com/llvm/llvm-project/issues/57247
has concluded.
2022-08-19 21:02:06 +07:00
Archibald Elliott 3a729069e4 [IR] Update llvm.prefetch to match docs
The current llvm.prefetch intrinsic docs state "The rw, locality and
cache type arguments must be constant integers."

This change:
- Makes arg 3 (cache type) an ImmArg
- Improves the verifier error messages to reference the incorrect
  argument.
- Fixes two tests which contradict the docs.

This is needed as the lowering to GlobalISel is different for ImmArgs
compared to other constants. The non-ImmArgs create a G_CONSTANT MIR
instruction, the for ImmArgs the constant is put directly on the
intrinsic's MIR instruction as an immediate.

Differential Revision: https://reviews.llvm.org/D132042
2022-08-19 09:11:17 +01:00
Craig Topper 37c47b2cac [RISCV] Change how mtune aliases are implemented.
The previous implementation translated from names like sifive-7-series
to sifive-7-rv32 or sifive-7-rv64. This also required sifive-7-rv32
and sifive-7-rv64 to be valid CPU names. As those are not real
CPUs it doesn't make sense to accept them in -mcpu.

This patch does away with the translation and adds sifive-7-series
directly to RISCV.td. Removing sifive-7-rv32 and sifive-7-rv64.
sifive-7-series is only allowed in -mtune.

I've also added "rocket" to RISCV.td but have not removed rocket-rv32
or rocket-rv64.

To prevent -mcpu=sifive-7-series or -mcpu=rocket being used with llc,
I've added a Feature32Bit to all rv32 CPUs. And made it an error to
have an rv32 triple without Feature32Bit. sifive-7-series and rocket
do not have Feature32Bit or Feature64Bit set so the user would need
to provide -mattr=+32bit or -mattr=+64bit along with the -mcpu to
avoid the error.

SiFive no longer names their newer products with 3, 5, or 7 series.
Instead we have p200 series, x200 series, p500 series, and p600 series.
Following the previous behavior would require a sifive-p500-rv32 and
sifive-p500-rv64 in order to support -mtune=sifive-p500-series. There
is currently no p500 product, but it could start getting confusing if
there was in the future.

I'm open to hearing alternatives for how to achieve my main goal
of removing sifive-7-rv32/rv64 as a CPU name.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D131708
2022-08-18 16:22:25 -07:00
Prabhdeep Singh Soni bce94ea551 [OMPIRBuilder] Add support for safelen clause
This patch adds OMPIRBuilder support for the safelen clause for the
simd directive.

Reviewed By: shraiysh, Meinersbur

Differential Revision: https://reviews.llvm.org/D131526
2022-08-18 15:43:08 -04:00
Florian Hahn b8709a9d03
[LV] Support fixed order recurrences.
If the incoming previous value of a fixed-order recurrence is a phi in
the header, go through incoming values from the latch until we find a
non-phi value. Use this as the new Previous, all uses in the header
will be dominated by the original phi, but need to be moved after
the non-phi previous value.

At the moment, fixed-order recurrences are modeled as a chain of
first-order recurrences.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D119661
2022-08-18 19:15:52 +01:00
Philip Reames 1436adae2c [LV-L] Add const and move method body out of line [nfc] 2022-08-18 11:10:19 -07:00
Vitaly Buka a0e402d41f [test] Disable DynamicLibrary with HWASAN
Re-enabled when https://github.com/llvm/llvm-project/issues/57206 fixed.
2022-08-18 10:19:18 -07:00
Paul Walker 96c8d615d6 [SVE] Extend findMoreOptimalIndexType so BUILD_VECTORs do not force 64bit indices.
Extends findMoreOptimalIndexType to allow ISD::BUILD_VECTOR based
indices to be truncated when such truncation is lossless. This can
enable the use of 32bit gather/scatter indices thus making it less
likely to have to split a gather/scatter in two.

Depends on D125194

Differential Revision: https://reviews.llvm.org/D130533
2022-08-18 18:00:53 +01:00
Simon Pilgrim fdec50182d [CostModel] Replace getUserCost with getInstructionCost
* Replace getUserCost with getInstructionCost, covering all cost kinds.
* Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks.

Original Patch by @samparker (Sam Parker)

Differential Revision: https://reviews.llvm.org/D79483
2022-08-18 11:55:23 +01:00
Daniel Bertalan 11443ef85d [llvm-objdump] Support dumping segment information with -chained_fixups
This commit adds the definitions for `dyld_chained_starts_in_image`,
`dyld_chained_starts_in_segment`, and related enums. Dumping their
contents is possible with the -chained_fixups flag of llvm-otool.

The chained-fixups.yaml test was changed to cover bindings/rebases, as
well as weak imports, weak symbols and flat namespace symbols. Now that
we have actual fixup entries, the __DATA segment contains data that
would need to be hexdumped in YAML. We also test empty pages (to look
for the "DYLD_CHAINED_PTR_START_NONE" annotation), so the YAML would end
up quite large. So instead, this commit includes a binary file.

When Apple's effort to upstream their chained fixups code continues,
we'll replace this code with the then-upstreamed code. But we need
something in the meantime for testing ld64.lld's chained fixups code.

Differential Revision: https://reviews.llvm.org/D131961
2022-08-18 09:29:27 +02:00
Lang Hames 6494920987 [JITLink] Pass Allocator (rather than storage) into Symbol named constructors.
Also switch from orc::ExecutorAddrDiff to uint64_t for the Symbol::Size field.

These changes help to prepare for the introduction of symbol alias support:
Aliases will require an auxiliary data structure which will also need to be
allocated (hence the need to pass the allocator down). The Size field will be
re-tasked to track the auxiliary data (which will hold a replacement Size field)
if the symbol is either an alias, or aliased by some other symbol.
2022-08-17 15:55:42 -07:00
Daniil Fukalov 7ed3d81333 [NFCI] Move cost estimation from TargetLowering to TargetTransformInfo.
TragetLowering had two last InstructionCost related `getTypeLegalizationCost()`
and `getScalingFactorCost()` members, but all other costs are processed in TTI.

E.g. it is not comfortable to use other TTI members in these two functions
overrided in a target.

Minor refactoring: `getTypeLegalizationCost()` now doesn't need DataLayout
parameter - it was always passed from TTI.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D117723
2022-08-18 00:38:55 +03:00
Nick Desaulniers 6b0e2fa6f0 [SelectionDAG] make INLINEASM_BR use MachineBasicBlocks instead of BlockAddresses
As part of re-architecting callbr to no longer use blockaddresses
(https://reviews.llvm.org/D129288), we don't really need them in MIR.
They make comparing MachineBasicBlocks of indirect targets during
MachineVerifier a PITA.

Suggested by @efriedma from the discussion:
https://reviews.llvm.org/D130290#3669531

Reviewed By: efriedma, void

Differential Revision: https://reviews.llvm.org/D130316
2022-08-17 09:34:31 -07:00
David Penry 1c9f0408bc Revert "[ModuloSchedule] Add interface call to accept/reject SMS schedules"
This reverts commit 8c4aea438c.

Needed because buildbot failures (warnings) gave a clue that there was
a functional bug in the ARM rejection logic.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D132037
2022-08-17 09:32:43 -07:00
David Penry 8c4aea438c [ModuloSchedule] Add interface call to accept/reject SMS schedules
This interface allows a target to reject a proposed
SMS schedule.  For Hexagon/PowerPC, all schedules
are accepted, leaving behavior unchanged.  For ARM,
schedules which exceed register pressure limits are
rejected.

Also, two RegisterPressureTracker methods now need to be public so
that register pressure can be computed by more callers.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D128941
2022-08-17 08:13:26 -07:00
Paul Kirth 656c5d652c [clang][llvm][NFC] Change misexpect's tolerance option to be 32-bit
In D131869 we noticed that we jump through some hoops because we parse the
tolerance option used in MisExpect.cpp into a 64-bit integer. This is
unnecessary, since the value can only be in the range [0, 100).

This patch changes the underlying type to be 32-bit from where it is
parsed in Clang through to it's use in LLVM.

Reviewed By: jloser

Differential Revision: https://reviews.llvm.org/D131935
2022-08-17 14:38:53 +00:00
Simon Pilgrim 1d522a39f7 [TTI] Remove getInstructionThroughput cost helper.
Pulled out of D79483 - we can just as easily use getUserCost directly
2022-08-17 11:41:47 +01:00
Eli Friedman cfd2c5ce58 Untangle the mess which is MachineBasicBlock::hasAddressTaken().
There are two different senses in which a block can be "address-taken".
There can be a BlockAddress involved, which means we need to map the
IR-level value to some specific block of machine code.  Or there can be
constructs inside a function which involve using the address of a basic
block to implement certain kinds of control flow.

Mixing these together causes a problem: if target-specific passes are
marking random blocks "address-taken", if we have a BlockAddress, we
can't actually tell which MachineBasicBlock corresponds to the
BlockAddress.

So split this into two separate bits: one for BlockAddress, and one for
the machine-specific bits.

Discovered while trying to sort out related stuff on D102817.

Differential Revision: https://reviews.llvm.org/D124697
2022-08-16 16:15:44 -07:00
Martin Sebor 345514e991 [InstCombine] Add support for strlcpy folding
Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D130666
2022-08-16 16:43:40 -06:00
Alexander Shaposhnikov d68ba43ad2 [Intrinsics] Add initial support for NonNull attribute
Add initial support for NonNull attribute.
(https://github.com/llvm/llvm-project/issues/57113)

Test plan:

verify that for
__thread int x;
int main() {

int* y = &x;
return *y;
}
(with this patch) clang -O -fsanitize=null -S -emit-llvm -o -
doesn't emit a null-pointer check

Differential revision: https://reviews.llvm.org/D131872
2022-08-16 21:28:23 +00:00
Victor Campos 784da8a722 [ARM] Simplify the creation of escaped build attribute values
There is an existing mechanism to escape strings, therefore the
functions created to escape Tag_also_compatible_with values are not
really needed. We can simply use the pre-existing utilities.

Reviewed By: pratlucas

Differential Revision: https://reviews.llvm.org/D131680
2022-08-16 11:49:33 +01:00
Victor Campos 08c6840f25 [ARM] Parse Tag_also_compatible_with attribute
The ARM Attribute Parser used to parse the value of also_compatible_with
as it is, disregarding the way it is encoded.

This patch does a context aware parsing of the also_compatible_with
attribute. Additionally, some error handling is also done for incorrect
cases.

Reviewed By: pratlucas

Differential Revision: https://reviews.llvm.org/D130913
2022-08-16 11:22:56 +01:00
Max Kazantsev ebabd6bf18 Return "[SCEV] Use context to strengthen flags of BinOps"
This reverts commit 354fa0b480.

Returning as is. The patch was reverted due to a miscompile, but
this patch is not causing it. This patch made it possible to infer
some nuw flags in code guarded by `false` condition, and then someone
else to managed to propagate the flag from dead code outside.

Returning the patch to be able to reproduce the issue.
2022-08-16 14:12:36 +07:00
Kshitij Jain 29fe204b4e Re-apply "[JITLink] Introduce ELF/i386 backend " with correct authorship.
I (lhames) accidentally pushed 5f300397c6 on
Kshitij Jain's behalf without updating the patch author first (my apologies
Kshitij!).

Re-applying with correct authorship.

https://reviews.llvm.org/D131347
2022-08-15 18:44:43 -07:00
Lang Hames 73600b7c8a Revert "[JITLink] Introduce ELF/i386 backend support for JITLink."
This reverts commit 5f300397c6.

No functional issues, I just failed to correctly set authorship on the patch.
2022-08-15 18:44:43 -07:00
Lang Hames 5f300397c6 [JITLink] Introduce ELF/i386 backend support for JITLink.
This initial ELF/i386 JITLink backend enables JIT-linking of minimal ELF i386
object files. No relocations are supported yet.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D131347
2022-08-15 18:35:51 -07:00
David Blaikie c63f2581f4 Enable -Wctad-maybe-unsupported in LLVM build
Warns on potentially unintended use of C++17 Class Template Argument
Deduction. Use of this feature with types that aren't intended to
support it may may future refactorings of those types more difficult -
so this warning fires whenever the feature is used with a type that may
not have intended to be used with CTAD (the warning uses the existence
of at least one explicit deduction guide to indicate that a type
intentionally supports CTAD - absent that, it's assumed to not be
intended to support CTAD & produces a warning).

This is disabled in libcxx because lots of the standard library is
assumed to provide ctad-usable APIs and the false positive suppression
in the diagnostic is based on system header classification which doesn't
apply in the libcxx build itself.

Differential Revision: https://reviews.llvm.org/D131727
2022-08-15 23:28:51 +00:00
Martin Sebor 65967708d2 [InstCombine] Adjust snprintf folding of constant strings (PR #56598)
Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D130494
2022-08-15 15:59:21 -06:00
Arthur Eubanks 633f5663c3 [LegacyPM] Remove ThinLTO bitcode writer legacy pass
Using the legacy PM for the optimization pipeline is deprecated and in
the process of being removed. This is a small step in that direction.

For an example of migrating to the new PM:
853b57fe80
2022-08-15 14:21:16 -07:00
Sunho Kim 0c69f9f32c [ORC][COFF] Introduce DLLImportDefinitionGenerator.
This class will be used to properly solve the `__imp_` symbol and jump-thunk generation issues. It is assumed to be the last definition generator to be called, and as it's the last generator the only symbols remaining in the lookup set are the symbols that are supposed to be queried outside this jitdylib. Instead of just letting them through, we issue another lookup invocation and fetch the allocated addresses, and then create jitlink graph containing `__imp_` GOT symbols and jump-thunks targetting the fetched addresses.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D131833
2022-08-16 02:06:57 +09:00
Nico Weber 940e178c00 [llvm-objdump] Start on -chained_fixups for llvm-otool
And --chained-fixups for llvm-objdump.

For now, this only prints the dyld_chained_fixups_header and adds
plumbing for the flag. This will be expanded in future commits.

When Apple's effort to upstream their chained fixups code continues,
we'll replace this code with the then-upstreamed code. But we need
something in the meantime for testing ld64.lld's chained fixups
code.

Update chained-fixups.yaml with a file that actually contains
the chained fixup data (`LinkEditData` doesn't encode it yet,
so use `__LINKEDIT` via `--raw-segment=data`).

Differential Revision: https://reviews.llvm.org/D131890
2022-08-15 10:58:52 -04:00
wangyihan 91d784a021 [NFC][SmallVector] Use std::conditional_t instead of std::conditional
Signed-off-by: wangyihan <yihan.wang@intel.com>
2022-08-15 21:51:13 +08:00
Dmitry Vassiliev 5371ab4456 [IR] Change access rights of PredIterator members
These members were made private here 6177386b05 without an explanation.
Our customers have an own implementation inherited from PredIterator with updated advancePastNonTerminators().
The access specifier protected looks resonable and safe here.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D131608
2022-08-15 14:25:58 +02:00
Max Kazantsev 354fa0b480 Revert "[SCEV] Use context to strengthen flags of BinOps"
This reverts commit 34ae308c73.

Our internal testing found a miscompile. Not sure if it's caused by
this patch or it revealed something else. Reverting while investigating.
2022-08-15 18:51:59 +07:00
Zijia Zhu 8719faafdb [ADT] Make SmallSet::insert(const T &) return const_iterator
This patch makes `SmallSet::insert(const T &)` return
`std::pair<const_iterator, bool>` instead of
`std::pair<NoneType, bool>`. This will exactly match std::set's behavior
and make deduplicating items with SmallSet easier.

Reviewed By: dblaikie, lattner

Differential Revision: https://reviews.llvm.org/D131549
2022-08-15 13:53:34 +08:00
Fangrui Song d797c2ffdb [DebugInfo] -fdebug-prefix-map: handle '#line "file"' for asm source
`getContext().setMCLineTableRootFile` (from D62074) sets `RootFile.Name` to
`FirstCppHashFilename`. `RootFile.Name` is not processed by -fdebug-prefix-map
and will go to DW_TAG_compile_unit's DT_AT_name and DW_TAG_label's
DW_AT_decl_file. Remap `RootFile.Name`.

Fix another issue reported by https://github.com/llvm/llvm-project/issues/56609

Reviewed By: #debug-info, dblaikie, raj.khem

Differential Revision: https://reviews.llvm.org/D131848
2022-08-14 20:58:23 -07:00
Kazu Hirata eeac9e9232 [ADT] Deprecate Optional::map
This patch deprecates Optional::map in favor of Optional::transform
for consistency with std::optional::transform in C++23.

Note that I've migrated all known users of Optional::map.

Differential Revision: https://reviews.llvm.org/D131842
2022-08-14 17:51:59 -07:00
Kazu Hirata f5a68feab3 Use llvm::none_of (NFC) 2022-08-14 16:25:39 -07:00
Kazu Hirata 9144e49334 [Support] Drop unnecessary const from a return type (NFC)
Identified with readability-const-return-type.
2022-08-14 12:51:56 -07:00
Lang Hames 1cf81274f4 [JITLink] Add eh-frame CFI inspector, fix crash on malformed FDEs.
Add a fix to check that FDE pc-begin targets are defined before calling
getBlock (which will crash if the target is not defined). FDE pc-begins
pointing at undefined symbols are expected to arise only in obscure
circumstances (malformed objects, or removal of targets by JITLink
passes), but we want to handle them gracefully. With this patch the
FDE will be retained, but without any keepalive edge to it. Unless
some pass takes action to mark it as live it will be dead-stripped.

To make it easier for passes to connect FDEs to their targets a new
EHFrameCFIBlockInspector utility is added. This allows clients to
quickly determine whether a CFI record is a CIE or an FDE (assuming
that it's valid), and retrieve any personality, pc-begin, cie, or
LSDA edges associated with it.
2022-08-14 10:49:26 -07:00
Anubhab Ghosh 23d0e71fcb [Orc] Use IntervalMap to store free memory regions in MapperJITLinkMemoryManager
MapperJITLinkMemoryManager uses a free list to keep track of available
memory regions. Using an IntervalMap instead of vector allow automatic
coalescing of memory regions as they are freed.

Differential Revision: https://reviews.llvm.org/D131831
2022-08-14 14:35:08 +05:30
Alexey Baturo b2f31cac28 [Triple] Add llvm::Triple::isRISCV{32,64}
Reviewed By: vitalybuka, MaskRay, craig.topper

Differential Revision: https://reviews.llvm.org/D131339
2022-08-13 18:51:35 -07:00
Joe Loser 9a75033402
[MC] Leverage constexpr `std::array` in `SubtargetFeature.h`
Replace C-style array with `std::array` since `std::array<T, N>::operator[]` is
`constexpr` in C++17. This also allows us to replace `array_lengthof` calls with
member `size()` function.

Differential Revision: https://reviews.llvm.org/D131826
2022-08-13 12:54:32 -06:00
Kazu Hirata 2a4748576e [ADT] Implement Optional::transform
This patch implements Optional::transform for consistency with
std::optional::transform in C++23.

Note that the new function is identical to Optional::map.  My plan is
to deprecate Optional::map after migrating all of its uses to
Optional::transform.

Differential Revision: https://reviews.llvm.org/D131829
2022-08-13 11:48:25 -07:00
Anubhab Ghosh a31af32183 Reapply [Orc] Properly deallocate mapped memory in MapperJITLinkMemoryManager
When memory is deallocated from MapperJITLinkMemoryManager deinitialize
actions are run through mapper and in case of InProcessMapper, memory
protections of the region are reset to read/write as they were previously
changed and can be reused in future.

Differential Revision: https://reviews.llvm.org/D131768
2022-08-13 13:07:50 +05:30
Anubhab Ghosh 8180105143 Revert "[Orc] Properly deallocate mapped memory in MapperJITLinkMemoryManager"
This reverts commit 143555b2ed.
2022-08-13 10:22:31 +05:30
Sunho Kim 9189a26664 [ORC_RT][COFF] Initial platform support for COFF/x86_64.
Initial platform support for COFF/x86_64.

Completed features:
* Statically linked orc runtime.
* Full linking/initialization of static/dynamic vc runtimes and microsoft stl libraries.
* SEH exception handling.
* Full static initializers support
* dlfns
* JIT side symbol lookup/dispatch

Things to note:
* It uses vc runtime libraries found in vc toolchain installations.
* Bootstrapping state is separated because when statically linking orc runtime it needs microsoft stl functions to initialize the orc runtime, but static initializers need to be ran in order to fully initialize stl libraries.
* Process symbols can't be used blidnly on msvc platform; otherwise duplicate definition error gets generated. If process symbols are used, it's destined to get out-of-reach error at some point.
* Atexit currently not handled -- will be handled in the follow-up patches.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D130479
2022-08-13 13:48:40 +09:00
Anubhab Ghosh 143555b2ed [Orc] Properly deallocate mapped memory in MapperJITLinkMemoryManager
When memory is deallocated from MapperJITLinkMemoryManager deinitialize
actions are run through mapper and in case of InProcessMapper, memory
protections of the region are reset to read/write as they were previously
changed and can be reused in future.

Differential Revision: https://reviews.llvm.org/D131768
2022-08-13 10:08:25 +05:30
Fangrui Song f62e60fb23 [MCDwarf] Respect -fdebug-prefix-map= for generated assembly debug info (DWARF v5)
For generated assembly debug info, MCDwarfLineTableHeader::CompilationDir is an
unmapped path set in MCContext::setGenDwarfRootFile. Remap it.

A relative destination path of -fdebug-prefix-map= exposes a llvm-dwarfdump bug
which joins relative DW_AT_comp_dir and directories[0].

Fix https://github.com/llvm/llvm-project/issues/56609

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D131749
2022-08-12 12:52:36 -07:00
Joe Loser ec7e7797b1
[ADT] Mark variable inline to avoid ODR violations in Sequence.h
Mark `force_iteration_on_noniterable_enum` as an `inline` variable
to avoid ODR violations.

Differential Revision: https://reviews.llvm.org/D131777
2022-08-12 12:55:07 -06:00
Joe Loser 7e521ed1ac
[ADT] Remove STLForwardCompat.h's C++17 equivalents
As a follow-up of e8578968f6 which replaced the
callers to use the C++17 equivalents, remove the equivalents from
`STLForwardCompat.h` entirely and their corresponding tests.

Differential Revision: https://reviews.llvm.org/D131769
2022-08-12 12:50:52 -06:00
Wolfgang Pieb 7ddfb4dfeb [Inlining] Introduce the function attribute "inline-max-stacksize"
The value of the attribute is a size in bytes. It has the effect of
suppressing inlining of functions whose stacksizes exceed the given value.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D129904
2022-08-12 11:07:18 -07:00
James Y Knight 20451cb06b Update license on Unicode.org's ConvertUTF code.
The code was relicensed by its owner (Unicode.org) a long time back,
but we still had the old (problematic) license in our fork.

Note that the source files have not been distributed from unicode.org
since 2009 (due to being buggy and unmaintained upstream), but they
were given this license before that.

Fixes https://github.com/llvm/llvm-project/issues/32309

Differential Revision: https://reviews.llvm.org/D66390
2022-08-12 16:51:08 +00:00
Dawid Jurczak 8a17e74ca9 [NFC] Introduce llvm::to_vector_of to allow creation of SmallVector<T> from range of items convertible to type T
It's https://reviews.llvm.org/D129565 follow-up.

Differential Revision: https://reviews.llvm.org/D129781
2022-08-12 15:22:12 +02:00
Joe Loser e8578968f6
[ADT] Replace STLForwardCompat.h's C++17 equivalents
STLForwardCompat.h defines several utilities and type traits to mimic that of
the ones in the C++17 standard library. Now that LLVM is built with the C++17
standards mode, remove use of these equivalents in favor of the ones from the
standard library.

Differential Revision: https://reviews.llvm.org/D131717
2022-08-12 06:55:59 -06:00
Max Kazantsev a3d1fb3b59 [SCEV] Prove condition invariance via context
Contextual knowledge may be used to prove invariance of some conditions.
For example, in this case:
```
  ; %len >= 0
  guard(%iv = {start,+,1}<nuw> <s %len)
  guard(%iv = {start,+,1}<nuw> <u %len)
```
the 2nd check always fails if `start` is negative and always passes otherwise.

It looks like there are more opportunities of this kind that are still to be
implemented in the future.

Differential Revision: https://reviews.llvm.org/D129753
Reviewed By: apilipenko
2022-08-12 14:23:35 +07:00
Arthur Eubanks eddcfe3a9e [NFC] Format ilist_node_options.h to cycle bots 2022-08-11 15:39:12 -07:00
Martin Storsjö 2c2fb0c737 [llvm] Use hidden visibility when building for MinGW with Clang
Since c5b3de6745 (git main,
August 11th), Clang does generate working hidden visibility
on MinGW targets. Using that reduces the number of exports from
a dylib build of LLVM significantly, which is vital for fitting
within the limit of 64k exported symbols from a DLL.

It's essential that if we set CMAKE_CXX_VISIBILITY_PRESET=hidden
(which passes -fvisibility=hidden on the command line), we also
must define LLVM_EXTERNAL_VISIBILITY consistently to override
it. (If there are mismatches, e.g. setting hidden visibility generally
but never overriding it back to default for the symbols that do need
to be exported, we'd get broken builds in such configurations.)

We don't want to be using __attribute__((visibility("hidden"))) on
MinGW with GCC, because GCC produces a warning about it. (GCC hasn't
warned about the command line options that set hidden visibility
though.) Clang has historically not warned about either of them, so
it is harmless to use the hidden visibility when building with older
Clang (so we don't need to detect the exact version of Clang/LLVM where
it has an effect).

This reduces the number of exported symbols for a dylib build of LLVM;
previously libLLVM exported around 64650 symbols (when the maximum is
65536) when the ARM, AArch64 and X86 targets were enabled. If enabling
more targets (or if building with e.g. assertions enabled), it would
exceed the limit. Now with visibility flags in use, the same build
with ARM, AArch64 and X86 ends up at around 35k exported symbols.

Differential Revision: https://reviews.llvm.org/D131661
2022-08-12 00:57:05 +03:00
Arnold Schwaighofer 6ef223c041 [coro async] Mark async suspend function and its resume function pointer intrinsic as nomerge
Coroutine splitting is not possible if the one-to-one mapping between the two is
lost. Every suspend point must have a matching continuation function
pointer.

rdar://98404664

Differential Revision: https://reviews.llvm.org/D131684
2022-08-11 11:43:30 -07:00
Fangrui Song c2d293ea25 Compiler.h: remove unused LLVM_NODISCARD
Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D131695
2022-08-11 11:06:24 -07:00
Fangrui Song 57f334d817 [Support] Remove Log2 workaround for Android API level < 18
The function added by D9467 is unneeded.
https://github.com/android/ndk/wiki/Changelog-r24 shows that the NDK has
moved forward to at least a minimum target API of 19.

Reviewed By: srhines

Differential Revision: https://reviews.llvm.org/D131656
2022-08-11 17:39:41 +00:00
Fangrui Song 1ca5fee228 [Support] Remove some #if __cplusplus > 201402L 2022-08-11 17:35:02 +00:00
Marc Auberer 84b7055afc [Docs] Fix duplicate enum item name
Removes duplicated names as recommended here: https://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D131193
2022-08-11 09:59:08 -07:00
Marco Elver c47ec95531 [MemorySanitizer] Support memcpy.inline and memset.inline
Other sanitizers (ASan, TSan, see added tests) already handle
memcpy.inline and memset.inline by not relying on InstVisitor to turn
the intrinsics into calls. Only MSan instrumentation currently does not
support them due to missing InstVisitor callbacks.

Fix it by actually making InstVisitor handle Mem*InlineInst.

While the mem*.inline intrinsics promise no calls to external functions
as an optimization, for the sanitizers we need to break this guarantee
since access into the runtime is required either way, and performance
can no longer be guaranteed. All other cases, where generating a call is
incorrect, should instead use no_sanitize.

Fixes: https://github.com/llvm/llvm-project/issues/57048

Reviewed By: vitalybuka, dvyukov

Differential Revision: https://reviews.llvm.org/D131577
2022-08-11 10:43:49 +02:00
Martin Storsjö 5563c38fde [JITLink] Silence GCC warnings about parentheses around && and || operators
This silences the following warnings:

../include/llvm/ExecutionEngine/JITLink/JITLink.h:1108:56: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
 1105 |     assert(S == Scope::Local || llvm::count_if(AbsoluteSymbols,
      |                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1106 |                                                [&](const Symbol *Sym) {
      |                                                ~~~~~~~~~~~~~~~~~~~~~~~~
 1107 |                                                  return Sym->getName() == Name;
      |                                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1108 |                                                }) == 0 &&
      |                                                ~~~~~~~~^~
 1109 |                                     "Duplicate absolute symbol");
      |                                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~
2022-08-11 09:58:11 +03:00
Sunho Kim 7260cdd2e1 [ORC][COFF] Introduce COFFVCRuntimeBootstrapper.
Introduces COFFVCRuntimeBootstrapper that loads/initialize vc runtime libraries. In COFF, we *must* jit-link vc runtime libraries as COFF relocation types have no proper way to deal with out-of-reach data symbols ragardless of linking mode. (even dynamic version msvcrt.lib have tons of static data symbols that must be jit-linked) This class tries to load vc runtime library files from msvc installations with an option to override the path.

There are some complications when dealing with static version of vc runtimes. First, they need static initializers to be ran that requires COFFPlatform support but orc runtime will not be usable before vc runtimes are fully initialized. (as orc runtime will use msvc stl libraries) COFFPlatform that will be introduced in a following up patch will collect static initializers and run them manually in host before boostrapping itself. So, the user will have to do the following.
1. Create COFFPlatform that addes static initializer collecting passes.
2. LoadVCRuntime
3. InitializeVCRuntime
4. COFFPlatform.bootstrap()
Second, the internal crt initialization function had to be reimplemented in orc side. There are other ways of doing this, but this is the simplest implementation that makes platform fully responsible for static initializer. The complication comes from the fact that crt initialization functions (such as acrt_initialize or dllmain_crt_process_attach) actually run all static initializers by traversing from `__xi_a` symbol to `__xi_z`. This requires symbols to be contiguously allocated in sections alphabetically sorted in memory, which is not possible right now and not practical in jit setting. We might ignore emission of `__xi_a` and `__xi_z` symbol and allocate them ourselves, but we have to take extra care after orc runtime boostrap has been done -- as that point orc runtime should be the one running the static initializers.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D130456
2022-08-11 15:27:47 +09:00
Sunho Kim 5cf0082ae3 [JITLink][COFF][x86_64] Implement SECTION/SECREL relocation.
Implements SECTION/SECREL relocation. These are used by debug info (pdb) data.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D130275
2022-08-11 15:12:24 +09:00
Craig Topper bc1f78cc3b [RISCV] Rename PROC_ALIAS to TUNE_ALIAS to reflect it's usage. NFC
This is not used as general CPU alias. Only to support -mtune. Name it as such.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D131602
2022-08-10 21:44:08 -07:00
aqjune 02e56e2533 [CodeGen] Generate efficient assembly for freeze(poison) version of `mm*_cast*` intel intrinsics
This patch makes the variants of `mm*_cast*` intel intrinsics that use `shufflevector(freeze(poison), ..)` emit efficient assembly.
(These intrinsics are planned to use `shufflevector(freeze(poison), ..)` after shufflevector's semantics update; relevant thread: D103874)

To do so, this patch

1. Updates `LowerAVXCONCAT_VECTORS` in X86ISelLowering.cpp to recognize `FREEZE(UNDEF)` operand of `CONCAT_VECTOR` in addition to `UNDEF`
2. Updates X86InstrVecCompiler.td to recognize `insert_subvector` of `FREEZE(UNDEF)` vector as its first operand.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D130339
2022-08-11 13:36:21 +09:00
WANG Xuerui 0c8bfbb374 [LoongArch] Define the new-style reloc types
Differential Revision: https://reviews.llvm.org/D131467
2022-08-11 10:37:30 +08:00
Martin Sebor 0dcfe7aa35 [InstCombine] Tighten up known library function signature tests (PR #56463)
Replace a switch statement used to validate arguments to known library
functions with a more consistent table-driven approach and tighten it
up.
2022-08-10 14:15:46 -06:00
Freddy Ye e4888a37d3 [X86][BF16] Enable __bf16 for x86 targets.
X86 psABI has updated to support __bf16 type, the ABI of which is the
same as FP16. See https://discourse.llvm.org/t/patch-add-optional-bfloat16-support/63149

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D130964
2022-08-10 09:00:47 +08:00
Adrian Prantl af4a39f800 Fix modeline 2022-08-09 16:35:24 -07:00
Kazu Hirata a044d0491e [llvm-profdata] Support JSON as as an output-only format
This patch teaches llvm-profdata to output the sample profile in the
JSON format.  The new option is intended to be used for research and
development purposes.  For example, one can write a Python script to
take a JSON file and analyze how similar different inline instances of
a given function are to each other.

I've chosen JSON because Python can parse it reasonably fast, and it
just takes a couple of lines to read the whole data:

  import json
  with open ('profile.json') as f:
    profile = json.load(f)

Differential Revision: https://reviews.llvm.org/D130944
2022-08-09 16:24:53 -07:00
Dinar Temirbulatov cab6cd6834 [AArch64][LoopVectorize] Introduce trip count minimal value threshold to ignore tail-folding.
After D121595 was commited, I noticed regressions assosicated with small trip
count numbersvectorisation by tail folding with scalable vectors. As a solution
for those issues I propose to introduce the minimal trip count threshold value.

  Differential Revision: https://reviews.llvm.org/D130755
2022-08-09 22:10:17 +01:00
Archibald Elliott b20fe2c25b [docs][AArch64] Label Features with Arm ARM Names
This patch adds the names of the Arm Architecture Reference Manual (ARM)
features to the corresponding Subtarget Features in the AArch64 backend
and target parser.

The aim of this is to make it clearer what architectural features a
subtarget feature might enable (so, which features a CPU must provide to
support that subtarget feature), and so make it easier to add new CPUs
in the future.

Differential Revision: https://reviews.llvm.org/D131257
2022-08-09 18:45:50 +01:00
Markus Böck 205701fd47 [llvm][ADT] Allow using structured bindings with `llvm::enumerate`
This patch adds the ability to deconstruct the `value_type` returned by `llvm::enumarate` into index and value of the wrapping range. Main use case is the common occurence of using it during loop iteration. After this patch it'd then be possible to write code such as:
```
for (auto [index, value] : enumerate(container)) {
   ...
}
```
where `index` is the current index and `value` a reference to elements in the given container.

Differential Revision: https://reviews.llvm.org/D131486
2022-08-09 18:12:40 +02:00
Jun Zhang 0981975ad0
[LLVM] Use range based for loop, NFC
Signed-off-by: Jun Zhang <jun@junz.org>
2022-08-10 00:04:26 +08:00
Simon Pilgrim adb0cd62a9 [Support] TaskQueue.h - replace std::result_of_t with std::invoke_result_t
std::result_of_t is deprecated in C++17

Fixes #57023
2022-08-09 10:52:39 +01:00
Jonas Devlieghere e8c807fade
[llvm] Don't rely on C++17 deduction guide for array creation
Seems like at least one bot (clang-ppc64-aix) is having trouble with the
C++17 deduction guide for array creation. Specify the template arguments
explicitly.
2022-08-08 22:29:26 -07:00
Jonas Devlieghere 2db6b34ea8
[llvm] Alternative attempt at fixing the modules build with C++17
My initial attempt in db008af501 resulted in "error: no viable
constructor or deduction guide for deduction of template arguments of
'array'". Let's see if we can work around that by using an ArrayRef with
an explicit template argument.
2022-08-08 21:59:22 -07:00
Yuta Mukai 5357dd2f43 [MachinePipeliner] Fix Phi generation failure for large stages
The previous code overwrites VRMap for prologue stages during Phi
generation if a register spans many stages.
As a result, the wrong register is used as the one coming from
the prologue in Phis at later stages. (A process exists to correct
this, but it does not work in all cases.)
In addition, VRMap for prologue must be preserved until addBranches().

This patch fixes them by separating the map for Phis into a different
variable (VRMapPhi).

Reviewed By: bcahoon

Differential Revision: https://reviews.llvm.org/D127840
2022-08-09 13:14:26 +09:00
Jonas Devlieghere 860efb10b4
Partially revert "[llvm] Repair the modules build with C++17"
This reverts commit db008af501 because
this now breaks the non-module build...
2022-08-08 15:25:10 -07:00
Jonas Devlieghere db008af501
[llvm] Repair the modules build with C++17
I'm honestly not sure if this is a legitimate issue or not, but after
switching from C++14 to C++17, the modules build started confusing
arrays and initializer lists. Work around the issue by being explicit.
2022-08-08 15:04:46 -07:00
Alex Brachet dbd04b853b [ELF] Support --package-metadata
This was recently introduced in GNU linkers and it makes sense for
ld.lld to have the same support. This implementation omits checking if
the input string is valid json to reduce size bloat.

Differential Revision: https://reviews.llvm.org/D131439
2022-08-08 21:31:58 +00:00
Fangrui Song de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
Daniel Thornburgh bf48b128b0 [Symbolizer] Implement pc element in symbolizing filter.
Implements the pc element for the symbolizing filter, including it's
"ra" and "pc" modes. Return addresses ("ra") are adjusted by
decrementing one. By default, {{{pc}}} elements are assumed to point to
precise code ("pc") locations. Backtrace elements will adopt the
opposite convention.

Along the way, some minor refactors of value printing and colorization.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D131115
2022-08-08 11:08:48 -07:00
Benjamin Kramer 2960299986 [ADT] Retire llvm::apply_tuple in favor of C++17 std::apply 2022-08-08 18:23:38 +02:00
Jakub Kuderski ba9dc5f577 [ADT] Add is_splat overload accepting initializer_list
Allow for `is_splat` to be used inline, similar to `is_contained`, e.g.,
```
if (is_splat({type1, type2, type3, type4}))
  ...
```

which is much more concise and less typo-prone than an equivalent chain of equality comparisons.

My immediate motivation is to clean up some code in the SPIR-V dialect that currently needs to either construct a temporary container or use `makeArrayRef` before calling `is_splat`.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D131289
2022-08-08 10:20:02 -04:00
Simon Pilgrim 9641a201a5 [DAG] Add initial SelectionDAG::canCreateUndefOrPoison support
This patch adds basic support for a DAG variant of the canCreateUndefOrPoison call and updates DAGCombiner::visitFREEZE to use it, further Opcodes (including target specific Opcodes) can be handled when we have test coverage.

So far, I've left visitFREEZE to just use this for unary nodes (which currently means the existing BITCAST/FREEZE cases) - later patches will add other unary opcodes (with test coverage) and we can also refactor visitFREEZE to support a general number of operands like we do in InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating.

I'm not aware of any vector test freeze coverage so the DemandedElts (and the Depth) args are not being used yet - but they are in place. Similarly we will be able to handle poison generating SDNodeFlags as and when it becomes an issue.

Part of the work for D106675 / PR50468

Differential Revision: https://reviews.llvm.org/D130646
2022-08-08 15:16:06 +01:00
Tobias Hieta 576375a2d6
[LLD][COFF] Ignore DEBUG_S_XFGHASH_TYPE/VIRTUAL
These are new debug types that ships with the latest
Windows SDK and would warn and finally fail lld-link.

The symbols seems to be related to Microsoft's XFG
which is their version of CFG. We can't handle any of
this yet, so for now we can just ignore these types
so that lld doesn't fail with a new version of Windows
SDK.

Fixes: #56285

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D129378
2022-08-08 15:53:52 +02:00
Benjamin Kramer 090bdaad34 [Support] Use std::shared_mutex when we're not on old MacOS
In C++17 mode this is always available.
2022-08-08 13:31:05 +02:00
Stefan Gränitz 7a66fe1075 Wrap `llvm_unreachable` macro in do-while loop
Macros that expand into multiple terms can cause interesting preprocessor hickups depending
on the context they are used in. https://github.com/llvm/llvm-project/issues/56867 reported
a miscompilation of `llvm_unreachable(msg)` inside a `LLVM_DEBUG({ ... })` block. We were
able to fix it by wrapping the expansion in a `do {} while(false)`.

Differential Revision: https://reviews.llvm.org/D131337
2022-08-08 13:15:32 +02:00
Shubham Narlawar ab4fc87a9d [DAG] Emit table lookup from TargetLowering::expandCTTZ()
This patch emits table lookup in expandCTTZ.

Context -
https://reviews.llvm.org/D113291 transforms set of IR instructions to
cttz intrinsic but there are some targets which does not support CTTZ or
CTLZ. Hence, I generate a table lookup in TargetLowering::expandCTTZ().

Differential Revision: https://reviews.llvm.org/D128911
2022-08-08 12:08:05 +01:00
Benjamin Kramer b4e9977fc1 Remove C++17 #ifdefs around the implicit conversion between StringRef and string_view
This is no longer needed as LLVM is built with C++17 now. Also drop the
explicit conversion to std::string as the implicit conversion to
std::string_view gets picked first anyways.
2022-08-08 12:59:23 +02:00
Simon Tatham 72017e9b16 [llvm-objdump,ARM] Fix big-endian AArch32 disassembly.
The ABI for big-endian AArch32, as specified by AAELF32, is above-
averagely complicated. Relocatable object files are expected to store
instruction encodings in byte order matching the ELF file's endianness
(so, big-endian for a BE ELF file). But executable images can
//either// do that //or// store instructions little-endian regardless
of data and ELF endianness (to support BE32 and BE8 platforms
respectively). They signal the latter by setting the EF_ARM_BE8 flag
in the ELF header.

(In the case of the Thumb instruction set, this all means that each
16-bit halfword of a Thumb instruction is stored in one or other
endianness. The two halfwords of a 32-bit Thumb instruction must
appear in the same order no matter what, because the first halfword is
the one that must avoid overlapping the encoding of any 16-bit Thumb
instruction.)

llvm-objdump was unconditionally expecting Arm instructions to be
stored little-endian. So it would correctly disassemble a BE8 image,
but if you gave it a BE32 image or a BE object file, it would retrieve
every instruction in byte-swapped form and disassemble it to
nonsense. (Even an object file output by LLVM itself, because
ARMMCCodeEmitter outputs instructions big-endian in big-endian mode,
which is correct for writing an object file.)

This patch allows llvm-objdump to correctly disassemble all three of
those classes of Arm ELF file. It does it by introducing a new
SubtargetFeature for big-endian instructions, setting it from the ELF
image type and flags during llvm-objdump setup, and teaching both
ARMDisassembler and llvm-objdump itself to pay attention to it when
retrieving instruction data from a section being disassembled.

Differential Revision: https://reviews.llvm.org/D130902
2022-08-08 10:49:51 +01:00
Anubhab Ghosh 1eee6de873 [Orc][JITLink] Slab based memory allocator to reduce RPC calls
Calling reserve() used to require an RPC call. This commit allows large
ranges of executor address space to be reserved. Subsequent calls to
reserve() will return subranges of already reserved address space while
there is still space available.

Differential Revision: https://reviews.llvm.org/D130392
2022-08-08 15:13:41 +05:30
Nathan James 5512f398a0
[ADT] Update Optional Deprecation with fix-it
When Optional accessors were deprecated, in D131349,  the standard c++14 style attribute was used.
Clang has a slightly better deprecated attribute that enables simpler migration by embedding fix-its.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D131381
2022-08-08 10:39:31 +01:00
Carlos Alberto Enciso a3f7a2c183 [CodeView] Add function to get size in bytes for TypeIndex/CVType.
Given a TypeIndex or CVType return its size in bytes. Basically it
is the inverse to 'CodeViewDebug::lowerTypeBasic', that returns a
TypeIndex based in a size.

Reviewed By: rnk, djtodoro

Differential Revision: https://reviews.llvm.org/D129846
2022-08-08 08:48:23 +01:00
Kazu Hirata e20d210eef [llvm] Qualify auto (NFC)
Identified with readability-qualified-auto.
2022-08-07 23:55:27 -07:00
Jun Zhang 98339ac7af
[Support] move llvm::llvm_is_multithread to header, NFC
This allow optimization without LTO. Also remove some useless else-ifs.
Signed-off-by: Jun Zhang <jun@junz.org>

Differential Revision: https://reviews.llvm.org/D131313
2022-08-08 08:49:02 +08:00
Kazu Hirata b5f8d42efe [ADT] Deprecate Optional::{hasValue,getValue} (NFC)
Differential Revision: https://reviews.llvm.org/D131349
2022-08-07 11:30:58 -07:00
Alexey Bader f0f1bcadc7 [demangler] Add getters for Qual/Vector/Pointer types
These are useful for downstream tool aligning the mangling of data types which differ between different languages/targets.

Patch by Steffen Larsen <steffen.larsen@intel.com>

Differential Revision: https://reviews.llvm.org/D130909
2022-08-07 02:10:05 -07:00
Kazu Hirata ba0407ba86 [llvm] Use range-based for loops (NFC) 2022-08-07 00:16:21 -07:00
Kazu Hirata a2d4501718 [llvm] Fix comment typos (NFC) 2022-08-07 00:16:14 -07:00
Kazu Hirata 3b114087c3 [llvm] Drop unnecessary const from return types (NFC)
Identified with readability-const-return-type.
2022-08-07 00:16:11 -07:00
Fangrui Song fa66789d06 [llvm] LLVM_NODISCARD => [[nodiscard]]. NFC
With C++17 there is no Clang pedantic warning.
2022-08-07 00:26:33 +00:00
Fangrui Song bf5550b679 [ADT] Fix signature of StringSet::insert
to match StringMap and unordered_set.
2022-08-06 22:48:41 +00:00
Krzysztof Parzyszek 2bc390bdd6 [RDF] Use default TargetOperandInfo if not given in constructor
All current in-tree users use the default implementation.
2022-08-06 14:32:52 -05:00
Markus Böck f7b73b7e8e [llvm] Remove uses of deprecated `std::iterator`
std::iterator has been deprecated in C++17 and some standard library implementations such as MS STL or libc++ emit deperecation messages when using the class.
Since LLVM has now switched to C++17 these will emit warnings on these implementations, or worse, errors in build configurations using -Werror.

This patch fixes these issues by replacing them with LLVMs own llvm::iterator_facade_base which offers a superset of functionality of std::iterator.

Differential Revision: https://reviews.llvm.org/D131320
2022-08-06 14:07:37 +02:00
Tobias Hieta 4b8db17c32 [llvm][macos] Fix usage of std::shared_mutex on old macOS SDK versions
When setting CMAKE_CXX_STANDARD to 17 and targeting a macOS version
under 10.12 the ifdefs would try to use std::shared_mutex because
the of the C++ standard. This should also check the targeted SDK.

See discussion in: https://reviews.llvm.org/D130689

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D131063
2022-08-05 21:45:23 +02:00
Zhaoshi Zheng 99e50e5838 [WinEH][ARM64] Split Unwind Info for Fucntions Larger than 1MB
Create function segments and emit unwind info of them.

A segment must be less than 1MB and no prolog or epilog is splitted between two
segments.

This patch should generate correct, though not optimal, unwind info for large
functions. Currently it only generate pacted info (.pdata) only for functions
that are less than 1MB (single-segment functions). This is NFC from before this
patch.

The next step is to enable (.pdata) only unwind info for the first segment or
segments that have neither prolog or epilog in a multi-segment function.

Another future work item is to further split segments that require more than 255
code words or have more than 65535 epilogs.

Reference:
https://docs.microsoft.com/en-us/cpp/build/arm64-exception-handling#function-fragments

Differential Revision: https://reviews.llvm.org/D130049
2022-08-05 11:46:41 -07:00
Dawid Jurczak 1bd31a6898 [NFC] Add SmallVector constructor to allow creation of SmallVector<T> from ArrayRef of items convertible to type T
Extracted from https://reviews.llvm.org/D129781 and address comment:
https://reviews.llvm.org/D129781#3655571

Differential Revision: https://reviews.llvm.org/D130268
2022-08-05 13:35:41 +02:00
Paul Kirth a812b39e8c [llvm][ir] Add missing license to ProfDataUtils
We failed to add these in D128860 or D128858

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D131226
2022-08-05 03:39:13 +00:00
Fangrui Song 7d6017fd31 [TTI] Change new getVectorInstrCost overload to use const reference after D131114
A const reference is preferred over a non-null const pointer.
`Type *` is kept as is to match the other overload.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D131197
2022-08-04 15:16:51 -07:00
Mingming Liu bc8f2f3649 [AArch64][TTI][NFC] Overload method 'getVectorInstrCost' to provide vector instruction itself, as a context information for cost estimation.
1) Overloaded (instruction-based) method is a wrapper around the current (opcode-based) method.
2) This patch also changes a few callsites (VectorCombine.cpp,
   SLPVectorizer.cpp, CodeGenPrepare.cpp) to call the overloaded method.
3) This is a split of D128302.

Differential Revision: https://reviews.llvm.org/D131114
2022-08-04 12:58:25 -07:00
Marc Auberer 9dbe839627 [Docs] Fix missing docs strings for CallingConv.h
Replaces
```
//
```
with
```
///
```
for some code lines to make it visible in the auto-generated documentation.

Reviewed By: dblaikie, MaskRay

Differential Revision: https://reviews.llvm.org/D131152
2022-08-04 19:14:53 +00:00
Daniel Thornburgh 22df238d4a [Symbolizer] Implement data symbolizer markup element.
This connects the Symbolizer to the markup filter and enables the first
working end-to-end flow using the filter.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D130187
2022-08-04 10:20:29 -07:00
Ellis Hoag 12e78ff881 [InstrProf] Add the skipprofile attribute
As discussed in [0], this diff adds the `skipprofile` attribute to
prevent the function from being profiled while allowing profiled
functions to be inlined into it. The `noprofile` attribute remains
unchanged.

The `noprofile` attribute is used for functions where it is
dangerous to add instrumentation to while the `skipprofile` attribute is
used to reduce code size or performance overhead.

[0] https://discourse.llvm.org/t/why-does-the-noprofile-attribute-restrict-inlining/64108

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D130807
2022-08-04 08:45:27 -07:00
Joshua Cranmer 2138c90645 [IR] Move support for dxil::TypedPointerType to LLVM core IR.
This allows the construct to be shared between different backends. However, it
still remains illegal to use TypedPointerType in LLVM IR--the type is intended
to remain an auxiliary type, not a real LLVM type. So no support is provided for
LLVM-C, nor bitcode, nor LLVM assembly (besides the bare minimum needed to make
Type->dump() work properly).

Reviewed By: beanz, nikic, aeubanks

Differential Revision: https://reviews.llvm.org/D130592
2022-08-04 10:41:11 -04:00
Lang Hames b5f76d83ff [ORC] Ensure that llvm_orc_registerJITLoaderGDBAllocAction is linked into tools.
Add a reference to llvm_orc_registerJITLoaderGDBAllocAction from the
linkComponents function in the lli, llvm-jitlink, and llvm-jitlink-executor
tools. This ensures that llvm_orc_registerJITLoaderGDBAllocAction is not
dead-stripped in optimized builds, which may cause failures in these tools.

The llvm_orc_registerJITLoaderGDBAllocAction function was originally added with
MachO debugging support in 69be352a19, but that patch failed to update the
linkComponents functions.

http://llvm.org/PR56817
2022-08-03 17:51:45 -07:00
Arthur Eubanks 203296d642 [BoundsChecking] Fix merging of sizes
BoundsChecking uses ObjectSizeOffsetEvaluator to keep track of the
underlying size/offset of pointers in allocations.  However,
ObjectSizeOffsetVisitor (something ObjectSizeOffsetEvaluator
uses to check for constant sizes/offsets)
doesn't quite treat sizes and offsets the same way as
BoundsChecking.  BoundsChecking wants to know the size of the
underlying allocation and the current pointer's offset within
it, but ObjectSizeOffsetVisitor only cares about the size
from the pointer to the end of the underlying allocation.

This only comes up when merging two size/offset pairs. Add a new mode to
ObjectSizeOffsetVisitor which cares about the underlying size/offset
rather than the size from the current pointer to the end of the
allocation.

Fixes a false positive with -fsanitize=bounds.

Reviewed By: vitalybuka, asbirlea

Differential Revision: https://reviews.llvm.org/D131001
2022-08-03 17:21:19 -07:00
Vitaly Buka a2aa6809a8 [NFC][Inliner] Add cl::opt<int> to tune InstrCost
The plan is tune this for sanitizers.

Differential Revision: https://reviews.llvm.org/D131123
2022-08-03 17:14:10 -07:00
Congzhe Cao 76be554931 [DependenceAnalysis][PR56275] Normalize negative dependence analysis results
This patch is the first of the two-patch series (D130188, D130179) that
resolve PR56275 (https://github.com/llvm/llvm-project/issues/56275)
which is a missed opportunity, where a perfrectly valid case for loop
interchange failed interchange legality.

If the distance/direction vector produced by dependence analysis (DA) is
negative, it needs to be normalized (reversed). This patch provides helper
functions `isDirectionNegative()` and `normalize()` in DA that does the
normalization, and clients can query DA to do normalization if needed.

A pass option `<normalized-results>` is added to DependenceAnalysisPrinterPass,
and we leverage it to update DA test cases to make sure of test coverage. The
test cases added in `Banerjee.ll` shows that negative vectors are normalized
with `print<da><normalized-results>`.

Reviewed By: bmahjour, Meinersbur, #loopoptwg

Differential Revision: https://reviews.llvm.org/D130188
2022-08-03 19:59:00 -04:00
Bill Wendling 239c831de4 Add switch to use "source_filename" instead of a hash ID for globally promoted local
During LTO a local promoted to a global gets a unique suffix based on
a hash of the module IR. This means that changes in the local's module
can affect the contents in another module that imported it (because the name
of the imported promoted local is changed, but that doesn't reflect a
real change in the importing module). So any tool that's
validating changes to the importing module will see a superficial change.

Instead of using the module hash, we can use the "source_filename" if it
exists to generate a unique identifier that doesn't change due to LTO
shenanigans.

Differential Revision: https://reviews.llvm.org/D128863
2022-08-03 16:41:56 -07:00
Mircea Trofin 0cb9746a7d [nfc][mlgo] Separate logger and training-mode model evaluator
This just shuffles implementations and declarations around. Now the
logger and the TF C API-based model evaluator are separate.

Differential Revision: https://reviews.llvm.org/D131116
2022-08-03 16:20:28 -07:00
Vitaly Buka e056e74dda [NFC][inline] Add const to an argument 2022-08-03 13:20:47 -07:00
Fraser Cormack 646e2f4803 [VP] Rename VP int<->float conversion ISD opcodes
These should be named like the non-VP versions for consistency.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D130967
2022-08-03 10:04:38 +01:00
Jannik Silvanus d0bfebda5b [Docs] Improve cycle and closed path definitions
Improve the cycle definition, by avoiding usage of not yet defined
or only vaguely defined terminology inside definitions.
More precisely, the existing definition defined "outermost cycles",
and then proceeded to use the term "cycles" for further definitions,
which in turn were used to actually define "cycles".

Now, instead only define "cycles". This does not change the meaning
of a cycle, which depends on the chosen surrounding (subgraph) of a CFG.

Also mention the function CFG in the first definition, because later
later definitions require it anyways.

Also slightly improve the definition of a closed path, by explicitly
requiring the inner nodes to be distinct.

Differential Revision: https://reviews.llvm.org/D130891
2022-08-03 10:28:13 +02:00
Nikita Popov b128e057c1 [AA] Make ModRefInfo a bitmask enum (NFC)
Mark ModRefInfo as a bitmask enum, which allows using normal
& and | operators on it. This supersedes various functions like
unionModRef() and intersectModRef(). I think this makes the code
cleaner than going through helper functions...

Differential Revision: https://reviews.llvm.org/D130870
2022-08-03 10:05:55 +02:00
Max Kazantsev 34ae308c73 [SCEV] Use context to strengthen flags of BinOps
Sometimes SCEV cannot infer nuw/nsw from something as simple as
```
  len in [0, MAX_INT]
...
  iv = phi(0, iv.next)
  guard(iv <s len)
  guard(iv <u len)
  iv.next = iv + 1
```
just because flag strenthening only relies on definition and does not use local facts.
This patch adds support for the simplest case: inference of flags of `add(x, constant)`
if we can contextually prove that `x <= max_int - constant`.

In case if it has negative CT impact, we can add an option to switch it off. I woudln't
expect that though.

Differential Revision: https://reviews.llvm.org/D129643
Reviewed By: apilipenko
2022-08-03 14:08:57 +07:00
Paul Kirth d434e40f39 [llvm][NFC] Refactor code to use ProfDataUtils
In this patch we replace common code patterns with the use of utility
functions for dealing with profiling metadata. There should be no change
in functionality, as the existing checks should be preserved in all
cases.

Reviewed By: bogner, davidxl

Differential Revision: https://reviews.llvm.org/D128860
2022-08-03 00:09:45 +00:00
Chris Bieneman ee4d815008 [DX] Remove IntrNoMem from create handle intrinsic
The create handle intrinsic calls can't be removed, so it was incorrect
to mark them as IntrNoMem.
2022-08-02 16:57:22 -05:00
Nicolai Hähnle f7872cdce1 CommandLine: add and use cl::SubCommand::get{All,TopLevel}
Prefer using these accessors to access the special sub-commands
corresponding to the top-level (no subcommand) and all sub-commands.

This is a preparatory step towards removing the use of ManagedStatic:
with a subsequent change, these global instances will be moved to
be regular function-scope statics.

It is split up to give downstream projects a (albeit short) window in
which they can switch to using the accessors in a forward-compatible
way.

Differential Revision: https://reviews.llvm.org/D129118
2022-08-02 23:49:16 +02:00
Mircea Trofin 4146c1756d [nfc] Remove unused parameter in TailDuplicator::duplicateSimpleBB
Differential Revision: https://reviews.llvm.org/D131008
2022-08-02 13:39:34 -07:00
Vladislav Dzhidzhoev f6d9f00031 [DebugInfo] Test commit: update irrelevant comments
Differential Revision: https://reviews.llvm.org/D130998
2022-08-02 20:21:24 +03:00
Jay Foad bb2832410e [IRBuilder] CreateIntrinsic with implicit mangling
Add a new IRBuilderBase::CreateIntrinsic which takes the return type and
argument values for the intrinsic call but does not take an explicit
list of types to mangle. Instead the builder works this out from the
intrinsic declaration and the types of the supplied arguments.

This means that the mangling is hidden from the client, which in turn
means that intrinsic definitions can change which arguments are mangled
without requiring any changes to the client code.

Differential Revision: https://reviews.llvm.org/D130776
2022-08-02 13:08:35 +01:00
Dawid Jurczak 78650b7861 [NFC] Remove some boilerplate from SmallVector header
In SmallVector header we use couple of times exactly same enable_if,
in this change we de-duplicate it and declare only once.

Extracted from: https://reviews.llvm.org/D129990

Differential Revision: https://reviews.llvm.org/D130779
2022-08-02 11:45:55 +02:00
David Sherwood 4ef9cb6c17 [AArch64][LoopVectorize] Disable tail-folding for SVE when loop has interleaved accesses
If we have interleave groups in the loop we want to vectorise then
we should fall back on normal vectorisation with a scalar epilogue. In
such cases when tail-folding is enabled we'll almost certainly go on to
create vplans with very high costs for all vector VFs and fall back on
VF=1 anyway. This is likely to be worse than if we'd just used an
unpredicated vector loop in the first place.

Once the vectoriser has proper support for analysing all the costs
for each combination of VF and vectorisation style, then we should
be able to remove this.

Added an extra test here:

  Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll

Differential Revision: https://reviews.llvm.org/D128342
2022-08-02 09:52:33 +01:00
jacquesguan e38af7ba95 [LV] Refactor getExtendedAddReductionCost to support other extended reduction more than Add.
Now the API getExtendedAddReductionCost is used to determine the cost of extended Add reduction with optional Mul. For Arm, it could cover the cases. But for other target, for example: RISCV, they support other kinds of extended recution, such as FAdd.

This patch does the following changes:
1, Split getExtendedAddReductionCost into 2 new API: getExtendedReductionCost which handles the extended reduction with addtional input of Opcode; getMulAccReductionCost which handle the MLA cases the getExtendedAddReductionCost.
2, Refactor getReductionPatternCost, add some contraint condition to make sure the getMulAccReductionCost should only handle the reuction of Add + Mul.

Differential Revision: https://reviews.llvm.org/D130868
2022-08-02 16:02:38 +08:00
Fangrui Song 2b70bebc6d [MachineFunctionPass] Support -print-changed={,c}diff{,-quiet}
Follow-up to D130434.
Move doSystemDiff to PrintPasses.cpp and call it in MachineFunctionPass.cpp.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D130833
2022-08-01 12:56:15 -07:00
Simon Pilgrim 630a65f3da SelectionDAGNodes.h - fix Wdocumentation warnings. NFC. 2022-08-01 15:09:16 +01:00
Simon Pilgrim 27105e2f30 MisExpect.h - fix Wdocumentation warnings. NFC. 2022-08-01 15:06:30 +01:00
Anubhab Ghosh ac3cb4ecd0 [Orc] Disable use of shared memory on Android
shm_open and shm_unlink are not available on Android. This commit
disables SharedMemoryMapper on Android until a better solution is
available.

https://android.googlesource.com/platform/bionic/+/refs/heads/master/docs/status.md
https://github.com/llvm/llvm-project/issues/56812

Differential Revision: https://reviews.llvm.org/D130814
2022-08-01 18:48:39 +05:30
Lucas Prates ba9caf9170 [Arm] Fix parsing and emission of Tag_also_compatible_with eabi attribute
According to the ABI for the Arm Architecture, the value for the
Tag_also_compatible_with eabi attribute is represented by an NTBS entry.
This string value, in turn, is composed of a pair of tag+value encoded
in one of two formats:
- ULEB128: tag, ULEB128: value, 0.
- ULEB128: tag, NBTS: data.
(See [[ 60a8eb8c55/addenda32/addenda32.rst (3373secondary-compatibility-tag) | section 3.3.7.3 on the Addenda to, and Errata in, the ABI for the Arm Architecture ]].)

Currently the Arm assembly parser and streamer ignore the encoding of
the attribute's NTBS value, which can result in incorrect attributes
being emitted in both assembly and object file outputs.

This patch fixes these issues by properly handing the value's encoding.
An update to llvm-readobj to properly handle the attribute's value will be
covered by a separate patch.

Patch by Victor Campos and Lucas Prates.

Reviewed By: vhscampos

Differential Revision: https://reviews.llvm.org/D129500
2022-08-01 13:28:01 +01:00
Dominik Adamski d90b7bf2c5 Add support for lowering simd if clause to LLVM IR
Scope of changes:
  1) Added new function to generate loop versioning
  2) Added support for if clause to applySimd function
  2) Added tests which confirm that lowering is successful

If ifCond is specified, then collapsed loop is duplicated and if branch
is added. Duplicated loop is executed if simd ifCond is evaluated to false.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D129368

Signed-off-by: Dominik Adamski <dominik.adamski@amd.com>
2022-08-01 04:43:32 -05:00
David Sherwood 41119a0f52 [DAGCombiner] Extend visitAND to include EXTRACT_SUBVECTOR
Eliminate an AND by redefining an anyext|sext|zext.

     (and (extract_subvector (anyext|sext|zext v) _) iN_mask)
  => (extract_subvector (zeroext_iN v))

Differential Revision: https://reviews.llvm.org/D130782
2022-08-01 10:32:32 +01:00
Vladislav Dzhidzhoev facb3ac385 [GlobalISel][DebugInfo] salvageDebugInfo analogue for gMIR
Salvage debug info of instruction that is about to be deleted as dead in
Combiner pass. Currently supported instructions are COPY and G_TRUNC.

It allows to salvage debug info of some dead arguments of functions, by putting
DWARF expression corresponding to the instruction being deleted into related
DBG_VALUE instruction.

Here is an example of missing variables location https://godbolt.org/z/K48osb9dK.
We see that arguments x, y of function foo are not available in debugger, and
corresponding DBG_VALUE instructions have undefined register operand instead of
variables locaton after Aarch64PreLegalizerCombiner pass. The reason is that
registers where variables are located are removed as dead (with instruction
G_TRUNC). We can use salvageDebugInfo analogue for gMIR to preserve debug
locations of dead variables.

Statistics of llvm object files built with vs without this commit on -O2
optimization level (CMAKE_BUILD_TYPE=RelWithDebInfo, -fglobal-isel) on Aarch64 (macOS):

Number of variables with 100% of parent scope covered by DW_AT_location has been increased by 7,9%.
Number of variables with 0% coverage of parent scope has been decreased by 1,2%.
Number of variables processed by location statistics has been increased by 2,9%.
Average PC ranges coverage has been increased by 1,8 percentage points.

Coverage can be improved by supporting more instructions, or by calling
salvageDebugInfo for instructions that are deleted during Combiner rules exection.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D129909
2022-08-01 11:14:53 +02:00
Nikita Popov 5b1d10bda6 [AA] Drop setModAndRef() function (NFC)
Without the "must" state, this function is pointless, because we
can just directly create a ModRef instead.
2022-08-01 07:55:39 +02:00
Nikita Popov f96ea53e89 [AA] Do not track Must in ModRefInfo
getModRefInfo() queries currently track whether the result is a
MustAlias on a best-effort basis. The only user of this functionality
is the optimized memory access type in MemorySSA -- which in turn
has no users. Given that this functionality has not found a user
since it was introduced five years ago (in D38862), I think we
should drop it again.

The context is that I'm working to separate FunctionModRefBehavior
to track mod/ref for different location kinds (like argmem or
inaccessiblemem) separately, and the fact that ModRefInfo also has
an unrelated Must flag makes this quite awkward, especially as this
means that NoModRef is not a zero value. If we want to retain the
functionality, I would probably split getModRefInfo() results into
a part that just contains the ModRef information, and a separate
part containing a (best-effort) AliasResult.

Differential Revision: https://reviews.llvm.org/D130713
2022-08-01 07:14:31 +02:00
Chuanqi Xu 9701053517 Introduce @llvm.threadlocal.address intrinsic to access TLS variable
This belongs to a series of patches which try to solve the thread
identification problem in coroutines. See
https://discourse.llvm.org/t/address-thread-identification-problems-with-coroutine/62015
for a full background.

The problem consists of two concrete problems: TLS variable and readnone
functions. This patch tries to convert the TLS problem to readnone
problem by converting the access of TLS variable to an intrinsic which
is marked as readnone.

The readnone problem would be addressed in following patches.

Reviewed By: nikic, jyknight, nhaehnle, ychen

Differential Revision: https://reviews.llvm.org/D125291
2022-08-01 10:51:30 +08:00
Luís Marques 260a641068 [RISCV] Pre-RA expand pseudos pass
Expand load address pseudo-instructions earlier (pre-ra) to allow follow-up
patches to fold the addi of PseudoLLA instructions into the immediate
operand of load/store instructions.

Differential Revision: https://reviews.llvm.org/D123264
2022-07-31 23:19:00 +02:00
Dawid Jurczak 50eb5bcfcd [NFC] Remove redundant CalculateSmallVectorDefaultInlinedElements usage from to_vector utility
CalculateSmallVectorDefaultInlinedElements<..>::value is already used as default value for second template parameter in SmallVector class declaration.
There is no need to pass it explicitly in to_vector.

Extracted from: https://reviews.llvm.org/D129781

Differential Revision: https://reviews.llvm.org/D130774
2022-07-31 10:51:01 +02:00
Sunho Kim e781451140 [JITLink] Relax zero-fill edge assertions.
Relax zero-fill edge assertions to only consider relocation edges. Keep-alive edges to zero-fill blocks can cause this assertion which is too strict.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D130450
2022-07-31 08:34:10 +09:00
Kazu Hirata 5dd78c3608 [IR] Fix a header guard (NFC)
Identified with llvm-header-guard.
2022-07-30 10:35:45 -07:00
Simon Pilgrim 2f08872d81 OMPIRBuilder.h - fix Wdocumentation warning. NFC. 2022-07-30 15:43:41 +01:00
Simon Pilgrim caa971f216 SelectionDAGNodes.h - fix Wdocumentation warnings. NFC. 2022-07-30 11:05:33 +01:00
Dawid Jurczak 65053fbc0d [NFC] Use more appropriate SmallVectorImpl::append call in std::initializer_list SmallVector constructor
Since we are in constructor there is no need to perform redundant call to SmallVectorImpl::clear() inside assign function.
Although calling cheaper append function instead assign doesn't make any difference on optimized builds
(DSE does the job removing stores), we still save some cycles for debug binaries.

Differential Revision: https://reviews.llvm.org/D130361
2022-07-30 09:17:35 +02:00
Fangrui Song ce6dd4e835 Revert D130458 "[llvm-objcopy] Support --{,de}compress-debug-sections for zstd"
This reverts commit c26dc2904b.

The new Zstd dispatch has an ongoing design discussion related to https://reviews.llvm.org/D130516#3688123 .
Revert for now before it is resolved.
2022-07-29 15:46:51 -07:00
Jay Foad 9436a85eb6 [IRBuilder] Make createCallHelper a member function. NFC.
This just avoids explicitly passing in 'this' as an argument in a bunch
of places.

Differential Revision: https://reviews.llvm.org/D130752
2022-07-29 21:17:26 +01:00
Alex Bradbury 85c6fab8d3 [RISCV][doc] Improve documentation comments on atomics intrinsics
Previously, it was necessary to check the atomics lowering or expansion
code to determine which argument was which.

This patch additionally tweaks the documentation comment in
TargetLowering to clarify the return value of the intrinsic and that the
intrinsic isn't required to mask and shift the result (this is handled
by the target-independent code in AtomicExpandPass).
2022-07-29 15:09:12 +01:00
Alexey Lapshin ece341f598 [Debuginfo][DWARF][NFC] Add paired methods working with DWARFDebugInfoEntry.
This review is extracted from D96035.

DWARF Debuginfo classes have two representations for DIEs: DWARFDebugInfoEntry
(short) and DWARFDie(extended). Depending on the task, it might be more convenient
to use DWARFDebugInfoEntry or/and DWARFDie. DWARFUnit class already has methods
working with DWARFDie and DWARFDebugInfoEntry. This patch adds more
methods working with DWARFDebugInfoEntry to have paired functionality.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D126059
2022-07-29 16:40:17 +03:00
Simon Pilgrim 63bdff3eb8 VirtualFileSystem.h - don't use \param in general description - use \p instead to fix Wdocumentation warnings. 2022-07-29 12:21:44 +01:00
Simon Pilgrim 9f68bb1da5 Fix unknown parameter Wdocumentation warning. NFC. 2022-07-29 12:17:30 +01:00
Simon Pilgrim 9082c13106 [Support] Add KnownBits::concat method
Add a method for the various cases where we need to concatenate 2 KnownBits together (BUILD_PAIR and SHIFT_PARTS in particular) - uses the existing APInt::concat 'HiBits.concat(LoBits)' convention

Differential Revision: https://reviews.llvm.org/D130557
2022-07-29 11:06:39 +01:00
Florian Hahn 214e2d8fe5
[SCEV] Avoid repeated proveNoSignedWrapViaInduction calls.
At the moment, proveNoSignedWrapViaInduction may be called for the
same AddRec a large number of times via getSignExtendExpr. This can have
a severe compile-time impact for very loop-heavy code.

If proveNoSignedWrapViaInduction failed to prove NSW the first time,
it is unlikely to succeed on subsequent tries and the cost doesn't seem
to be justified.

This is the signed version of 8daa338297 / D130648.

This can drastically improve compile-time in some excessive cases and
also has a slightly positive compile-time impact on CTMark:

NewPM-O3: -0.06%
NewPM-ReleaseThinLTO: -0.04%
NewPM-ReleaseLTO-g: -0.04%

https://llvm-compile-time-tracker.com/compare.php?from=8daa338297d533db4d1ae8d3770613eb25c29688&to=aed126a196e7a5a9803543d9b4d6bdb233d0009c&stat=instructions

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D130694
2022-07-29 09:15:03 +01:00
Sunho Kim 6b27890b2c [ORC][COFF] Handle COFF import files of static archive.
Handles COFF import files of static archive. Changes static library genrator to build up object file map keyed by symbol name that excludes the symbols from dllimported symbols so that static generator will not be responsible for them. It exposes the list of dynamic libraries that need to be imported. Client should properly load the libraries in this list beforehand. Object file map is also an improvment from the past in terms of performance. Archive.findSym does a slow O(n) linear serach of symbol list to find the symbol. (we call findSym O(n) times, thus full time complexity is O(n^2); we were the only user of findSym function in fact)

There is a room for improvements in how to load the libraries in the list. We currently just hand the responsibility over to the clinet. A better way would be let ORC read this list and hand them over to JITLink side that would also help validation (e.g. not trying to generate stub for non dllimported targets) Nevertheless, we will have to exclude the symbols from COFF import object file list and need a way to access this list, which this patch offers.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D129952
2022-07-29 16:25:08 +09:00
Alexey Lapshin e74197bc12 [Reland][Debuginfo][llvm-dwarfutil] Add check for unsupported debug sections.
Current DWARFLinker implementation does not support some debug sections
(mainly DWARF v5 sections). This patch adds diagnostic for such sections.
The warning would be displayed for critical(such that could not be removed)
sections and the source file would be skipped. Other unsupported sections
would be removed and warning message should be displayed. The zero exit
status would be returned for both cases.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D123623
2022-07-28 21:29:16 +03:00
Fangrui Song c26dc2904b [llvm-objcopy] Support --{,de}compress-debug-sections for zstd
Also, add ELFCOMPRESS_ZSTD (2) from the approved generic-abi proposal:
https://groups.google.com/g/generic-abi/c/satyPkuMisk
("Add new ch_type value: ELFCOMPRESS_ZSTD")

Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399
("[RFC] Zstandard as a second compression method to LLVM")

Differential Revision: https://reviews.llvm.org/D130458
2022-07-28 10:45:53 -07:00
Austin Kerbow f5b21680d1 [AMDGPU] Add amdgcn_sched_group_barrier builtin
This builtin allows the creation of custom scheduling pipelines on a per-region
basis. Like the sched_barrier builtin this is intended to be used either for
testing, in situations where the default scheduler heuristics cannot be
improved, or in critical kernels where users are trying to get performance that
is close to handwritten assembly. Obviously using these builtins will require
extra work from the kernel writer to maintain the desired behavior.

The builtin can be used to create groups of instructions called "scheduling
groups" where ordering between the groups is enforced by the scheduler.
__builtin_amdgcn_sched_group_barrier takes three parameters. The first parameter
is a mask that determines the types of instructions that you would like to
synchronize around and add to a scheduling group. These instructions will be
selected from the bottom up starting from the sched_group_barrier's location
during instruction scheduling. The second parameter is the number of matching
instructions that will be associated with this sched_group_barrier. The third
parameter is an identifier which is used to describe what other
sched_group_barriers should be synchronized with. Note that multiple
sched_group_barriers must be added in order for them to be useful since they
only synchronize with other sched_group_barriers. Only "scheduling groups" with
a matching third parameter will have any enforced ordering between them.

As an example, the code below tries to create a pipeline of 1 VMEM_READ
instruction followed by 1 VALU instruction followed by 5 MFMA instructions...
// 1 VMEM_READ
__builtin_amdgcn_sched_group_barrier(32, 1, 0)
// 1 VALU
__builtin_amdgcn_sched_group_barrier(2, 1, 0)
// 5 MFMA
__builtin_amdgcn_sched_group_barrier(8, 5, 0)
// 1 VMEM_READ
__builtin_amdgcn_sched_group_barrier(32, 1, 0)
// 3 VALU
__builtin_amdgcn_sched_group_barrier(2, 3, 0)
// 2 VMEM_WRITE
__builtin_amdgcn_sched_group_barrier(64, 2, 0)

Reviewed By: jrbyrnes

Differential Revision: https://reviews.llvm.org/D128158
2022-07-28 10:43:14 -07:00
Simon Pilgrim 8c99cef1e7 [DAG] Remove SelectionDAG::GetDemandedBits and use SimplifyMultipleUseDemandedBits directly.
GetDemandedBits is mainly a wrapper around SimplifyMultipleUseDemandedBits now, and is only used by DAGCombiner::visitSTORE so I've moved all remaining functionality there.

visitSTORE was making use of this to 'simplify' constants for a trunc-store. Just removing this code left to a mixture of regressions and gains - it came down to whether a target preferred a sign or zero extended constant for materialization/truncation. I've just moved the code over for now, but a next step would be to move this to targetShrinkDemandedConstant, but some targets that override the method expect a basic binop, and might react badly to a store node.....
2022-07-28 17:03:44 +01:00
Liqiang Tao d52e775b05 [llvm][ModuleInliner] Add inline cost priority for module inliner
This patch introduces the inline cost priority into the
module inliner, which uses the same computation as
InlineCost.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D130012
2022-07-28 22:44:03 +08:00
Liqiang Tao c113594378 Revert "[llvm][ModuleInliner] Add inline cost priority for module inliner"
This reverts commit bb7f62bbbd.
2022-07-28 22:36:28 +08:00
Chris Bieneman fe13002bb3 [HLSL] Add __builtin_hlsl_create_handle
This is pretty straightforward, it just adds a builtin to return a
pointer to a resource handle. This maps to a dx intrinsic.

The shape of this builtin and the underlying intrinsic will likely
shift a bit as this implementation becomes more feature complete, but
this is a good basis to get started.

Depends on D128569.

Differential Revision: https://reviews.llvm.org/D130016
2022-07-28 09:16:11 -05:00
Liqiang Tao bb7f62bbbd [llvm][ModuleInliner] Add inline cost priority for module inliner
This patch introduces the inline cost priority into the
module inliner, which uses the same computation as
InlineCost.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D130012
2022-07-28 21:28:07 +08:00
Florian Hahn 8daa338297
[SCEV] Avoid repeated proveNoUnsignedWrapViaInduction calls.
At the moment, proveNoUnsignedWrapViaInduction may be called for the
same AddRec a large number of times via getZeroExtendExpr. This can have
a severe compile-time impact for very loop-heavy code. One one
particular workload, LSR takes ~51s without this patch, almost
exlusively in proveNoUnsignedWrapViaInduction. With this patch, the time
in LSR drops to ~0.4s.

If proveNoUnsignedWrapViaInduction failed to prove NUW the first time,
it is unlikely to succeed on subsequent tries and the cost doesn't seem
to be justified.

Besides drastically improving compile-time in some excessive cases, this
also has a slightly positive compile-time impact on CTMark:

NewPM-O3: -0.07%
NewPM-ReleaseThinLTO: -0.08%
NewPM-ReleaseLTO-g: -0.06

https://llvm-compile-time-tracker.com/compare.php?from=b435da027d7774c24cdb8c88d09f6b771e07fb14&to=f2729e33e8284b502f6c35a43345272252f35d12&stat=instructions

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D130648
2022-07-28 10:02:19 +01:00
Paul Kirth 6e9bab71b6 Revert "[llvm][NFC] Refactor code to use ProfDataUtils"
This reverts commit 300c9a7881.

We will reland once these issues are ironed out.
2022-07-27 21:38:11 +00:00
Paul Kirth 300c9a7881 [llvm][NFC] Refactor code to use ProfDataUtils
In this patch we replace common code patterns with the use of utility
functions for dealing with profiling metadata. There should be no change
in functionality, as the existing checks should be preserved in all
cases.

Reviewed By: bogner, davidxl

Differential Revision: https://reviews.llvm.org/D128860
2022-07-27 21:13:54 +00:00
Paul Kirth 6047deb7c2 [llvm] Provide utility function for MD_prof
Currently, there is significant code duplication for dealing with
MD_prof metadata throughout the compiler. These utility functions can
improve code reuse and simplify boilerplate code when dealing with
profiling metadata, such as branch weights. The inent is to provide a
uniform set of APIs that allow common tasks, such as identifying
specific types of MD_prof metadata and extracting branch weights.

Future patches can build on this initial implementation and clean up the
different implementations across the compiler.

Reviewed By: bogner

Differential Revision: https://reviews.llvm.org/D128858
2022-07-27 21:13:51 +00:00
Jonas Devlieghere a8c3d9815e
[DebugInfo] Teach LLVM and LLDB about ptrauth in DWARF
Teach libDebugInfo (llvm-dwarfdump) and lldb about DWARF tags and
attributes for pointer authentication. These values have been emitted by
Apple clang for several releases. Although upstream LLVM doesn't emit
these values yet, we hope to upstream that part sometime soon.

Differential revision: https://reviews.llvm.org/D130215
2022-07-27 11:48:35 -07:00
Amara Emerson 19cdd1908b [AArch64][GlobalISel] Add heuristics for localizing G_CONSTANT.
This adds similar heuristics to G_GLOBAL_VALUE, querying the cost of
materializing a specific constant in code size. Doing so prevents us from
sinking constants which require multiple instructions to generate into
use blocks.

Code size savings on CTMark -Os:
Program                                       size.__text
                                              before         after           diff
ClamAV/clamscan                               381940.00      382052.00       0.0%
lencod/lencod                                 428408.00      428428.00       0.0%
SPASS/SPASS                                   411868.00      411876.00       0.0%
kimwitu++/kc                                  449944.00      449944.00       0.0%
Bullet/bullet                                 463588.00      463556.00      -0.0%
sqlite3/sqlite3                               284696.00      284668.00      -0.0%
consumer-typeset/consumer-typeset             414492.00      414424.00      -0.0%
7zip/7zip-benchmark                           595244.00      594972.00      -0.0%
mafft/pairlocalalign                          247512.00      247368.00      -0.1%
tramp3d-v4/tramp3d-v4                         372884.00      372044.00      -0.2%
                           Geomean difference                               -0.0%

Differential Revision: https://reviews.llvm.org/D130554
2022-07-27 10:51:16 -07:00
Stanislav Mekhanoshin 0562cf442f Allow data prefetch into non-default address space
I am playing with the LoopDataPrefetch pass and found out that it
bails to work with a pointer in a non-zero address space. This
patch adds the target callback to check if an address space is to
be considered for prefetching. Default implementation still only
allows address space 0, so this is NFCI.

This does not currently affect any known targets, but seems to be
generally useful for the future.

Differential Revision: https://reviews.llvm.org/D129795
2022-07-27 10:01:26 -07:00
Rainer Orth 979ddfff37 [Support] Handle SPARC in sys::getHostCPUName
While working on D118450 <https://reviews.llvm.org/D118450>, I noticed that
`sys::getHostCPUName` lacks SPARC support.

This patch implements it.  The code is taken from/inspired by GCC's
`gcc/config/sparc/driver-sparc.cc`.  There's one caveat: since LLVM, unlike
GCC, doesn't support the SPARC-M7, -S7, and -M8 CPUs, I map all those to
the latest supported one (UltraSparc T4/`niagara4`).

Tested on `sparcv9-sun-solaris2.11` and `sparc64-unknown-linux-gnu` by
running `savcov --version` on

- Netra SPARC S7-2 (SPARC-S7, Solaris 11.4)
- SPARC T5-2 (SPARC T5, Solaris 11.4)
- SPARC Enterprise T5220 (UltraSPARC T2, Solaris 11.3)
- SPARC T5 (UltraSPARC T5, Debian sid)
- SPARC T3 (UltraSPARC T3, Debian sid)
- SPARC Enterprise T5220 (Debian sid)

Differential Revision: https://reviews.llvm.org/D130272
2022-07-27 12:21:03 +02:00
Alexey Lapshin 79ff02a122 Revert "[Debuginfo][llvm-dwarfutil] Add check for unsupported debug sections."
This reverts commit 0d191b7553.
2022-07-27 11:48:56 +03:00
Martin Sebor 4447603616 [InstCombine] Fold strtoul and strtoull and avoid PR #56293
Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D129224
2022-07-26 14:11:40 -06:00
Francis Visoiu Mistrih 2c6e8b4636 [Matrix] Refactor tiled loops in a struct. NFC
The three loops have the same structure: index, header, latch.
2022-07-26 11:02:22 -07:00
Jessica Paquette 39d431d811 [GlobalISel] Import patterns for G_FMAXIMUM + G_FMINIMUM
Allows us to select scalar instructions on AArch64.

Differential Revision: https://reviews.llvm.org/D115381
2022-07-26 10:58:44 -07:00
Fangrui Song f106525de2 [MachineFunctionPass] Support -print-changed and -print-changed=quiet
-print-changed for new pass manager is handy beside -print-after-all.
Port it to MachineFunctionPass.

Note: lib/Passes/StandardInstrumentations.cpp implements a number of
misc features. If we want to use them for codegen, we may need to lift
some functionality to LLVMIR.

Reviewed By: aeubanks, jamieschmeiser

Differential Revision: https://reviews.llvm.org/D130434
2022-07-26 10:16:49 -07:00
Stefan Gränitz 1e30820483 [WinEH] Apply funclet operand bundles to nounwind intrinsics that lower to function calls in the course of IR transforms
WinEHPrepare marks any function call from EH funclets as unreachable, if it's not a nounwind intrinsic or has no proper funclet bundle operand. This
affects ARC intrinsics on Windows, because they are lowered to regular function calls in the PreISelIntrinsicLowering pass. It caused silent binary truncations and crashes during unwinding with the GNUstep ObjC runtime: https://github.com/gnustep/libobjc2/issues/222

This patch adds a new function `llvm::IntrinsicInst::mayLowerToFunctionCall()` that aims to collect all affected intrinsic IDs.
* Clang CodeGen uses it to determine whether or not it must emit a funclet bundle operand.
* PreISelIntrinsicLowering asserts that the function returns true for all ObjC runtime calls it lowers.
* LLVM uses it to determine whether or not a funclet bundle operand must be propagated to inlined call sites.

Reviewed By: theraven

Differential Revision: https://reviews.llvm.org/D128190
2022-07-26 17:52:43 +02:00
Paul Walker e5c892dd85 [SVE][SelectionDAG] Use INDEX to generate matching instances of BUILD_VECTOR.
This patch starts small, only detecting sequences of the form
<a, a+n, a+2n, a+3n, ...> where a and n are ConstantSDNodes.

Differential Revision: https://reviews.llvm.org/D125194
2022-07-26 15:28:37 +00:00
Arthur Eubanks 2eade1dba4 [WPD] Use new llvm.public.type.test intrinsic for potentially publicly visible classes
Turning on opaque pointers has uncovered an issue with WPD where we currently pattern match away `assume(type.test)` in WPD so that a later LTT doesn't resolve the type test to undef and introduce an `assume(false)`. The pattern matching can fail in cases where we transform two `assume(type.test)`s into `assume(phi(type.test.1, type.test.2))`.

Currently we create `assume(type.test)` for all virtual calls that might be devirtualized. This is to support `-Wl,--lto-whole-program-visibility`.

To prevent this, all virtual calls that may not be in the same LTO module instead use a new `llvm.public.type.test` intrinsic in place of the `llvm.type.test`. Then when we know if `-Wl,--lto-whole-program-visibility` is passed or not, we can either replace all `llvm.public.type.test` with `llvm.type.test`, or replace all `llvm.public.type.test` with `true`. This prevents WPD from trying to pattern match away `assume(type.test)` for public virtual calls when failing the pattern matching will result in miscompiles.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D128955
2022-07-26 08:01:08 -07:00
Alexey Lapshin 0d191b7553 [Debuginfo][llvm-dwarfutil] Add check for unsupported debug sections.
Current DWARFLinker implementation does not support some debug sections
(mainly DWARF v5 sections). This patch adds diagnostic for such sections.
The warning would be displayed for critical(such that could not be removed)
sections and the source file would be skipped. Other unsupported sections
would be removed and warning message should be displayed. The zero exit
status would be returned for both cases.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D123623
2022-07-26 15:25:51 +03:00
Simon Tatham 55f1fbf005 [MC,llvm-objdump,ARM] Target-dependent disassembly resync policy.
Currently, when llvm-objdump is disassembling a code section and
encounters a point where no instruction can be decoded, it uses the
same policy on all targets: consume one byte of the section, emit it
as "<unknown>", and try disassembling from the next byte position.

On an architecture where instructions are always 4 bytes long and
4-byte aligned, this makes no sense at all. If a 4-byte word cannot be
decoded as an instruction, then the next place that a valid
instruction could //possibly// be found is 4 bytes further on.
Disassembling from a misaligned address can't possibly produce
anything that the code generator intended, or that the CPU would even
attempt to execute.

This patch introduces a new MCDisassembler virtual method called
`suggestBytesToSkip`, which allows each target to choose its own
resynchronization policy. For Arm (as opposed to Thumb) and AArch64,
I've filled in the new method to return a fixed width of 4.

Thumb is a more interesting case, because the criterion for
identifying 2-byte and 4-byte instruction encodings is very simple,
and doesn't require the particular instruction to be recognized. So
`suggestBytesToSkip` is also passed an ArrayRef of the bytes in
question, so that it can take that into account. The new test case
shows Thumb disassembly skipping over two unrecognized instructions,
and identifying one as 2-byte and one as 4-byte.

For targets other than Arm and AArch64, this is NFC: the base class
implementation of `suggestBytesToSkip` still returns 1, so that the
existing behavior is unchanged. Other targets can fill in their own
implementations as they see fit; I haven't attempted to choose a new
behavior for each one myself.

I've updated all the call sites of `MCDisassembler::getInstruction` in
llvm-objdump, and also one in sancov, which was the only other place I
spotted the same idiom of `if (Size == 0) Size = 1` after a call to
`getInstruction`.

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D130357
2022-07-26 09:35:30 +01:00
David Spickett 2f9fa9ef53 [lldb][AArch64] Add support for memory tags in core files
This teaches ProcessElfCore to recognise the MTE tag segments.

https://www.kernel.org/doc/html/latest/arm64/memory-tagging-extension.html#core-dump-support

These segments contain all the tags for a matching memory segment
which will have the same size in virtual address terms. In real terms
it's 2 tags per byte so the data in the segment is much smaller.

Since MTE is the only tag type supported I have hardcoded some
things to those values. We could and should support more formats
as they appear but doing so now would leave code untested until that
happens.

A few things to note:
* /proc/pid/smaps is not in the core file, only the details you have
  in "maps". Meaning we mark a region tagged only if it has a tag segment.
* A core file supports memory tagging if it has at least 1 memory
  tag segment, there is no other flag we can check to tell if memory
  tagging was enabled. (unlike a live process that can support memory
  tagging even if there are currently no tagged memory regions)

Tests have been added at the commands level for a core file with
mte and without.

There is a lot of overlap between the "memory tag read" tests here and the unit tests for
MemoryTagManagerAArch64MTE::UnpackTagsFromCoreFileSegment, but I think it's
worth keeping to check ProcessElfCore doesn't cause an assert.

Depends on D129487

Reviewed By: omjavaid

Differential Revision: https://reviews.llvm.org/D129489
2022-07-26 08:46:36 +01:00
Kazu Hirata c8cf669f60 [ADT] Deprecate Optional::getValueOr (NFC)
This patch deprecates getValueOr in favor of value_or.

Differential Revision: https://reviews.llvm.org/D130140
2022-07-25 23:01:02 -07:00
Xiang Li 57006b14fa [DirectX backend] [NFC]Add DXILOpBuilder to generate DXIL operation
A new helper class DXILOpBuilder is added to create DXIL op function calls.

TableGen backend for DXILOperation will create table for DXIL op function parameter types.
When create DXIL op function, these parameter types will used to create the function type.

Reviewed By: bogner

Differential Revision: https://reviews.llvm.org/D130291
2022-07-25 21:49:59 -07:00
Alexander Shaposhnikov 1e636f2676 [IRBuilder] Add assert for AtomicRMW ordering
Add assert for AtomicRMW: Ordering != AtomicOrdering::Unordered
(https://github.com/llvm/llvm-project/blob/main/llvm/lib/IR/Verifier.cpp#L3944)
and adjust expandAtomicStore accordingly.

Test plan:
1/ ninja check-llvm check-clang check-lld
2/ Bootstrapped LLVM/Clang pass tests

Differential revision: https://reviews.llvm.org/D130457
2022-07-25 22:51:25 +00:00
Vladislav Dzhidzhoev 2c84b92346 Fix assertion in SmallDenseMap constructor with reserve from non-power-of-2 buckets count
`SmallDenseMap` constructor with reserve gets an arbitrary `NumInitBuckets` value and passes it below to `init` method.

If `NumInitBuckets` is greater then `InlineBuckets`, then `SmallDenseMap` initializes to large representation passing `NumInitBuckets` below to `DenseMap` initialization. `DenseMap::initEmpty` method asserts that initial buckets count must be a power of 2.

Proposed solution is to update `NumInitBuckets` value in `SmallDenseMap` constructor till the next power of 2. It should satisfy both `DenseMap` preconditions and required minimum buckets count for reservation.

Reviewed By: atrick

Differential Revision: https://reviews.llvm.org/D129825
2022-07-25 17:09:44 +00:00
Sunho Kim 0f00e58841 [JITLink][COFF][x86_64] Reimplement ADDR32NB/REL32.
Reimplements ADDR32NB/REL32 relocations properly, out-of-reach targets will be dealt in the separate patch that will generate the stub for dllimport symbols.

Reviewed By: sgraenitz

Differential Revision: https://reviews.llvm.org/D129936
2022-07-25 23:41:53 +09:00
Weining Lu aff68f5ad6 [LoongArch] Parse LoongArch base ABI in ObjectYAML and llvm-readobj
LoongArch e_flags definition:
https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html#_e_flags_identifies_abi_type_and_version

Differential Revision: https://reviews.llvm.org/D130238
2022-07-25 20:40:57 +08:00
Kazu Hirata 95a932fb15 Remove redundaunt override specifiers (NFC)
Identified with modernize-use-override.
2022-07-24 22:28:11 -07:00
Kazu Hirata b5188591a0 [llvm] Remove redundaunt virtual specifiers (NFC)
Identified with modernize-use-override.
2022-07-24 21:50:35 -07:00
Amaury Séchet 5e29360743 [NFC] Add parentheses in MathExtra.h
The code used to cause a warning:
  llvm/include/llvm/Support/MathExtras.h:751:39: warning: suggest parentheses around ‘-’ in operand of ‘&’ [-Wparentheses]
    751 |   assert(Align != 0 && (Align & Align - 1) == 0 &&
        |
2022-07-24 22:04:09 +00:00
Kazu Hirata ec8fa36d7c [ExecutionEngine] Fix a header guard (NFC)
Identified with llvm-header-guard.
2022-07-24 12:27:12 -07:00
Kazu Hirata c661bd0886 [llvm] Remove unused forward declarations (NFC) 2022-07-24 12:27:05 -07:00
Fangrui Song 85cfd91723 [ELF] Optimize some non-constant alignTo with alignToPowerOf2. NFC
My x86-64 lld executable is 2KiB smaller. .eh_frame writing gets faster as there
were lots of divisions.
2022-07-24 11:20:49 -07:00
Fangrui Song 7feab85df8 [MC] Remove unused renameELFSection 2022-07-24 01:23:07 -07:00
Fangrui Song 6977ff4006 [MC] Delete dead zlib-gnu code and simplify writeSectionData 2022-07-24 01:17:34 -07:00
Kazu Hirata 2201d1827f [Analysis] Use default member initialization (NFC) 2022-07-23 18:36:24 -07:00
Kazu Hirata 7bfa06f6c0 [CodeGen] Use range-based for loops (NFC) 2022-07-23 16:10:46 -07:00
Fangrui Song 7225213c0a [LegacyPM] Remove {,PostInline}EntryExitInstrumenterPass
Following recent changes removing non-core features of the legacy
PM/optimization pipeline.
2022-07-23 15:30:15 -07:00
Nuno Lopes 9df0b254d2 [NFC] Switch a few uses of undef to poison as placeholders for unreachable code 2022-07-23 21:50:11 +01:00
Kazu Hirata 85dadf6d8d [TableGen] Drop an unnecessary const from a return type (NFC)
This patch also drops "&" that binds to a temporary.

Identified with readability-const-return-type.
2022-07-23 11:30:23 -07:00
Kazu Hirata 71cdb8c6f1 [ADT] Use default member initialization (NFC)
Identified with modernize-use-default-member-init.
2022-07-23 10:50:27 -07:00
ARCHIT SAXENA 3bb1ce2319 Add a nop instruction if a section starts with landing pad for function splitter
This change adds a nop instruction if section starts with landing pad. This change is like [D73739](https://reviews.llvm.org/D73739) which avoids zero offset landing pad in basic block sections.

Detailed description:
The current machine functions splitter can create ˜sections which start with a landing pad themselves. This places landing pad at offset zero from LPStart.
```
	.section	.text.split.foo10,"ax",@progbits
foo10.cold:                             # %lpad
	.cfi_startproc
	.cfi_personality 3, __gxx_personality_v0
	.cfi_lsda 3, .Lexception5
	.cfi_def_cfa %rsp, 16
.Ltmp11: <--- This is a Landing pad and also LP Start as it is start of this section
	movq	%rax, %rdi <--- first instruction is at offest 0 from LPStart
	callq	_Unwind_Resume@PLT

 ```
This will cause landing pad entries to become zero (.Ltmp11-foo10.cold)
```
.Lcst_begin4:
	.uleb128 .Ltmp9-.Lfunc_begin2           # >> Call Site 1 <<
	.uleb128 .Ltmp10-.Ltmp9                 #   Call between .Ltmp9 and .Ltmp10
	.uleb128 .Ltmp11-foo10.cold  <---This is zero           #     jumps to .Ltmp11
	.byte	3                               #   On action: 2
	.uleb128 .Ltmp10-.Lfunc_begin2          # >> Call Site 2 <<
	.uleb128 .Lfunc_end9-.Ltmp10            #   Call between .Ltmp10 and .Lfunc_end9
	.byte	0                               #     has no landing pad
	.byte	0                               #   On action: cleanup
	.p2align	2
```
The C++ ABI somehow assumes that no landing pads point directly to LPStart (which works in the normal case since the function begin is never a landing pad), and uses LP.offset = 0 to specify no landing pad. This change adds a nop instruction at start of such sections so that such a case could be avoided. Output:
```
	.section	.text.split.foo10,"ax",@progbits
foo10.cold:                             # %lpad
	.cfi_startproc
	.cfi_personality 3, __gxx_personality_v0
	.cfi_lsda 3, .Lexception5
	.cfi_def_cfa %rsp, 16
	nop <--- new instruction that is added
.Ltmp11:
	movq	%rax, %rdi
	callq	_Unwind_Resume@PLT
```

Reviewed By: modimo, snehasish, rahmanl

Differential Revision: https://reviews.llvm.org/D130133
2022-07-22 15:20:10 -07:00
Arnold Schwaighofer 58e6ee0e1f llvm.swift.async.context.addr cannot be modeled as NoMem because we don't want it to be cse'd accross async suspends
An async suspend models the split between two partial async functions.
`llvm.swift.async.context.addr ` will have a different value in the two
partial functions so it is not correct to generally CSE the instruction.

rdar://97336162

Differential Revision: https://reviews.llvm.org/D130201
2022-07-22 11:50:58 -07:00
Shilei Tian 602e0eb9f0 [OpenMP][DeviceRTL] Fix the issue that multiple calls to `omp_get_wtime` is optimized out by mistake
Multiple calls to `omp_get_wtime` could be optimized out due to the function
is mistakenly marked as `readnone`. This patch fixes the issue, and also add the
support to run optimization on `libomptarget` tests.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D130179
2022-07-22 13:46:45 -04:00
Mircea Trofin 7b81a81d5f [NFC] FunctionSamples::getEntrySamples -> getHeadSamplesEstimate
The name `getEntrySamples` was misleading for 2 reasons. One, it's
close in name to `Function::getEntryCount`, but the equivalent here is
`getHeadSamples`; second, as opposed to the other get* APIs in
`FunctionSamples`, it performs an estimate/heuristic rather than just
retrieving raw data (or a non-heuristic derivate off that data, like
`getMaxCountInside`)

The new name should more clearly communicate its intent; and, being
close (in name) to `getHeadSamples`, it should allow the reader discover
the relation between them.

Also updated the doc comments for both `getHeadSamples[Estimate]` so a
reader may better understand the relation between them.

Differential Revision: https://reviews.llvm.org/D130281
2022-07-22 09:17:59 -07:00
Shilei Tian 77cb30e3a6 Revert "[OpenMP][DeviceRTL] Fix the issue that multiple calls to `omp_get_wtime` is optimized out by mistake"
This reverts commit ad34f1dba8.
2022-07-22 11:45:13 -04:00
Shilei Tian ad34f1dba8 [OpenMP][DeviceRTL] Fix the issue that multiple calls to `omp_get_wtime` is optimized out by mistake
Multiple calls to `omp_get_wtime` could be optimized out due to the function
is mistakenly marked as `readnone`. This patch fixes the issue, and also add the
support to run optimization on `libomptarget` tests.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D130179
2022-07-22 11:43:30 -04:00
Malhar Jajoo 41958f76d8 [Costmodel] Add "type-based-intrinsic-cost" cli option
This patch adds a command line flag to be able to test
the type based cost-model analysis for Intrinsics.

Differential Revision: https://reviews.llvm.org/D129109
2022-07-22 15:50:57 +01:00
Kazu Hirata 70257fab68 Use any_of (NFC) 2022-07-22 01:05:17 -07:00
Johannes Doerfert a50b9f9f1f [Attributor][FIX] Handle non-recursive but re-entrant functions properly
If a function is non-recursive we only performed intra-procedural
reasoning for reachability (via AA::isPotentiallyReachable). However,
if it is re-entrant that doesn't mean we can't reach. Instead of this
problematic logic in the reachability reasoning we utilize logic in
AAPointerInfo. If a location is for sure written by a function it can
be re-entrant or recursive we know only intra-procedural reasoning is
sufficient.
2022-07-22 00:00:56 -05:00
Johannes Doerfert 62f7888d6d [Attributor] Dominating must-write accesses allow unknown initial values
If we have a dominating must-write access we do not need to know the
initial value of some object to perform reasoning about the potential
values. The dominating must-write has overwritten the initial value.
2022-07-21 23:08:43 -05:00
Johannes Doerfert dfac030271 [Intrinsics] Add `nocallback` to the memset/cpy/move intrinsics
These were forgotten when D118680 was applied. Similar to D125937.

Differential Revision: https://reviews.llvm.org/D129516
2022-07-21 22:52:46 -05:00
Ilia Diachkov b8e1544b9d [SPIRV] add SPIRVPrepareFunctions pass and update other passes
The patch adds SPIRVPrepareFunctions pass, which modifies function
signatures containing aggregate arguments and/or return values before
IR translation. Information about the original signatures is stored in
metadata. It is used during call lowering to restore correct SPIR-V types
of function arguments and return values. This pass also substitutes some
llvm intrinsic calls to function calls, generating the necessary functions
in the module, as the SPIRV translator does.

The patch also includes changes in other modules, fixing errors and
enabling many SPIR-V features that were omitted earlier. And 15 LIT tests
are also added to demonstrate the new functionality.

Differential Revision: https://reviews.llvm.org/D129730

Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
2022-07-22 04:00:48 +03:00
Teresa Johnson 1dad6247d2 [MemProf] Add memprof metadata related analysis utilities
Adds a number of utilities that are used to help create and update
memprof related metadata. These will be used during profile matching
and annotation, as well as by the inliner when updating the metadata.
Also adds unit tests for the utilities.

See also related RFCs:
RFC: Sanitizer-based Heap Profiler [1]
RFC: A binary serialization format for MemProf [2]
RFC: IR metadata format for MemProf [3]
(Note that the IR metadata format has changed from the RFC during
implementation, as described in the preceeding patch adding the basic
metadata and verification support.)

Depends on D128141.

Differential Revision: https://reviews.llvm.org/D128854
2022-07-21 13:46:01 -07:00
Sanjay Patel 78c09f0f24 [PatternMatch][InstCombine] match a vector with constant expression element(s) as a constant expression
The InstCombine test is reduced from issue #56601. Without the more
liberal match for ConstantExpr, we try to rearrange constants in
Negator forever.

Alternatively, we could adjust the definition of m_ImmConstant to be
more conservative, but that's probably a larger patch, and I don't
see any downside to changing m_ConstantExpr. We never capture and
modify a ConstantExpr; transforms just want to avoid it.

Differential Revision: https://reviews.llvm.org/D130286
2022-07-21 15:23:57 -04:00
Daniel Thornburgh 6605187103 [NFC] Fix compiler warning in MarkupFilter 2022-07-21 12:00:29 -07:00
Daniel Thornburgh 17e4c217b6 [Symbolizer] Implement contextual symbolizer markup elements.
This change implements the contextual symbolizer markup elements: reset,
module, and mmap. These provide information about the runtime context of
the binary necessary to resolve addresses to symbolic values.

Summary information is printed to the output about this context.
Multiple mmap elements for the same module line are coalesced together.
The standard requires that such elements occur on their own lines to
allow for this; accordingly, anything after a contextual element on a
line is silently discarded.

Implementing this cleanly requires that the filter drive the parser;
this allows skipped sections to avoid being parsed. This also makes the
filter quite a bit easier to use, at the cost of some unused
flexibility.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D129519
2022-07-21 11:29:19 -07:00
David Sherwood f15b6b2907 [AArch64] Add target hook for preferPredicateOverEpilogue
This patch adds the AArch64 hook for preferPredicateOverEpilogue,
which currently returns true if SVE is enabled and one of the
following conditions (non-exhaustive) is met:

1. The "sve-tail-folding" option is set to "all", or
2. The "sve-tail-folding" option is set to "all+noreductions"
and the loop does not contain reductions,
3. The "sve-tail-folding" option is set to "all+norecurrences"
and the loop has no first-order recurrences.

Currently the default option is "disabled", but this will be
changed in a later patch.

I've added new tests to show the options behave as expected here:

  Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll

Differential Revision: https://reviews.llvm.org/D129560
2022-07-21 17:20:06 +01:00
Joseph Huber bc33c2fa0c [Binary] Hard-code the alignment of the offloading binary
Summary:
We previously used `alignof` to get the necessary alignment of the
binary header. However this was different on 32-bit platforms and caused
a few tests to fail because of it. This patch just changes this to be a
hard-coded constant of 8.
2022-07-21 09:28:26 -04:00
Nikita Popov 1f69503107 [MemoryBuiltins] Add getReallocatedOperand() function (NFC)
Replace the value-accepting isReallocLikeFn() overload with a
getReallocatedOperand() function, which returns which operand is
the one being reallocated. Currently, this is always the first one,
but once allockind(realloc) is respected, the reallocated operand
will be determined by the allocptr parameter attribute.
2022-07-21 14:54:16 +02:00
Nikita Popov 46e6dd84b7 [MemoryBuiltins] Remove isFreeCall() function (NFC)
Remove isFreeCall() in favor of getFreedOperand(). Replace the
two remaining uses with a getFreedOperand() != nullptr check, as
they only care that something is getting freed. (The usage in DSE
is correct as such. The allocator-related checks in CFLGraph look
rather questionable in general.)
2022-07-21 14:44:23 +02:00
Alexey Lapshin 8bb4451a65 [Reland][DebugInfo][llvm-dwarfutil] Combine overlapped address ranges.
DWARF files may contain overlapping address ranges. f.e. it can happen if the two
copies of the function have identical instruction sequences and they end up sharing.
That looks incorrect from the point of view of DWARF spec. Current implementation
of DWARFLinker does not combine overlapped address ranges. It would be good if such
ranges would be handled in some useful way. Thus, this patch allows DWARFLinker
to combine overlapped ranges in a single one.

Depends on D86539

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D123469
2022-07-21 14:15:39 +03:00
Matt Devereau e0fbd990c9 [AArch64][SVE] Add ISel pattern to lower DUPLANE128 to LD1RQD
Following on from https://reviews.llvm.org/D128902, lower DUPLANE128 to LD1RQD
for integer load types from instruction selection.

Differential Revision: https://reviews.llvm.org/D130010
2022-07-21 10:56:43 +00:00
Alexey Lapshin 3aad49082c Revert "[DebugInfo][llvm-dwarfutil] Combine overlapped address ranges."
This reverts commit d2a4d6bf9c.
2022-07-21 13:40:20 +03:00
Nikita Popov c81dff3c30 [MemoryBuiltins] Add getFreedOperand() function (NFCI)
We currently assume in a number of places that free-like functions
free their first argument. This is true for all hardcoded free-like
functions, but with the new attribute-based design, the freed
argument is supposed to be indicated by the allocptr attribute.

To make sure we handle this correctly once allockind(free) is
respected, add a getFreedOperand() helper which returns the freed
argument, rather than just indicating whether the call frees *some*
argument.

This migrates most but not all users of isFreeCall() to the new
API. The remaining users are a bit more tricky.
2022-07-21 12:39:35 +02:00
Alexey Lapshin d2a4d6bf9c [DebugInfo][llvm-dwarfutil] Combine overlapped address ranges.
DWARF files may contain overlapping address ranges. f.e. it can happen if the two
copies of the function have identical instruction sequences and they end up sharing.
That looks incorrect from the point of view of DWARF spec. Current implementation
of DWARFLinker does not combine overlapped address ranges. It would be good if such
ranges would be handled in some useful way. Thus, this patch allows DWARFLinker
to combine overlapped ranges in a single one.

Depends on D86539

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D123469
2022-07-21 13:15:18 +03:00
Nikita Popov d144ae6e1b [MemoryBuiltins] Default to trivial mapper in getAllocSize() (NFC)
Default getAllocSize() to use the trivial mapper. Also switch
from using std::function to function_ref.

Furthermore, update the doc comment to point out a subtle difference
between getAllocSize() and getObjectSize(): The latter may also
return something for calls that return their argument (via "returned"
attribute or special intrinsics like invariant groups).
2022-07-21 11:43:48 +02:00
Nikita Popov f45ab43332 [MemoryBuiltins] Avoid isAllocationFn() call before checking removable alloc
Alloc directly checking whether a given call is a removable
allocation, instead of first checking whether it is an allocation
first.
2022-07-21 09:39:19 +02:00
Congzhe Cao 05ccde8023 [LoopCacheAnalysis] Fix a type mismatch problem in cost calculation
There is a problem in loop cache analysis that the types of SCEV variables
`Coeff` and `ElemSize` in function `isConsecutive()` may not match. The
mismatch would cause SCEV failures when `Coeff` is multiplied with `ElemSize`.

The fix in this patch is to extend the type of both `Coeff` and `ElemSize` to
whichever is wider in those two variables. As a clean-up, duplicate calculations
of `Stride` in `computeRefCost()` is then removed.

Reviewed By: Meinersbur, #loopoptwg

Differential Revision: https://reviews.llvm.org/D128877
2022-07-21 01:57:05 -04:00
Anubhab Ghosh 4fcf8434dd [ORC] Add a new MemoryMapper-based JITLinkMemoryManager implementation.
MapperJITLinkMemoryManager supports executor memory management using any
implementation of MemoryMapper to do the transfer such as InProcessMapper or
SharedMemoryMapper.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D129495
2022-07-20 17:52:37 -07:00
Teresa Johnson 0174f5553e [MemProf] Basic metadata support and verification
Add basic support for the MemProf metadata (!memprof and !callsite)
which was initially described in "RFC: IR metadata format for MemProf"
(https://discourse.llvm.org/t/rfc-ir-metadata-format-for-memprof/59165).

The bulk of the patch is verification support, along with some tests.

There are a couple of changes to the format described in the original
RFC:

Initial measurements suggested that a tree format for the stack ids in
the contexts would be more efficient, but subsequent evaluation with
large applications showed that in fact the cost of the additional
metadata nodes required by this deduplication scheme overwhelmed the
benefit from sharing stack id nodes. Therefore, the implementation here
and in follow on patches utilizes a simpler scheme of lists of stack id
integers in the memprof profile contexts and callsite metadata. The
follow on matching patch employs context trimming optimizations to
reduce the cost.

Secondly, instead of verbosely listing all profiled fields in each
profiled context (memory info block or MIB), and deferring the
interpretation of the profile data, the profile data is evaluated and
converted into string tags specifying the behavior (e.g. "cold") during
profile matching. This reduces the verbosity of the profile metadata,
and allows additional context trimming optimizations. As a result, the
named metadata schema description is also no longer needed.

Differential Revision: https://reviews.llvm.org/D128141
2022-07-20 15:30:55 -07:00
Schrodinger ZHU Yifan 304027206c [ThinLTO] Support aliased GlobalIFunc
Fixes https://github.com/llvm/llvm-project/issues/56290: when an ifunc is
aliased in LTO, clang will attempt to create an alias summary; however, as ifunc
is not included in the module summary, doing so will lead to crash.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D129009
2022-07-20 15:30:38 -07:00
Kazu Hirata 94e03abf91 [IPO] Restore a call to has_value (NFC)
This patch restores a call to has_value to make it clear that we are
checking the presence of an optional value, not the underlying value.

This patch partially reverts d08f34b592.

Differential Revision: https://reviews.llvm.org/D129453
2022-07-20 09:40:18 -07:00
Roman Rusyaev 394a388d14 [TableGen] Add a location for a class definition that was forward-declared
This change improves ctags generation for tablegen files.

For the following example
```
class A;

class A {
  int a;
}
```
Previously, tags were generated only for a forward declaration of class 'A'.

This patch allows generating tags for the forward declarations
and further definition of class 'A'.

Reviewed By: barannikov88

Original patch by: rusyaev-roman (Roman Rusyaev)
Some adjustments by: nhaehnle (Nicolai Hähnle)

Differential Revision: https://reviews.llvm.org/D129935
2022-07-20 15:56:17 +02:00
esmeyi b1847ff068 [XCOFF] write the aux header when the visibility is specified in XCOFF32.
The n_type field in the symbol table entry has two interpretations in XCOFF32, and a single interpretation in XCOFF64.
The new interpretation is used in XCOFF32 if the value of the o_vstamp field in the auxiliary header is 2.
In XCOFF64 and the new XCOFF32 interpretation, the n_type field is used for the symbol type and visibility.
The patch writes the aux header with an o_vstamp field value of 2 when the visibility is specified in XCOFF32 to make the new XCOFF32 interpretation used.

Reviewed By: DiggerLin, jhenderson

Differential Revision: https://reviews.llvm.org/D128148
2022-07-20 07:09:34 -04:00
Chuanqi Xu 645d2dd3a9 Revert "Don't treat readnone call in presplit coroutine as not access memory"
This reverts commit 57224ff4a6. This
commit may trigger crashes on some workloads. Revert it for clearness.
2022-07-20 17:00:58 +08:00
Fangrui Song e931c2e870 [LegacyPM] Remove InstrOrderFileLegacyPass
Following recent changes removing non-core features of the legacy
PM/optimization pipeline.
2022-07-19 23:58:51 -07:00
Chuanqi Xu 57224ff4a6 Don't treat readnone call in presplit coroutine as not access memory
To solve the readnone problems in coroutines. See
https://discourse.llvm.org/t/address-thread-identification-problems-with-coroutine/62015
for details.

According to the discussion, we decide to fix the problem by inserting
isPresplitCoroutine() checks in different passes instead of
wrapping/unwrapping readnone attributes in CoroEarly/CoroCleanup passes.
In this direction, we might not be able to cover every case at first.
Let's take a "find and fix" strategy.

Reviewed By: nikic, nhaehnle, jyknight

Differential Revision: https://reviews.llvm.org/D127383
2022-07-20 10:37:23 +08:00
Jez Ng 2e2737cdf9 [MC][MachO] Change addrsig format + ensure its size is properly set
There were two problems with the previous setup:

1. We weren't setting its size, which caused problems when `__llvm_addrsig`
   wasn't the last section. In particular, `__debug_line` (if created) is
   generated and placed after `__llvm_addrsig`, and would result in an
   invalid object file w/ overlapping sections being emitted.

2. The symbol indices could be invalidated if e.g. `llvm-strip` ran on
   the object file. See discussion [here][1].

To fix both these issues, we use symbol relocations instead of encoding
symbol indices directly in the section contents. The section itself
doesn't contain any data. That sidesteps the layout problem in addition
to solving the second issue.

The corresponding LLD change to read in this new format: {D128938}.
It will fix the icf-safe.ll test failure on this diff.

[1]: https://discourse.llvm.org/t/problems-with-mach-o-address-significance-table-generation/63392/

Reviewed By: #lld-macho, alx32

Differential Revision: https://reviews.llvm.org/D127637
2022-07-19 21:22:23 -04:00
Lang Hames 94e6d2677b [ORC] Fix serialization / deserialization of default-constructed StringRef.
Avoids accessing the data field on zero-length strings. This is the StringRef
counterpart to the ArrayRef<char> fix in 67220c2ad7.

rdar://97285294
2022-07-19 17:22:21 -07:00
Anubhab Ghosh 1b1f1c7786 Re-re-apply 5acd471698, Add a shared-memory based orc::MemoryMapper...
...with more fixes.

The original patch was reverted in 3e9cc543f2 due to bot failures caused by
a missing dependence on librt. That issue was fixed in 32d8d23cd0, but that
commit also broke sanitizer bots due to a bug in SimplePackedSerialization:
empty ArrayRef<char>s triggered a zero-byte memcpy from a null source. The
ArrayRef<char> serialization issue was fixed in 67220c2ad7, and this patch has
also been updated with a new custom SharedMemorySegFinalizeRequest message that
should avoid serializing empty ArrayRefs in the first place.

https://reviews.llvm.org/D128544
2022-07-19 15:35:33 -07:00
Johannes Doerfert bf789b1957 [Attributor] Replace AAValueSimplify with AAPotentialValues
For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many reasons:
- We recomputed the result a lot as there was no caching for the 9
  locations calling `genericValueTraversal`.
- We added the idea of "intra" vs. "inter" procedural simplification
  only as an afterthought. `genericValueTraversal` did offer an option
  but `AAValueSimplify` did not. Thus, we might end up with "too much"
  simplification in certain situations and then gave up on it.
- Because `genericValueTraversal` was not a real `AA` we ended up with
  problems like the infinite recursion bug (#54981) as well as code
  duplication.

This patch introduces `AAPotentialValues` and replaces the
`AAValueSimplify` uses with it. `genericValueTraversal` is folded into
`AAPotentialValues` as are the instruction simplifications performed in
`AAValueSimplify` before. We further distinguish "intra" and "inter"
procedural simplification now.

`AAValueSimplify` was not deleted as we haven't ported the
re-materialization of instructions yet. There are other differences over
the former handling, e.g., we may not fold trivially foldable
instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2`
but if an operand would be simplified to `i32 1` we would fold it still.

We are also even more aware of function/SCC boundaries in CGSCC passes,
which is good even if some tests look like they regress.

Fixes: https://github.com/llvm/llvm-project/issues/54981

Note: A previous version was flawed and consequently reverted in
      6555558a80.
2022-07-19 16:24:42 -05:00
Yusra Syeda 6fb27bc2e3 [SystemZ][z/OS] Introduce CCAssignToRegAndStack to calling convention
Differential Revision: https://reviews.llvm.org/D127328
2022-07-19 13:55:25 -04:00
Cole Kissane e939bf67e3 [llvm] add zstd to `llvm::compression` namespace
- add zstd to `llvm::compression` namespace
- add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB`
- add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp`
- debian users should install libzstd when using `LLVM_ENABLE_ZSTD=FORCE_ON` from source due to this bug https://bugs.launchpad.net/ubuntu/+source/libzstd/+bug/1941956

Reviewed By: leonardchan, MaskRay

Differential Revision: https://reviews.llvm.org/D128465
2022-07-19 10:54:36 -07:00
Jon Chesterfield 3a20597776 [amdgpu] Implement lds kernel id intrinsic
Implement an intrinsic for use lowering LDS variables to different
addresses from different kernels. This will allow kernels that cannot
reach an LDS variable to avoid wasting space for it.

There are a number of implicit arguments accessed by intrinsic already
so this implementation closely follows the existing handling. It is slightly
novel in that this SGPR is written by the kernel prologue.

It is necessary in the general case to put variables at different addresses
such that they can be compactly allocated and thus necessary for an
indirect function call to have some means of determining where a
given variable was allocated. Claiming an arbitrary SGPR into which
an integer can be written by the kernel, in this implementation based
on metadata associated with that kernel, which is then passed on to
indirect call sites is sufficient to determine the variable address.

The intent is to emit a __const array of LDS addresses and index into it.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D125060
2022-07-19 17:46:19 +01:00
Alexey Lapshin 4539b44148 [Reland][Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF.
This patch implements proposal https://lists.llvm.org/pipermail/llvm-dev/2020-August/144579.html
llvm-dwarfutil - is a tool that is used for processing debug info(DWARF) located in built binary files to improve debug info quality, reduce debug info size. The patch currently implements smaller set of command-line options(comparing to the proposal):

```
./llvm-dwarfutil [options] <input file> <output file>

  --garbage-collection    Do garbage collection for debug info(default)
  -j <value>              Alias for --num-threads
  --no-garbage-collection Don`t do garbage collection for debug info
  --no-odr-deduplication  Don`t do ODR deduplication for debug types
  --no-odr                Alias for --no-odr-deduplication
  --no-separate-debug-file
                          Create single output file, containing debug tables(default)
  --num-threads <threads> Number of available threads for multi-threaded execution. Defaults to the number of cores on the current machine
  --odr-deduplication     Do ODR deduplication for debug types(default)
  --odr                   Alias for --odr-deduplication
  --separate-debug-file   Create two output files: file w/o debug tables and file with debug tables
  --tombstone [bfd,maxpc,exec,universal]
                          Tombstone value used as a marker of invalid address(default: universal)
    =bfd - Zero for all addresses and [1,1] for DWARF v4 (or less) address ranges and exec
    =maxpc - Minus 1 for all addresses and minus 2 for DWARF v4 (or less) address ranges
    =exec - Match with address ranges of executable sections
    =universal - Both: bfd and maxpc
```

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D86539
2022-07-19 15:11:36 +03:00
Simon Pilgrim 0f6b0461b0 [DAG] SimplifyDemandedBits - relax "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" to match only demanded bits
The "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" fold is currently limited to the XOR mask being a shifted all-bits mask, but we can relax this to only need to match under the demanded bits.

This helps expose more bit extraction/clearing patterns and fixes the PowerPC testCompares*.ll regressions from D127115

Alive2: https://alive2.llvm.org/ce/z/fl7T7K

Differential Revision: https://reviews.llvm.org/D129933
2022-07-19 10:59:07 +01:00
David Spickett 5d14873249 [llvm][AArch64] Add missing FPCR, H and B registers to Codeview mapping
Fixes https://github.com/llvm/llvm-project/issues/56484

H registers are 16 bit views of AArch64's Neon registers and
B are the 8 bit views.

msvc does not support 16 bit float (some mention in DirectX but I
couldn't find a way to get to it) so for lack of a better reference
I'm using:
85c9b41b33/server/references/dia/include/cvconst.h
(the other microsoft-pdb repo is no longer up to date)

Luckily clang does support fp16 so a test is added for that.

There is no 8 bit float type so I had to get creative with the
test case. We're not testing for correct debug info here just
that we can select the B register and not crash in the process.

For FPCR it's never going to be passed as an argument so I've
not added a test for it. It is included to keep our list looking
the same as the reference.

Reviewed By: majnemer

Differential Revision: https://reviews.llvm.org/D129774
2022-07-19 09:33:13 +00:00
Alexey Lapshin e717f91c96 Revert "[Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF."
This reverts commit e2147c26bd.
2022-07-19 12:17:47 +03:00
Alexey Lapshin e2147c26bd [Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF.
This patch implements proposal https://lists.llvm.org/pipermail/llvm-dev/2020-August/144579.html
llvm-dwarfutil - is a tool that is used for processing debug info(DWARF) located in built binary files to improve debug info quality, reduce debug info size. The patch currently implements smaller set of command-line options(comparing to the proposal):

```
./llvm-dwarfutil [options] <input file> <output file>

  --garbage-collection    Do garbage collection for debug info(default)
  -j <value>              Alias for --num-threads
  --no-garbage-collection Don`t do garbage collection for debug info
  --no-odr-deduplication  Don`t do ODR deduplication for debug types
  --no-odr                Alias for --no-odr-deduplication
  --no-separate-debug-file
                          Create single output file, containing debug tables(default)
  --num-threads <threads> Number of available threads for multi-threaded execution. Defaults to the number of cores on the current machine
  --odr-deduplication     Do ODR deduplication for debug types(default)
  --odr                   Alias for --odr-deduplication
  --separate-debug-file   Create two output files: file w/o debug tables and file with debug tables
  --tombstone [bfd,maxpc,exec,universal]
                          Tombstone value used as a marker of invalid address(default: universal)
    =bfd - Zero for all addresses and [1,1] for DWARF v4 (or less) address ranges and exec
    =maxpc - Minus 1 for all addresses and minus 2 for DWARF v4 (or less) address ranges
    =exec - Match with address ranges of executable sections
    =universal - Both: bfd and maxpc
```

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D86539
2022-07-19 11:18:36 +03:00
serge-sans-paille a2ac383b44 [llvm] Fix forward declaration in Support/JSON.h
Some methods of json::Array require json::Value to be completely defined, so
they can't be defined in-class. Fix that by defining them out of class.

Fix #55780
2022-07-19 09:07:29 +02:00
Max Kazantsev 51f837a680 [NFC] Introduce API to detect tokens penetrating LCSSA form
Following discussion in PR56243, we need to somehow detect the situation
when token values penetrate LCSSA form for transforms that require that
it is maintained by all values (for example, to sustain use-def dominance
invarians). This patch introduces a parameter to LCSSA checkers to control
their ignorance about tokens.

Differential Revision: https://reviews.llvm.org/D129983
Reviewed By: efriedma
2022-07-19 13:52:30 +07:00
Shraiysh Vaishay 35fc666877 [OpenMP][IRBuilder] Add support for taskgroup
This patch adds support for generating taskgroup construct.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D128203
2022-07-19 10:49:34 +05:30
Lang Hames 67220c2ad7 [ORC] Fix serialization / deserialization of default-constructed ArrayRef<char>.
Avoids a zero-length memcpy from a null src, which caused errors on some of the
sanitizer bots. Also uses null when deserializing an empty ArrayRef (rather
than pointing to a zero length range in the middle of the input buffer).
2022-07-18 20:39:01 -07:00
Matt Arsenault 8d0383eb69 CodeGen: Remove AliasAnalysis from regalloc
This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is rematerializable. I also don't think this was entirely
correct, since it was implicitly assuming constant loads are also
dereferenceable.

Remove this and rely only on the invariant+dereferenceable flags in
the memory operand. Set the flag based on the AA query upfront. This
should have the same net benefit, but has the possible disadvantage of
making this AA query nonlazy.

Preserve the behavior of assuming pointsToConstantMemory implying
dereferenceable for now, but maybe this should be changed.
2022-07-18 17:23:41 -04:00
Stanislav Mekhanoshin 523a99c0eb [AMDGPU] Support for gfx940 fp8 smfmac
Differential Revision: https://reviews.llvm.org/D129908
2022-07-18 12:12:41 -07:00
Stanislav Mekhanoshin 2695f0a688 [AMDGPU] Support for gfx940 fp8 mfma
Differential Revision: https://reviews.llvm.org/D129906
2022-07-18 11:49:56 -07:00
Stanislav Mekhanoshin 9fa5a6b7e8 [AMDGPU] Support for gfx940 fp8 conversions
Differential Revision: https://reviews.llvm.org/D129902
2022-07-18 11:48:43 -07:00
zhijian a6316d6da5 [AIX] support read global symbol of big archive
Reviewers: James Henderson, Fangrui Song

Differential Revision: https://reviews.llvm.org/D124865
2022-07-18 10:43:30 -04:00
Simon Pilgrim c2ab5c5514 [DAG] Fix typo in isDesirableToCommuteWithShift description. NFC. 2022-07-18 13:11:23 +01:00
Max Kazantsev d693fd29f1 [Verifier] Make Verifier recognize undef tokens as correct IR
Undef tokens may appear in unreached code as result of RAUW of some optimization,
and it should not be considered as bad IR.

Patch by Dmitry Bakunevich!

Differential Revision: https://reviews.llvm.org/D128904
Reviewed By: mkazantsev
2022-07-18 16:26:06 +07:00
Nikita Popov 11079e8820 [IR] Don't treat callbr as indirect terminator
Callbr is no longer an indirect terminator in the sense that is
relevant here (that it's successors cannot be updated). The primary
effect of this change is that callbr no longer prevents formation
of loop simplify form.

I decided to drop the isIndirectTerminator() method entirely and
replace it with isa<IndirectBrInst>() checks. I assume this method
was added to abstract over indirectbr and callbr, but it never
really caught on, and there is nothing left to abstract anymore
at this point.

Differential Revision: https://reviews.llvm.org/D129849
2022-07-18 09:32:08 +02:00
Valentin Clement 048aaab194
[flang][openacc] Use TableGen to generate the clause parser
This patch introduce an automatic generation of the clause parser from the TableGen
information.

New information can be stored directly in the TableGen file:
- The different aliases that a clause support.
- prefix before a value.
- whether a prefix is optional or not.

Makes it easier to add new clauses and also avoid some error (`write` clause incorrect until now).

This patch is updating only the OpenACC part. A patch with a modification of the OpenMP clause parser will follow.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D106968
2022-07-18 09:26:57 +02:00
Craig Topper a55ff6aadd [Support][CodeGen] Fix spelling Divison->Division. NFC 2022-07-17 23:16:29 -07:00
Fangrui Song 0e3447bf8a [LegacyPM] Remove WholeProgramDevirt
Unused after LTO removal from legacy optimization passline.
2022-07-17 23:14:53 -07:00
Fangrui Song 1f90cc589e [LegacyPM] Remove FunctionImportLegacyPass
Unused after ThinLTO was removed from legacy optimization pipeline.
2022-07-17 23:06:46 -07:00
Abinav Puthan Purayil d96361d714 [AMDGPU] Add the uses_dynamic_stack field to the kernel descriptor and the kernel metadata map
This change introduces the dynamic stack boolean field to code-object-v3
and above under the code properties of the kernel descriptor and under
the kernel metadata map of NT_AMDGPU_METADATA. This field corresponds to
the is_dynamic_callstack field of amd_kernel_code_t.

Differential Revision: https://reviews.llvm.org/D128344
2022-07-18 10:07:13 +05:30
Kazu Hirata 3112987d5c Remove unused forward declarations (NFC) 2022-07-17 15:37:48 -07:00
Fangrui Song bbaa015e82 [LegacyPM] Remove LowerTypeTestsPass
Unused after LTO removal from optimization passline.
2022-07-17 15:06:38 -07:00
Fangrui Song a6942256ca [LegacyPM] Remove NameAnonGlobalLegacyPass
Unused after LTO removal from optimization passline.
2022-07-17 14:38:29 -07:00
Fangrui Song d74b88c69d [LegacyPM] Remove CanonicalizeAliasesLegacyPass
Unused after LTO removal from optimization passline.
2022-07-17 14:30:22 -07:00
Fangrui Song 70519a1fba [LegacyPM] Remove LTO passes from optimization pipeline
Following recent changes removing non-core features of the legacy
PM/optimization pipeline.
2022-07-17 14:24:36 -07:00
Fangrui Song f502115561 [LegacyPM] Remove PGO options from PassManagerBuilder
They have been dead since legacy PGO/SamplePGO passes were removed.
2022-07-17 14:03:23 -07:00
Fangrui Song dd5e3f0e27 [LegacyPM] Remove SampleProfileLoaderLegacyPass
Following recent changes removing non-core features of the legacy
PM/optimization pipeline (e.g. PGO), remove SamplePGO.
2022-07-17 12:09:46 -07:00
Kazu Hirata c13a09a462 [llvm] Fix header guards (NFC)
Identified with llvm-header-guard.
2022-07-17 02:18:55 -07:00
Kazu Hirata 92a1b2afc8 [Analysis] Remove isArithmeticRecurrenceKind
The last use was removed on Jul 30, 2021 in commit
9d35594993.
2022-07-16 13:23:32 -07:00
Fangrui Song f9d6f37201 [LegacyPM] Remove ControlHeightReductionLegacyPass
This pass tries to reduce the number of conditional branches in the hot path
based on profile. It's mostly a no-op after legacy PGO passes are moved.
2022-07-16 01:35:56 -07:00
Fangrui Song 3a42c499c2 [LegacyPM] Remove createInstrProfilingLegacyPass
Follow the steps of removing non-core instrumentation passes like PGO.
2022-07-16 01:26:40 -07:00
Fangrui Song 685775bbab [LegacyPM] Remove CGProfileLegacyPass
It's mostly a no-op after I removed legacy PGO passes in D123834.
2022-07-16 00:39:56 -07:00
Fangrui Song df8f5be596 [LegacyPM] Remove ModuleSanitizerCoverageLegacyPass
Follow the steps of various other legacy instrumentation passes removed for
15.0.0.
2022-07-15 19:01:20 -07:00
Mitch Phillips 4162aefad1 Revert "Re-apply 5acd471698, Add a shared-memory based orc::MemoryMapper, with fixes."
This reverts commit 32d8d23cd0.

Reason: Broke the UBSan buildbots. See more details on Phabricator:
https://reviews.llvm.org/D128544
2022-07-15 17:11:55 -07:00
Rong Xu 5e0443292b [PGO] Report number of counts being dropped when a hash-mismatch happens
This patch reports number of counts being dropped when a hash-mismatch
happens. This information will be helpful to the users -- if the dropped
counts are large, the user should redo the instrumentation build and
recollect the profile.

Differential Revision: https://reviews.llvm.org/D129001
2022-07-15 14:53:59 -07:00
Fangrui Song 0d5a62faca [sanitizer] Add "mainfile" prefix to sanitizer special case list
When an issue exists in the main file (caller) instead of an included file
(callee), using a `src` pattern applying to the included file may be
inappropriate if it's the caller's responsibility. Add `mainfile` prefix to check
the main filename.

For the example below, the issue may reside in a.c (foo should not be called
with a misaligned pointer or foo should switch to an unaligned load), but with
`src` we can only apply to the innocent callee a.h. With this patch we can use
the more appropriate `mainfile:a.c`.
```
//--- a.h
// internal linkage
static inline int load(int *x) { return *x; }

//--- a.c, -fsanitize=alignment
#include "a.h"
int foo(void *x) { return load(x); }
```

See the updated clang/docs/SanitizerSpecialCaseList.rst for a caveat due
to C++ vague linkage functions.

Reviewed By: #sanitizers, kstoimenov, vitalybuka

Differential Revision: https://reviews.llvm.org/D129832
2022-07-15 10:39:26 -07:00
Anubhab Ghosh 32d8d23cd0 Re-apply 5acd471698, Add a shared-memory based orc::MemoryMapper, with fixes.
The original commit was reverted in 3e9cc543f2 due to buildbot failures, which
should be fixed by the addition of dependencies on librt.

Differential Revision: https://reviews.llvm.org/D128544
2022-07-15 09:45:30 -07:00
David Kreitzer c720b6fddd Clarify the behavior of the llvm.vector.insert/extract intrinsics when the index
is out of range. Both intrinsics return a poison value.

Consequently, mark the intrinsics speculatable.
Differential Revision: https://reviews.llvm.org/D129656
2022-07-15 07:56:44 -07:00
Edd Barrett 2e62a26fd7
[stackmaps] Legalise patchpoint arguments.
This is similar to D125680, but for llvm.experimental.patchpoint
(instead of llvm.experimental.stackmap).

Differential review: https://reviews.llvm.org/D129268
2022-07-15 12:01:59 +01:00
Fangrui Song 3c849d0aef Modernize Optional::{getValueOr,hasValue} 2022-07-15 01:20:39 -07:00
Nikita Popov 2a721374ae [IR] Don't use blockaddresses as callbr arguments
Following some recent discussions, this changes the representation
of callbrs in IR. The current blockaddress arguments are replaced
with `!` label constraints that refer directly to callbr indirect
destinations:

    ; Before:
    %res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo))
    to label %asm.fallthrough [label %foo]
    ; After:
    %res = callbr i8* asm "", "=r,r,!i"(i8* %x)
    to label %asm.fallthrough [label %foo]

The benefit of this is that we can easily update the successors of
a callbr, without having to worry about also updating blockaddress
references. This should allow us to remove some limitations:

* Allow unrolling/peeling/rotation of callbr, or any other
  clone-based optimizations
  (https://github.com/llvm/llvm-project/issues/41834)
* Allow duplicate successors
  (https://github.com/llvm/llvm-project/issues/45248)

This is just the IR representation change though, I will follow up
with patches to remove limtations in various transformation passes
that are no longer needed.

Differential Revision: https://reviews.llvm.org/D129288
2022-07-15 10:18:17 +02:00
Nikita Popov f75ccadcdd [LSR] Create SCEVExpander earlier, use member isSafeToExpand() (NFC)
This is a followup to D129630, which switches LSR to the member
isSafeToExpand() variant, and removes the freestanding function.

This is done by creating the SCEVExpander early (already during the
analysis phase). Because the SCEVExpander is now available for the
whole lifetime of LSRInstance, I've also made it into a member
variable, rather than passing it around in even more places.

Differential Revision: https://reviews.llvm.org/D129769
2022-07-15 09:41:23 +02:00
Fangrui Song 141c9d7759 [llvm-dwp] Add SHF_COMPRESSED support and remove .zdebug support
clang 14 removed -gz=zlib-gnu and ld.lld/llvm-objcopy removed .zdebug support
recently. llvm-dwp currently doesn't support SHF_COMPRESSED. Add support and
remove .zdebug support.

Simplify llvm::object::Decompressor which has no .zdebug user now.

While here, add tests for ELF32LE, ELF32BE, and ELF64BE.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D129728
2022-07-14 16:19:32 -07:00
Dawid Jurczak d71128d97d [NFC][Metadata] Change MDNode::operands()'s return type from op_range to ArrayRef<MDOperand>
This patch is https://reviews.llvm.org/D129468 follow-up and address one of comment
coming from that review: https://reviews.llvm.org/D129468#3643295

Differential Revision: https://reviews.llvm.org/D129565
2022-07-14 17:22:32 +02:00
Nikita Popov 9e6e631b38 [LoopPredication] Use isSafeToExpandAt() member function (NFC)
As a followup to D129630, this switches a usage of the freestanding
function in LoopPredication to use the member variant instead. This
was the last use of the freestanding function, so drop it entirely.
2022-07-14 14:49:07 +02:00
Nikita Popov dcf4b733ef [SCEVExpander] Make CanonicalMode handing in isSafeToExpand() more robust (PR50506)
isSafeToExpand() for addrecs depends on whether the SCEVExpander
will be used in CanonicalMode. At least one caller currently gets
this wrong, resulting in PR50506.

Fix this by a) making the CanonicalMode argument on the freestanding
functions required and b) adding member functions on SCEVExpander
that automatically take the SCEVExpander mode into account. We can
use the latter variant nearly everywhere, and thus make sure that
there is no chance of CanonicalMode mismatch.

Fixes https://github.com/llvm/llvm-project/issues/50506.

Differential Revision: https://reviews.llvm.org/D129630
2022-07-14 14:41:51 +02:00
Namhyung Kim 69b312cde4 [llvm-objdump] Create fake sections for a ELF core file
The linux perf tools use /proc/kcore for disassembly kernel functions.
Actually it copies the relevant parts to a temp file and then pass it to
objdump. But it doesn't have section headers so llvm-objdump cannot
handle it.

Let's create fake section headers for the program headers. It'd have a
single section for each segment to cover the entire range. And for this
purpose we can consider only executable code segments.

With this change, I can see the following command shows proper outputs.

perf annotate --stdio --objdump=/path/to/llvm-objdump

Differential Revision: https://reviews.llvm.org/D128705
2022-07-14 13:39:59 +01:00
Cullen Rhodes 3e9cc543f2 Revert "[ORC] Add a shared-memory based orc::MemoryMapper."
This reverts commit 5acd471698.

Breaks shared library build with:

  ld.lld-12: error: undefined symbol: shm_open
  >>> referenced by ExecutorSharedMemoryMapperService.cpp:68
  (/home/culrho01/llvm-project/llvm/lib/ExecutionEngine/Orc/TargetProcess/ExecutorSharedMemoryMapperService.cpp:68)
  >>>
  lib/ExecutionEngine/Orc/TargetProcess/CMakeFiles/LLVMOrcTargetProcess.dir/ExecutorSharedMemoryMapperService.cpp.o:(llvm::orc::rt_bootstrap::ExecutorSharedMemoryMapperService::reserve[abi:cxx11](unsigned
  long))
  >>> did you mean: sem_open
  >>> defined in:
  /usr/bin/../lib/gcc/aarch64-linux-gnu/9/../../../aarch64-linux-gnu/libpthread.so
2022-07-14 09:52:57 +00:00
Amara Emerson 6e6be5f950 Revert "[llvm] add zstd to llvm::compression namespace"
This reverts commit d449c60076.

Breaks macOS builds with this:
llvm/lib/Support/Compression.cpp:24:10: fatal error: 'zstd.h' file not found
2022-07-14 01:23:20 -07:00
Jannik Silvanus e5c4cde451 [AMDGPU] SIMachineScheduler: Add support for several MachineScheduler features
The SI machine scheduler inherits from ScheduleDAGMI.
This patch adds support for a few features that are implemented
in ScheduleDAGMI (or its base classes) that were missing so far
because their support is implemented in overridden functions.

* Support cl::opt -view-misched-dags
  This option allows to open a graphical window of the scheduling DAG.

* Support cl::opt -misched-print-dags
  This option allows to print the scheduling DAG in text form.

* After constructing the scheduling DAG, call postprocessDAG()
  to apply any registered DAG mutations.
  Note that currently there are no mutations defined in AMDGPUTargetMachine.cpp
  in case SIScheduler is used.
  Still add this to avoid surprises in the future in case mutations are added.

Differential Revision: https://reviews.llvm.org/D128808
2022-07-14 09:45:31 +02:00
Kazu Hirata 611ffcf4e4 [llvm] Use value instead of getValue (NFC) 2022-07-13 23:11:56 -07:00
Cole Kissane d449c60076 [llvm] add zstd to llvm::compression namespace
- add `FindZSTD.cmake`
- add zstd to `llvm::compression` namespace
- add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB`
- add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp`

Reviewed By: leonardchan, MaskRay

Differential Revision: https://reviews.llvm.org/D128465
2022-07-13 19:58:42 -07:00
Cole Kissane 5ecb161c64 Revert "[llvm] add zstd to `llvm::compression` namespace"
This reverts commit cef07169ec.
2022-07-13 19:48:29 -07:00
Cole Kissane cef07169ec [llvm] add zstd to `llvm::compression` namespace
- add `FindZSTD.cmake`
- add zstd to `llvm::compression` namespace
- add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB`
- add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp`

Reviewed By: leonardchan, MaskRay

Differential Revision: https://reviews.llvm.org/D128465
2022-07-13 19:06:27 -07:00
Fangrui Song e690137dde [Support] Change compression::zlib::{compress,uncompress} to use uint8_t *
It's more natural to use uint8_t * (std::byte needs C++17 and llvm has
too much uint8_t *) and most callers use uint8_t * instead of char *.
The functions are recently moved into `llvm::compression::zlib::`, so
downstream projects need to make adaption anyway.
2022-07-13 16:26:54 -07:00
Anubhab Ghosh 5acd471698 [ORC] Add a shared-memory based orc::MemoryMapper.
This is an implementation of orc::MemoryMapper that maps shared memory
pages in both executor and controller process and writes directly to
them avoiding transferring content over EPC. All allocations are properly
deinitialized automatically on the executor side at shutdown by the
ExecutorSharedMemoryMapperService.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D128544
2022-07-13 15:24:28 -07:00
Philip Reames dde2a7fb6d [RISCV] Exploit fact that vscale is always power of two to replace urem sequence
When doing scalable vectorization, the loop vectorizer uses a urem in the computation of the vector trip count. The RHS of that urem is a (possibly shifted) call to @llvm.vscale.

vscale is effectively the number of "blocks" in the vector register. (That is, types such as <vscale x 8 x i8> and <vscale x 1 x i8> both fill one 64 bit block, and vscale is essentially how many of those blocks there are in a single vector register at runtime.)

We know from the RISCV V extension specification that VLEN must be a power of two between ELEN and 2^16. Since our block size is 64 bits, the must be a power of two numbers of blocks. (For everything other than VLEN<=32, but that's already broken.)

It is worth noting that AArch64 SVE specification explicitly allows non-power-of-two sizes for the vector registers and thus can't claim that vscale is a power of two by this logic.

Differential Revision: https://reviews.llvm.org/D129609
2022-07-13 10:54:47 -07:00
Fangrui Song b28412d539 [llvm-objcopy][ELF] Add --set-section-type
The request is mentioned on D129053. I feel that having this functionality is
mildly useful (not strong).

* Rename .ctors to .init_array and change sh_type to SHT_INIT_ARRAY (GNU objcopy
  detects the special name but we don't).
* Craft tests for a new SHT_LLVM_* extension

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D129337
2022-07-13 10:04:21 -07:00
Mitch Phillips 90e5a8ac47 Remove 'no_sanitize_memtag'. Add 'sanitize_memtag'.
For MTE globals, we should have clang emit the attribute for all GV's
that it creates, and then use that in the upcoming AArch64 global
tagging IR pass. We need a positive attribute for this sanitizer (rather
than implicit sanitization of all globals) because it needs to interact
with other parts of LLVM, including:

  1. Suppressing certain global optimisations (like merging),
  2. Emitting extra directives by the ASM writer, and
  3. Putting extra information in the symbol table entries.

While this does technically make the LLVM IR / bitcode format
non-backwards-compatible, nobody should have used this attribute yet,
because it's a no-op.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D128950
2022-07-13 08:54:41 -07:00
Nikita Popov 6f9d990a6e [TargetFolder] Use DL-aware folding for icmp
The Fold() call was accidentally dropped in
138fcc5f76, though it doesn't seem
to make a difference in practice (no test changes).
2022-07-13 15:35:13 +02:00
Nikita Popov 6d6983ced9 [IRBuilder] Migrate fneg to fold infrastructure
Make use of a single FoldUnOpFMF() API, though in practice FNeg
is the only unary operation that exists.

This is likely NFC in practice, because users of InstSimplifyFolder
don't create fneg.
2022-07-13 15:29:52 +02:00
Max Kazantsev 30e33b4b81 [SCEV][NFC] Make getStrengthenedNoWrapFlagsFromBinOp return optional 2022-07-13 18:54:25 +07:00
Corentin Jabot d4892a168f [Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.

Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.

The warning is off by default as its likely to be somewhat disruptive
otherwise.

This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D128059
2022-07-13 10:19:26 +02:00
Kazu Hirata 3361a364e6 [llvm] Use has_value instead of hasValue (NFC) 2022-07-12 22:25:42 -07:00
Nathan James a565509308
[ADT] Use Empty Base Optimization for Allocators
In D94439, BumpPtrAllocator changed its implementation to use an empty base optimization for the underlying allocator.
This patch builds on that by extending its functionality to more classes as well as enabling the underlying allocator to be a reference type, something not currently possible as you can't derive from a reference.

The main place this sees use is in StringMaps which often use the default MallocAllocator, yet have to pay the size of a pointer for no reason.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D129206
2022-07-12 23:57:04 +01:00
Jonas Devlieghere a262f4dbd7 Revert "[Clang] Add a warning on invalid UTF-8 in comments."
This reverts commit cc309721d2 because it
breaks the following tests on GreenDragon:

  TestDataFormatterObjCCF.py
  TestDataFormatterObjCExpr.py
  TestDataFormatterObjCKVO.py
  TestDataFormatterObjCNSBundle.py
  TestDataFormatterObjCNSData.py
  TestDataFormatterObjCNSError.py
  TestDataFormatterObjCNSNumber.py
  TestDataFormatterObjCNSURL.py
  TestDataFormatterObjCPlain.py
  TestDataFormatterObjNSException.py

https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/45288/
2022-07-12 15:22:29 -07:00
Kai Nacke 4ae254e488 Revert "[GISel] Unify use of getStackGuard"
This reverts commit e60b4fb2b7.
2022-07-12 17:00:43 -04:00
Kai Nacke e60b4fb2b7 [GISel] Unify use of getStackGuard
Some rework of getStackGuard() based on comments in
https://reviews.llvm.org/D129505.

- getStackGuard() now creates and returns the destination
  register, simplifying calls
- the pointer type is passed to getStackGuard() to avoid
  recomputation
- removed PtrMemTy in emitSPDescriptorParent(), because
  this type is only used here when loading the value but
  not when storing the value

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D129576
2022-07-12 16:46:37 -04:00
Sunho Kim 2a0aa98c8d [ORC] Remove unused function declaration. (NFC)
Differential Revision: https://reviews.llvm.org/D129582
2022-07-13 05:13:31 +09:00
Sunho Kim db995d72db [JITLink][COFF] Initial COFF support.
Adds initial COFF support in JITLink. This is able to run a hello world c program in x86 windows successfully.

Implemented
- COFF object loader
- Static local symbols
- Absolute symbols
- External symbols
- Weak external symbols
- Common symbols
- COFF jitlink-check support
- All COMDAT selection type execpt largest
- Implicit symobl size calculation
- Rel32 relocation with PLT stub.
- IMAGE_REL_AMD64_ADDR32NB relocation

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D128968
2022-07-13 03:52:43 +09:00
Yuanfang Chen fcb7d76d65 [coroutine] add nomerge function attribute to `llvm.coro.save`
It is illegal to merge two `llvm.coro.save` calls unless their
`llvm.coro.suspend` users are also merged. Marks it "nomerge" for
the moment.

This reverts D129025.

Alternative to D129025, which affects other token type users like WinEH.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D129530
2022-07-12 10:39:38 -07:00
Nick Desaulniers 2240d72f15 [X86] initial -mfunction-return=thunk-extern support
Adds support for:
* `-mfunction-return=<value>` command line flag, and
* `__attribute__((function_return("<value>")))` function attribute

Where the supported <value>s are:
* keep (disable)
* thunk-extern (enable)

thunk-extern enables clang to change ret instructions into jmps to an
external symbol named __x86_return_thunk, implemented as a new
MachineFunctionPass named "x86-return-thunks", keyed off the new IR
attribute fn_ret_thunk_extern.

The symbol __x86_return_thunk is expected to be provided by the runtime
the compiled code is linked against and is not defined by the compiler.
Enabling this option alone doesn't provide mitigations without
corresponding definitions of __x86_return_thunk!

This new MachineFunctionPass is very similar to "x86-lvi-ret".

The <value>s "thunk" and "thunk-inline" are currently unsupported. It's
not clear yet that they are necessary: whether the thunk pattern they
would emit is beneficial or used anywhere.

Should the <value>s "thunk" and "thunk-inline" become necessary,
x86-return-thunks could probably be merged into x86-retpoline-thunks
which has pre-existing machinery for emitting thunks (which could be
used to implement the <value> "thunk").

Has been found to build+boot with corresponding Linux
kernel patches. This helps the Linux kernel mitigate RETBLEED.
* CVE-2022-23816
* CVE-2022-28693
* CVE-2022-29901

See also:
* "RETBLEED: Arbitrary Speculative Code Execution with Return
Instructions."
* AMD SECURITY NOTICE AMD-SN-1037: AMD CPU Branch Type Confusion
* TECHNICAL GUIDANCE FOR MITIGATING BRANCH TYPE CONFUSION REVISION 1.0
  2022-07-12
* Return Stack Buffer Underflow / Return Stack Buffer Underflow /
  CVE-2022-29901, CVE-2022-28693 / INTEL-SA-00702

SystemZ may eventually want to support "thunk-extern" and "thunk"; both
options are used by the Linux kernel's CONFIG_EXPOLINE.

This functionality has been available in GCC since the 8.1 release, and
was backported to the 7.3 release.

Many thanks for folks that provided discrete review off list due to the
embargoed nature of this hardware vulnerability. Many Bothans died to
bring us this information.

Link: https://www.youtube.com/watch?v=IF6HbCKQHK8
Link: https://github.com/llvm/llvm-project/issues/54404
Link: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-01/msg01197.html
Link: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/return-stack-buffer-underflow.html
Link: https://arstechnica.com/information-technology/2022/07/intel-and-amd-cpus-vulnerable-to-a-new-speculative-execution-attack/?comments=1
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce114c866860aa9eae3f50974efc68241186ba60
Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00702.html
Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00707.html

Reviewed By: aaron.ballman, craig.topper

Differential Revision: https://reviews.llvm.org/D129572
2022-07-12 09:17:54 -07:00
Dawid Jurczak 165240fe38 [NFC] Fix compile time regression seen on some benchmarks after a630ea3003 commit
The goal of this change is fixing most of compile time slowdown seen after a630ea3003 commit on lencod and sqlite3 benchmarks.
There are 3 improvements included in this patch:

1. In getNumOperands when possible get value directly from SmallNumOps.
2. Inline getLargePtr by moving its definition to header.
3. In TBAAStructTypeNode::getField get all operands once instead taking operands in loop one after one.

Differential Revision: https://reviews.llvm.org/D129468
2022-07-12 15:00:27 +02:00
Corentin Jabot cc309721d2 [Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.

Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.

The warning is off by default as its likely to be somewhat disruptive
otherwise.

This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D128059
2022-07-12 14:34:30 +02:00
Nikita Popov 00797b88e0 [InlineAsm] Improve error messages for invalid constraint strings
InlineAsm constraint string verification can fail for many reasons,
but used to always print a generic "invalid type for inline asm
constraint string" message -- which is especially confusing if
the actual error is unrelated to the type, e.g. a failure to parse
the constraint string.

Change the verify API to return an Error with a more specific
error message, and print that in the IR parser.
2022-07-12 11:41:16 +02:00
Nikita Popov 4bb7b6fae3 [IR] Remove support for float binop constant expressions
As part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179,
this removes support for the floating-point binop constant expressions
fadd, fsub, fmul, fdiv and frem.

As part of this change, the C APIs LLVMConstFAdd, LLVMConstFSub,
LLVMConstFMul, LLVMConstFDiv and LLVMConstFRem are removed.
The LLVMBuild APIs should be used instead.

Differential Revision: https://reviews.llvm.org/D129478
2022-07-12 09:40:49 +02:00
Kazu Hirata ec9a0e36d9 [IPO] Remove addLTOOptimizationPasses and addLateLTOOptimizationPasses (NFC)
The last uses were removed on Apr 15, 2022 in commit
2e6ac54cf4.

Differential Revision: https://reviews.llvm.org/D129460
2022-07-11 20:15:24 -07:00
Xiang1 Zhang a45dd3d814 [X86] Support -mstack-protector-guard-symbol
Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D129346
2022-07-12 10:17:00 +08:00
Xiang1 Zhang 643786213b Revert "[X86] Support -mstack-protector-guard-symbol"
This reverts commit efbaad1c4a.
due to miss adding review info.
2022-07-12 10:14:32 +08:00
Xiang1 Zhang efbaad1c4a [X86] Support -mstack-protector-guard-symbol 2022-07-12 10:13:48 +08:00
Prabhdeep Singh Soni ac892c70a4 [OMPIRBuilder] Add support for simdlen clause
This patch adds OMPIRBuilder support for the simdlen clause for the
simd directive. It uses the simdlen support in OpenMPIRBuilder when
it is enabled in Clang. Simdlen is lowered by OpenMPIRBuilder by
generating the loop.vectorize.width metadata.

Reviewed By: jdoerfert, Meinersbur

Differential Revision: https://reviews.llvm.org/D129149
2022-07-11 13:29:06 -04:00
spupyrev eecd41aa09 Revert "Rebase: [Facebook] [MC] Introduce NeverAlign fragment type"
This reverts commit 6d0528636a.
2022-07-11 09:50:47 -07:00
Rafael Auler 6d0528636a Rebase: [Facebook] [MC] Introduce NeverAlign fragment type
Summary:
Introduce NeverAlign fragment type.

The intended usage of this fragment is to insert it before a pair of
macro-op fusion eligible instructions. NeverAlign fragment ensures that
the next fragment (first instruction in the pair) does not end at a
given alignment boundary by emitting a minimal size nop if necessary.

In effect, it ensures that a pair of macro-fusible instructions is not
split by a given alignment boundary, which is a precondition for
macro-op fusion in modern Intel Cores (64B = cache line size, see Intel
Architecture Optimization Reference Manual, 2.3.2.1 Legacy Decode
Pipeline: Macro-Fusion).

This patch introduces functionality used by BOLT when emitting code with
MacroFusion alignment already in place.

The use case is different from BoundaryAlign and instruction bundling:
- BoundaryAlign can be extended to perform the desired alignment for the
first instruction in the macro-op fusion pair (D101817). However, this
approach has higher overhead due to reliance on relaxation as
BoundaryAlign requires in the general case - see
https://reviews.llvm.org/D97982#2710638.
- Instruction bundling: the intent of NeverAlign fragment is to prevent
the first instruction in a pair ending at a given alignment boundary, by
inserting at most one minimum size nop. It's OK if either instruction
crosses the cache line. Padding both instructions using bundles to not
cross the alignment boundary would result in excessive padding. There's
no straightforward way to request instruction bundling to avoid a given
end alignment for the first instruction in the bundle.

LLVM: https://reviews.llvm.org/D97982

Manual rebase conflict history:
https://phabricator.intern.facebook.com/D30142613

Test Plan: sandcastle

Reviewers: #llvm-bolt

Subscribers: phabricatorlinter

Differential Revision: https://phabricator.intern.facebook.com/D31361547
2022-07-11 09:31:52 -07:00
David Sherwood 03fee6712a [LoopVectorize] Add option to use active lane mask for loop control flow
Currently, for vectorised loops that use the get.active.lane.mask
intrinsic we only use the mask for predicated vector operations,
such as masked loads and stores, etc. The loop itself is still
controlled by comparing the canonical induction variable with the
trip count. However, for some targets this is inefficient when it's
cheap to use the mask itself to control the loop.

This patch adds support for using the active lane mask for control
flow by:

1. Generating the active lane mask for the next iteration of the
vector loop, rather than the current one. If there are still any
remaining iterations then at least the first bit of the mask will
be set.
2. Extract the first bit of this mask and use this bit for the
conditional branch.

I did this by creating a new VPActiveLaneMaskPHIRecipe that sets
up the initial PHI values in the vector loop pre-header. I've also
made use of the new BranchOnCond VPInstruction for the final
instruction in the loop region.

Differential Revision: https://reviews.llvm.org/D125301
2022-07-11 13:46:55 +01:00
Abhina Sreeskantharajan 6e2329e33a [SystemZ][z/OS] Force alignment to fix build failure on z/OS
The following commit https://reviews.llvm.org/D125998 added a static_assert which was triggered on z/OS because bitfields are always aligned to 1 regardless of type.

```
error: static_assert failed due to requirement 'alignof(llvm::SmallVector<llvm::MDOperand, 0>) <= alignof(llvm::MDNode::Header)' "LargeStorageVector too strongly aligned"
```

The solution was to force the alignment to be size_t.

Reviewed By: wolfgangp

Differential Revision: https://reviews.llvm.org/D129369
2022-07-11 08:29:29 -04:00
Kazu Hirata c13d04e599 [DWARFLinker] Remove unused declaration copyAbbrev (NFC)
The corresponding definition was removed on Apr 26, 2021 in commit
233c24330b.
2022-07-10 22:10:23 -07:00
Kazu Hirata f2e1d2cec0 [GlobalISel] Remove unused declaration fewerElementsVectorSextInReg (NFC)
The corresponding definition was removed on Dec 23, 2021 in commit
29f88b93fd.
2022-07-10 20:41:02 -07:00
Nicolai Hähnle ede600377c ManagedStatic: remove many straightforward uses in llvm
(Reapply after revert in e9ce1a5880 due to
Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other
than error categories, to be checked in more detail and reapplied
separately.)

Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,
moving the static variable directly into the only function that uses it.

Differential Revision: https://reviews.llvm.org/D129120
2022-07-10 10:29:15 +02:00
Nicolai Hähnle e9ce1a5880 Revert "ManagedStatic: remove many straightforward uses in llvm"
This reverts commit e6f1f06245.

Reverting due to a failure on the fuchsia-x86_64-linux buildbot.
2022-07-10 09:54:30 +02:00
Nicolai Hähnle e6f1f06245 ManagedStatic: remove many straightforward uses in llvm
Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,
moving the static variable directly into the only function that uses it.

Differential Revision: https://reviews.llvm.org/D129120
2022-07-10 09:15:08 +02:00
Fangrui Song 2c18e817ee [Support] Delete redundant 'static' from namespace scope 'static constexpr'. NFC 2022-07-09 23:36:01 -07:00
Corentin Jabot 50416e5454 Revert "[Clang] Add a warning on invalid UTF-8 in comments."
It is probable thart this change crashes on the powerpc bots.

This reverts commit 355532a149.
2022-07-09 17:18:35 +02:00
Lang Hames 7ac7837080 [JITLink][AArch64] Rename PointerToGOT and fix typo.
PointerToGOT lowering was accidentally changed from Delta32 to Delta64 in
db37225803. This patch moves it back to Delta32 and renames the generic
aarch64 edge to Delta32ToGOT to avoid the ambiguity.

No test case yet -- I haven't figured out how to write a succinct test case
(this typically appears in CIEs in eh-frames).
2022-07-09 08:09:23 -07:00
Corentin Jabot 355532a149 [Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.

Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.

The warning is off by default as its likely to be somewhat disruptive
otherwise.

This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D128059
2022-07-09 11:26:45 +02:00
Leonard Chan 474c873148 Revert "[llvm] cmake config groundwork to have ZSTD in LLVM"
This reverts commit f07caf20b9 which seems to break upstream https://lab.llvm.org/buildbot/#/builders/109/builds/42253.
2022-07-08 13:48:05 -07:00
Cole Kissane f07caf20b9 [llvm] cmake config groundwork to have ZSTD in LLVM
- added `FindZSTD.cmake`
- added a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB`
- likewise added have_zstd to compiler-rt/test/lit.common.cfg.py, clang-tools-extra/clangd/test/lit.cfg.py, and several lit.site.cfg.py.in files mirroring have_zlib behavior

Reviewed By: leonardchan, MaskRay

Differential Revision: https://reviews.llvm.org/D128465
2022-07-08 11:46:52 -07:00
Joseph Huber 5300263c70 [OpenMP] Add loop tripcount argument to kernel launch and remove push function
Previously we added the `push_target_tripcount` function to send the
loop tripcount to the device runtime so we knew how to configure the
teams / threads for execute the loop for a teams distribute construct.
This was implemented as a separate function mostly to avoid changing the
interface for backwards compatbility. Now that we've changed it anyway
and the new interface can take an arbitrary number of arguments via the
struct without changing the ABI, we can move this to the new interface.
This will simplify the runtime by removing unnecessary state between
calls.

Depends on D128550

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D128816
2022-07-08 14:44:16 -04:00
Joseph Huber 1fff116645 [OpenMP] Change OpenMP code generation for target region entries
This patch changes the code we generate to enter a target region on the
device. This is in-line with the new definition in the runtime that was
added previously. Additionally we implement this in the OpenMPIRBuilder
so that this code can be shared with Flang in the future.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D128550
2022-07-08 14:44:11 -04:00
Cole Kissane 96063bfa90 [llvm] Remove unused and redundant crc32 funcction from llvm::compression::zlib namespace
* Remove crc32 from zlib compression namespace, people should use the `llvm::crc32` instead.

Reviewed By: MaskRay, leonardchan

Differential Revision: https://reviews.llvm.org/D128754
2022-07-08 11:24:45 -07:00
Cole Kissane ea61750c35 [NFC] Refactor llvm::zlib namespace
* Refactor compression namespaces across the project, making way for a possible
  introduction of alternatives to zlib compression.
  Changes are as follows:
  * Relocate the `llvm::zlib` namespace to `llvm::compression::zlib`.

Reviewed By: MaskRay, leonardchan, phosek

Differential Revision: https://reviews.llvm.org/D128953
2022-07-08 11:19:07 -07:00
Nicolai Hähnle 5a731d733c Fix test: LLVMGetBitcodeModule takes ownership of memory buffer
Clarify this behavior in the C interface header file and fix a related
bug in a test.

Differential Revision: https://reviews.llvm.org/D129113
2022-07-08 20:06:44 +02:00
Matt Arsenault 13ac4c3de9 GlobalISel: Add buildBoolExtInReg helper 2022-07-08 11:55:08 -04:00
Matt Arsenault 1ee6ce9bad GlobalISel: Allow forming atomic/volatile G_ZEXTLOAD
SelectionDAG has a target hook, getExtendForAtomicOps, which it uses
in the computeKnownBits implementation for ATOMIC_LOAD. This is pretty
ugly (as is having a separate load opcode for atomics), so instead
allow making use of atomic zextload. Enable this for AArch64 since the
DAG path defaults in to the zext behavior.

The tablegen changes are pretty ugly, but partially helps migrate
SelectionDAG from using ISD::ATOMIC_LOAD to regular ISD::LOAD with
atomic memory operands. For now the DAG emitter will emit matchers for
patterns which the DAG will not produce.

I'm still a bit confused by the intent of the isLoad/isStore/isAtomic
bits. The DAG implementation rejects trying to use any of these in
combination. For now I've opted to make the isLoad checks also check
isAtomic, although I think having isLoad and isAtomic set on these
makes most sense.
2022-07-08 11:55:08 -04:00
Valentin Clement 015834e455
[flang][openacc][NFC] Extract device_type parser to its own
Move the device_type parser to a separate parser AccDeviceTypeExprList. Preparatory work for D106968.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D106967
2022-07-08 16:02:04 +02:00
Valentin Clement 36e24da8eb
[flang][openacc][NFC] Make self clause value optional in ACC.td and extract the parser
Set the isOptional flag for the self clause. Move the optional and parenthesis part of the parser. Update the rest of the code to deal with the optional value.

Preparatory work for D106968.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D106965
2022-07-08 15:45:12 +02:00
Johannes Doerfert f6e0c05e3d Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues"
This reverts commit f17639ea0c as three
AMDGPU tests haven't been updated. Will need to verify the changes are
not regressions we should avoid.
2022-07-08 00:53:38 -05:00
Johannes Doerfert f17639ea0c [Attributor] Replace AAValueSimplify with AAPotentialValues
For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many reasons:
- We recomputed the result a lot as there was no caching for the 9
  locations calling `genericValueTraversal`.
- We added the idea of "intra" vs. "inter" procedural simplification
  only as an afterthought. `genericValueTraversal` did offer an option
  but `AAValueSimplify` did not. Thus, we might end up with "too much"
  simplification in certain situations and then gave up on it.
- Because `genericValueTraversal` was not a real `AA` we ended up with
  problems like the infinite recursion bug (#54981) as well as code
  duplication.

This patch introduces `AAPotentialValues` and replaces the
`AAValueSimplify` uses with it. `genericValueTraversal` is folded into
`AAPotentialValues` as are the instruction simplifications performed in
`AAValueSimplify` before. We further distinguish "intra" and "inter"
procedural simplification now.

`AAValueSimplify` was not deleted as we haven't ported the
re-materialization of instructions yet. There are other differences over
the former handling, e.g., we may not fold trivially foldable
instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2`
but if an operand would be simplified to `i32 1` we would fold it still.

We are also even more aware of function/SCC boundaries in CGSCC passes,
which is good even if some tests look like they regress.

Fixes: https://github.com/llvm/llvm-project/issues/54981

Note: A previous version was flawed and consequently reverted in
      6555558a80.
2022-07-08 00:38:27 -05:00
Abinav Puthan Purayil c42fe5bd7a [GlobalISel][SelectionDAG] Implement the HasNoUse builtin predicate
This change introduces the HasNoUse builtin predicate in PatFrags that
checks for the absence of use of the first result operand.
GlobalISelEmitter will allow source PatFrags with this predicate to be
matched with destination instructions with empty outs. This predicate is
required for selecting the no-return variant of atomic instructions in
AMDGPU.

Differential Revision: https://reviews.llvm.org/D125212
2022-07-08 09:47:33 +05:30
Joseph Huber 41fba3c107 [Metadata] Add 'exclude' metadata to add the exclude flags on globals
This patchs adds a new metadata kind `exclude` which implies that the
global variable should be given the necessary flags during code
generation to not be included in the final executable. This is done
using the ``SHF_EXCLUDE`` flag on ELF for example. This should make it
easier to specify this flag on a variable without needing to explicitly
check the section name in the target backend.

Depends on D129053 D129052

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D129151
2022-07-07 12:20:40 -04:00
Joseph Huber 1d2ce4da84 [Object] Add ELF section type for offloading objects
Currently we use the `.llvm.offloading` section to store device-side
objects inside the host, creating a fat binary. The contents of these
sections is currently determined by the name of the section while it
should ideally be determined by its type. This patch adds the new
`SHT_LLVM_OFFLOADING` section type to the ELF section types. Which
should make it easier to identify this specific data format.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D129052
2022-07-07 12:20:30 -04:00
Joseph Huber ed801ad5e5 [Clang] Use metadata to make identifying embedded objects easier
Currently we use the `embedBufferInModule` function to store binary
strings containing device offloading data inside the host object to
create a fatbinary. In the case of LTO, we need to extract this object
from the LLVM-IR. This patch adds a metadata node for the embedded
objects containing the embedded pointers and the sections they were
stored at. This should create a cleaner interface for identifying these
values.

In the future it may be worthwhile to also encode an `ID` in the
metadata corresponding to the object's special section type if relevant.
This would allow us to extract the data from an object file and LLVM-IR
using the same ID.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D129033
2022-07-07 12:20:25 -04:00
Nicolai Hähnle fdf7e437bf llvm-c: Add LLVMDeleteInstruction to fix a test issue
Not deleting the loose instruction with metadata associated to it causes
an assertion when the LLVMContext is destroyed. This was previously
hidden by the fact that llvm-c-test does not call LLVMShutdown. The
planned removal of ManagedStatic exposed this issue.

Differential Revision: https://reviews.llvm.org/D129114
2022-07-07 14:29:20 +02:00
Sven van Haastregt 1d9086bf05 Fix use of uninitialized member in constructor
The constructor does `Saver(Alloc)`, so `Alloc` should be
initialized first.  Move `Alloc` up in the declaration order.

Fixes a -Wuninitialized warning when building with GCC 12.1.

Reported-by: Mihail Atanassov <mihail.atanassov@arm.com>
2022-07-07 12:05:24 +01:00
Nikita Popov 4a579abd9f [GlobalsModRef] Don't override getModRefBehavior() for CallBase
BasicAA will already call getModRefBehavior() on the Function of
the CallBase if there are no operand bundles. This happens through
getBestAAResults(), i.e. it is a recursive call that will query
other AA providers, not just the BasicAA implementation.

As such, there is no need to reimplement the same functionality
in GlobalsModRef, a combination of BasicAA and GlobalsModRef already
handles it. This does mean that this no longer works under
-disable-basic-aa, but that's a testing only option.
2022-07-07 10:35:44 +02:00
Sander de Smalen 6106a767b7 [AArch64][SME] Update load/store intrinsics to take predicate corresponding to element size.
Instead of using <vscale x 16 x i1> for all the loads/stores, we now use the appropriate
predicate type according to the element size, e.g.

  ld1b uses <vscale x 16 x i1>
  ld1w uses <vscale x 4 x i1>
  ld1q uses <vscale x 1 x i1>

Reviewed By: kmclaughlin

Differential Revision: https://reviews.llvm.org/D129083
2022-07-07 07:39:27 +00:00
Nico Weber e9fe20dab3 Revert "[Clang] Add a warning on invalid UTF-8 in comments."
This reverts commit 4174f0ca61.

Also revert follow-up "[Clang] Fix invalid utf-8 detection"
This reverts commit bf45e27a67.

The second commit broke tests, see comments on
https://reviews.llvm.org/D129223, and it sounds like the first
commit isn't valid without the second one. So reverting both for now.
2022-07-06 22:51:52 +02:00
Nico Weber 39ed08f8d4 try to fix build after babef908cc 2022-07-06 22:15:09 +02:00