Currently the only way to do this is to work with the instruction list directly.
This is part of a series of cleanup patches towards making BasicBlock::getInstList() private.
Differential Revision: https://reviews.llvm.org/D139142
Currently the only way to do this is to work with the instruction list directly.
This is part of a series of cleanup patches towards making BasicBlock::getInstList() private.
Differential Revision: https://reviews.llvm.org/D138977
Re-land with constexpr StringRef::substr():
The TargetParser depends heavily on a collection of macros and enums to tie
together information about architectures, CPUs and extensions. Over time this
has led to some pretty awkward API choices. For example, recently a custom
operator-- has been added to the enum, which effectively turns iteration into
a graph traversal and makes the ordering of the macro calls in the header
significant. More generally there is a lot of string <-> enum conversion
going on. I think this shows the extent to which the current data structures
are constraining us, and the need for a rethink.
Key changes:
- Get rid of Arch enum, which is used to bind fields together. Instead of
passing around ArchKind, use the named ArchInfo objects directly or via
references.
- The list of all known ArchInfo becomes an array of pointers.
- ArchKind::operator-- is replaced with ArchInfo::implies(), which defines
which architectures are predecessors to each other. This allows features
from predecessor architectures to be added in a more intuitive way.
- Free functions of the form f(ArchKind) are converted to ArchInfo::f(). Some
functions become unnecessary and are deleted.
- Version number and profile are added to the ArchInfo. This makes comparison
of architectures easier and moves a couple of functions out of clang and
into AArch64TargetParser.
- clang::AArch64TargetInfo ArchInfo is initialised to Armv8a not INVALID.
- AArch64::ArchProfile which is distinct from ARM::ArchProfile
- Give things sensible names and add some comments.
Differential Revision: https://reviews.llvm.org/D138792
The TargetParser depends heavily on a collection of macros and enums to tie
together information about architectures, CPUs and extensions. Over time this
has led to some pretty awkward API choices. For example, recently a custom
operator-- has been added to the enum, which effectively turns iteration into
a graph traversal and makes the ordering of the macro calls in the header
significant. More generally there is a lot of string <-> enum conversion
going on. I think this shows the extent to which the current data structures
are constraining us, and the need for a rethink.
Key changes:
- Get rid of Arch enum, which is used to bind fields together. Instead of
passing around ArchKind, use the named ArchInfo objects directly or via
references.
- The list of all known ArchInfo becomes an array of pointers.
- ArchKind::operator-- is replaced with ArchInfo::implies(), which defines
which architectures are predecessors to each other. This allows features
from predecessor architectures to be added in a more intuitive way.
- Free functions of the form f(ArchKind) are converted to ArchInfo::f(). Some
functions become unnecessary and are deleted.
- Version number and profile are added to the ArchInfo. This makes comparison
of architectures easier and moves a couple of functions out of clang and
into AArch64TargetParser.
- clang::AArch64TargetInfo ArchInfo is initialised to Armv8a not INVALID.
- AArch64::ArchProfile which is distinct from ARM::ArchProfile
- Give things sensible names and add some comments.
Differential Revision: https://reviews.llvm.org/D138792
Add ELFObjectFileBase::getLoongArchFeatures, and return the proper ELF
relative reloc type for LoongArch.
Reviewed By: MaskRay, SixWeining
Differential Revision: https://reviews.llvm.org/D138016
It's an artifact very specific to using TFAgents during training, so it
belongs with ModelUnderTrainingRunner.
Differential Revision: https://reviews.llvm.org/D139031
PHI Node can't be modeled like other instructions since its operand
number depends on predecessors. So we have a stand alone strategy for it.
Signed-off-by: Peter Rong <PeterRong96@gmail.com>
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138959
In C++20, types that declare or delete any constructors are no longer aggregates, breaking compilation of many existing uses of aggregate initialization. In this test, provide a one-arg constructor so that `StructWithoutCopyOrMove{1}` still works.
D137836 changed what llvm::get_physical_cores returns when threads are
disabled, to bring it inline with the other parts of Threading. It now
returns the value for "unknown" when threading is disabled.
This commit updates the tests (which are failing on some platforms), to
also reflect this change.
Differential Revision: https://reviews.llvm.org/D139015
Virtual Memory System Architecture (VMSA)
This is part of the 2022 A-Profile Architecture extensions and adds support for
the following:
- Translation Hardening Extension (FEAT_THE)
- 128-bit Page Table Descriptors (FEAT_D128)
- 56-bit Virtual Address (FEAT_LVA3)
- Support for 128-bit System Registers (FEAT_SYSREG128)
- System Instructions that can take 128-bit inputs (FEAT_SYSINSTR128)
- 128-bit Atomic Instructions (FEAT_LSE128)
- Permission Indirection Extension (FEAT_S1PIE, FEAT_S2PIE)
- Permission Overlay Extension (FEAT_S1POE, FEAT_S2POE)
- Memory Attribute Index Enhancement (FEAT_AIE)
New instructions added:
- FEAT_SYSREG128 adds MRRS and MSRR.
- FEAT_SYSINSTR128 adds the SYSP instruction and TLBIP aliases.
- FEAT_LSE128 adds LDCLRP, LDSET, and SWPP instructions.
- FEAT_THE adds the set of RCW* instructions.
Specs for individual instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/
Contributors:
Keith Walker
Lucas Prates
Sam Elliott
Son Tuan Vu
Tomas Matheson
Differential Revision: https://reviews.llvm.org/D138920
Currently the only way to do this is to work with the instruction list directly.
This is part of a series of cleanup patches towards making BasicBlock::getInstList() private.
Differential Revision: https://reviews.llvm.org/D138875
Add a new version of `zip` that assumes that all iteratees have equal
lengths. The difference compared to `zip_first` is that `zip_equal`
checks this assumption in builds with assertions enabled.
This will allow us to clearly express the intent when working with
equally-sized ranges without having to write this assertion manually.
This is similar to Python's `zip(..., equal=True)` [1] or
`more_itertools.zip_equal` [2].
I saw this first suggested by @benvanik.
[1] https://peps.python.org/pep-0618/
[2] https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.zip_equal
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D138865
Update the documentation comment and add a new test case.
Add an assertion in `zip_first` checking the iteratee length precondition.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D138858
Randomlly select an instruction and try to use it in the future by replacing it with another instruction's operand.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138948
`connectToSink` uses a value by putting it in a future instruction.
It will replace the operand of a future instruction with the current value.
However, if current value is an `Instruction` and put into a switch case, the module is invalid.
We fix that by only connecting to Br/Switch's condition, and don't touch other operands.
Will have other strategies to mutate other Br/Switch operands to be patched once this patch is passed
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138890
This change is focussed on simplifying `Support/Host.h` to only do
target detection. In this case, this function is close in usage to
existing functions in `Support/Threading.h`, so I moved it into there.
The function is also renamed to `llvm::get_physical_cores()` to match
the style of threading's functions.
The big change here is that now if you have threading disabled,
`llvm::get_physical_cores()` will return -1, as if it had not been able
to work out the right info. This is due to how Threading.cpp includes
OS-specific code/headers. This seems ok, as if threading is disabled,
LLVM should not need to know the number of physical cores.
Differential Revision: https://reviews.llvm.org/D137836
In revision B.q and before of the Armv8-M architecture reference
manual, the vector/scalar forms of the `vmla` and `vmlas` instructions
came in signed and unsigned integer forms, such as `vmla.s8 q0,q1,r2`
or `vmlas.u32 q3,q4,r5`.
Revision B.r has changed this. There are no longer signed and unsigned
versions of these instructions, since they were functionally identical
anyway. Now there is just `vmla.i8` (or `i16` or `i32`, and similarly
for `vmlas`). Bit 28 of the instruction encoding, which was previously
0 for signed or 1 for unsigned, is now expected to be 0 always.
This change updates LLVM to the new version of the architecture. The
obsoleted encodings for unsigned integers are now decoding errors, and
only the still-valid encoding is ever emitted. This shouldn't break
any existing assembly code, because the old signed and unsigned
versions of the mnemonic are still accepted by the assembler (which is
standard practice anyway for all signedness-agnostic MVE integer
instructions).
Reviewed By: dmgreen, lenary
Differential Revision: https://reviews.llvm.org/D138827
`ShuffleBlockStrategy` will shuffle the instructions in a basic block without breaking the dependency of instructions.
It is implemented as a topological sort, only we randomly select instructions with no dependency.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138339
All east asian width wide and full-width codepoints
are considered double width, as well as emojis and
symbols commonely rendered as emoji.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D138518
This patch implements assembly support for the 2022 A-Profile Architecture
extension FEAT_LRCPC3. FEAT_LRCPC3 is AArch64 only and introduces new
variants of load/store instructions with release consistency ordering.
Specs for individual instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/
This feature is optionally available from v8.2a and therefore not enabled by
default.
Contributors:
Lucas Prates
Sam Elliot
Son Tuan Vu
Tomas Matheson
Differential Revision: https://reviews.llvm.org/D138579
This change is focussed on simplifying `Support/Host.h` to only do
target detection. In this case, this function is close in usage to
existing functions in `Support/Threading.h`, so I moved it into there.
The function is also renamed to `llvm::get_physical_cores()` to match
the style of threading's functions.
Differential Revision: https://reviews.llvm.org/D137836
Fix replaceVariableLocationOp unconditionally replacing the first operand of a
dbg.assign.
Reviewed By: jryans
Differential Revision: https://reviews.llvm.org/D138561
I noticed these were missing, so this adds Host identifiers for
cortex-a55, cortex-a510, cortex-a710 and cortex-x2, taken from their
respective TRMs.
Differential Revision: https://reviews.llvm.org/D138497
This patch introudces the OpenMPIRBuilderConfig class which contains various
flags that are needed to lower OMP constructs to LLVM-IR. The purpose is to
keep the flags in one place so they do not have to be passed in every time.
The flags can be set optionally since some uses cases don't rely on functions
that depend on these flags.
Reviewed By: jdoerfert, tschuett
Differential Revision: https://reviews.llvm.org/D138220
This patch implements the 2022 Architecture General Data-Processing Instructions
They include:
Common Short Sequence Compression (CSSC) instructions
- scalar comparison instructions
SMAX, SMIN, UMAX, UMIN (32/64 bits) with or without immediate
- ABS (absolute), CNT (count non-zero bits), CTZ (count trailing zeroes)
- command-line options for CSSC
Associated with these instructions in the documentation is the Range Prefetch
Memory (RPRFM) instruction, which signals to the memory system that data memory
accesses from a specified range of addresses are likely to occur in the near
future. The instruction lies in hint space, and is made unconditional.
Specs for the individual instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/
contributors to this patch:
- Cullen Rhodes
- Son Tuan Vu
- Mark Murray
- Tomas Matheson
- Sam Elliott
- Ties Stuij
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D138488
This patch replaces None with a custom base class in
FormatVariadicTest.cpp.
As part of the migration from llvm::Optional to std::optional, I'd
like to define None as std::nullopt, but FormatVariadicTest.cpp blocks
that.
When you specialize indexed_accessor_range with the base class being
None, the template instantiation eventually generates code to compare
two instances of None. That's not guaranteed with std::nullopt.
Replacing None with a custom base class allows me to define None as
std::nullopt.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Differential Revision: https://reviews.llvm.org/D138381
This patch replaces NoneType() and NoneType::None with None in
preparation for migration from llvm::Optional to std::optional.
In the std::optional world, we are not guranteed to be able to
default-construct std::nullopt_t or peek what's inside it, so neither
NoneType() nor NoneType::None has a corresponding expression in the
std::optional world.
Once we consistently use None, we should even be able to replace the
contents of llvm/include/llvm/ADT/None.h with something like:
using NoneType = std::nullopt_t;
inline constexpr std::nullopt_t None = std::nullopt;
to ease the migration from llvm::Optional to std::optional.
Differential Revision: https://reviews.llvm.org/D138376
On x86 and AArch, SIMD instructions encode all of the scheduling information in the instruction
itself. For example, VADD.I16 q0, q1, q2 is a neon instruction that operates on 16-bit integer
elements stored in 128-bit Q registers, which leads to eight 16-bit lanes in parallel. This kind
of information impacts how the instruction takes to execute and what dependencies this may cause.
On RISCV however, the data that impacts scheduling is encoded in CSR registers such as vtype or
vl, in addition with the instruction itself. But MCA does not track or use the data in these
registers. This patch fixes this problem by introducing Instruments into MCA.
* Replace `CodeRegions` with `AnalysisRegions`
* Add `Instrument` and `InstrumentManager`
* Add `InstrumentRegions`
* Add RISCV Instrument and `InstrumentManager`
* Parse `Instruments` in driver
* Use instruments to override schedule class
* RISCV use lmul instrument to override schedule class
* Fix unit tests to pass empty instruments
* Add -ignore-im clopt to disable this change
A prior version of this patch was commited in 5e82ee5373. 2323a4ee61 reverted
that change because the unit test files caused build errors. The change with fixes
were committed in b88b8307bf but reverted once again e8e92c8313 due to more
build errors.
This commit adds the prior changes and fixes the build error.
Differential Revision: https://reviews.llvm.org/D137440
This is useful where tests previously encoded information in the name
names of ranges and points. Currently, this is pretty limited because
names consist of only alphanumeric characters and '_'.
With this patch, we can keep the names simple and attach optional
payloads to ranges and points instead.
The new syntax should be fully backwards compatible (if I haven't missed
anything). I tested this against clangd unit tests and everything still passes.
Differential Revision: https://reviews.llvm.org/D137909
The invalid case is now represented by an empty StringRef rather than
a nullptr.
Previously ARCH_FEATURE was build from SUB_ARCH by prepending "+".
This is now reverse, so that the "+arch-feature" is now visible in
the .def, which is a bit clearer. This meant converting one StringSwitch
into a loop.
Removed getters which are now mostly unnecessary.
Removed some old FIXMEs.
Differential Revision: https://reviews.llvm.org/D138026
After 32f1c5531b, getDefiningRecipe returns a VPRecipeBase* so there's
no need to cast to VPRecipeBase.
Suggested by @Ayal during review of D136068, thanks!
The return value of getDef is guaranteed to be a VPRecipeBase and all
users can also accept a VPRecipeBase *. Most users actually case to
VPRecipeBase or a specific recipe before using it, so this change
removes a number of redundant casts.
Also rename it to getDefiningRecipe to make the name a bit clearer.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D136068