By default `llvm::seq` would happily iterate over enums, which may be unsafe if the enum values are not continuous. This patch disable enum iteration with `llvm::seq` and `llvm::seq_inclusive` and adds two new functions: `enum_seq` and `enum_seq_inclusive`.
To make sure enum iteration is safe, we require users to declare their enum types as iterable by specializing `enum_iteration_traits<SomeEnum>`. Because it's not always possible to add these traits next to enum definition (e.g., for enums defined in external libraries), we provide an escape hatch to allow iteration on per-callsite basis by passing `force_iteration_on_noniterable_enum`.
The main benefit of this approach is that these global declarations via traits can appear just next to enum definitions, making easy to spot when enums are miss-labeled, e.g., after introducing new enum values, whereas `force_iteration_on_noniterable_enum` should stand out and be easy to grep for.
This emerged from a discussion with gchatelet@ about reusing llvm's `Sequence.h` in lieu of https://github.com/GPUOpen-Drivers/llpc/blob/dev/lgc/interface/lgc/EnumIterator.h.
Reviewed By: dblaikie, gchatelet, aaron.ballman
Differential Revision: https://reviews.llvm.org/D107378
This patch allows iterating typed enum via the ADT/Sequence utility.
It also changes the original design to better separate concerns:
- `StrongInt` only deals with safe `intmax_t` operations,
- `SafeIntIterator` presents the iterator and reverse iterator
interface but only deals with safe `StrongInt` internally.
- `iota_range` only deals with `SafeIntIterator` internally.
This design ensures that operations are always valid. In particular,
"Out of bounds" assertions fire when:
- the `value_type` is not representable as an `intmax_t`
- iterator operations make internal computation underflow/overflow
- the internal representation cannot be converted back to `value_type`
Differential Revision: https://reviews.llvm.org/D106279
Details:
Switch all #includes to use <> because that is consistent with what happens in the cmake checks.
Otherwise, we could be in the situation where cmake checks see that headers exist at <perfmon/...>
but in llvm-exegesis code, we use "perfmon/...", which may not exist.
Related PR/revisions: D84076, PR51017+D105615
Differential Revision: https://reviews.llvm.org/D105861
As was reported on PR50620, the X86LbrCounter destructor was double-closing the filedescriptor and not unmapping the buffer.
Differential Revision: https://reviews.llvm.org/D104201
[llvm-exegesis] Disable the LBR check on AMD
https://bugs.llvm.org/show_bug.cgi?id=48918
The bug reported a hang (or very very slow runtime) on a Zen2. Unfortunately, we don't have the hardware right now to debug it and I was not able to reproduce the bug on a HSW.
Theory we've got is that the lbr-checking code could be confused on AMD.
Differential Revision: https://reviews.llvm.org/D97504
New change:
- Surround usages of x86 helper in llvm-exegesis/X86/Target.cpp with ifdef
- Fix bug which caused the caller of getVendorSignature to not have a copy of EAX that it expected.
https://bugs.llvm.org/show_bug.cgi?id=48918
The bug reported a hang (or very very slow runtime) on a Zen2. Unfortunately, we don't have the hardware right now to debug it and I was not able to reproduce the bug on a HSW.
Theory we've got is that the lbr-checking code could be confused on AMD.
Differential Revision: https://reviews.llvm.org/D97504
Include x86 intrinsics only when compiling for x86_64
or i386. _MSC_VER no longer implies x86.
Reviewed By: gchatelet
Differential Revision: https://reviews.llvm.org/D96498
This attempts to move all tools over to using `add_llvm_library` for
better consistency. After doing this, I noticed it ended up as nearly a
reimplementation of https://reviews.llvm.org/rL342148, which later got
reverted in r342336 (b09a8c9bd9).
With ccache and ninja on a large core machine (40), I haven't run into
build errors, so I'm hopeful it's better now, though it doesn't seem to
be any different / new.
Reviewed By: stephenneuendorffer
Differential Revision: https://reviews.llvm.org/D90970
This attempts to move all tools over to using `add_llvm_library` for
better consistency. After doing this, I noticed it ended up as nearly a
reimplementation of https://reviews.llvm.org/rL342148, which later got
reverted in r342336 (b09a8c9bd9).
With ccache and ninja on a large core machine (40), I haven't run into
build errors, so I'm hopeful it's better now, though it doesn't seem to
be any different / new.
Reviewed By: stephenneuendorffer
Differential Revision: https://reviews.llvm.org/D90970
No longer rely on an external tool to build the llvm component layout.
Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.
These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.
Differential Revision: https://reviews.llvm.org/D90848
The addition of TILELOADD instructions with a new encoding format
triggered a hard abort instead of proper error reporting due to the use
of `llvm_unreachable` for actually reachable code.
Properly report an error when the encoding format is unknown.
Differential Revision: https://reviews.llvm.org/D90289
This is mostly for the benefit of the LBR latency mode.
Right now, it performs no checking. If this is run on non-supported hardware, it will produce all zeroes for latency.
Differential Revision: https://reviews.llvm.org/D85254
New change: Updated lit.local.cfg to use pass the right argument to llvm-exegesis to actually request the LBR mode.
Differential Revision: https://reviews.llvm.org/D88670
This reverts commit 4fcd1a8e65 as
`llvm/test/tools/llvm-exegesis/X86/lbr/mov-add.s` failed on hosts
without LBR supported if the build has LIBPFM enabled. On that host,
`perf_event_open` fails with `EOPNOTSUPP` on LBR config. That change's
basic assumption
> If this is run on a non-supported hardware, it will produce all zeroes for latency.
could not stand as `perf_event_open` system call will fail if the
underlying hardware really don't have LBR supported.
This is mostly for the benefit of the LBR latency mode.
Right now, it performs no checking. If this is run on non-supported hardware, it will produce all zeroes for latency.
Differential Revision: https://reviews.llvm.org/D85254
Starting with Skylake, the LBR contains the precise number of cycles between the two
consecutive branches.
Making use of this will hopefully make the measurements more precise than the
existing methods of using RDTSC.
Differential Revision: https://reviews.llvm.org/D77422
New change: check for existence of field `cycles` in perf_branch_entry before enabling this mode.
This should prevent compilation errors when building for older kernel whose headers don't support it.
From @erichkeane:
```
This patch doesn't seem to build for me:
/iusers/ekeane1/workspaces/llvm-project/llvm/tools/llvm-exegesis/lib/X86/X86Counter.cpp: In function ‘llvm::Error llvm::exegesis::parseDataBuffer(const char*, size_t, const void*, const void*, llvm::SmallVector<long int, 4>*)’:
/iusers/ekeane1/workspaces/llvm-project/llvm/tools/llvm-exegesis/lib/X86/X86Counter.cpp:99:37: error: ‘struct perf_branch_entry’ has no member named ‘cycles’
CycleArray->push_back(Entry.cycles);
I'm on RHEL7, so I have kernel 3.10, so it doesn't have 'cycles'.
According ot this: https://elixir.bootlin.com/linux/v4.3/source/include/uapi/linux/perf_event.h#L963 kernel 4.3 is the first time that 'cycles' appeared in this structure.
```
Starting with Skylake, the LBR contains the precise number of cycles between the two
consecutive branches.
Making use of this will hopefully make the measurements more precise than the
existing methods of using RDTSC.
Differential Revision: https://reviews.llvm.org/D77422
These are documented as using modrm byte of 0xe8, 0xf0, and 0xf8
respectively. But hardware ignore bits 2:0. So 0xe9-0xef is treated
the same as 0xe8. Similar for the other two.
Fixing this required adding 8 new formats to the X86 instructions
to convey this information. Could have gotten away with 3, but
adding all 8 made for a more logical conversion from format to
modrm encoding.
I renumbered the format encodings to keep the register modrm
formats grouped together.
isPrefix was added to support the patches to align branches.
it relies on a switch over instruction names.
This moves those opcodes to a new format so the information is
tablegen and we can just check for a specific value in some bits
in TSFlags instead.
I've left the other function in place for now so that the
existing patches in phabricator will still work. I'll work with
the owner to get them migrated.
function_ref is non-owning, so if we get it as a parameter in constructor,
our reference goes out-of-scope as soon as constructor returns.
Instead, let's just take it as a parameter to the actual `generate()` call
Summary:
Currently, we only have nice exploration for LEA instruction,
while for the rest, we rely on `randomizeUnsetVariables()`
to sometimes generate something interesting.
While that works, it isn't very reliable in coverage :)
Here, i'm making an assumption that while we may want to explore
multi-instruction configs, we are most interested in the
characteristics of the main instruction we were asked about.
Which we can do, by taking the existing `randomizeMCOperand()`,
and turning it on it's head - instead of relying on it to randomly fill
one of the interesting values, let's pregenerate all the possible interesting
values for the variable, and then generate as much `InstructionTemplate`
combinations of these possible values for variables as needed/possible.
Of course, that requires invasive changes to no longer pass just the
naked `Instruction`, but sometimes partially filled `InstructionTemplate`.
As it can be seen from the test, this allows us to explore
`X86::OperandType::OPERAND_COND_CODE` for instructions
that take such an operand.
I'm hoping this will greatly simplify exploration.
Reviewers: courbet, gchatelet
Reviewed By: gchatelet
Subscribers: orodley, mgorny, sdardis, tschuett, jrtc27, atanasyan, mstojanovic, andreadb, RKSimon, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74156
Summary:
It turns out that CUR_DIRECTION is just an internal placeholder, not an actual
valid encoded value.
Reviewers: gchatelet
Subscribers: tschuett, mstojanovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73343
Summary:
What we're redoing already exists in the X86 backend, it's called
`X86II::getOperandBias`.
Reviewers: gchatelet
Subscribers: tschuett, mstojanovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73340
Summary:
... instead of crashing.
On typical exmaple is when there are no available registers.
Reviewers: gchatelet
Subscribers: tschuett, mstojanovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73196
Summary:
Right now when picking a back-to-back instruction at random, we might select
instructions that we do not know how to handle.
Add a ExegesisTarget hook to possibly filter instructions.
Reviewers: gchatelet
Subscribers: tschuett, mstojanovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73161
Summary:
Add unit test to show the issue: We must select an *aliasing* output
register, not the exact register.
Reviewers: gchatelet
Subscribers: tschuett, mstojanovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73095
The addition of `inverse_throughput` mode highlighted the disjointedness
of snippet generators and benchmark runners because it used the
`UopsSnippetGenerator` with the `LatencyBenchmarkRunner`.
To keep the code consistent tie the snippet generators to
parallelization/serialization rather than their benchmark runners.
Renaming `LatencySnippetGenerator` -> `SerialSnippetGenerator`.
Renaming `UopsSnippetGenerator` -> `ParallelSnippetGenerator`.
Differential Revision: https://reviews.llvm.org/D72928
Summary: This is a following up to D70874. It adds the initialization of FPCW in llvm-exegesis.
Reviewers: craig.topper, RKSimon, courbet, gchatelet
Subscribers: tschuett, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70891
Summary: This patch is used to initialize the new added register MXCSR.
Reviewers: craig.topper, RKSimon
Subscribers: tschuett, courbet, llvm-commits, LiuChen3
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70874