llvm-project

Commit Graph

Author	SHA1	Message	Date
Andrea Di Biagio	1c3bcc6ce5	[llvm-mca] Speed up the computation of the wait/ready/issued sets in the Scheduler. This patch is a follow-up to r338702. We don't need to use a map to model the wait/ready/issued sets. It is much more efficient to use a vector instead. This patch gives us an average 7.5% speedup (on top of the ~12% speedup obtained after r338702). llvm-svn: 338883	2018-08-03 12:55:28 +00:00
Andrea Di Biagio	c2619a2f3d	[llvm-mca] Use a vector to store ResourceState objects in the ResourceManager. We don't need to use a map to store ResourceState objects. The number of processor resources is known statically from the scheduling model. We can therefore use a vector, and reserve a slot for each processor resource that we want to simulate. Every time the ResourceManager queries the ResourceState vector, the index to the vector of ResourceState objects can be easily computed from the processor resource mask. This drastically reduces the time complexity of method ResourceManager::use() and method ResourceManager::release(). This patch gives an average speedup of 12%. llvm-svn: 338702	2018-08-02 11:12:35 +00:00
Andrea Di Biagio	7f3bf5c1f9	[llvm-mca] Correctly update the rank in `Scheduler::select()`. Found by inspection. llvm-svn: 338579	2018-08-01 16:06:33 +00:00
Andrea Di Biagio	23fbe7cbb5	[llvm-mca] Improve a few debug prints. NFC llvm-svn: 337003	2018-07-13 14:55:47 +00:00
Andrea Di Biagio	61c52af9d9	[llvm-mca] improve the instruction issue logic implemented by the Scheduler. This patch modifies the Scheduler heuristic used to select the next instruction to issue to the pipelines. The motivating example is test X86/BtVer2/add-sequence.s, for which llvm-mca wrongly reported an estimated IPC of 1.50. According to perf, the actual IPC for that test should have been ~2.00. It turns out that an IPC of 2.00 for test add-sequence.s cannot possibly be predicted by a Scheduler that only prioritizes instructions based on their "age". A similar issue also affected test X86/BtVer2/dependent-pmuld-paddd.s, for which llvm-mca wrongly estimated an IPC of 0.84 instead of an IPC of 1.00. Instructions in the ReadyQueue are now ranked based on two factors: - The "age" of an instruction. - The number of unique users of writes associated with an instruction. The new logic still prioritizes older instructions over younger instructions to minimize the pressure on the reorder buffer. However, the number of users of an instruction now also affects the overall rank. This potentially increases the ability of the Scheduler to extract instruction level parallelism. This patch fixes the problem with the wrong IPC reported for test add-sequence.s and test dependent-pmuld-paddd.s. llvm-svn: 336420	2018-07-06 08:08:30 +00:00
Andrea Di Biagio	eb1bef60b9	[llvm-mca] Avoid calling method update() on instructions that are already in the IS_READY state. NFCI When promoting instructions from the wait queue to the ready queue, we should check if an instruction has already reached the IS_READY state before calling method update(). llvm-svn: 335722	2018-06-27 11:17:07 +00:00
Andrea Di Biagio	580f3eb226	[llvm-mca] Removed wrong NDEBUG guards introduced by my last commit. This partially reverts r335589. llvm-svn: 335592	2018-06-26 11:00:21 +00:00
Andrea Di Biagio	eec6b81922	[llvm-mca] Remove unused header files and correctly guard some include headers under NDEBUG. NFC llvm-svn: 335589	2018-06-26 10:44:12 +00:00
Matt Davis	dea343d2b3	[llvm-mca] Rename Backend to Pipeline. NFC. Summary: This change renames the Backend and BackendPrinter to Pipeline and PipelinePrinter respectively. Variables and comments have also been updated to reflect this change. The reason for this rename, is to be slightly more correct about what MCA is modeling. MCA models a Pipeline, which implies some logical sequence of stages. Reviewers: andreadb, courbet, RKSimon Reviewed By: andreadb, courbet Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D48496 llvm-svn: 335496	2018-06-25 16:53:00 +00:00
Matt Davis	488ac4cb39	[llvm-mca] Introduce the ExecuteStage (was originally the Scheduler class). Summary: This patch transforms the Scheduler class into the ExecuteStage. Most of the logic remains. Reviewers: andreadb, RKSimon, courbet Reviewed By: andreadb Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D47246 llvm-svn: 334679	2018-06-14 01:20:18 +00:00
Andrea Di Biagio	4037011404	[llvm-mca] Fixed a problem caused by an invalid use of a processor resource mask in the Scheduler. The lambda functions used by method ResourceManager::mustIssueImmediately() was incorrectly truncating masks of buffered processor resources to 32-bit quantities. The invalid mask values were then used to access a map of processor resource descriptors. Fixes PR37643. llvm-svn: 333692	2018-05-31 20:27:46 +00:00
Matt Davis	5b79ffc5bc	[llvm-mca] Add the RetireStage. Summary: This class maintains the same logic as the original RetireControlUnit. This is just an intermediate patch to make the RCU a Stage. Future patches will remove the dependency on the DispatchStage, and then more properly populate the pre/execute/post Stage interface. Reviewers: andreadb, RKSimon, courbet Reviewed By: andreadb, courbet Subscribers: javed.absar, mgorny, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D47244 llvm-svn: 333292	2018-05-25 18:00:25 +00:00
Matt Davis	679083e3d8	[llvm-mca] Make Dispatch a subclass of Stage. Summary: The logic of dispatch remains the same, but now DispatchUnit is a Stage (DispatchStage). This change has the benefit of simplifying the backend runCycle() code. The same logic applies, but it belongs to different components now. This is just a start, eventually we will need to remove the call to the DispatchStage in Scheduler.cpp, but that will be a separate patch. This change is mostly a renaming and moving of existing logic. This change also encouraged me to remove the Subtarget (STI) member from the Backend class. That member was used to initialize the other members of Backend and to eventually call DispatchUnit::dispatch(). Now that we have Stages, we can eliminate this by instantiating the DispatchStage with everything it needs at the time of construction (e.g., Subtarget). That change allows us to call DispatchStage::execute(IR) as we expect to call execute() for all other stages. Once we add the Stage list (D46907) we can more cleanly call preExecute() on all of the stages, DispatchStage, will probably wrap cycleEvent() in that case. Made some formatting and minor cleanups to README.txt. Some of the text was re-flowed to stay within 80 cols. Reviewers: andreadb, courbet, RKSimon Reviewed By: andreadb, courbet Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D46983 llvm-svn: 332652	2018-05-17 19:22:29 +00:00
Andrea Di Biagio	8ea3a34e39	[llvm-mca] Improved support for dependency-breaking instructions. The tool assumes that a zero-latency instruction that doesn't consume hardware resources is an optimizable dependency-breaking instruction. That means, it doesn't have to wait on register input operands, and it doesn't consume any physical register. The PRF knows how to optimize it at register renaming stage. llvm-svn: 332249	2018-05-14 15:08:22 +00:00
Nicola Zaghen	d34e60ca85	Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240	2018-05-14 12:53:11 +00:00
Matt Davis	21a8d32307	[llvm-mca] Avoid exposing index values in the MCA interfaces. Summary: This patch eliminates many places where we originally needed to pass index values to represent an instruction. The index is still used as a key, in various parts of MCA. I'm not comfortable eliminating the index just yet. By burying the index in the instruction, we can avoid exposing that value in many places. Eventually, we should consider removing the Instructions list in the Backend all together, it's only used to hold and reclaim the memory for the allocated Instruction instances. Instead we could pass around a smart pointer. But that's a separate discussion/patch. Reviewers: andreadb, courbet, RKSimon Reviewed By: andreadb Subscribers: javed.absar, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D46367 llvm-svn: 331660	2018-05-07 18:29:15 +00:00
Andrea Di Biagio	e047d3529b	[llvm-mca] Correctly handle zero-latency stores that consume pipeline resources. This fixes PR37293. We can have scheduling classes with no write latency entries, that still consume processor resources. We don't want to treat those instructions as zero-latency instructions; they still have to be issued to the underlying pipelines, so they still consume resource cycles. This is likely to be a regression which I have accidentally introduced at revision 330807. Now, if an instruction has a non-empty set of write processor resources, we conservatively treat it as a normal (i.e. non zero-latency) instruction. llvm-svn: 331193	2018-04-30 15:55:04 +00:00
Matt Davis	ad78e6673c	[MCA] [NFC] Remove unused Index formal from ResourceManager::issueInstruction Summary: The instruction index was never referenced in the body. Just a minor cleanup. Reviewers: andreadb Reviewed By: andreadb Subscribers: javed.absar, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D46142 llvm-svn: 331001	2018-04-26 22:30:40 +00:00
Andrea Di Biagio	db66efcb6a	[llvm-mca] Remove method Instruction::isZeroLatency(). NFCI llvm-svn: 330807	2018-04-25 09:38:58 +00:00
Andrea Di Biagio	27c4b09626	[llvm-mca] Refactor the Scheduler interface in preparation for PR36663. Zero latency instructions are now scheduled the same way as other instructions. Before this patch, there was a specialzed code path for those instructions. All scheduler events are now generated from method `scheduleInstruction()` and from method `cycleEvent()`. This will make easier to implement a "execution stage", and let that stage publish all the scheduler events. No functional change intended. llvm-svn: 330723	2018-04-24 14:53:16 +00:00
Andrea Di Biagio	c752616f30	[llvm-mca] Ensure that instructions with a schedule read-advance are always issued in the right order. Normally, the Scheduler prioritizes older instructions over younger instructions during the instruction issue stage. In one particular case where a dependent instruction had a schedule read-advance associated to one of the input operands, this rule was not correctly applied. This patch fixes the issue and adds a test to verify that we don't regress that particular case. llvm-svn: 330032	2018-04-13 15:19:07 +00:00
Andrea Di Biagio	3e64644de8	[llvm-mca] Removed unused argument from cycleEvent. NFC llvm-svn: 329895	2018-04-12 10:49:40 +00:00
Andrea Di Biagio	b24953bbfb	[llvm-mca] Let the Scheduler notify dispatch stall events caused by the lack of scheduling resources. This patch moves part of the logic that notifies dispatch stall events from the DispatchUnit to the Scheduler. The main goal of this patch is to remove (yet another) dependency between the DispatchUnit and the Scheduler. Before this patch, the DispatchUnit had to know about `Scheduler::Event` and how to classify stalls due to the lack of scheduling resources. This patch removes that knowledge and simplifies the logic in DispatchUnit::checkScheduler. This is another change done in preparation for the work to fix PR36663. No functional change intended. llvm-svn: 329835	2018-04-11 18:05:23 +00:00
Andrea Di Biagio	0a837ef6b1	[llvm-mca] Correctly set the ReadAdvance information for register use operands. The tool was passing the wrong operand index to method MCSubtargetInfo::getReadAdvanceCycles(). That method requires a "UseIdx", and not the operand index. This was found when testing X86 code where instructions had a memory folded operand. This patch fixes the issue and adds test read-advance-1.s to ensure that the ReadAfterLd (a ReadAdvance of 3cy) information is correctly used. llvm-svn: 328790	2018-03-29 14:26:56 +00:00
Andrea Di Biagio	51dba7d3ab	[llvm-mca] Make the resource cost a double. This is done in preparation for the fix for PR36874. The number of cycles consumed for each pipe is now a double quantity. This allows reuse of the resource pressure view to print out instruction tables. llvm-svn: 328335	2018-03-23 17:36:07 +00:00
Andrea Di Biagio	2dee62bd0a	[llvm-mca] Minor refactoring. NFCI Also, removed a couple of unused methods from class Instruction. llvm-svn: 328198	2018-03-22 14:14:49 +00:00
Andrea Di Biagio	09ea09e478	[llvm-mca] Simplify (and better standardize) the Instruction interface. llvm-svn: 328190	2018-03-22 11:39:34 +00:00
Andrea Di Biagio	3562248825	[llvm-mca] Simplify code. NFC llvm-svn: 328187	2018-03-22 10:19:20 +00:00
Andrea Di Biagio	847accd001	[llvm-mca] Remove const from a bunch of ArrayRef. NFC llvm-svn: 328018	2018-03-20 19:06:34 +00:00
Andrea Di Biagio	a3f2e483dd	[llvm-mca] Move the logic that computes the scheduler's queue usage to the BackendStatistics view. This patch introduces two new callbacks in the event listener interface to handle the "buffered resource reserved" event and the "buffered resource released" event. Every time a buffered resource is used, an event is generated. Before this patch, the Scheduler (with the help of the ResourceManager) was responsible for tracking the scheduler's queue usage. However, that design forced the Scheduler to 'publish' scheduler's queue pressure information through the Backend interface. The goal of this patch is to break the dependency between the BackendStatistics view, and the Backend. Now the Scheduler knows how to notify "buffer reserved/released" events. The scheduler's queue usage analysis has been moved to the BackendStatistics. Differential Revision: https://reviews.llvm.org/D44686 llvm-svn: 328011	2018-03-20 18:20:39 +00:00
Andrea Di Biagio	4704f0386b	[llvm-mca] Move the routine that computes processor resource masks to its own file. Function computeProcResourceMasks is used by the ResourceManager (owned by the Scheduler) to compute resource masks for processor resources. Before this refactoring, there was an implicit dependency between the Scheduler and the InstrBuilder. That is because InstrBuilder has to know about resource masks when computing the set of processor resources consumed by a new instruction. With this patch, the functionality that computes resource masks has been extracted from the ResourceManager, and moved to a separate file (Support.h). This helps removing the dependency between the Scheduler and the InstrBuilder. No functional change intended. llvm-svn: 327973	2018-03-20 12:25:54 +00:00
Andrea Di Biagio	44bfcd2d63	[llvm-mca] Simplify code. NFC llvm-svn: 327886	2018-03-19 19:09:38 +00:00
Andrea Di Biagio	b52297508e	[llvm-mca] Remove the logic that computes the reciprocal throughput, and make the SummaryView independent from the Backend. NFCI Since r327420, the tool can query the MCSchedModel interface to obtain the reciprocal throughput information. As a consequence, method `ResourceManager::getRThroughput`, and method `Backend::getRThroughput` are no longer needed. This patch simplifies the code by removing the custom RThroughput computation. This patch also refactors class SummaryView by removing the dependency with the Backend object. No functional change intended. llvm-svn: 327425	2018-03-13 17:24:32 +00:00
Andrea Di Biagio	e1a1da1126	[llvm-mca] Use a const ArrayRef in a few places. NFC llvm-svn: 327396	2018-03-13 13:58:02 +00:00
Clement Courbet	844f22d3c3	[llvm-mca] Refactor event listeners to make the backend agnostic to event types. Summary: This is a first step towards making the pipeline configurable. Subscribers: llvm-commits, andreadb Differential Revision: https://reviews.llvm.org/D44309 llvm-svn: 327389	2018-03-13 13:11:01 +00:00
Andrea Di Biagio	0c54129907	[llvm-mca] Views are now independent from resource masks. NFCI This change removes method Backend::getProcResourceMasks() and simplifies some logic in the Views. This effectively removes yet another dependency between the views and the Backend. No functional change intended. llvm-svn: 327214	2018-03-10 16:55:07 +00:00
Andrea Di Biagio	373c38a2db	[llvm-mca] Fix handling of zero-latency instructions. This patch fixes a problem found when testing zero latency instructions on target AArch64 -mcpu=exynos-m3 / -mcpu=exynos-m1. On Exynos-m3/m1, direct branches are zero-latency instructions that don't consume any processor resources. The DispatchUnit marks zero-latency instructions as "executed", so that no scheduling is required. The event of instruction executed is then notified to all the listeners, and the reorder buffer (managed by the RetireControlUnit) is updated. In particular, the entry associated to the zero-latency instruction in the reorder buffer is marked as executed. Before this patch, the DispatchUnit forgot to assign a retire control unit token (RCUToken) to the zero-latency instruction. As a consequence, the RCUToken was used uninitialized. This was causing a crash in the RetireControlUnit logic. Fixes PR36650. llvm-svn: 327056	2018-03-08 20:21:55 +00:00
Andrea Di Biagio	3a6b092017	[llvm-mca] LLVM Machine Code Analyzer. llvm-mca is an LLVM based performance analysis tool that can be used to statically measure the performance of code, and to help triage potential problems with target scheduling models. llvm-mca uses information which is already available in LLVM (e.g. scheduling models) to statically measure the performance of machine code in a specific cpu. Performance is measured in terms of throughput as well as processor resource consumption. The tool currently works for processors with an out-of-order backend, for which there is a scheduling model available in LLVM. The main goal of this tool is not just to predict the performance of the code when run on the target, but also help with diagnosing potential performance issues. Given an assembly code sequence, llvm-mca estimates the IPC (instructions per cycle), as well as hardware resources pressure. The analysis and reporting style were mostly inspired by the IACA tool from Intel. This patch is related to the RFC on llvm-dev visible at this link: http://lists.llvm.org/pipermail/llvm-dev/2018-March/121490.html Differential Revision: https://reviews.llvm.org/D43951 llvm-svn: 326998	2018-03-08 13:05:02 +00:00

38 Commits