llvm-project

Commit Graph

Author	SHA1	Message	Date
Andrew Savonichev	bba25a9cd8	[MCA] Support carry-over instructions for in-order processors Instructions that have more uops than the processor's IssueWidth are issued in multiple cycles. The patch fixes PR49712. Differential Revision: https://reviews.llvm.org/D99339	2021-03-26 00:06:19 +03:00
Andrew Savonichev	292da93d59	[MCA] Disable RCU for InOrderIssueStage This is a follow-up for: D98604 [MCA] Ensure that writes occur in-order When instructions are aligned by the order of writes, they retire in-order naturally. There is no need for an RCU, so it is disabled. Differential Revision: https://reviews.llvm.org/D98628	2021-03-24 13:54:04 +03:00
Andrew Savonichev	e6ce0db378	[MCA] Ensure that writes occur in-order Delay the issue of a new instruction if that leads to out-of-order commits of writes. This patch fixes the problem described in: https://bugs.llvm.org/show_bug.cgi?id=41796#c3 Differential Revision: https://reviews.llvm.org/D98604	2021-03-18 17:10:20 +03:00
Andrew Savonichev	d791695cb5	[MCA] Add support for in-order CPUs This patch adds a pipeline to support in-order CPUs such as ARM Cortex-A55. In-order pipeline implements a simplified version of Dispatch, Scheduler and Execute stages as a single stage. Entry and Retire stages are common for both in-order and out-of-order pipelines. Differential Revision: https://reviews.llvm.org/D94928	2021-03-04 14:08:19 +03:00
David Green	6c89f6fae4	[AArch64] Attempt to fix Mac tests with a more specific triple. NFC	2021-01-04 11:29:18 +00:00
Usman Nadeem	685c8b537a	[AARCH64] Improve accumulator forwarding for Cortex-A57 model The old CPU model only had MLA->MLA forwarding. I added some missing MUL->MLA read advances and a missing absolute diff accumulator read advance according to the Cortex A57 Software Optimization Guide. The patch improves performance in EEMBC rgbyiqv2 by about 6%-7% and spec2006/milc by 8% (repeated runs on multiple devices), causes no significant regressions (none in SPEC). Differential Revision: https://reviews.llvm.org/D92296	2021-01-04 10:58:43 +00:00
Sjoerd Meijer	630d37dc1b	[AArch64] Enable Cortex-A55 schedmodel The model was committed in `4b8ade837e` but not yet enabled to allow for a few fix ups. This adds a few of these fixes, and also a LLVM MCA test to check most instructions. While I do have plans to look into some more tuning, it's time to enable this as it better than using the A53 schedule. Differential Revision: https://reviews.llvm.org/D88017	2020-11-30 19:28:34 +00:00
Caroline Concatto	71038788ce	Revert "[AArch64][AsmParser] Remove 'x31' alias for 'sp/xzr' register." This reverts commit `8b281bfaf3`.	2020-11-02 08:15:50 +00:00
Caroline Concatto	8b281bfaf3	[AArch64][AsmParser] Remove 'x31' alias for 'sp/xzr' register. Only the aliases 'xzr' and 'sp' exist for the physical register x31. The reason for wanting to remove the alias 'x31' is because it allows users to write invalid asm that is not accepted by the GNU assembler. Is there any objection to removing this alias? Or do we want to keep this for compatibility with existing code that uses w31/x31? Differential Revision: https://reviews.llvm.org/D90153	2020-11-02 07:57:05 +00:00
Evgeny Leviant	2e61cd1295	[MachineScheduler] Fix operand scheduling for pre/post-increment loads Differential revision: https://reviews.llvm.org/D87557	2020-09-12 16:53:12 +03:00
Andrea Di Biagio	5578ec32f9	[MCA] Fixed a bug where loads and stores were sometimes incorrectly marked as depedent. Fixes PR45793. This fixes a regression introduced by a very old commit `280ac1fd1d` (was llvm-svn 361950). Commit `280ac1fd1d` redesigned the logic in the LSUnit with the goal of speeding up isReady() queries, and stabilising the LSUnit API (while also making the load store unit more customisable). The concept of MemoryGroup (effectively an alias set) was added by that commit to better describe and track dependencies between memory operations. However, that concept was not just used for alias dependencies, but it was also used for describing memory "order" dependencies (enforced by the memory consistency model). Instructions of a same memory group were considered "equivalent" as in: independent operations that can potentially execute in parallel. The problem was that the cost of a dependency (in terms of number of cycles) should have been different for "order" dependency. Instructions in an order dependency simply have to have to wait until their predecessors are "issued" to an underlying pipeline (rather than having to wait until predecessors have beeng fully executed). For simple "order" dependencies, this was effectively introducing an artificial delay on the "issue" of independent loads and stores. This patch fixes the issue and adds a new test named 'independent-load-stores.s' to a bunch of x86 targets. That test contains the reproducible posted by Fabian Ritter on PR45793. I had to rerun the update-mca-tests script on several files. To avoid expected regressions on some Exynos tests, I have added a -noalias=false flag (to match the old strict behavior on latencies). Some tests for processor Barcelona are improved/fixed by this change and they now show better results. In a few tests we were incorrectly counting the time spent by instructions in a scheduler queue. In one case in particular we now correctly see a store executed out of order. That test was affected by the same underlying issue reported as PR45793. Reviewers: mattd Differential Revision: https://reviews.llvm.org/D79351	2020-05-05 10:25:36 +01:00
Evandro Menezes	ff0f407e90	[MCA] Fix test cases (NFC) Fix the test cases for Exynos M5 that break under Darwin.	2019-11-22 16:19:58 -06:00
Evandro Menezes	48b7fe02a1	[AArch64] Add the pipeline model for Exynos M5 Add the scheduling and cost models for Exynos M5.	2019-11-22 15:09:17 -06:00
Eric Christopher	8259182e51	Revert "[AArch64] Add the pipeline model for Exynos M5" as it's causing test failures in llvm-mca. This reverts commit `9bdfee2a3b`.	2019-11-20 16:04:52 -08:00
Evandro Menezes	9bdfee2a3b	[AArch64] Add the pipeline model for Exynos M5 Add the scheduling and cost models for Exynos M5.	2019-11-20 16:56:07 -06:00
Evandro Menezes	80c03fb5c2	[mca] Fix test case (NFC) Fix test case for Darwin builds.	2019-10-31 16:44:52 -05:00
Evandro Menezes	f9af4ccb8a	[AArch64] Update for Exynos Fix the costs of `add` and `orr` with an immediate operand.	2019-10-31 15:25:22 -05:00
Evandro Menezes	215da6606c	[clang][llvm] Obsolete Exynos M1 and M2	2019-10-30 15:02:59 -05:00
Andrea Di Biagio	f6a60f1f80	[llvm-mca][scheduler-stats] Print issued micro opcodes per cycle. NFCI It makes more sense to print out the number of micro opcodes that are issued every cycle rather than the number of instructions issued per cycle. This behavior is also consistent with the dispatch-stats: numbers from the two views can now be easily compared. llvm-svn: 357919	2019-04-08 16:05:54 +00:00
Evandro Menezes	946fe976fd	[llvm-mca] Update tests for Exynos (NFC) Update test cases for Exynos M4. llvm-svn: 350961	2019-01-11 19:36:27 +00:00
Evandro Menezes	9b7b5b1dcc	[llvm-mca] Update the Exynos test cases (NFC) Add more entropy to the test cases. llvm-svn: 350662	2019-01-08 22:29:56 +00:00
Evandro Menezes	7927a45cdb	[llvm-mca] Rename directory for the Cortex tests (NFC) llvm-svn: 349688	2018-12-19 22:24:42 +00:00
Evandro Menezes	7f37ec7cd3	[llvm-mca] Update Exynos test cases (NFC) llvm-svn: 349687	2018-12-19 22:24:39 +00:00
Evandro Menezes	5d409b2278	[AArch64] Improve the Exynos M3 pipeline model llvm-svn: 349652	2018-12-19 17:37:51 +00:00
Evandro Menezes	1cfab9747d	[llvm-mca] Split test (NFC) Split the Exynos test of the register offset addressing mode into separate loads and stores tests. llvm-svn: 349651	2018-12-19 17:37:14 +00:00
Evandro Menezes	031abc2bd7	[llvm-mca] Improve test (NFC) Add more instruction variations for Exynos. llvm-svn: 349567	2018-12-18 23:19:52 +00:00
Evandro Menezes	4bfd4ce1bc	[llvm-mca] Update the Exynos test cases (NFC) Add more entropy to the test cases. llvm-svn: 349537	2018-12-18 20:46:03 +00:00
Evandro Menezes	53f0d41dc4	[AArch64] Refactor the Exynos scheduling predicates Refactor the scheduling predicates based on `MCInstPredicate`. In this case, for the Exynos processors. Differential revision: https://reviews.llvm.org/D55345 llvm-svn: 348774	2018-12-10 17:17:26 +00:00
Evandro Menezes	7ea7de55ea	[llvm-mca] Add new tests for Exynos (NFC) llvm-svn: 348766	2018-12-10 16:22:29 +00:00
Hans Wennborg	c56cc3a889	Fix test/tools/llvm-mca/AArch64/Exynos/direct-branch.s on Mac It was failing as below. Adding a triple seems to help. -- : 'RUN: at line 2'; /work/llvm.combined/build.release/bin/llvm-mca -march=aarch64 -mcpu=exynos-m1 -resource-pressure=false < /work/llvm.combined/llvm/test/tools/llvm-mca/AArch64/Exynos/direct-branch.s \| /work/llvm.combined/build.release/bin/FileCheck /work/llvm.combined/llvm/test/tools/llvm-mca/AArch64/Exynos/direct-branch.s -check-prefixes=ALL,M1 : 'RUN: at line 3'; /work/llvm.combined/build.release/bin/llvm-mca -march=aarch64 -mcpu=exynos-m3 -resource-pressure=false < /work/llvm.combined/llvm/test/tools/llvm-mca/AArch64/Exynos/direct-branch.s \| /work/llvm.combined/build.release/bin/FileCheck /work/llvm.combined/llvm/test/tools/llvm-mca/AArch64/Exynos/direct-branch.s -check-prefixes=ALL,M3 -- Exit Code: 1 Command Output (stderr): -- /work/llvm.combined/llvm/test/tools/llvm-mca/AArch64/Exynos/direct-branch.s:36:12: error: M1-NEXT: expected string not found in input ^ <stdin>:21:2: note: scanning from here 1 0 0.25 b Ltmp0 ^ -- llvm-svn: 348577	2018-12-07 09:58:33 +00:00
Evandro Menezes	51df880e70	[llvm-mca] Improve test (NFC) Add more instructions to the test for Cortex. llvm-svn: 348565	2018-12-07 03:23:36 +00:00
Evandro Menezes	83beb91450	[llvm-mca] Improve test (NFC) Add a label to make explicit that the branch is short for Exynos. llvm-svn: 348564	2018-12-07 03:23:14 +00:00
Evandro Menezes	5d42bc7ce8	[llvm-mca] Simplify test (NFC) llvm-svn: 348395	2018-12-05 18:34:51 +00:00
Evandro Menezes	86953e4350	[llvm-mca] Sort test run lines (NFC) llvm-svn: 348393	2018-12-05 18:30:06 +00:00
Evandro Menezes	56368c6fa5	[AArch64] Refactor the scheduling predicates (2/3) (NFC) Refactor the scheduling predicates based on `MCInstPredicate`. In this case, `AArch64InstrInfo::hasShiftedReg()`. Differential revision: https://reviews.llvm.org/D54820 llvm-svn: 347598	2018-11-26 21:47:41 +00:00
Evandro Menezes	b02ac8bd21	[AArch64] Refactor the scheduling predicates (1/3) (NFC) Refactor the scheduling predicates based on `MCInstPredicate`. In this case, `AArch64InstrInfo::isScaledAddr()` Differential revision: https://reviews.llvm.org/D54777 llvm-svn: 347597	2018-11-26 21:47:28 +00:00
Evandro Menezes	079bf4b7b4	[TableGen] Emit more variant transitions `llvm-mca` relies on the predicates to be based on `MCSchedPredicate` in order to resolve the scheduling for variant instructions. Otherwise, it aborts the building of the instruction model early. However, the scheduling model emitter in `TableGen` gives up too soon, unless all processors use only such predicates. In order to allow more processors to be used with `llvm-mca`, this patch emits scheduling transitions if any processor uses these predicates. The transition emitted for the processors using legacy predicates is the one specified with `NoSchedPred`, which is based on `MCSchedPredicate`. Preferably, `llvm-mca` should instead assume a reasonable default when a variant transition is not based on `MCSchedPredicate` for a given processor. This issue should be revisited in the future. Differential revision: https://reviews.llvm.org/D54648 llvm-svn: 347504	2018-11-23 21:17:33 +00:00
Evandro Menezes	d0792170a3	[llvm-mca] Add test case (NFC) Add test case that will serve as the base for D54820. llvm-svn: 347440	2018-11-22 00:38:36 +00:00
Evandro Menezes	b9f9042648	[llvm-mca] Add test case (NFC) Fix previous commit r347434. llvm-svn: 347437	2018-11-21 23:36:40 +00:00
Evandro Menezes	34b32a3019	[llvm-mca] Add test case (NFC) Add test case that will serve as the base for D54777. llvm-svn: 347434	2018-11-21 22:57:46 +00:00
Andrea Di Biagio	a2eee47450	[llvm-mca] Add fields "Total uOps" and "uOps Per Cycle" to the report generated by the SummaryView. This patch adds two new fields to the perf report generated by the SummaryView. Fields are now logically organized into two small groups; only the second group contains throughput indicators. Example: ``` Iterations: 100 Instructions: 300 Total Cycles: 414 Total uOps: 700 Dispatch Width: 4 uOps Per Cycle: 1.69 IPC: 0.72 Block RThroughput: 4.0 ``` This patch also updates the docs for llvm-mca. Due to the nature of this change, several tests in the tools/llvm-mca directory were affected, and had to be updated using script `update_mca_test_checks.py`. llvm-svn: 340946	2018-08-29 17:56:39 +00:00
Andrea Di Biagio	a03f2a77f8	[llvm-mca] Fix PR38575: Avoid an invalid implicit truncation of a processor resource mask (an uint64_t value) to unsigned. This patch fixes a regression introduced at revision 338702. A processor resource mask was incorrectly implicitly truncated to an unsigned quantity. Later on, the truncated mask was used to initialize an element of a vector of processor resource descriptors. On targets with more than 32 processor resources, some elements of the vector are left uninitialized. As a consequence, this bug might have eventually caused a crash due to null dereference in the Scheduler. This patch fixes PR38575, and adds a test for it. llvm-svn: 339768	2018-08-15 12:53:38 +00:00
Andrea Di Biagio	d2e2c053cf	[llvm-mca] Use a different character to flag instructions with side-effects in the Instruction Info View. NFC This makes easier to identify changes in the instruction info flags. It also helps spotting potential regressions similar to the one recently introduced at r336728. Using the same character to mark MayLoad/MayStore/HasSideEffects is problematic for llvm-lit. When pattern matching substrings, llvm-lit consumes tabs and spaces. A change in position of the flag marker may not trigger a test failure. This patch only changes the character used for flag `hasSideEffects`. The reason why I didn't touch other flags is because I want to avoid spamming the mailing because of the massive diff due to the numerous tests affected by this change. In future, each instruction flag should be associated with a different character in the Instruction Info View. llvm-svn: 336797	2018-07-11 12:44:44 +00:00
Sanjay Patel	59313be8d3	[CodeGen] assume max/default throughput for unspecified instructions This is a fix for the problem arising in D47374 (PR37678): https://bugs.llvm.org/show_bug.cgi?id=37678 We may not have throughput info because it's not specified in the model or it's not available with variant scheduling, so assume that those instructions can execute/complete at max-issue-width. Differential Revision: https://reviews.llvm.org/D47723 llvm-svn: 334055	2018-06-05 23:34:45 +00:00
Roman Lebedev	7b53d1454f	[llvm-mca] Make sure not to end the test files with an empty line. Summary: It's super irritating. [properly configured] git client then complains about that double-newline, and you have to use `--force` to ignore the warning, since even if you fix it manually, it will be reintroduced the very next runtime :/ Reviewers: RKSimon, andreadb, courbet, craig.topper, javed.absar, gbedwell Reviewed By: gbedwell Subscribers: javed.absar, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D47697 llvm-svn: 333887	2018-06-04 11:48:46 +00:00
Greg Bedwell	e790f6fb06	[UpdateTestChecks] Improved update_mca_test_checks block analysis Previously update_mca_test_checks worked entirely at "block" level where a block is some sequence of lines delimited by at least one empty line. This generally worked well, but could sometimes lead to excessive repetition of check lines for various prefixes if some block was almost identical between prefixes, but not quite (for example, due to a different dispatch width in the otherwise identical summary views). This new analyis attempts to split blocks further in the case where the following conditions are met: a) There is some prefix common to every RUN line (typically 'ALL'). b) The first line of the block is common to the output with every prefix. c) The block has the same number of lines for the output with every prefix. Also, regenerated all llvm-mca test files with the following command: update_mca_test_checks.py "../test/tools/llvm-mca//.s" "../test/tools/llvm-mca///*.s" The new analysis showed a "multiple lines not disambiguated by prefixes" warning for test "AArch64/Exynos/scheduler-queue-usage.s" so I've also added some explicit prefixes to each of the RUN lines in that test. Differential Revision: https://reviews.llvm.org/D47321 llvm-svn: 333204	2018-05-24 16:36:44 +00:00
Andrea Di Biagio	cb1ed400a4	[llvm-mca] Removed an empty line generated by the timeline view. NFC. Also, regenerate all tests. llvm-svn: 332853	2018-05-21 17:11:56 +00:00
Andrea Di Biagio	45ccdd1785	[llvm-mca] Regenerate tests after r332381 and r332361. NFC llvm-svn: 332447	2018-05-16 10:12:06 +00:00
Andrea Di Biagio	e047d3529b	[llvm-mca] Correctly handle zero-latency stores that consume pipeline resources. This fixes PR37293. We can have scheduling classes with no write latency entries, that still consume processor resources. We don't want to treat those instructions as zero-latency instructions; they still have to be issued to the underlying pipelines, so they still consume resource cycles. This is likely to be a regression which I have accidentally introduced at revision 330807. Now, if an instruction has a non-empty set of write processor resources, we conservatively treat it as a normal (i.e. non zero-latency) instruction. llvm-svn: 331193	2018-04-30 15:55:04 +00:00
Greg Bedwell	90d141a295	[UpdateTestChecks] Add update_mca_test_checks.py script This script can be used to regenerate tests in the test/tools/llvm-mca directory (PR36904). Regenerated a number of tests using the pattern: test/tools/llvm-mca///*.s Differential Revision: https://reviews.llvm.org/D45369 llvm-svn: 330246	2018-04-18 10:27:45 +00:00

1 2

56 Commits