llvm-project

Commit Graph

Author	SHA1	Message	Date
Andrea Di Biagio	815cdbff29	[X86][Btver2] Improved latency/throughput model for scalar int-to-float conversions. Account for bypass delays when computing the latency of scalar int-to-float conversions. On Jaguar we need to account for an extra 6cy latency (see AMD fam16h SOG). This patch also fixes the number of micropcodes for the register-memory variants of scalar int-to-float conversions. Differential Revision: https://reviews.llvm.org/D57148 llvm-svn: 352518	2019-01-29 16:47:27 +00:00
Simon Pilgrim	4e03b2496d	[llvm-mca][X86] Add missing mfence/pinsrw tests llvm-svn: 351831	2019-01-22 16:01:08 +00:00
Simon Pilgrim	aa6a4339ac	[X86][BtVer2] SSE2 vector shifts has local forwarding disabled Similar to horizontal ops on D56777, the sse2 (but not mmx) bit shift ops has local forwarding disabled, adding +1cy to the use latency for the result. Differential Revision: https://reviews.llvm.org/D57026 llvm-svn: 351817	2019-01-22 13:27:18 +00:00
Andrea Di Biagio	b68dd05c14	[X86][BtVer2] Update the WriteLoad latency. r327630 introduced new write definitions for float/vector loads. Before that revision, WriteLoad was used by both integer/float (scalar/vector) load. So, WriteLoad had to conservatively declare a latency to 5cy. That is because the load-to-use latency for float/vector load is 5cy. Now that we have dedicated writes for float/vector loads, there is no reason why we should keep the latency of WriteLoad to 5cy. At the moment, WriteLoad is only used by scalar integer loads only; we can assume an optimstic 3cy latency for them. This patch changes that latency from 5cy to 3cy, and regenerates the affected scheduling/mca tests. Differential Revision: https://reviews.llvm.org/D56922 llvm-svn: 351742	2019-01-21 12:04:10 +00:00
Simon Pilgrim	66da1ed29d	[X86][Btver2] CVTSS2I/CVTSD2I - add missing JFPU0 pipe We issue JFPU1->JSTC then JFPU0->JFPA then -> JALU0 (integer pipe) Match AMD Fam16h SOG + llvm-exegesis tests llvm-svn: 343314	2018-09-28 13:19:22 +00:00
Andrea Di Biagio	d2e2c053cf	[llvm-mca] Use a different character to flag instructions with side-effects in the Instruction Info View. NFC This makes easier to identify changes in the instruction info flags. It also helps spotting potential regressions similar to the one recently introduced at r336728. Using the same character to mark MayLoad/MayStore/HasSideEffects is problematic for llvm-lit. When pattern matching substrings, llvm-lit consumes tabs and spaces. A change in position of the flag marker may not trigger a test failure. This patch only changes the character used for flag `hasSideEffects`. The reason why I didn't touch other flags is because I want to avoid spamming the mailing because of the massive diff due to the numerous tests affected by this change. In future, each instruction flag should be associated with a different character in the Instruction Info View. llvm-svn: 336797	2018-07-11 12:44:44 +00:00
Roman Lebedev	7b53d1454f	[llvm-mca] Make sure not to end the test files with an empty line. Summary: It's super irritating. [properly configured] git client then complains about that double-newline, and you have to use `--force` to ignore the warning, since even if you fix it manually, it will be reintroduced the very next runtime :/ Reviewers: RKSimon, andreadb, courbet, craig.topper, javed.absar, gbedwell Reviewed By: gbedwell Subscribers: javed.absar, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D47697 llvm-svn: 333887	2018-06-04 11:48:46 +00:00
Simon Pilgrim	1273f4ad93	[X86] Add GPR<->XMM Schedule Tags BtVer2 - fix NumMicroOp and account for the Lat+6cy GPR->XMM and Lat+1cy XMm->GPR delays (see rL332737) The high number of MOVD/MOVQ equivalent instructions meant that there were a number of missed patterns in SNB/Znver1: SNB - add missing GPR<->MMX costs (taken from Agner / Intel AOM) Znver1 - add missing GPR<->XMM MOVQ costs (taken from Agner) llvm-svn: 332745	2018-05-18 17:58:36 +00:00
Simon Pilgrim	3ecb0b80f6	[X86][BtVer2] Partial vector stores (inc MMX) have a 2cy latency llvm-svn: 332722	2018-05-18 14:22:22 +00:00
Simon Pilgrim	c4b8d367a8	[X86][SSE] Ensure vector partial load/stores use the WriteVecLoad/WriteVecStore scheduler classes Retag some instructions that were missed when we split off vector load/store/moves - MOVQ/MOVD etc. Fixes BtVer2/SLM which have different behaviours for GPR stores. llvm-svn: 332718	2018-05-18 14:08:01 +00:00
Simon Pilgrim	d749b321b2	[X86][SSE] Ensure float load/stores use the WriteFLoad/WriteFStore scheduler classes Retag some instructions that were missed when we split off vector load/store/moves - MOVSS/MOVSD/MOVHPD/MOVHPD/MOVLPD/MOVLPS etc. Fixes BtVer2/SLM which have different behaviours for GPR stores. llvm-svn: 332714	2018-05-18 13:13:59 +00:00
Andrea Di Biagio	45ccdd1785	[llvm-mca] Regenerate tests after r332381 and r332361. NFC llvm-svn: 332447	2018-05-16 10:12:06 +00:00
Simon Pilgrim	be9a206883	[X86] Split WriteCvtF2F into F32->F64 and F64->F32 scheduler classes BtVer2 - Fixes schedules for (V)CVTPS2PD instructions A lot of the Intel models still have too many InstRW overrides for these new classes - this needs cleaning up but I wanted to get the classes in first llvm-svn: 332376	2018-05-15 17:36:49 +00:00
Simon Pilgrim	4135de2e93	[llvm-mca][x86] Add scalar nt-store instruction tests llvm-svn: 332262	2018-05-14 17:10:33 +00:00
Simon Pilgrim	061096d2c2	[llvm-mca][x86] Remove addsubpd from SSE2 tests llvm-svn: 331678	2018-05-07 21:10:48 +00:00
Simon Pilgrim	f209321d61	[llvm-mca][X86] Add mmx instruction to btver2 resource tests Useful to see scheduler class deltas against xmm equivalents llvm-svn: 330335	2018-04-19 15:09:46 +00:00
Simon Pilgrim	c310bfa193	[llvm-mca][X86] Add mmx versions of SSSE3 instructions Move PABS instructions incorrectly tested under SSE2 llvm-svn: 330295	2018-04-18 20:47:48 +00:00
Greg Bedwell	90d141a295	[UpdateTestChecks] Add update_mca_test_checks.py script This script can be used to regenerate tests in the test/tools/llvm-mca directory (PR36904). Regenerated a number of tests using the pattern: test/tools/llvm-mca///*.s Differential Revision: https://reviews.llvm.org/D45369 llvm-svn: 330246	2018-04-18 10:27:45 +00:00
Simon Pilgrim	86588fc809	[X86][Btver2] Add vector extract costs llvm-svn: 329524	2018-04-08 11:26:26 +00:00
Simon Pilgrim	8139a88cb6	[X86][Btver2] Strip unnecessary check prefixes from resources tests llvm-svn: 329192	2018-04-04 13:25:45 +00:00
Simon Pilgrim	fcf49df21c	[X86][Btver2] Add (U)COMISD/(U)COMISD scheduler costs Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write) llvm-svn: 328573	2018-03-26 19:01:06 +00:00
Simon Pilgrim	8815105cd5	[X86][Btver2] Add CVTSD2SS/CVTSS2SD scheduler costs llvm-svn: 328541	2018-03-26 16:24:13 +00:00
Simon Pilgrim	aa40148cae	[X86][Btver2] Account for the "+i" integer pipe transfer costs (1cy use of JALU0 for GPR PRF write) llvm-svn: 328536	2018-03-26 16:10:08 +00:00
Simon Pilgrim	0b73b29388	[X86][Btver2] Add CVTSD2SI/CVTSS2SI scheduler costs Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write) This also adds missing vcvttss2si tests llvm-svn: 328505	2018-03-26 15:30:47 +00:00
Simon Pilgrim	67df1cf597	[X86][Btver2] Add (V)SQRTPD/(V)SQRTSD costs The xmm sd/pd versions were using the WriteFSQRT default which is modelled on sqrtss/sqrtps llvm-svn: 328497	2018-03-26 14:03:40 +00:00
Andrea Di Biagio	d1569290ef	[llvm-mca] Add flag -instruction-tables to print the theoretical resource pressure distribution for instructions (PR36874) The goal of this patch is to address most of PR36874. To fully fix PR36874 we need to split the "InstructionInfo" view from the "SummaryView". That would make easy to check the latency and rthroughput as well. The patch reuses all the logic from ResourcePressureView to print out the "instruction tables". We have an entry for every instruction in the input sequence. Each entry reports the theoretical resource pressure distribution. Resource pressure is uniformly distributed across all the processor resource units of a group. At the moment, the backend pipeline is not configurable, so the only way to fix this is by creating a different driver that simply sends instruction events to the resource pressure view. That means, we don't use the Backend interface. Instead, it is simpler to just have a different code-path for when flag -instruction-tables is specified. Once Clement addresses bug 36663, then we can port the "instruction tables" logic into a stage of our configurable pipeline. Updated the BtVer2 test cases (thanks Simon for the help). Now we pass flag -instruction-tables to each modified test. Differential Revision: https://reviews.llvm.org/D44839 llvm-svn: 328487	2018-03-26 12:04:53 +00:00
Simon Pilgrim	e5c0a041ff	[X86][Btver2] Cleanup MOVMSK instructions to use JFPA function unit Add missing non-VEX and (V)PMOVMSKB instructions to the pattern llvm-svn: 328338	2018-03-23 17:38:59 +00:00
Simon Pilgrim	ee282b3160	[X86][Btver2] Vector store instructions use a JFPU1 scheduler pipe and JSAGU/JSTC function units llvm-svn: 328328	2018-03-23 15:35:13 +00:00
Simon Pilgrim	a1e3ea01ef	[X86][Btver2] Vector move/load/store instructions use a JFPU01 scheduler pipe and JFPX/JVALU function unit as well as the AGUs llvm-svn: 328304	2018-03-23 11:27:31 +00:00
Simon Pilgrim	bcb86bb927	[X86][Btver2] Conversion, MaskedLoad/MaskedStore and NTStores all are scheduled through the JFPU1 pipe llvm-svn: 328226	2018-03-22 18:29:16 +00:00
Simon Pilgrim	0e031afa95	[X86][Btver2] FCMP (inc FMAX/FMIN) instructions use the JFPA functional pipe The ymm instructions are double pumped as well. llvm-svn: 328222	2018-03-22 17:43:12 +00:00
Simon Pilgrim	e5b51f6786	[X86][Btver2] FMUL ymm instructions are double pumped on the JFPM functional pipe llvm-svn: 328217	2018-03-22 17:25:38 +00:00
Simon Pilgrim	e16790b133	[X86][Btver2] Modelled float bitwise instructions as being performed on the float cluster (FPA/FPM) not the integer. llvm-svn: 327793	2018-03-18 12:37:35 +00:00
Simon Pilgrim	e409f84e7e	[X86][Btver2] Correctly distinguish between scheduling pipe and functional unit for JWriteResFpuPair defs Jaguar's FPU has 2 scheduler pipes (JFPU0/JFPU1) which forward to multiple functional sub-units each. We need to model that an micro-op will both consume the scheduler pipe and a functional unit. This patch just handles the ops defined through JWriteResFpuPair, I'll go through the custom cases later. llvm-svn: 327791	2018-03-18 12:09:17 +00:00
Simon Pilgrim	0ba4a0f3a6	[X86][Btver2] Add llvm-mca tests to show pipe resource usage of most vector instructions Hopefully these tests can be easily reused should any other subtarget get in depth llvm-mca coverage (we can either copy the tests or move them into a common dir and run it with multiple prefixes). llvm-svn: 327788	2018-03-18 09:32:38 +00:00

35 Commits