llvm-project

Commit Graph

Author	SHA1	Message	Date
Andrea Di Biagio	ff630c2cdc	[llvm-mca][BtVer2] teach how to identify false dependencies on partially written registers. The goal of this patch is to improve the throughput analysis in llvm-mca for the case where instructions perform partial register writes. On x86, partial register writes are quite difficult to model, mainly because different processors tend to implement different register merging schemes in hardware. When the code contains partial register writes, the IPC (instructions per cycles) estimated by llvm-mca tends to diverge quite significantly from the observed IPC (using perf). Modern AMD processors (at least, from Bulldozer onwards) don't rename partial registers. Quoting Agner Fog's microarchitecture.pdf: " The processor always keeps the different parts of an integer register together. For example, AL and AH are not treated as independent by the out-of-order execution mechanism. An instruction that writes to part of a register will therefore have a false dependence on any previous write to the same register or any part of it." This patch is a first important step towards improving the analysis of partial register updates. It changes the semantic of RegisterFile descriptors in tablegen, and teaches llvm-mca how to identify false dependences in the presence of partial register writes (for more details: see the new code comments in include/Target/TargetSchedule.h - class RegisterFile). This patch doesn't address the case where a write to a part of a register is followed by a read from the whole register. On Intel chips, high8 registers (AH/BH/CH/DH)) can be stored in separate physical registers. However, a later (dirty) read of the full register (example: AX/EAX) triggers a merge uOp, which adds extra latency (and potentially affects the pipe usage). This is a very interesting article on the subject with a very informative answer from Peter Cordes: https://stackoverflow.com/questions/45660139/how-exactly-do-partial-registers-on-haswell-skylake-perform-writing-al-seems-to In future, the definition of RegisterFile can be extended with extra information that may be used to identify delays caused by merge opcodes triggered by a dirty read of a partial write. Differential Revision: https://reviews.llvm.org/D49196 llvm-svn: 337123	2018-07-15 11:01:38 +00:00
Andrea Di Biagio	bb25e27f58	[llvm-mca] A write latency cannot be a negative value. NFC llvm-svn: 336437	2018-07-06 13:46:10 +00:00
Andrea Di Biagio	fa2d16f4ab	[llvm-mca] Fix RegisterFile debug prints. NFC llvm-svn: 336367	2018-07-05 16:13:49 +00:00
Andrea Di Biagio	877f9a7e39	[llvm-mca] Use a WriteRef to describe register writes in class RegisterFile. This patch introduces a new class named WriteRef. A WriteRef is used by the RegisterFile to keep track of register definitions. Internally it wraps a WriteState, as well as the source index of the defining instruction. This patch allows the tool to propagate additional information to support future analysis on data dependencies. llvm-svn: 335867	2018-06-28 15:50:26 +00:00
Andrea Di Biagio	eb1bef60b9	[llvm-mca] Avoid calling method update() on instructions that are already in the IS_READY state. NFCI When promoting instructions from the wait queue to the ready queue, we should check if an instruction has already reached the IS_READY state before calling method update(). llvm-svn: 335722	2018-06-27 11:17:07 +00:00
Matt Davis	dea343d2b3	[llvm-mca] Rename Backend to Pipeline. NFC. Summary: This change renames the Backend and BackendPrinter to Pipeline and PipelinePrinter respectively. Variables and comments have also been updated to reflect this change. The reason for this rename, is to be slightly more correct about what MCA is modeling. MCA models a Pipeline, which implies some logical sequence of stages. Reviewers: andreadb, courbet, RKSimon Reviewed By: andreadb, courbet Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D48496 llvm-svn: 335496	2018-06-25 16:53:00 +00:00
Andrea Di Biagio	757600bccb	[llvm-mca] Correctly update the CyclesLeft of a register read in the presence of partial register updates. This patch fixe the logic in ReadState::cycleEvent(). That method was not correctly updating field `TotalCycles`. Added extra code comments in class ReadState to better describe each field. llvm-svn: 334028	2018-06-05 17:12:02 +00:00
Andrea Di Biagio	db66efcb6a	[llvm-mca] Remove method Instruction::isZeroLatency(). NFCI llvm-svn: 330807	2018-04-25 09:38:58 +00:00
Andrea Di Biagio	0a837ef6b1	[llvm-mca] Correctly set the ReadAdvance information for register use operands. The tool was passing the wrong operand index to method MCSubtargetInfo::getReadAdvanceCycles(). That method requires a "UseIdx", and not the operand index. This was found when testing X86 code where instructions had a memory folded operand. This patch fixes the issue and adds test read-advance-1.s to ensure that the ReadAfterLd (a ReadAdvance of 3cy) information is correctly used. llvm-svn: 328790	2018-03-29 14:26:56 +00:00
Andrea Di Biagio	2dee62bd0a	[llvm-mca] Minor refactoring. NFCI Also, removed a couple of unused methods from class Instruction. llvm-svn: 328198	2018-03-22 14:14:49 +00:00
Andrea Di Biagio	09ea09e478	[llvm-mca] Simplify (and better standardize) the Instruction interface. llvm-svn: 328190	2018-03-22 11:39:34 +00:00
Andrea Di Biagio	3562248825	[llvm-mca] Simplify code. NFC llvm-svn: 328187	2018-03-22 10:19:20 +00:00
Andrea Di Biagio	3a6b092017	[llvm-mca] LLVM Machine Code Analyzer. llvm-mca is an LLVM based performance analysis tool that can be used to statically measure the performance of code, and to help triage potential problems with target scheduling models. llvm-mca uses information which is already available in LLVM (e.g. scheduling models) to statically measure the performance of machine code in a specific cpu. Performance is measured in terms of throughput as well as processor resource consumption. The tool currently works for processors with an out-of-order backend, for which there is a scheduling model available in LLVM. The main goal of this tool is not just to predict the performance of the code when run on the target, but also help with diagnosing potential performance issues. Given an assembly code sequence, llvm-mca estimates the IPC (instructions per cycle), as well as hardware resources pressure. The analysis and reporting style were mostly inspired by the IACA tool from Intel. This patch is related to the RFC on llvm-dev visible at this link: http://lists.llvm.org/pipermail/llvm-dev/2018-March/121490.html Differential Revision: https://reviews.llvm.org/D43951 llvm-svn: 326998	2018-03-08 13:05:02 +00:00

13 Commits