89 lines
		
	
	
		
			3.0 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
			
		
		
	
	
			89 lines
		
	
	
		
			3.0 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
| ==========================
 | |
| Vector Predication Roadmap
 | |
| ==========================
 | |
| 
 | |
| .. contents:: Table of Contents
 | |
|   :depth: 3
 | |
|   :local:
 | |
| 
 | |
| Motivation
 | |
| ==========
 | |
| 
 | |
| This proposal defines a roadmap towards native vector predication in LLVM,
 | |
| specifically for vector instructions with a mask and/or an explicit vector
 | |
| length.  LLVM currently has no target-independent means to model predicated
 | |
| vector instructions for modern SIMD ISAs such as AVX512, ARM SVE, the RISC-V V
 | |
| extension and NEC SX-Aurora.  Only some predicated vector operations, such as
 | |
| masked loads and stores, are available through intrinsics [MaskedIR]_.
 | |
| 
 | |
| The Vector Predication (VP) extensions is a concrete RFC and prototype
 | |
| implementation to achieve native vector predication in LLVM.  The VP prototype
 | |
| and all related discussions can be found in the VP patch on Phabricator
 | |
| [VPRFC]_.
 | |
| 
 | |
| Roadmap
 | |
| =======
 | |
| 
 | |
| 1. IR-level VP intrinsics
 | |
| -------------------------
 | |
| 
 | |
| - There is a consensus on the semantics/instruction set of VP.
 | |
| - VP intrinsics and attributes are available on IR level.
 | |
| - TTI has capability flags for VP (``supportsVP()``?,
 | |
|   ``haveActiveVectorLength()``?).
 | |
| 
 | |
| Result: VP usable for IR-level vectorizers (LV, VPlan, RegionVectorizer),
 | |
| potential integration in Clang with builtins.
 | |
| 
 | |
| 2. CodeGen support
 | |
| ------------------
 | |
| 
 | |
| - VP intrinsics translate to first-class SDNodes
 | |
|   (eg  ``llvm.vp.fdiv.* -> vp_fdiv``).
 | |
| - VP legalization (legalize explicit vector length to mask (AVX512), legalize VP
 | |
|   SDNodes to pre-existing ones (SSE, NEON)).
 | |
| 
 | |
| Result: Backend development based on VP SDNodes.
 | |
| 
 | |
| 3. Lift InstSimplify/InstCombine/DAGCombiner to VP
 | |
| --------------------------------------------------
 | |
| 
 | |
| - Introduce PredicatedInstruction, PredicatedBinaryOperator, .. helper classes
 | |
|   that match standard vector IR and VP intrinsics.
 | |
| - Add a matcher context to PatternMatch and context-aware IR Builder APIs.
 | |
| - Incrementally lift DAGCombiner to work on VP SDNodes as well as on regular
 | |
|   vector instructions.
 | |
| - Incrementally lift InstCombine/InstSimplify to operate on VP as well as
 | |
|   regular IR instructions.
 | |
| 
 | |
| Result: Optimization of VP intrinsics on par with standard vector instructions.
 | |
| 
 | |
| 4. Deprecate llvm.masked.* / llvm.experimental.reduce.*
 | |
| -------------------------------------------------------
 | |
| 
 | |
| - Modernize llvm.masked.* / llvm.experimental.reduce* by translating to VP.
 | |
| - DCE transitional APIs.
 | |
| 
 | |
| Result: VP has superseded earlier vector intrinsics.
 | |
| 
 | |
| 5. Predicated IR Instructions
 | |
| -----------------------------
 | |
| 
 | |
| - Vector instructions have an optional mask and vector length parameter. These
 | |
|   lower to VP SDNodes (from Stage 2).
 | |
| - Phase out VP intrinsics, only keeping those that are not equivalent to
 | |
|   vectorized scalar instructions (reduce,  shuffles, ..)
 | |
| - InstCombine/InstSimplify expect predication in regular Instructions (Stage (3)
 | |
|   has laid the groundwork).
 | |
| 
 | |
| Result: Native vector predication in IR.
 | |
| 
 | |
| References
 | |
| ==========
 | |
| 
 | |
| .. [MaskedIR] `llvm.masked.*` intrinsics,
 | |
|    https://llvm.org/docs/LangRef.html#masked-vector-load-and-store-intrinsics
 | |
| 
 | |
| .. [VPRFC] RFC: Prototype & Roadmap for vector predication in LLVM,
 | |
|    https://reviews.llvm.org/D57504
 |