84 lines
		
	
	
		
			3.6 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
			
		
		
	
	
			84 lines
		
	
	
		
			3.6 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
| Code Generation Notes for MSA
 | |
| =============================
 | |
| 
 | |
| Intrinsics are lowered to SelectionDAG nodes where possible in order to enable
 | |
| optimisation, reduce the size of the ISel matcher, and reduce repetition in
 | |
| the implementation. In a small number of cases, this can cause different
 | |
| (semantically equivalent) instructions to be used in place of the requested
 | |
| instruction, even when no optimisation has taken place.
 | |
| 
 | |
| Instructions
 | |
| ============
 | |
| 
 | |
| This section describes any quirks of instruction selection for MSA. For
 | |
| example, two instructions might be equally valid for some given IR and one is
 | |
| chosen in preference to the other.
 | |
| 
 | |
| bclri.b:
 | |
|         It is not possible to emit bclri.b since andi.b covers exactly the
 | |
|         same cases. andi.b should use fractionally less power than bclri.b in
 | |
|         most hardware implementations so it is used in preference to bclri.b.
 | |
| 
 | |
| vshf.w:
 | |
|         It is not possible to emit vshf.w when the shuffle description is
 | |
|         constant since shf.w covers exactly the same cases. shf.w is used
 | |
|         instead. It is also impossible for the shuffle description to be
 | |
|         unknown at compile-time due to the definition of shufflevector in
 | |
|         LLVM IR.
 | |
| 
 | |
| vshf.[bhwd]
 | |
|         When the shuffle description describes a splat operation, splat.[bhwd]
 | |
|         instructions will be selected instead of vshf.[bhwd]. Unlike the ilv*,
 | |
|         and pck* instructions, this is matched from MipsISD::VSHF instead of
 | |
|         a special-case MipsISD node.
 | |
| 
 | |
| ilvl.d, pckev.d:
 | |
|         It is not possible to emit ilvl.d, or pckev.d since ilvev.d covers the
 | |
|         same shuffle. ilvev.d will be emitted instead.
 | |
| 
 | |
| ilvr.d, ilvod.d, pckod.d:
 | |
|         It is not possible to emit ilvr.d, or pckod.d since ilvod.d covers the
 | |
|         same shuffle. ilvod.d will be emitted instead.
 | |
| 
 | |
| splat.[bhwd]
 | |
|         The intrinsic will work as expected. However, unlike other intrinsics
 | |
|         it lowers directly to MipsISD::VSHF instead of using common IR.
 | |
| 
 | |
| splati.w:
 | |
|         It is not possible to emit splati.w since shf.w covers the same cases.
 | |
|         shf.w will be emitted instead.
 | |
| 
 | |
| copy_s.w:
 | |
|         On MIPS32, the copy_u.d intrinsic will emit this instruction instead of
 | |
|         copy_u.w. This is semantically equivalent since the general-purpose
 | |
|         register file is 32-bits wide.
 | |
| 
 | |
| binsri.[bhwd],  binsli.[bhwd]:
 | |
|         These two operations are equivalent to each other with the operands
 | |
|         swapped and condition inverted. The compiler may use either one as
 | |
|         appropriate.
 | |
|         Furthermore, the compiler may use bsel.[bhwd] for some masks that do
 | |
|         not survive the legalization process (this is a bug and will be fixed).
 | |
| 
 | |
| bmnz.v, bmz.v, bsel.v:
 | |
|         These three operations differ only in the operand that is tied to the
 | |
|         result and the order of the operands.
 | |
|         It is (currently) not possible to emit bmz.v, or bsel.v since bmnz.v is
 | |
|         the same operation and will be emitted instead.
 | |
|         In future, the compiler may choose between these three instructions
 | |
|         according to register allocation.
 | |
|         These three operations can be very confusing so here is a mapping
 | |
|         between the instructions and the vselect node in one place:
 | |
|                 bmz.v  wd, ws, wt/i8 -> (vselect wt/i8, wd, ws)
 | |
|                 bmnz.v wd, ws, wt/i8 -> (vselect wt/i8, ws, wd)
 | |
|                 bsel.v wd, ws, wt/i8 -> (vselect wd, wt/i8, ws)
 | |
| 
 | |
| bmnzi.b, bmzi.b:
 | |
|         Like their non-immediate counterparts, bmnzi.v and bmzi.v are the same
 | |
|         operation with the operands swapped. bmnzi.v will (currently) be emitted
 | |
|         for both cases.
 | |
| 
 | |
| bseli.v:
 | |
|         Unlike the non-immediate versions, bseli.v is distinguishable from
 | |
|         bmnzi.b and bmzi.b and can be emitted.
 |