forked from OSchip/llvm-project
				
			
		
			
				
	
	
		
			134 lines
		
	
	
		
			4.9 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
			
		
		
	
	
			134 lines
		
	
	
		
			4.9 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
//===----------------------------------------------------------------------===//
 | 
						|
// Representing sign/zero extension of function results
 | 
						|
//===----------------------------------------------------------------------===//
 | 
						|
 | 
						|
Mar 25, 2009  - Initial Revision
 | 
						|
 | 
						|
Most ABIs specify that functions which return small integers do so in a
 | 
						|
specific integer GPR.  This is an efficient way to go, but raises the question:
 | 
						|
if the returned value is smaller than the register, what do the high bits hold?
 | 
						|
 | 
						|
There are three (interesting) possible answers: undefined, zero extended, or
 | 
						|
sign extended.  The number of bits in question depends on the data-type that
 | 
						|
the front-end is referencing (typically i1/i8/i16/i32).
 | 
						|
 | 
						|
Knowing the answer to this is important for two reasons: 1) we want to be able
 | 
						|
to implement the ABI correctly.  If we need to sign extend the result according
 | 
						|
to the ABI, we really really do need to do this to preserve correctness.  2)
 | 
						|
this information is often useful for optimization purposes, and we want the
 | 
						|
mid-level optimizers to be able to process this (e.g. eliminate redundant
 | 
						|
extensions).
 | 
						|
 | 
						|
For example, lets pretend that X86 requires the caller to properly extend the
 | 
						|
result of a return (I'm not sure this is the case, but the argument doesn't
 | 
						|
depend on this).  Given this, we should compile this:
 | 
						|
 | 
						|
int a();
 | 
						|
short b() { return a(); }
 | 
						|
 | 
						|
into:
 | 
						|
 | 
						|
_b:
 | 
						|
	subl	$12, %esp
 | 
						|
	call	L_a$stub
 | 
						|
	addl	$12, %esp
 | 
						|
	cwtl
 | 
						|
	ret
 | 
						|
 | 
						|
An optimization example is that we should be able to eliminate the explicit
 | 
						|
sign extension in this example:
 | 
						|
 | 
						|
short y();
 | 
						|
int z() {
 | 
						|
  return ((int)y() << 16) >> 16;
 | 
						|
}
 | 
						|
 | 
						|
_z:
 | 
						|
	subl	$12, %esp
 | 
						|
	call	_y
 | 
						|
	;;  movswl %ax, %eax   -> not needed because eax is already sext'd
 | 
						|
	addl	$12, %esp
 | 
						|
	ret
 | 
						|
 | 
						|
//===----------------------------------------------------------------------===//
 | 
						|
// What we have right now.
 | 
						|
//===----------------------------------------------------------------------===//
 | 
						|
 | 
						|
Currently, these sorts of things are modelled by compiling a function to return
 | 
						|
the small type and a signext/zeroext marker is used.  For example, we compile
 | 
						|
Z into:
 | 
						|
 | 
						|
define i32 @z() nounwind {
 | 
						|
entry:
 | 
						|
	%0 = tail call signext i16 (...)* @y() nounwind
 | 
						|
	%1 = sext i16 %0 to i32
 | 
						|
	ret i32 %1
 | 
						|
}
 | 
						|
 | 
						|
and b into:
 | 
						|
 | 
						|
define signext i16 @b() nounwind {
 | 
						|
entry:
 | 
						|
	%0 = tail call i32 (...)* @a() nounwind		; <i32> [#uses=1]
 | 
						|
	%retval12 = trunc i32 %0 to i16		; <i16> [#uses=1]
 | 
						|
	ret i16 %retval12
 | 
						|
}
 | 
						|
 | 
						|
This has some problems: 1) the actual precise semantics are really poorly
 | 
						|
defined (see PR3779).  2) some targets might want the caller to extend, some
 | 
						|
might want the callee to extend 3) the mid-level optimizer doesn't know the
 | 
						|
size of the GPR, so it doesn't know that %0 is sign extended up to 32-bits 
 | 
						|
here, and even if it did, it could not eliminate the sext. 4) the code
 | 
						|
generator has historically assumed that the result is extended to i32, which is
 | 
						|
a problem on PIC16 (and is also probably wrong on alpha and other 64-bit
 | 
						|
targets).
 | 
						|
 | 
						|
//===----------------------------------------------------------------------===//
 | 
						|
// The proposal
 | 
						|
//===----------------------------------------------------------------------===//
 | 
						|
 | 
						|
I suggest that we have the front-end fully lower out the ABI issues here to
 | 
						|
LLVM IR.  This makes it 100% explicit what is going on and means that there is
 | 
						|
no cause for confusion.  For example, the cases above should compile into:
 | 
						|
 | 
						|
define i32 @z() nounwind {
 | 
						|
entry:
 | 
						|
        %0 = tail call i32 (...)* @y() nounwind
 | 
						|
	%1 = trunc i32 %0 to i16
 | 
						|
        %2 = sext i16 %1 to i32
 | 
						|
        ret i32 %2
 | 
						|
}
 | 
						|
define i32 @b() nounwind {
 | 
						|
entry:
 | 
						|
	%0 = tail call i32 (...)* @a() nounwind
 | 
						|
	%retval12 = trunc i32 %0 to i16
 | 
						|
	%tmp = sext i16 %retval12 to i32
 | 
						|
	ret i32 %tmp
 | 
						|
}
 | 
						|
 | 
						|
In this model, no functions will return an i1/i8/i16 (and on a x86-64 target
 | 
						|
that extends results to i64, no i32).  This solves the ambiguity issue, allows us 
 | 
						|
to fully describe all possible ABIs, and now allows the optimizers to reason
 | 
						|
about and eliminate these extensions.
 | 
						|
 | 
						|
The one thing that is missing is the ability for the front-end and optimizer to
 | 
						|
specify/infer the guarantees provided by the ABI to allow other optimizations.
 | 
						|
For example, in the y/z case, since y is known to return a sign extended value,
 | 
						|
the trunc/sext in z should be eliminable.
 | 
						|
 | 
						|
This can be done by introducing new sext/zext attributes which mean "I know
 | 
						|
that the result of the function is sign extended at least N bits.  Given this,
 | 
						|
and given that it is stuck on the y function, the mid-level optimizer could
 | 
						|
easily eliminate the extensions etc with existing functionality.
 | 
						|
 | 
						|
The major disadvantage of doing this sort of thing is that it makes the ABI
 | 
						|
lowering stuff even more explicit in the front-end, and that we would like to
 | 
						|
eventually move to having the code generator do more of this work.  However,
 | 
						|
the sad truth of the matter is that this is a) unlikely to happen anytime in
 | 
						|
the near future, and b) this is no worse than we have now with the existing
 | 
						|
attributes.
 | 
						|
 | 
						|
C compilers fundamentally have to reason about the target in many ways.  
 | 
						|
This is ugly and horrible, but a fact of life.
 | 
						|
 |