Matt Arsenault
							
						 
						
							 
							
							
							
							
								
							
							
								3dbeefa978 
								
							 
						 
						
							
							
								
								AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel  
							
							 
							
							... 
							
							
							
							Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.
Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).
llvm-svn: 298444 
							
						 
						
							2017-03-21 21:39:51 +00:00  
						
					 
				
					
						
							
							
								 
								Matt Arsenault
							
						 
						
							 
							
							
							
							
								
							
							
								5d8eb25e78 
								
							 
						 
						
							
							
								
								AMDGPU: Use unsigned compare for eq/ne  
							
							 
							
							... 
							
							
							
							For some reason there are both of these available, except
for scalar 64-bit compares which only has u64. I'm not sure
why there are both (I'm guessing it's for the one bit inputs we
don't use), but for consistency always using the
unsigned one.
llvm-svn: 282832 
							
						 
						
							2016-09-30 01:50:20 +00:00  
						
					 
				
					
						
							
							
								 
								Matt Arsenault
							
						 
						
							 
							
							
							
							
								
							
							
								3d1c1deb04 
								
							 
						 
						
							
							
								
								AMDGPU: Run SIFoldOperands after PeepholeOptimizer  
							
							 
							
							... 
							
							
							
							PeepholeOptimizer cleans up redundant copies, which makes
the operand folding more effective.
shader-db stats:
Totals:
SGPRS: 34200 -> 34336 (0.40 %)
VGPRS: 22118 -> 21655 (-2.09 %)
Code Size: 632144 -> 633460 (0.21 %) bytes
LDS: 11 -> 11 (0.00 %) blocks
Scratch: 10240 -> 11264 (10.00 %) bytes per wave
Max Waves: 8822 -> 8918 (1.09 %)
Wait states: 0 -> 0 (0.00 %)
Totals from affected shaders:
SGPRS: 7704 -> 7840 (1.77 %)
VGPRS: 5169 -> 4706 (-8.96 %)
Code Size: 234444 -> 235760 (0.56 %) bytes
LDS: 2 -> 2 (0.00 %) blocks
Scratch: 0 -> 1024 (0.00 %) bytes per wave
Max Waves: 1188 -> 1284 (8.08 %)
Wait states: 0 -> 0 (0.00 %)
Increases:
SGPRS: 35 (0.01 %)
VGPRS: 1 (0.00 %)
Code Size: 59 (0.02 %)
LDS: 0 (0.00 %)
Scratch: 1 (0.00 %)
Max Waves: 48 (0.02 %)
Wait states: 0 (0.00 %)
Decreases:
SGPRS: 26 (0.01 %)
VGPRS: 54 (0.02 %)
Code Size: 68 (0.03 %)
LDS: 0 (0.00 %)
Scratch: 0 (0.00 %)
Max Waves: 4 (0.00 %)
Wait states: 0 (0.00 %)
llvm-svn: 266378 
							
						 
						
							2016-04-14 21:58:24 +00:00  
						
					 
				
					
						
							
							
								 
								Matt Arsenault
							
						 
						
							 
							
							
							
							
								
							
							
								7900334dd5 
								
							 
						 
						
							
							
								
								AMDGPU: Fold bitcasts of scalar constants to vectors  
							
							 
							
							... 
							
							
							
							This cleans up some messes since the individual scalar components
can be CSEed.
llvm-svn: 266376 
							
						 
						
							2016-04-14 21:58:07 +00:00  
						
					 
				
					
						
							
							
								 
								Matt Arsenault
							
						 
						
							 
							
							
							
							
								
							
							
								9c47dd583a 
								
							 
						 
						
							
							
								
								AMDGPU: Remove some old intrinsic uses from tests  
							
							 
							
							... 
							
							
							
							llvm-svn: 260493 
							
						 
						
							2016-02-11 06:02:01 +00:00  
						
					 
				
					
						
							
							
								 
								Matt Arsenault
							
						 
						
							 
							
							
							
							
								
							
							
								e8df879948 
								
							 
						 
						
							
							
								
								AMDGPU: Improve accuracy of instruction rates for some FP instructions  
							
							 
							
							... 
							
							
							
							llvm-svn: 245774 
							
						 
						
							2015-08-22 00:50:41 +00:00  
						
					 
				
					
						
							
							
								 
								Tom Stellard
							
						 
						
							 
							
							
							
							
								
							
							
								e48fe2a27a 
								
							 
						 
						
							
							
								
								AMDGPU/SI: Add support for shrinking v_cndmask_b32_e32 instructions  
							
							 
							
							... 
							
							
							
							Reviewers: arsenm
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11061 
llvm-svn: 242146 
							
						 
						
							2015-07-14 14:15:03 +00:00  
						
					 
				
					
						
							
							
								 
								Tom Stellard
							
						 
						
							 
							
							
							
							
								
							
							
								45bb48ea19 
								
							 
						 
						
							
							
								
								R600 -> AMDGPU rename  
							
							 
							
							... 
							
							
							
							llvm-svn: 239657 
							
						 
						
							2015-06-13 03:28:10 +00:00