Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								3a7e8e77e6 
								
							 
						 
						
							
							
								
								amdgcn: Consolidate atomic minmax helpers  
							
							 
							
							... 
							
							
							
							Removes most overrides
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry
llvm-svn: 347665 
							
						 
						
							2018-11-27 16:01:13 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								f663e7e6da 
								
							 
						 
						
							
							
								
								amdgcn: Move __clc_amdgcn_s_waitcnt definition to clc file  
							
							 
							
							... 
							
							
							
							Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 346082 
							
						 
						
							2018-11-04 00:39:27 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								0e95b6a579 
								
							 
						 
						
							
							
								
								amdgcn: Convert get_num_groups to clc  
							
							 
							
							... 
							
							
							
							Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 346081 
							
						 
						
							2018-11-04 00:39:25 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								97283de27d 
								
							 
						 
						
							
							
								
								amdgcn: Convert get_global_size to clc  
							
							 
							
							... 
							
							
							
							Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 346080 
							
						 
						
							2018-11-04 00:39:20 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								ea2f32b75d 
								
							 
						 
						
							
							
								
								amdgcn: Convert get_local_size to clc  
							
							 
							
							... 
							
							
							
							Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 346079 
							
						 
						
							2018-11-04 00:39:16 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								faa1ff16c1 
								
							 
						 
						
							
							
								
								amdgcn: Use __constant AS for amdgcn builtins.  
							
							 
							
							... 
							
							
							
							Fixes build after clang r338707.
Reviewer: Matthew.Arsenault@amd.com 
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 338898 
							
						 
						
							2018-08-03 15:14:08 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								70a270da5f 
								
							 
						 
						
							
							
								
								Add initial support for half precision builtins  
							
							 
							
							... 
							
							
							
							v2: fix fmax implementation
    use consistent checks for __CLC_FP_SIZE
    add missing TODOs
    fix whitespace in definitions.h
v3: undef ZERO in modf.inc
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Reviewed-by: Aaron Watry <awatry@gmail.com>
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 332677 
							
						 
						
							2018-05-17 22:55:30 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								8fa100dfe3 
								
							 
						 
						
							
							
								
								amdgcn/fmin: Fix typos that reduced precision  
							
							 
							
							... 
							
							
							
							Not sure how these sneaked in.
Fixes fminD and few other tests(fractD, cosD) on carrizo
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 330198 
							
						 
						
							2018-04-17 18:11:29 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								fd11db19c2 
								
							 
						 
						
							
							
								
								amdgcn: Update datalayout after LLVM r328656  
							
							 
							
							... 
							
							
							
							Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 329290 
							
						 
						
							2018-04-05 14:47:44 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								f96b1b88f8 
								
							 
						 
						
							
							
								
								amdgcn/fmax: fcanonicalize operands  
							
							 
							
							... 
							
							
							
							v_max instruction needs canonicalized operands.
Passes CTS on carrizo
Reviewer: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 327076 
							
						 
						
							2018-03-08 23:01:01 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								e724e346ab 
								
							 
						 
						
							
							
								
								amdgcn/fmin: fcanonicalize operands  
							
							 
							
							... 
							
							
							
							v_min instruction needs canonicalized operands.
Passes CTS on carrizo
Reviewer: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 327075 
							
						 
						
							2018-03-08 23:00:58 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								04a46bf0a2 
								
							 
						 
						
							
							
								
								amdgcn,popcount: Workaround broken llvm.ctpop intrinsic on some GCN ASICs  
							
							 
							
							... 
							
							
							
							This is only really needed for VI+ ASICs. However, llvm would cast the value to
i32 for older asics anyway. The proper fix is in LLVM-7 (r326535).
Fixes CTS popcount on carrizo.
Reviewer: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 327044 
							
						 
						
							2018-03-08 18:58:07 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								1ad6a94676 
								
							 
						 
						
							
							
								
								amdgcn: Fix build after GDS/const AS swap in r325030  
							
							 
							
							... 
							
							
							
							Acked-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 325866 
							
						 
						
							2018-02-23 07:37:01 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								eda1872d04 
								
							 
						 
						
							
							
								
								amdgcn: Fix datalayout after addition of 32bit const AS in r324747  
							
							 
							
							... 
							
							
							
							Acked-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 325865 
							
						 
						
							2018-02-23 07:36:54 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								911666f3fa 
								
							 
						 
						
							
							
								
								amdgcn: Fix datalayout after clang r324101  
							
							 
							
							... 
							
							
							
							r324101 switched around AS numbering
Acked-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 325863 
							
						 
						
							2018-02-23 07:36:39 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								c420b61b26 
								
							 
						 
						
							
							
								
								amdgcn: Add missing datalayout info to .ll files  
							
							 
							
							... 
							
							
							
							Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Acked-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 316239 
							
						 
						
							2017-10-20 21:10:18 +00:00  
						
					 
				
					
						
							
							
								 
								Jeroen Ketema
							
						 
						
							 
							
							
							
							
								
							
							
								fe9fa89854 
								
							 
						 
						
							
							
								
								Let get_work_dim take exactly 0 arguments  
							
							 
							
							... 
							
							
							
							Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 314634 
							
						 
						
							2017-10-01 20:11:46 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								ce29e8cde1 
								
							 
						 
						
							
							
								
								Restore support for llvm-3.9  
							
							 
							
							... 
							
							
							
							Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Acked-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 314543 
							
						 
						
							2017-09-29 19:06:41 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								c9bbbe2403 
								
							 
						 
						
							
							
								
								Implement cl_khr_int64_extended_atomics builtins  
							
							 
							
							... 
							
							
							
							Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 313811 
							
						 
						
							2017-09-20 20:42:19 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								3d1db3de74 
								
							 
						 
						
							
							
								
								amdgcn,waitcnt: Add datalayout info  
							
							 
							
							... 
							
							
							
							This file is only compiled for GCN which all share the same layout
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 312493 
							
						 
						
							2017-09-04 15:52:07 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								999b1d9426 
								
							 
						 
						
							
							
								
								amdgcn: rewrite barrier() using fence and clang __builtin_amdgcn_s_barrier  
							
							 
							
							... 
							
							
							
							Specs require using fences when barrier() is invoked:
"The barrier function will either flush any variables stored in local memory
or queue a memory fence to ensure correct ordering of memory operations to local memory."
and
"The barrier function will queue a memory fence to ensure correct ordering
of memory operations to global memory."
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 311022 
							
						 
						
							2017-08-16 17:09:00 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								1977092dc3 
								
							 
						 
						
							
							
								
								amdgcn: Implement {read_,write_,}mem_fence builtin  
							
							 
							
							... 
							
							
							
							v2: add more detailed comment about waitcnt instruction
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 311021 
							
						 
						
							2017-08-16 17:08:56 +00:00  
						
					 
				
					
						
							
							
								 
								Matt Arsenault
							
						 
						
							 
							
							
							
							
								
							
							
								958fce3192 
								
							 
						 
						
							
							
								
								amdgcn: Fix return type of get_num_groups  
							
							 
							
							... 
							
							
							
							llvm-svn: 279723 
							
						 
						
							2016-08-25 07:31:40 +00:00  
						
					 
				
					
						
							
							
								 
								Matt Arsenault
							
						 
						
							 
							
							
							
							
								
							
							
								26d9c41ff6 
								
							 
						 
						
							
							
								
								amdgcn: Fix return type for get_global_size  
							
							 
							
							... 
							
							
							
							llvm-svn: 279644 
							
						 
						
							2016-08-24 17:52:04 +00:00  
						
					 
				
					
						
							
							
								 
								Matt Arsenault
							
						 
						
							 
							
							
							
							
								
							
							
								314364cbd2 
								
							 
						 
						
							
							
								
								amdgpu: Fix default case value for get_local_size  
							
							 
							
							... 
							
							
							
							llvm-svn: 279359 
							
						 
						
							2016-08-20 04:17:17 +00:00  
						
					 
				
					
						
							
							
								 
								Matt Arsenault
							
						 
						
							 
							
							
							
							
								
							
							
								220268d177 
								
							 
						 
						
							
							
								
								amdgcn: Fix get_local_size IR return type  
							
							 
							
							... 
							
							
							
							llvm-svn: 279350 
							
						 
						
							2016-08-20 00:01:21 +00:00  
						
					 
				
					
						
							
							
								 
								Matt Arsenault
							
						 
						
							 
							
							
							
							
								
							
							
								2ce3d94a01 
								
							 
						 
						
							
							
								
								amdgcn: Correct return types to be size_t  
							
							 
							
							... 
							
							
							
							llvm-svn: 279343 
							
						 
						
							2016-08-19 22:49:39 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								a82e080b57 
								
							 
						 
						
							
							
								
								AMDGPU: Implement get_global_offset builtin  
							
							 
							
							... 
							
							
							
							Also fix get_global_id to consider offset
No idea how to add this for ptx, so they are stuck with the old get_global_id
implementation.
v2: split to a separate patch
v3: Switch R600 to use implictarg.ptr
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 276443 
							
						 
						
							2016-07-22 17:24:24 +00:00  
						
					 
				
					
						
							
							
								 
								Jan Vesely
							
						 
						
							 
							
							
							
							
								
							
							
								74f02db922 
								
							 
						 
						
							
							
								
								AMDGPU: Use clang intrinsics for workitem builtins  
							
							 
							
							... 
							
							
							
							v2: split into 2 patches
    use clang builtins for other intrinsics as well
v3: Fix warnings
    Switch r600 to use implictarg.ptr
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 276442 
							
						 
						
							2016-07-22 17:24:20 +00:00  
						
					 
				
					
						
							
							
								 
								Matt Arsenault
							
						 
						
							 
							
							
							
							
								
							
							
								b456c6dd56 
								
							 
						 
						
							
							
								
								Replace llvm.AMDGPU.ldexp with llvm.amdgcn.ldexp  
							
							 
							
							... 
							
							
							
							It didn't really work on r600 to begin with, which should
get its own intrinsic.
llvm-svn: 275813 
							
						 
						
							2016-07-18 16:42:50 +00:00  
						
					 
				
					
						
							
							
								 
								Matt Arsenault
							
						 
						
							 
							
							
							
							
								
							
							
								45e6eaaa05 
								
							 
						 
						
							
							
								
								amdgcn: Use new workitem intrinsics  
							
							 
							
							... 
							
							
							
							llvm-svn: 261042 
							
						 
						
							2016-02-17 00:27:27 +00:00  
						
					 
				
					
						
							
							
								 
								Matt Arsenault
							
						 
						
							 
							
							
							
							
								
							
							
								a48e15c6cb 
								
							 
						 
						
							
							
								
								Split sources for amdgcn and r600  
							
							 
							
							... 
							
							
							
							Most files remain in a common amdgpu directory.
Also switches barriers to to use convergent,
and use llvm.amdgcn.s.barrier.
This now requires 3.9/trunk to build amdgcn.
llvm-svn: 260777 
							
						 
						
							2016-02-13 01:01:59 +00:00