Tom Stellard
bf9f76fbe0
Implement log1p builtin
...
llvm-svn: 219230
2014-10-07 20:22:42 +00:00
Jan Vesely
8f64c3d842
Implement fmod
...
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 219087
2014-10-05 20:24:52 +00:00
Tom Stellard
081e778d22
Implement async_work_group_copy builtin v3
...
This is a simple implementation which just copies data synchronously.
v2:
- Use size_t.
v3:
- Fix possible race condition by splitting the copy among multiple
work items.
llvm-svn: 219008
2014-10-03 19:49:39 +00:00
Tom Stellard
ed5bbfdb1b
Implement async_work_group_strided_copy builtin v2
...
This is a simple implementation which just copies data synchronously.
v2:
- Use size_t.
llvm-svn: 219007
2014-10-03 19:49:37 +00:00
Tom Stellard
b5064f79ef
Implement wait_group_events builtin v2
...
This is a simple default implemetation which just calls barrier().
v2:
- Only call barrier() once.
llvm-svn: 219006
2014-10-03 19:49:34 +00:00
Aaron Watry
0d976ba497
atomic: Add generic atom[ic]_cmpxchg
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217918
2014-09-16 22:34:49 +00:00
Aaron Watry
025d79ad6c
atomic: Implement generic atom[ic]_xchg
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217917
2014-09-16 22:34:45 +00:00
Aaron Watry
7cfa12c2a5
atomic: Add generic atomic_min implementation
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217916
2014-09-16 22:34:41 +00:00
Aaron Watry
3f0a1a4c27
atomic: Add generic atom[ic]_xor
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217915
2014-09-16 22:34:36 +00:00
Aaron Watry
31e67d1cff
atomic: Add atom[ic]_or
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217914
2014-09-16 22:34:32 +00:00
Aaron Watry
cc68405761
atomics: Add generic atom[ic]_and
...
Not used yet.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217913
2014-09-16 22:34:28 +00:00
Aaron Watry
49614fbfd9
atomic: Add generic implementation of atom[ic]_max
...
Not used yet...
v2: Correct int/uint behavior
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217912
2014-09-16 22:34:24 +00:00
Aaron Watry
c9b88d32be
atomic: define extension functions for existing atomic implementations
...
We were missing the local versions of the atom_* before
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217911
2014-09-16 22:34:21 +00:00
Aaron Watry
947bdd059a
math: Add tan implementation
...
Uses the algorithm:
tan(x) = sin(x) / sqrt(1-sin^2(x))
An alternative is:
tan(x) = sin(x) / cos(x)
Which produces more verbose bitcode and longer assembly.
Either way, the generated bitcode seems pretty nasty and a more optimized
but still precise-enough solution is welcome.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 217511
2014-09-10 15:43:35 +00:00
Aaron Watry
951ab64d19
math: Add asin implementation
...
asin(x) = atan2(x, sqrt( 1-x^2 ))
alternatively:
asin(x) = PI/2 - acos(x)
Use the atan2 implementation since it produces slightly shorter bitcode and
R600 machine code.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 217510
2014-09-10 15:43:32 +00:00
Aaron Watry
268beab921
math: Add acos implementation
...
Passes the tests that were submitted to the piglit list
Tested on R600 (Pitcairn)
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 217509
2014-09-10 15:43:29 +00:00
Jan Vesely
05a60b7ac3
add isordered builtin
...
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 217247
2014-09-05 13:59:15 +00:00
Jan Vesely
63486c1f0e
add isunordered builtin
...
v2: remove trailing newline
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 217246
2014-09-05 13:59:13 +00:00
Jan Vesely
41a0c491de
add islessgreater builtin
...
v2: remove trailing newline
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 217245
2014-09-05 13:59:11 +00:00
Jan Vesely
369e20353c
add isnormal builtin
...
v2: simplify and remove isnan leftovers
remove trailing newline
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 217244
2014-09-05 13:59:09 +00:00
Jan Vesely
a5a3b023b4
add isfinite builtin
...
v2: simplify and remove isinf leftovers
remove trailing newline
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 217243
2014-09-05 13:59:06 +00:00
Tom Stellard
7a9e2c6879
Implement isinf builtin
...
llvm-svn: 217046
2014-09-03 15:55:40 +00:00
Tom Stellard
d8a73abfc3
Fix implementation of copysign
...
This was previously implemented with a macro and we were using
__builtin_copysign(), which takes double inputs for the float
version of copysign().
Reviewed-and-Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 217045
2014-09-03 15:55:38 +00:00
Jan Vesely
ef513d392b
Implement generic mad_sat
...
v2: Fix trailing whitespace
Fix signed long overflow
improve comment
v3: fix typo
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 216923
2014-09-02 17:55:02 +00:00
Aaron Watry
9447097636
Revert "Implement generic mad_sat"
...
This reverts commit cf62eded8b623a1c10d3692d25e5882b7939f564.
I didn't mean to commit this... Jan has a v3 incoming
llvm-svn: 216322
2014-08-23 14:06:01 +00:00
Aaron Watry
6bfac7ae69
Implement generic mad_sat
...
v2: Fix trailing whitespace
Fix signed long overflow
improve comment
Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
llvm-svn: 216320
2014-08-23 14:04:33 +00:00
Tom Stellard
2ad4243bf7
Implement prefetch builtin
...
The default implementation is a no-op. Targets should override this
with their own implementations.
llvm-svn: 216127
2014-08-20 21:23:03 +00:00
Aaron Watry
f991505d02
vload/vstore: Use casts instead of scalarizing everything in CLC version
...
This generates bitcode which is indistinguishable from what was
hand-written for int32 types in v[load|store]_impl.ll.
v4: Use vec2+scalar for vec3 load/stores to prevent corruption (per Tom)
v3: Also remove unused generic/lib/shared/v[load|store]_impl.ll
v2: (Per Matt Arsenault) Fix alignment issues with vector load stores
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: Matt Arsenault <Matthew.Arsenault@amd.com>
CC: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 216069
2014-08-20 13:58:57 +00:00
Jan Vesely
12c660827e
relational: Add islessequal(floatN) builtin
...
v2: remove the initial undef
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 214568
2014-08-01 21:50:59 +00:00
Jan Vesely
acba2c98eb
relational: Add isless(floatN) builtin
...
v2: remove the initial undef
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 214567
2014-08-01 21:50:55 +00:00
Tom Stellard
903a78b7c6
Implement sin builtin for float types
...
This double version still uses @llvm.sin.
llvm-svn: 213762
2014-07-23 15:16:21 +00:00
Tom Stellard
c0ab2f81e3
Implement cos builtin for float types
...
The double version still uses @llvm.cos.
llvm-svn: 213761
2014-07-23 15:16:18 +00:00
Tom Stellard
f9caca8b9d
Implement atan2 builtin
...
llvm-svn: 213760
2014-07-23 15:16:16 +00:00
Tom Stellard
47882923c7
Implement atan builtin
...
llvm-svn: 213759
2014-07-23 15:16:13 +00:00
Aaron Watry
d7f022a582
relational: Implement isnotequal
...
v2: Use relational macros instead of hand-rolled ones
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 213320
2014-07-17 22:07:32 +00:00
Aaron Watry
30102536c0
relational: Implement isgreaterequal
...
v2: Use relational macros instead of hand-rolled macros
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 213319
2014-07-17 22:07:27 +00:00
Aaron Watry
803a992f04
relational: Implement isgreater
...
v2: Use relational macros instead of hand-rolled macros
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 213318
2014-07-17 22:07:19 +00:00
Aaron Watry
d9ee196eab
relational: Implement signbit
...
v2 Changes:
- use __builtin_signbit instead of shifting by hand
- significantly improve vector shuffling
- Works correctly now for signbit(float16) on radeonsi
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 211696
2014-06-25 13:29:23 +00:00
Jeroen Ketema
42df5d2a8f
Add exp10
...
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211680
2014-06-25 10:06:35 +00:00
Jeroen Ketema
09516fa27d
Add pown
...
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211211
2014-06-18 19:42:23 +00:00
Aaron Watry
6af2969a61
math: Implement mix builtin
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211047
2014-06-16 19:53:59 +00:00
Aaron Watry
f7f79d2a94
relational: Add isequal(floatN) builtin
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211046
2014-06-16 19:53:57 +00:00
Aaron Watry
e167db9238
Add all(igentype) builtin
...
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 211045
2014-06-16 19:53:54 +00:00
Jeroen Ketema
82aaa41286
Implementations for exp(float) and exp(double) v2
...
Use separate implementations instead of a macro
to ensure the constant multiplied with is of
higher precision.
v2: Use the correct formula, spotted by Dan Liew <daniel.liew@imperial.ac.uk>
Reviewed-by: Aaron Warty <awatry@gmail.com>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 210891
2014-06-13 09:40:09 +00:00
Tom Stellard
3a12fc6a07
Add sincos
...
Patch by: Jeroen Ketema
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 204478
2014-03-21 16:22:01 +00:00
Tom Stellard
457e35912e
Implement builtins for cl_khr_global_int32_base_atomics extension
...
llvm-svn: 195021
2013-11-18 18:21:23 +00:00
Tom Stellard
436bf70519
Implement sign() builtin
...
llvm-svn: 192384
2013-10-10 19:08:56 +00:00
Tom Stellard
6c7b86c106
Implement nextafter() builtin
...
There are two implementations of nextafter():
1. Using clang's __builtin_nextafter. Clang replaces this builtin with
a call to nextafter which is part of libm. Therefore, this
implementation will only work for targets with an implementation of
libm (e.g. most CPU targets).
2. The other implementation is written in OpenCL C. This function is
known internally as __clc_nextafter and can be used by targets that
don't have access to libm.
llvm-svn: 192383
2013-10-10 19:08:51 +00:00
Tom Stellard
e36e9dec65
Implement isnan() builtin
...
llvm-svn: 192382
2013-10-10 19:08:41 +00:00
Aaron Watry
50a7bcbac9
Add atomic_inc and atomic_add builtins
...
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 190058
2013-09-05 16:04:01 +00:00