Introduce an option to request global visibility settings be applied to
declarations without a definition or an explicit visibility, rather than
the existing behavior of giving these default visibility. When the
visibility of all or most extern definitions are known this allows for
the same optimisations -fvisibility permits without updating source code
to annotate all declarations.
Differential Revision: https://reviews.llvm.org/D56868
llvm-svn: 352391
Summary:
The 512-bit cvt(u)qq2tops, cvt(u)qqtopd, and cvt(u)dqtops intrinsics all have the possibility of taking an explicit rounding mode argument. If the rounding mode is CUR_DIRECTION we'd like to emit a sitofp/uitofp instruction and a select like we do for 256-bit intrinsics.
For cvt(u)qqtopd and cvt(u)dqtops we do this when the form of the software intrinsics that doesn't take a rounding mode argument is used. This is done by using convertvector in the header with the select builtin. But if the explicit rounding mode form of the intrinsic is used and CUR_DIRECTION is passed, we don't do this. We shouldn't have this inconsistency.
For cvt(u)qqtops nothing is done because we can't use the select builtin in the header without avx512vl. So we need to use custom codegen for this.
Even when the rounding mode isn't CUR_DIRECTION we should also use select in IR for consistency. And it will remove another scalar integer mask from our intrinsics.
To accomplish all of these goals I've taken a slightly unusual approach. I've added two new X86 specific intrinsics for sitofp/uitofp with rounding. These intrinsics are variadic on the input and output type so we only need 2 instead of 6. This avoids the need for a switch to map them in CGBuiltin.cpp. We just need to check signed vs unsigned. I believe other targets also use variadic intrinsics like this.
So if the rounding mode is CUR_DIRECTION we'll use an sitofp/uitofp instruction. Otherwise we'll use one of the new intrinsics. After that we'll emit a select instruction if needed.
Reviewers: RKSimon, spatel
Reviewed By: RKSimon
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D56998
llvm-svn: 352267
Relocatable code generation is meaningless on MSP430, as the platform is too small to use shared libraries.
Patch by Dmitry Mikushev!
Differential Revision: https://reviews.llvm.org/D56927
llvm-svn: 352181
ACLE specifies that return type for rsr and rsr64 is uint32_t and
uint64_t respectively. D56852 change the return type of rsr64 from
unsigned long to unsigned long long which at least on Linux doesn't
match uint64_t, but the test isn't strict enough to detect that
because compiler implicitly converts unsigned long long to uint64_t,
but it breaks other uses such as printf with PRIx64 type specifier.
This change makes the test stricter enforcing that the return type
of rsr and rsr64 builtins is what is actually specified in ACLE.
Differential Revision: https://reviews.llvm.org/D57210
llvm-svn: 352156
This reverts commit r351740: this broke on platforms where unsigned long
long isn't the same as uint64_t which is what ACLE specifies for the
return value of rsr64.
Differential Revision: https://reviews.llvm.org/D57209
llvm-svn: 352153
This adds a C/C++ attribute which corresponds to the LLVM IR wasm-import-module
attribute. It allows code to specify an explicit import module.
Differential Revision: https://reviews.llvm.org/D57160
llvm-svn: 352106
Generate DILabel metadata and call llvm.dbg.label after label
statement to associate the metadata with the label.
After fixing PR37395.
After fixing problems in LiveDebugVariables.
After fixing NULL symbol problems in AddressPool when enabling
split-dwarf-file.
After fixing PR39094.
After landing D54199 and D54465 to fix Chromium build failed.
Differential Revision: https://reviews.llvm.org/D45045
llvm-svn: 352025
We can't use any other string, anyway, because its type wouldn't
match the type of the PredefinedExpr.
With this change, we don't compute a "nice" name for the __func__ global
when it's used in the initializer for a constant. This doesn't seem like
a great loss, and I'm not sure how to fix it without either storing more
information in the AST, or somehow threading through the information
from ExprConstant.cpp.
This could break some situations involving BlockDecl; currently,
CodeGenFunction::EmitPredefinedLValue has some logic to intentionally
emit a string different from what Sema computed. This code skips that
logic... but that logic can't work correctly in general anyway. (For
example, sizeof(__func__) returns the wrong result.) Hopefully this
doesn't affect practical code.
Fixes https://bugs.llvm.org/show_bug.cgi?id=40313 .
Differential Revision: https://reviews.llvm.org/D56821
llvm-svn: 351766
The ACLE states that 64-bit crc32, wsr, rsr and rbit operands are
uint64_t so we should have the clang builtin match this description
- which is what we already do for AArch32.
Differential Revision: https://reviews.llvm.org/D56852
llvm-svn: 351740
For some reason we were missing tests for several unmasked conversion intrinsics, but had their mask form.
Also use a non-default rounding mode on some tests to provide better coverage for a future patch.
llvm-svn: 351708
These intrinsics can always be replaced with generic integer comparisons without any regression in codegen, even for -O0/-fast-isel cases.
Noticed while cleaning up vector integer comparison costs for PR40376.
A future commit will remove/autoupgrade the existing VPCOM/VPCOMU llvm intrinsics.
llvm-svn: 351687
With commit r351627, LLVM gained the ability to apply (existing) IPO
optimizations on indirections through callbacks, or transitive calls.
The general idea is that we use an abstraction to hide the middle man
and represent the callback call in the context of the initial caller.
It is described in more detail in the commit message of the LLVM patch
r351627, the llvm::AbstractCallSite class description, and the
language reference section on callback-metadata.
This commit enables clang to emit !callback metadata that is
understood by LLVM. It does so in three different cases:
1) For known broker functions declarations that are directly
generated, e.g., __kmpc_fork_call for the OpenMP pragma parallel.
2) For known broker functions that are identified by their name and
source location through the builtin detection, e.g.,
pthread_create from the POSIX thread API.
3) For user annotated functions that carry the "callback(callee, ...)"
attribute. The attribute has to include the name, or index, of
the callback callee and how the passed arguments can be
identified (as many as the callback callee has). See the callback
attribute documentation for detailed information.
Differential Revision: https://reviews.llvm.org/D55483
llvm-svn: 351629
Summary:
This attribute will allow users to opt specific functions out of
speculative load hardening. This compliments the Clang attribute
named speculative_load_hardening. When this attribute or the attribute
speculative_load_hardening is used in combination with the flags
-mno-speculative-load-hardening or -mspeculative-load-hardening,
the function level attribute will override the default during LLVM IR
generation. For example, in the case, where the flag opposes the
function attribute, the function attribute will take precendence.
The sticky inlining behavior of the speculative_load_hardening attribute
may cause a function with the no_speculative_load_hardening attribute
to be tagged with the speculative_load_hardening tag in
subsequent compiler phases which is desired behavior since the
speculative_load_hardening LLVM attribute is designed to be maximally
conservative.
If both attributes are specified for a function, then an error will be
thrown.
Reviewers: chandlerc, echristo, kristof.beyls, aaron.ballman
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54909
llvm-svn: 351565
llvm.flt.rounds returns an i32, but the builtin expects an integer.
On targets where integers are not 32-bits clang tries to bitcast the result, causing an assertion failure.
The patch enables newlib build for msp430.
Patch by Edward Jones!
Differential Revision: https://reviews.llvm.org/D24461
llvm-svn: 351449
We need to custom handle these so we can turn the scalar mask into a vXi1 vector.
Differential Revision: https://reviews.llvm.org/D56530
llvm-svn: 351390
* Accept as an argument constants in range 0..63 (aligned with TI headers and linker scripts provided with TI GCC toolchain).
* Emit function attribute 'interrupt'='xx' instead of aliases (used in the backend to create a section for particular interrupt vector).
* Add more diagnostics.
Patch by Kristina Bessonova!
Differential Revision: https://reviews.llvm.org/D56663
llvm-svn: 351344
Pass the frame pointer that the first finally block receives onto the nested
finally block, instead of generating it using localaddr.
Differential Revision: https://reviews.llvm.org/D56463
llvm-svn: 351302
Summary:
UB isn't nice. It's cool and powerful, but not nice.
Having a way to detect it is nice though.
[[ https://wg21.link/p1007r3 | P1007R3: std::assume_aligned ]] / http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1007r2.pdf says:
```
We propose to add this functionality via a library function instead of a core language attribute.
...
If the pointer passed in is not aligned to at least N bytes, calling assume_aligned results in undefined behaviour.
```
This differential teaches clang to sanitize all the various variants of this assume-aligned attribute.
Requires D54588 for LLVM IRBuilder changes.
The compiler-rt part is D54590.
This is a second commit, the original one was r351105,
which was mass-reverted in r351159 because 2 compiler-rt tests were failing.
Reviewers: ABataev, craig.topper, vsk, rsmith, rnk, #sanitizers, erichkeane, filcab, rjmccall
Reviewed By: rjmccall
Subscribers: chandlerc, ldionne, EricWF, mclow.lists, cfe-commits, bkramer
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D54589
llvm-svn: 351177
Summary:
This patch attempts to redo what was tried in r278783, but was reverted.
These intrinsics should be available on non-windows platforms with "xsave" feature check. But on Windows platforms they shouldn't have feature check since that's how MSVC behaves.
To accomplish this I've added a MS builtin with no feature check. And a normal gcc builtin with a feature check. When _MSC_VER is not defined _xgetbv/_xsetbv will be macros pointing to the gcc builtin name.
I've moved the forward declarations from intrin.h to immintrin.h to match the MSDN documentation and used that as the header file for the MS builtin.
I'm not super happy with this implementation, and I'm open to suggestions for better ways to do it.
Reviewers: rnk, RKSimon, spatel
Reviewed By: rnk
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D56686
llvm-svn: 351160
Summary:
UB isn't nice. It's cool and powerful, but not nice.
Having a way to detect it is nice though.
[[ https://wg21.link/p1007r3 | P1007R3: std::assume_aligned ]] / http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1007r2.pdf says:
```
We propose to add this functionality via a library function instead of a core language attribute.
...
If the pointer passed in is not aligned to at least N bytes, calling assume_aligned results in undefined behaviour.
```
This differential teaches clang to sanitize all the various variants of this assume-aligned attribute.
Requires D54588 for LLVM IRBuilder changes.
The compiler-rt part is D54590.
Reviewers: ABataev, craig.topper, vsk, rsmith, rnk, #sanitizers, erichkeane, filcab, rjmccall
Reviewed By: rjmccall
Subscribers: chandlerc, ldionne, EricWF, mclow.lists, cfe-commits, bkramer
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D54589
llvm-svn: 351105
This removes the old grow_memory and mem.grow-style builtins, leaving just
the memory.grow-style builtins.
Differential Revision: https://reviews.llvm.org/D56645
llvm-svn: 351089
Summary:
Adds a new -f[no]split-lto-unit flag that is disabled by default to
control module splitting during ThinLTO. It is automatically enabled
for -fsanitize=cfi and -fwhole-program-vtables.
The new EnableSplitLTOUnit codegen flag is passed down to llvm
via a new module flag of the same name.
Depends on D53890.
Reviewers: pcc
Subscribers: ormris, mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D53891
llvm-svn: 350949
As reported in PR33035, LLVM crashes if given a common object with an
alignment of greater than 32 bits. This is because the COFF file format
does not support these alignments, so emitting them is broken anyway.
This patch changes any global definitions greater than 32 bit alignment
to no longer be in 'common'.
https://bugs.llvm.org/show_bug.cgi?id=33035
Differential Revision: https://reviews.llvm.org/D56391
Change-Id: I48609289753b7f3b58c5e2bc1712756750fbd45a
llvm-svn: 350643
As noted on PR40203, for gcc compatibility we need to support non-immediate values in the 'slli/srli/srai' shift by immediate vector intrinsics.
llvm-svn: 350619
The problem is similar to D55986 but for threads: a process with the
interceptor hwasan library loaded might have some threads started by
instrumented libraries and some by uninstrumented libraries, and we
need to be able to run instrumented code on the latter.
The solution is to perform per-thread initialization lazily. If a
function needs to access shadow memory or add itself to the per-thread
ring buffer its prologue checks to see whether the value in the
sanitizer TLS slot is null, and if so it calls __hwasan_thread_enter
and reloads from the TLS slot. The runtime does the same thing if it
needs to access this data structure.
This change means that the code generator needs to know whether we
are targeting the interceptor runtime, since we don't want to pay
the cost of lazy initialization when targeting a platform with native
hwasan support. A flag -fsanitize-hwaddress-abi={interceptor,platform}
has been introduced for selecting the runtime ABI to target. The
default ABI is set to interceptor since it's assumed that it will
be more common that users will be compiling application code than
platform code.
Because we can no longer assume that the TLS slot is initialized,
the pthread_create interceptor is no longer necessary, so it has
been removed.
Ideally, lazy initialization should only cost one instruction in the
hot path, but at present the call may cause us to spill arguments
to the stack, which means more instructions in the hot path (or
theoretically in the cold path if the spills are moved with shrink
wrapping). With an appropriately chosen calling convention for
the per-thread initialization function (TODO) the hot path should
always need just one instruction and the cold path should need two
instructions with no spilling required.
Differential Revision: https://reviews.llvm.org/D56038
llvm-svn: 350429