If the shuffle is required for the reduced structures/big data type,
current code may cause compiler crash because of the loading of the
aggregate values. Patch fixes this problem.
llvm-svn: 335377
D48464 contains changes that will loosen some of the range checks in SemaChecking to a DefaultError warning that can be disabled.
This patch adds explicit masking to avoid using the upper bits of immediates to gracefully handle the warning being disabled.
llvm-svn: 335308
This is breaking a couple of buildbots. We need to run the
NameAnonGlobal pass for regular LTO now as well (since we're producing a
summary). I'll post a separate patch for review to make this happen and
then re-commit.
This reverts commit c0759b7b1f4a81ff9021b952aa38a222d5fa4dfd.
llvm-svn: 335291
parallel region.
If the current construct requires sharing of the local variable in the
inner parallel region, this variable must be globalized to avoid
runtime crash.
llvm-svn: 335285
Summary:
With D33921, we gained the ability to have module summaries in regular
LTO modules without triggering ThinLTO compilation. Module summaries in
regular LTO allow garbage collection (dead stripping) before LTO
compilation and thus open up additional optimization opportunities.
This patch enables summary emission in regular LTO for all targets
except ld64-based ones (which use the legacy LTO API).
Reviewers: pcc, tejohnson, mehdi_amini
Subscribers: inglorion, eraman, cfe-commits
Differential Revision: https://reviews.llvm.org/D34156
llvm-svn: 335284
Summary:
This test is a strip down version of a function inside the
amalgamated sqlite source. When converted to IR clang produces
a phi instruction without debug location.
This patch fixes the above issue.
Differential Revision: https://reviews.llvm.org/D47720
llvm-svn: 335255
This diff includes the logic for setting the precision bits for each primary fixed point type in the target info and logic for initializing a fixed point literal.
Fixed point literals are declared using the suffixes
```
hr: short _Fract
uhr: unsigned short _Fract
r: _Fract
ur: unsigned _Fract
lr: long _Fract
ulr: unsigned long _Fract
hk: short _Accum
uhk: unsigned short _Accum
k: _Accum
uk: unsigned _Accum
```
Errors are also thrown for illegal literal values
```
unsigned short _Accum u_short_accum = 256.0uhk; // expected-error{{the integral part of this literal is too large for this unsigned _Accum type}}
```
Differential Revision: https://reviews.llvm.org/D46915
llvm-svn: 335148
This is not only semantically correct but ensures that they will not
be marked as address-significant once D48155 lands.
Differential Revision: https://reviews.llvm.org/D48206
llvm-svn: 334982
Summary: All *_sqrt_round_s[s|d] intrinsics should execute a square root on
zeroth element from B (Ops[1]) and insert in to A (Ops[0]), not the other way around.
Reviewers: itaraban, craig.topper
Reviewed By: craig.topper
Subscribers: craig.topper, cfe-commits
Differential Revision: https://reviews.llvm.org/D48288
llvm-svn: 334964
The previous names took the shift amount in bits to match gcc and required a multiply by 8 in the header. This creates a misleading error message when we check the range of the immediate to the builtin since the allowed range also got multiplied by 8.
This commit changes the builtins to use a byte shift amount to match the underlying instruction and the Intel intrinsic.
Fixes the remaining issue from PR37795.
llvm-svn: 334773
This diff includes changes for the remaining _Fract and _Sat fixed point types.
```
signed short _Fract s_short_fract;
signed _Fract s_fract;
signed long _Fract s_long_fract;
unsigned short _Fract u_short_fract;
unsigned _Fract u_fract;
unsigned long _Fract u_long_fract;
// Aliased fixed point types
short _Accum short_accum;
_Accum accum;
long _Accum long_accum;
short _Fract short_fract;
_Fract fract;
long _Fract long_fract;
// Saturated fixed point types
_Sat signed short _Accum sat_s_short_accum;
_Sat signed _Accum sat_s_accum;
_Sat signed long _Accum sat_s_long_accum;
_Sat unsigned short _Accum sat_u_short_accum;
_Sat unsigned _Accum sat_u_accum;
_Sat unsigned long _Accum sat_u_long_accum;
_Sat signed short _Fract sat_s_short_fract;
_Sat signed _Fract sat_s_fract;
_Sat signed long _Fract sat_s_long_fract;
_Sat unsigned short _Fract sat_u_short_fract;
_Sat unsigned _Fract sat_u_fract;
_Sat unsigned long _Fract sat_u_long_fract;
// Aliased saturated fixed point types
_Sat short _Accum sat_short_accum;
_Sat _Accum sat_accum;
_Sat long _Accum sat_long_accum;
_Sat short _Fract sat_short_fract;
_Sat _Fract sat_fract;
_Sat long _Fract sat_long_fract;
```
This diff only allows for declaration of these fixed point types. Assignment and other operations done on fixed point types according to http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf will be added in future patches.
Differential Revision: https://reviews.llvm.org/D46911
llvm-svn: 334718
Summary: These intrinsics result in hint instructions. They are provided here for MSVC ARM64 compatibility.
Reviewers: mstorsjo, compnerd, javed.absar
Reviewed By: mstorsjo
Subscribers: kristof.beyls, chrib, cfe-commits
Differential Revision: https://reviews.llvm.org/D48132
llvm-svn: 334639
Summary:
In many cases we can't devirtualize
because definition of vtable is not present. Most of the
time it is caused by inline virtual function not beeing
emitted. Forcing emitting of vtable adds a reference of these
inline virtual functions.
Note that GCC was always doing it.
Reviewers: rjmccall, rsmith, amharc, kuhar
Subscribers: llvm-commits, cfe-commits
Differential Revision: https://reviews.llvm.org/D47108
Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com>
llvm-svn: 334600
Currently clang set kernel calling convention for CUDA/HIP after
arranging function, which causes incorrect kernel function type since
it depends on calling convention.
This patch moves setting kernel convention before arranging
function.
Differential Revision: https://reviews.llvm.org/D47733
llvm-svn: 334457
This should reduce the binary size penalty of ASan on Windows. After
r334313, ASan will add red zones to globals in comdats, so we will still
find OOB accesses to string literals.
llvm-svn: 334417
Summary: We've had these target independent intrinsics for at least a year and a half. Looks like they do exactly what we need here and the backend already supports them.
Reviewers: RKSimon, delena, spatel, GBuella
Reviewed By: RKSimon
Subscribers: cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D47693
llvm-svn: 334366
We currently support them only in AArch64. The NEON Reference,
however, says they are 'ARMv7, ARMv8' intrinsics.
Differential Revision: https://reviews.llvm.org/D47446
llvm-svn: 334362
SmallSet forwards to SmallPtrSet for pointer types. SmallPtrSet supports iteration, but a normal SmallSet doesn't. So if it wasn't for the forwarding, this wouldn't work.
These places were found by hiding the begin/end methods in the SmallSet forwarding.
llvm-svn: 334339
I'd like to make the select builtins require an avx512f, avx512bw, or avx512vl fature to match what is normally required to get masking. Truncate is special in that there are instructions with a 128/256-bit masked result even without avx512vl.
By using special buitlins we can emit a select without using the 128/256-bit select builtins.
llvm-svn: 334331
I'm looking into making the select builtins require avx512f, avx512bw, or avx512vl since masking operations generally require those features.
The extract builtins are funny because the 512-bit versions return a 128 or 256 bit vector with masking even when avx512vl is not supported.
llvm-svn: 334330
CGM.GetAddrOfConstantCString() sets the adress of the created GlobalValue
to unnamed. When emitting the object file LLVM will mark the surrounding
section as SHF_MERGE iff the string is nul-terminated and contains no
other nuls (see IsNullTerminatedString). This results in problems when
saving temporaries because LLVM doesn't set an EntrySize, so reading in
the serialized assembly file fails.
This never happened for the GPU binaries because they usually contain
a nul-character somewhere. Instead this only affected the module ID
when compiling relocatable device code.
However, this points to a potentially larger problem: If we put a
constant string into a named section, we really want the data to end
up in that section in the object file. To avoid LLVM merging sections
this patch unmarks the GlobalVariable's address as unnamed which also
fixes the problem of invalid serialized assembly files when saving
temporaries.
Differential Revision: https://reviews.llvm.org/D47902
llvm-svn: 334281
Test changes are due to differences in how we generate undef elements now. We also changed the types used for extractf128_si256/insertf128_si256 to match the signature of the builtin that previously existed which this patch resurrects. This also matches gcc.
llvm-svn: 334261
The windows-msvc target is meant to be ABI compatible with MSVC,
including the exception handling. Ensure that a windows-msvc triple
always equates to the MSVC personality being used.
This mostly affects the GNUStep and ObjFW Obj-C runtimes. To the best of
my knowledge, those are normally not used with windows-msvc triples. I
believe WinObjC is based on GNUStep (or it at least uses libobjc2), but
that also takes the approach of wrapping Obj-C exceptions in C++
exceptions, so the MSVC personality function is the right one to use
there as well.
Differential Revision: https://reviews.llvm.org/D47862
llvm-svn: 334253
Adds support for these intrinsics, which are ARM and ARM64 only:
_interlockedbittestandreset_acq
_interlockedbittestandreset_rel
_interlockedbittestandreset_nf
_interlockedbittestandset_acq
_interlockedbittestandset_rel
_interlockedbittestandset_nf
Refactor the bittest intrinsic handling to decompose each intrinsic into
its action, its width, and its atomicity.
llvm-svn: 334239
We still emit shufflevector instructions we just do it from CGBuiltin.cpp now. This ensures the intrinsics that use this are only available on CPUs that support the feature.
I also added range checking to the immediate, but only checked it is 8 bits or smaller. We should maybe be stricter since we never use all 8 bits, but gcc doesn't seem to do that.
llvm-svn: 334237
We still lower them to native shuffle IR, but we do it in CGBuiltin.cpp now. This allows us to check the target feature and ensure the immediate fits in 8 bits.
This also improves our -O0 codegen slightly because we're able to see the zeroinitializer in the shuffle. It looks like it got lost behind a store+load previously.
llvm-svn: 334208
Summary:
When requirement imposed by __target__ attributes on functions
are not satisfied, prefer printing those requirements, which
are explicitly mentioned in the attributes.
This makes such messages more useful, e.g. printing avx512f instead of avx2
in the following scenario:
```
$ cat foo.c
static inline void __attribute__((__always_inline__, __target__("avx512f")))
x(void)
{
}
int main(void)
{
x();
}
$ clang foo.c
foo.c:7:2: error: always_inline function 'x' requires target feature 'avx2', but would be inlined into function 'main' that is compiled without support for 'avx2'
x();
^
1 error generated.
```
bugzilla: https://bugs.llvm.org/show_bug.cgi?id=37338
Reviewers: craig.topper, echristo, dblaikie
Reviewed By: craig.topper, echristo
Differential Revision: https://reviews.llvm.org/D46541
llvm-svn: 334174
Summary:
We recently switch to using a selects in the intrinsics header files for FMA instructions. But the 512-bit versions support flavors with rounding mode which must be an Integer Constant Expression. This has forced those intrinsics to be implemented as macros. As it stands now the mask and mask3 intrinsics evaluate one of their macro arguments twice. If that argument itself is another intrinsic macro, we can end up over expanding macros. Or if its something we can CSE later it would show up multiple times when it shouldn't.
I tried adding __extension__ around the macro and making it an expression statement and declaring a local variable. But whatever name you choose for the local variable can never be used as the name of an input to the macro in user code. If that happens you would end up with the same name on the LHS and RHS of an assignment after expansion. We might be safe if we use __ in front of the variable names because those names are reserved and user code shouldn't use that, but I wasn't sure I wanted to make that claim.
The other option which I've chosen here, is to add back _mask, _maskz, and _mask3 flavors of the builtin which we will expand in CGBuiltin.cpp to replicate the argument as needed and insert any fneg needed on the third operand to make a subtract. The _maskz isn't truly necessary if we have an unmasked version or if we use the masked version with a -1 mask and wrap a select around it. But I've chosen to make things more uniform.
I separated out the scalar builtin handling to avoid too many things going on in EmitX86FMAExpr. It was different enough due to the extract and insert that the minor duplication of the CreateCall was probably worth it.
Reviewers: tkrupa, RKSimon, spatel, GBuella
Reviewed By: tkrupa
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D47724
llvm-svn: 334159
Factor out the common setjmp call emission code.
Based on a patch by Chris January
Differential Revision: https://reviews.llvm.org/D47784
llvm-svn: 334112
I tested these locally on an x86 machine by disabling the inline asm
codepath and confirming that it does the same bitflips as we do with the
inline asm.
Addresses code review feedback.
llvm-svn: 334059
Previously we were just using extended vector operations in the header file.
This unfortunately allowed non-constant indices to be used with the intrinsics. This is incompatible with gcc, icc, and MSVC. It also introduces a different performance characteristic because non-constant index gets lowered to a vector store and an element sized load.
By adding the builtins we can check for the index to be a constant and ensure its in range of the vector element count.
User code still has the option to use extended vector operations themselves if they need non-constant indexing.
llvm-svn: 334057
This builtin takes an index as its second operand, but the codegen hardcodes an index of 0 and doesn't use the operand. The only use of the builtin in the header file passes 0 to the operand so this works for that usage. But its more correct to use the real operand.
llvm-svn: 334054
CUDA/HIP does not support RTTI on device side, therefore there
is no point of emitting type info when compiling for device.
Emitting type info for device not only clutters the IR with useless
global variables, but also causes undefined symbol at linking
since vtable for cxxabiv1::class_type_info has external linkage.
Differential Revision: https://reviews.llvm.org/D47694
llvm-svn: 334021
We need to implement _interlockedbittestandset as a builtin for
windows.h, so we might as well do the whole family. It reduces code
duplication anyway.
Fixes PR33188, a long standing bug in our bittest implementation
encountered by Chakra.
llvm-svn: 333978
Adding __attribute__((aligned(32))) to __m256 breaks the implementation
of _mm256_loadu_ps on Windows. On Windows, alignment attributes have
higher precedence than packing attributes.
We also might want to carefully consider the consequences of changing
our vector typedefs, since many users copy them and invent their own
new, non-Intel specific vector type names.
llvm-svn: 333958
Summary:
Because `llvm::Triple` can be derived from `TargetInfo`, it is simpler
to take only `TargetInfo` argument.
Reviewers: sbc100
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D47620
llvm-svn: 333938
// Primary fixed point types
signed short _Accum s_short_accum;
signed _Accum s_accum;
signed long _Accum s_long_accum;
unsigned short _Accum u_short_accum;
unsigned _Accum u_accum;
unsigned long _Accum u_long_accum;
// Aliased fixed point types
short _Accum short_accum;
_Accum accum;
long _Accum long_accum;
This diff only allows for declaration of the fixed point types. Assignment and other operations done on fixed point types according to http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf will be added in future patches. The saturated versions of these types and the equivalent _Fract types will also be added in future patches.
The tests included are for asserting that we can declare these types.
Fixed the test that was failing by not checking for dso_local on some
targets.
Differential Revision: https://reviews.llvm.org/D46084
llvm-svn: 333923
This seems like a premature optimization. It's unlikely a user would pass something the frontend can tell is all ones to the masked load/store intrinsics.
We do this optimization for emitting select for masking because we have builtin calls in header files that pass an all ones mask in. Though at this point we may not longer have any builtins that emit some IR and a select. We may only have the select builtins so maybe we can remove that optimization too.
llvm-svn: 333847
We currently support them only in AArch64. The NEON Reference,
however, says they are 'ARMv7, ARMv8' intrinsics.
Differential Revision: https://reviews.llvm.org/D47121
llvm-svn: 333829
```
// Primary fixed point types
signed short _Accum s_short_accum;
signed _Accum s_accum;
signed long _Accum s_long_accum;
unsigned short _Accum u_short_accum;
unsigned _Accum u_accum;
unsigned long _Accum u_long_accum;
// Aliased fixed point types
short _Accum short_accum;
_Accum accum;
long _Accum long_accum;
```
This diff only allows for declaration of the fixed point types. Assignment and other operations done on fixed point types according to http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf will be added in future patches. The saturated versions of these types and the equivalent `_Fract` types will also be added in future patches.
The tests included are for asserting that we can declare these types.
Differential Revision: https://reviews.llvm.org/D46084
llvm-svn: 333814
This fixes two major problems:
- We were not capping vector alignment as desired on 32-bit ARM.
- We were using different alignments based on the AVX settings on
Intel, so we did not have a consistent ABI.
This is an ABI break, but we think we can get away with it because
vectors tend to be used mostly in inline code (which is why not having
a consistent ABI has not proven disastrous on Intel).
Intel's AVX types are specified as having 32-byte / 64-byte alignment,
so align them explicitly instead of relying on the base ABI rule.
Note that this sort of attribute is stripped from template arguments
in template substitution, so there's a possibility that code templated
over vectors will produce inadequately-aligned objects. The right
long-term solution for this is for alignment attributes to be
interpreted as true qualifiers and thus preserved in the canonical type.
llvm-svn: 333791
Summary:
clang's current wasm EH implementation is a non-MVP feature in progress.
We had a `-mexception-handling` wasm feature but were not using it. This
patch hides the non-MVP wasm EH behind a flag, so it does not affect
other code for now.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits
Differential Revision: https://reviews.llvm.org/D47614
llvm-svn: 333716
A deferred region should end before the start of a label, and should not
extend to the start of the label sub-statement.
Fixes llvm.org/PR35867.
llvm-svn: 333715
The WebAssembly committee has decided on the names `memory.size` and
`memory.grow` for the memory intrinsics, so update the clang builtin
functions to follow those names, keeping both sets of old names in place
for compatibility.
llvm-svn: 333712
Summary:
Because wasm control flow needs to be structured, using WinEH
instructions to support wasm EH brings several benefits. This patch
makes wasm EH uses Windows EH instructions, with some changes:
1. Because wasm uses a single catch block to catch all C++ exceptions,
this merges all catch clauses into a single catchpad, within which we
test the EH selector as in Itanium EH.
2. Generates a call to `__clang_call_terminate` in case a cleanup
throws. Wasm does not have a runtime to handle this.
3. In case there is no catch-all clause, inserts a call to
`__cxa_rethrow` at the end of a catchpad in order to unwind to an
enclosing EH scope.
Reviewers: majnemer, dschuff
Subscribers: jfb, sbc100, jgravelle-google, sunfish, cfe-commits
Differential Revision: https://reviews.llvm.org/D44931
llvm-svn: 333703
Ensure latest MPT decl has a MSInheritanceAttr when instantiating
templates, to avoid null MSInheritanceAttr deref in
CXXRecordDecl::getMSInheritanceModel().
See PR#37399 for repo / details.
Patch by Andrew Rogers!
Differential Revision: https://reviews.llvm.org/D46664
llvm-svn: 333680
Discard the last uncompleted deferred region in a decl, if one exists.
This prevents lines at the end of a function containing only whitespace
or closing braces from being marked as uncovered, if they follow a
region terminator (return/break/etc).
The previous behavior was to heuristically complete deferred regions at
the end of a decl. In practice this ended up being too brittle for too
little gain. Users would complain that there was no way to reach full
code coverage because whitespace at the end of a function would be
marked uncovered.
rdar://40238228
Differential Revision: https://reviews.llvm.org/D46918
llvm-svn: 333609
This patch replaces all packed (and scalar without rounding
mode) fused intrinsics with fmadd/fmaddsub variations.
Then fmadd/fmaddsub are lowered to native IR.
Patch by tkrupa
Reviewers: craig.topper, sroland, spatel, RKSimon
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D47444
llvm-svn: 333555
These intrinsics are used by MSVC's header files on AArch64 Windows as
well as AArch32, so we should support them for both targets. I've
factored them out of CodeGenFunction::EmitARMBuiltinExpr into separate
functions that EmitAArch64BuiltinExpr can call as well.
Reviewers: javed.absar, mstorsjo
Reviewed By: mstorsjo
Subscribers: kristof.beyls, cfe-commits
Differential Revision: https://reviews.llvm.org/D47476
llvm-svn: 333513
This helps especially when the collision is for a template specialization,
where the template arguments are not available from anywhere else in the
diagnostic, and are likely relevant to the problem.
llvm-svn: 333489
initialization functions to 'cxx_fast_tlscc'.
This fixes a bug where instructions calling initialization functions for
thread-local static members of c++ template classes were using calling
convention 'cxx_fast_tlscc' while the called functions weren't annotated
with the calling convention.
rdar://problem/40447463
Differential Revision: https://reviews.llvm.org/D47354
llvm-svn: 333447
The checksum will not reflect the real source, so there's no clear
reason to include them in the debug info. Also this was causing a
crash on the DWARF side.
Differential Revision: https://reviews.llvm.org/D47260
llvm-svn: 333311
If orphaned parallel region is found, the next code must be emitted:
```
if(__kmpc_is_spmd_exec_mode() || __kmpc_parallel_level(loc, gtid))
Serialized execution.
else if (IsMasterThread())
Prepare and signal worker.
else
Outined function call.
```
llvm-svn: 333301
It caused asserts, see PR37560.
> Use zeroinitializer for (trailing zero portion of) large array initializers
> more reliably.
>
> Clang has two different ways it emits array constants (from InitListExprs and
> from APValues), and both had some ability to emit zeroinitializer, but neither
> was able to catch all cases where we could use zeroinitializer reliably. In
> particular, emitting from an APValue would fail to notice if all the explicit
> array elements happened to be zero. In addition, for large arrays where only an
> initial portion has an explicit initializer, we would emit the complete
> initializer (which could be huge) rather than emitting only the non-zero
> portion. With this change, when the element would have a suffix of more than 8
> zero elements, we emit the array constant as a packed struct of its initial
> portion followed by a zeroinitializer constant for the trailing zero portion.
>
> In passing, I found a bug where SemaInit would sometimes walk the entire array
> when checking an initializer that only covers the first few elements; that's
> fixed here to unblock testing of the rest.
>
> Differential Revision: https://reviews.llvm.org/D47166
llvm-svn: 333067
more reliably.
Clang has two different ways it emits array constants (from InitListExprs and
from APValues), and both had some ability to emit zeroinitializer, but neither
was able to catch all cases where we could use zeroinitializer reliably. In
particular, emitting from an APValue would fail to notice if all the explicit
array elements happened to be zero. In addition, for large arrays where only an
initial portion has an explicit initializer, we would emit the complete
initializer (which could be huge) rather than emitting only the non-zero
portion. With this change, when the element would have a suffix of more than 8
zero elements, we emit the array constant as a packed struct of its initial
portion followed by a zeroinitializer constant for the trailing zero portion.
In passing, I found a bug where SemaInit would sometimes walk the entire array
when checking an initializer that only covers the first few elements; that's
fixed here to unblock testing of the rest.
Differential Revision: https://reviews.llvm.org/D47166
llvm-svn: 333044
The clang builtins have the same semantics as the stdlib functions.
The stdlib functions are defined in section 7.20.6.1 of the C standard with:
"If the result cannot be represented, the behavior is undefined."
That lets us mark the negation with 'nsw' because "sub i32 0, INT_MIN" would
be UB/poison.
Differential Revision: https://reviews.llvm.org/D47202
llvm-svn: 333038
Introduced CreateMemTempWithoutCast and CreateTemporaryAllocaWithoutCast to emit alloca
without casting to default addr space.
ActiveFlag is a temporary variable emitted for clean up. It is defined as AllocaInst* type and there is
a cast to AlllocaInst in SetActiveFlag. An alloca casted to generic pointer causes assertion in
SetActiveFlag.
Since there is only load/store of ActiveFlag, it is safe to use the original alloca, therefore use
CreateMemTempWithoutCast is called.
Differential Revision: https://reviews.llvm.org/D47099
llvm-svn: 332982
This change will help Visual Studio resolve forward references to C++ lambda
routines used by captured variables.
Differential Revision: https://reviews.llvm.org/D45438
llvm-svn: 332975
Summary:
This includes initial support for the (hopefully final) updated Objective-C ABI, developed here:
https://github.com/davidchisnall/clang-gnustep-abi-2
It also includes some cleanups and refactoring from older GNU ABIs.
The current version is ELF only, other formats to follow.
Reviewers: rjmccall, DHowett-MSFT
Reviewed By: rjmccall
Subscribers: smeenai, cfe-commits
Differential Revision: https://reviews.llvm.org/D46052
llvm-svn: 332950
Because the intrinsics in the headers are implemented as macros, we can't just use a select builtin and pternlog builtin. This would require one of the macro arguments to be used twice. Depending on what was passed to the macro we could expand an expression twice leading to weird behavior. We could maybe declare our local variable in the macro, but that would need to worry about name collisions.
To avoid that just generate IR directly in CGBuiltin.cpp.
Differential Revision: https://reviews.llvm.org/D47125
llvm-svn: 332891
1. added restrictions to memory scope, order and volatile parameters
2. added custom processing for these builtins - currently is not used code,
needed to switch off GCCBuiltin link to the builtins (ongoing change to llvm
tree)
3. builtins renamed as requested
Differential Revision: https://reviews.llvm.org/D43281
llvm-svn: 332848
If a variable has an initializer, codegen tries to build its value. If
the variable is large in size, building its value requires substantial
resources. It causes strange behavior from user viewpoint: compilation
of huge zero initialized arrays like:
char data_1[2147483648u] = { 0 };
consumes enormous amount of time and memory.
With this change codegen tries to determine if variable initializer is
equivalent to zero initializer. In this case variable value is not
constructed.
This change fixes PR18978.
Differential Revision: https://reviews.llvm.org/D46241
llvm-svn: 332847
The first version of the patch (r332228) was flawed because it was
putting structors into C5/D5 comdats very eagerly. This is correct only
if we can ensure the comdat contains all required versions of the
structor (which wasn't the case). This version uses a more nuanced
approach:
- for local structor symbols we use an alias because we don't have to
worry about comdats or other compilation units.
- linkonce symbols are emitted separately, as we cannot guarantee we
will have all symbols we need to form a comdat (they are emitted
lazily, only when referenced).
- available_externally symbols are also emitted separately, as the code
seemed to be worried about emitting an alias in this case.
- other linkage types are not affected by the optimization level. They
either get put into a comdat (weak) or get aliased (external).
Reviewers: rjmccall, aprantl
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D46685
llvm-svn: 332839
When a lambda capture captures a __block in the same statement, the compiler asserts out because isCapturedBy assumes that an Expr can only be a BlockExpr, StmtExpr, or if it's a Stmt then all the statement's children are expressions. That's wrong, we need to visit all sub-statements even if they're not expressions to see if they also capture.
Fix this issue by pulling out the isCapturedBy logic to use RecursiveASTVisitor.
<rdar://problem/39926584>
llvm-svn: 332801
To support linking device code in different source files, it is necessary to
embed fat binary at host linking stage.
This patch emits an external symbol for fat binary in host codegen, then
embed the fat binary by lld through a linker script.
Differential Revision: https://reviews.llvm.org/D46472
llvm-svn: 332724
MethodVFTableLocations in MigrosoftVTableContext contains canonicalized
decl. But, it's sometimes asked to lookup for non-canonicalized decl,
and that causes assertion failure, and compilation failure.
Fixes PR37481.
Patch by Taiju Tsuiki!
Differential Revision: https://reviews.llvm.org/D46929
llvm-svn: 332639
lifetime.start/end expects pointer argument in alloca address space.
However in C++ a temporary variable is in default address space.
This patch changes API CreateMemTemp and CreateTempAlloca to
get the original alloca instruction and pass it lifetime.start/end.
It only affects targets with non-zero alloca address space.
Differential Revision: https://reviews.llvm.org/D45900
llvm-svn: 332593
functions.
If the combined construct is specified in the declare target function
and the device code is emitted, the compiler crashes because of the
incorrectly chosen captured stmt. We should choose the innermost
captured statement, not the outermost.
llvm-svn: 332477
If the orphaned directive is executed in SPMD mode, we need to emit the
check for the SPMD mode and run the orphaned parallel directive in
sequential mode.
llvm-svn: 332467
In generic data-sharing mode we do not need to globalize
variables/parameters of reference/pointer types. They already are placed
in the global memory.
llvm-svn: 332380
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
Explicitly avoided changing the strings in the clang-format tests.
Differential Revision: https://reviews.llvm.org/D44975
llvm-svn: 332350
Some targets have constant address space (e.g. amdgcn). For them string literal should be
emitted in constant address space then casted to default address space.
Differential Revision: https://reviews.llvm.org/D46643
llvm-svn: 332279
Summary:
Removing the full structor and replacing all usages with the base one
can degrade debug quality as it will leave the debugger unable to locate
the full object structor. This is apparent when evaluating an expression
in the debugger which requires constructing an object of class which has
had this optimization applied to it. When compiling the expression, we
pretend that the class and its methods have been defined in another
compilation unit, so the expression compiler assumes the structor
definition must be available. This didn't use to be the case for
structors with internal linkage. Less aggressive optimizations like
emitting the full structor as an alias remain in place, as they do not
cause the structor symbol to disappear completely.
This improves debug quality on non-darwin platforms (darwin does not
have -mconstructor-aliases on by default, so it is spared these
problems) and enable us to remove some workarounds from LLDB which attempt to
mitigate this issue.
Reviewers: rjmccall, aprantl
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D46685
llvm-svn: 332228
These intrinsics work exactly as all other atomic_fetch_* intrinsics and allow to create *atomicrmw* with ordering.
Updated the clang-extensions document.
Differential Revision: https://reviews.llvm.org/D46386
llvm-svn: 332193
Summary:
The Itanium ABI requires that the type info for pointer-to-incomplete types to have internal linkage, so that it doesn't interfere with the type info once completed. Currently it also marks the type info name as internal as well. However, this causes a bug with the STL implementations, which use the type info name pointer to perform ordering and hashing of type infos.
For example:
```
// header.h
struct T;
extern std::type_info const& Info;
// tu_one.cpp
#include "header.h"
std::type_info const& Info = typeid(T*);
// tu_two.cpp
#include "header.h"
struct T {};
int main() {
auto &TI1 = Info;
auto &TI2 = typeid(T*);
assert(TI1 == TI2); // Fails
assert(TI1.hash_code() == TI2.hash_code()); // Fails
}
```
This patch fixes the STL bug by emitting the type info name as linkonce_odr when the type-info is for a pointer-to-incomplete type.
Note that libc++ could fix this without a compiler change, but the quality of fix would be poor. The library would either have to:
(A) Always perform strcmp/string hashes.
(B) Determine if we have a pointer-to-incomplete type, and only do strcmp then. This would require an ABI break for libc++.
Reviewers: rsmith, rjmccall, majnemer, vsapsai
Reviewed By: rjmccall
Subscribers: smeenai, cfe-commits
Differential Revision: https://reviews.llvm.org/D46665
llvm-svn: 332028
This commit relands r331904.
Adding a SrcMgr::CharacteristicKind parameter to the InclusionDirective
in PPCallbacks, and updating calls to that function. This will be useful
in https://reviews.llvm.org/D43778 to determine which includes are
system
headers.
Differential Revision: https://reviews.llvm.org/D46614
llvm-svn: 332021
Added initial support for L2 parallelism in SPMD mode. Note, though,
that the orphaned parallel directives are not currently supported in
SPMD mode.
llvm-svn: 332016
This is unnecessary for AVX512VL supporting CPUs like SKX. We can just emit a 128-bit masked load/store here no matter what. The backend will widen it to 512-bits on KNL CPUs.
Fixes the frontend portion of PR37386. Need to fix the backend to optimize the new sequences well.
llvm-svn: 331958
Summary:
The Itanium ABI requires that the type info for pointer-to-incomplete types to have internal linkage, so that it doesn't interfere with the type info once completed. Currently it also marks the type info name as internal as well. However, this causes a bug with the STL implementations, which use the type info name pointer to perform ordering and hashing of type infos.
For example:
```
// header.h
struct T;
extern std::type_info const& Info;
// tu_one.cpp
#include "header.h"
std::type_info const& Info = typeid(T*);
// tu_two.cpp
#include "header.h"
struct T {};
int main() {
auto &TI1 = Info;
auto &TI2 = typeid(T*);
assert(TI1 == TI2); // Fails
assert(TI1.hash_code() == TI2.hash_code()); // Fails
}
```
This patch fixes the STL bug by emitting the type info name as linkonce_odr when the type-info is for a pointer-to-incomplete type.
Note that libc++ could fix this without a compiler change, but the quality of fix would be poor. The library would either have to:
(A) Always perform strcmp/string hashes.
(B) Determine if we have a pointer-to-incomplete type, and only do strcmp then. This would require an ABI break for libc++.
Reviewers: rsmith, rjmccall, majnemer, vsapsai
Reviewed By: rjmccall
Subscribers: smeenai, cfe-commits
Differential Revision: https://reviews.llvm.org/D46665
llvm-svn: 331957
Previously we emitted something like
rotl(x, n) {
n &= bitwidth-1;
return n != 0 ? ((x << n) | (x >> (bitwidth - n)) : x;
}
We use a select to avoid the undefined behavior on the (bitwidth - n) shift.
The middle and backend don't really recognize this as a rotate and end up emitting a cmov or control flow because of the select.
A better pattern is (x << (n & mask)) | (x << (-n & mask)) where mask is bitwidth - 1.
Fixes the main complaint in PR37387. There's still some work to be done if the user writes that sequence directly on a short or char where type promotion rules can prevent it from being recognized. The builtin is emitting direct IR with unpromoted types so that isn't a problem for it.
Differential Revision: https://reviews.llvm.org/D46656
llvm-svn: 331943
Summary:
This attribute tells clang to skip this function from stack protector
when -stack-protector option is passed.
GCC option for this is:
__attribute__((__optimize__("no-stack-protector"))) and the
equivalent clang syntax would be: __attribute__((no_stack_protector))
This is used in Linux kernel to selectively disable stack protector
in certain functions.
Reviewers: aaron.ballman, rsmith, rnk, probinson
Reviewed By: aaron.ballman
Subscribers: probinson, srhines, cfe-commits
Differential Revision: https://reviews.llvm.org/D46300
llvm-svn: 331925
Adding a SrcMgr::CharacteristicKind parameter to the InclusionDirective
in PPCallbacks, and updating calls to that function. This will be useful
in https://reviews.llvm.org/D43778 to determine which includes are system
headers.
Differential Revision: https://reviews.llvm.org/D46614
llvm-svn: 331904
It is required to emit unique names for offloading regions ids. Required
to support compilation and linking of several compilation units.
llvm-svn: 331899
If the global variables are marked as declare target and they need
ctors/dtors, these ctors/dtors are emitted and then invoked by the
offloading runtime library. They are not explicitly used in the emitted
code and thus can be optimized out. Patch marks these functions as used,
so the optimizer cannot remove these function during the optimization
phase.
llvm-svn: 331879
It broke the Chromium build (see reply on the review).
> Generate DILabel metadata and call llvm.dbg.label after label
> statement to associate the metadata with the label.
>
> Differential Revision: https://reviews.llvm.org/D45045
>
> Patch by Hsiangkai Wang.
This doesn't revert the change to backend-unsupported-error.ll
that seems to correspond to an llvm-side change.
llvm-svn: 331861
Generate DILabel metadata and call llvm.dbg.label after label
statement to associate the metadata with the label.
Differential Revision: https://reviews.llvm.org/D45045
Patch by Hsiangkai Wang.
llvm-svn: 331843
This is similar to the LLVM change https://reviews.llvm.org/D46290.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46320
llvm-svn: 331834
The linkage of the global entries must be weak to enable support of
redefinition of the same target regions in multiple compilation units.
llvm-svn: 331768
This patch addresses some mostly trivial post-commit review comments received
on r331677.
Additionally, this patch fixes an assertion in `getNarrowingKind` caused by
the use of an uninitialized value from `checkThreeWayNarrowingConversion`.
llvm-svn: 331707
Summary:
This patch tackles long hanging fruit for the builtin operator<=> expressions. It is currently needs some cleanup before landing, but I want to get some initial feedback.
The main changes are:
* Lookup, build, and store the required standard library types and expressions in `ASTContext`. By storing them in ASTContext we don't need to store (and duplicate) the required expressions in the BinaryOperator AST nodes.
* Implement [expr.spaceship] checking, including diagnosing narrowing conversions.
* Implement `ExprConstant` for builtin spaceship operators.
* Implement builitin operator<=> support in `CodeGenAgg`. Initially I emitted the required comparisons using `ScalarExprEmitter::VisitBinaryOperator`, but this caused the operand expressions to be emitted once for every required cmp.
* Implement [builtin.over] with modifications to support the intent of P0946R0. See the note on `BuiltinOperatorOverloadBuilder::addThreeWayArithmeticOverloads` for more information about the workaround.
Reviewers: rsmith, aaron.ballman, majnemer, rnk, compnerd, rjmccall
Reviewed By: rjmccall
Subscribers: rjmccall, rsmith, aaron.ballman, junbuml, mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D45476
llvm-svn: 331677
Added initial codegen for level 2, 3 etc. parallelism. Currently, all
the second, the third etc. parallel regions will run sequentially.
llvm-svn: 331642
Summary:
Passes down the necessary code ge options to the LTO Config to enable
-fdiagnostics-show-hotness and -fsave-optimization-record in the ThinLTO
backend for a distributed build.
Also, remove warning about not having PGO when the input is IR.
Reviewers: pcc
Subscribers: mehdi_amini, inglorion, eraman, cfe-commits
Differential Revision: https://reviews.llvm.org/D46464
llvm-svn: 331592
Summary:
http://wg21.link/P0664r2 section "Evolution/Core Issues 24" describes a
proposed change to Coroutines TS that would have any exceptions thrown
after the initial suspend point of a coroutine be caught by the handler
specified by the promise type's 'unhandled_exception' member function.
This commit provides a sample implementation of the specified behavior.
Test Plan: `check-clang`
Reviewers: GorNishanov, EricWF
Reviewed By: GorNishanov
Subscribers: cfe-commits, lewissbaker, eric_niebler
Differential Revision: https://reviews.llvm.org/D45860
llvm-svn: 331519
FunctionProtoType.
We previously re-evaluated the expression each time we wanted to know whether
the type is noexcept or not. We now evaluate the expression exactly once.
This is not quite "no functional change": it fixes a crasher bug during AST
deserialization where we would try to evaluate the noexcept specification in a
situation where we have not deserialized sufficient portions of the AST to
permit such evaluation.
llvm-svn: 331428
Some symbols are not allowed to be used as names on some targets. Patch
ries to unify the emission of the names of LLVM globals so they could be
used on different targets.
llvm-svn: 331358
devices.
If the function is an instantiation|specialization of the template and
is used in the device code, the definitions of such functions should be
emitted for the device.
llvm-svn: 331261
This is not yet part of any C++ working draft, and so is controlled by the flag
-fchar8_t rather than a -std= flag. (The GCC implementation is controlled by a
flag with the same name.)
This implementation is experimental, and will be removed or revised
substantially to match the proposal as it makes its way through the C++
committee.
llvm-svn: 331244
As suggested in the post-commit thread for rL331056, we should match these
clang options with the established vocabulary of the corresponding sanitizer
option. Also, the use of 'strict' is well-known for these kinds of knobs,
and we can improve the descriptive text in the docs.
So this intends to match the logic of D46135 but only change the words.
Matching LLVM commit to match this spelling of the attribute to follow shortly.
Differential Revision: https://reviews.llvm.org/D46236
llvm-svn: 331209
Emit error messages instead of compiler crashing when the target region
does not exist in the device code + fix crash when the location comes
from macros.
llvm-svn: 331195
When a '>>' token is split into two '>' tokens (in C++11 onwards), or (as an
extension) when we do the same for other tokens starting with a '>', we can't
just use a location pointing to the first '>' as the location of the split
token, because that would result in our miscomputing the length and spelling
for the token. As a consequence, for example, a refactoring replacing 'A<X>'
with something else would sometimes replace one character too many, and
similarly diagnostics highlighting a template-id source range would highlight
one character too many.
Fix this by creating an expansion range covering the first character of the
'>>' token, whose spelling is '>'. For this to work, we generalize the
expansion range of a macro FileID to be either a token range (the common case)
or a character range (used in this new case).
llvm-svn: 331155
As discussed in the post-commit thread for:
rL330437 ( http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180423/545906.html )
We need a way to opt-out of a float-to-int-to-float cast optimization because too much
existing code relies on the platform-specific undefined result of those casts when the
float-to-int overflows.
The LLVM changes associated with adding this function attribute are here:
rL330947
rL330950
rL330951
Also as suggested, I changed the LLVM doc to mention the specific sanitizer flag that
catches this problem:
rL330958
Differential Revision: https://reviews.llvm.org/D46135
llvm-svn: 331041
The ACLE spec which describes these intrinsics hasn't been published yet, but
this is based on the final draft which will be published soon, and these have
already been implemented by GCC.
Differential revision: https://reviews.llvm.org/D46109
llvm-svn: 331039
SPIR-V encodes the read_only and write_only access qualifiers of pipes,
so separate LLVM IR types are required to target SPIR-V. Other backends
may also find this useful.
These new types are `opencl.pipe_ro_t` and `opencl.pipe_wo_t`, which
replace `opencl.pipe_t`.
This replaces __get_pipe_num_packets(...) and __get_pipe_max_packets(...)
which took a read_only pipe with separate versions for read_only and
write_only pipes, namely:
* __get_pipe_num_packets_ro(...)
* __get_pipe_num_packets_wo(...)
* __get_pipe_max_packets_ro(...)
* __get_pipe_max_packets_wo(...)
These separate versions exist to avoid needing a bitcast to one of the
two qualified pipe types.
Patch by Stuart Brady.
Differential Revision: https://reviews.llvm.org/D46015
llvm-svn: 331026
function if a function delegates to another function.
Fix a bug introduced in r328731, which caused a struct with ObjC __weak
fields that was passed to a function to be destructed twice, once in the
callee function and once in another function the callee function
delegates to. To prevent this, keep track of the callee-destructed
structs passed to a function and disable their cleanups at the point of
the call to the delegated function.
This reapplies r331016, which was reverted in r331019 because it caused
an assertion to fail in EmitDelegateCallArg on a windows bot. I made
changes to EmitDelegateCallArg so that it doesn't try to deactivate
cleanups for structs that have trivial destructors (cleanups for those
structs are never pushed to the cleanup stack in EmitParmDecl).
rdar://problem/39194693
Differential Revision: https://reviews.llvm.org/D45382
llvm-svn: 331020
function if a function delegates to another function.
Fix a bug introduced in r328731, which caused a struct with ObjC __weak
fields that was passed to a function to be destructed twice, once in the
callee function and once in another function the callee function
delegates to. To prevent this, keep track of the callee-destructed
structs passed to a function and disable their cleanups at the point of
the call to the delegated function.
rdar://problem/39194693
Differential Revision: https://reviews.llvm.org/D45382
llvm-svn: 331016
This patch is a tweak of changyu's patch: https://reviews.llvm.org/D40381. It differs in that the recognition of the 'concept' token is moved into the machinery that recognizes declaration-specifiers - this allows us to leverage the attribute handling machinery more seamlessly.
See the test file to get a sense of the basic parsing that this patch supports.
There is much more work to be done before concepts are usable...
Thanks Changyu!
llvm-svn: 330794
HIP is a language similar to CUDA (https://github.com/ROCm-Developer-Tools/HIP/blob/master/docs/markdown/hip_kernel_language.md ).
The language syntax is very similar, which allows a hip program to be compiled as a CUDA program by Clang. The main difference
is the host API. HIP has a set of vendor neutral host API which can be implemented on different platforms. Currently there is open source
implementation of HIP runtime on amdgpu target (https://github.com/ROCm-Developer-Tools/HIP).
This patch adds support of input kind and language standard hip.
When hip file is compiled, both LangOpts.CUDA and LangOpts.HIP is turned on. This allows compilation of hip program as CUDA
in most cases and only special handling of hip program is needed LangOpts.HIP is checked.
This patch also adds support of kernel launching of HIP program using HIP host API.
When -x hip is not specified, there is no behaviour change for CUDA.
Patch by Greg Rodgers.
Revised and lit test added by Yaxun Liu.
Differential Revision: https://reviews.llvm.org/D44984
llvm-svn: 330790
/usr/local/bin/ld.lld: error: undefined symbol: llvm::createAggressiveInstCombinerPass()
>>> referenced by cc1_main.cpp
>>> tools/clang/tools/driver/CMakeFiles/clang.dir/cc1_main.cpp.o:(_GLOBAL__sub_I_cc1_main.cpp)
And so on
The bot coverage is clearly missing.
llvm-svn: 330694
NVPTX target.
When generating the wrapper function for the offloading region, we need
to call the outlined function and cast the arguments correctly to follow
the ABI. Usually, variables captured by value are casted to `uintptr_t`
type. But this should not performed for the variables with pointer type.
llvm-svn: 330620
If an atomic variable is misaligned (and that suspicion is why Clang emits
libcalls at all) the runtime support library will have to use a lock to safely
access it, with potentially very bad performance consequences. There's a very
good chance this is unintentional so it makes sense to issue a warning.
Also give it a named group so people can promote it to an error, or disable it
if they really don't care.
llvm-svn: 330566
Some targets need special LLVM calling convention for CUDA kernel.
This patch does that through a TargetCodeGenInfo hook.
It only affects amdgcn target.
Patch by Greg Rodgers.
Revised and lit tests added by Yaxun Liu.
Differential Revision: https://reviews.llvm.org/D45223
llvm-svn: 330447
Summary:
By default Clang outputs its version (including git commit hash, in
case of trunk builds) into object and assembly files. It might be
useful to have an option to disable this, especially for debugging
purposes.
This patch implements new command line flags -Qn and -Qy (the names
are chosen for compatibility with GCC). -Qn disables output of
the 'llvm.ident' metadata string and the 'producer' debug info. -Qy
(enabled by default) does the opposite.
Reviewers: faisalv, echristo, aprantl
Reviewed By: aprantl
Subscribers: aprantl, cfe-commits, JDevlieghere, rogfer01
Differential Revision: https://reviews.llvm.org/D45255
llvm-svn: 330442
nvcc generates a unique registration function for each object file
that contains relocatable device code. Unique names are achieved
with a module id that is also reflected in the function's name.
Differential Revision: https://reviews.llvm.org/D42922
llvm-svn: 330425
This implements support for the previously ignored flag
`-falign-functions`. This allows the frontend to request alignment on
function definitions in the translation unit where they are not
explicitly requested in code. This is compatible with the GCC behaviour
and the ICC behaviour.
The scalar value passed to `-falign-functions` aligns functions to a
power-of-two boundary. If flag is used, the functions are aligned to
16-byte boundaries. If the scalar is specified, it must be an integer
less than or equal to 4096. If the value is not a power-of-two, the
driver will round it up to the nearest power of two.
llvm-svn: 330378
The force_align_arg_pointer attribute was using a hardcoded 16-byte
alignment value which in combination with -mstack-alignment=32 (or
larger) would produce a misaligned stack which could result in crashes
when accessing stack buffers using aligned AVX load/store instructions.
Fix the issue by using the "stackrealign" function attribute instead
of using a hardcoded 16-byte alignment.
Patch By: Gramner
Differential Revision: https://reviews.llvm.org/D45812
llvm-svn: 330331
This is the patch that lowers x86 intrinsics to native IR
in order to enable optimizations.
Patch by tkrupa
Differential Revision: https://reviews.llvm.org/D44786
llvm-svn: 330323
have a non-trivial destructor.
This fixes a bug introduced in r328731 where CodeGen emits calls to
synthesized destructors for non-trivial C structs in C++ mode when the
struct passed to EmitCallArg doesn't have a non-trivial destructor.
Under Microsoft's ABI, ASTContext::isParamDestroyedInCallee currently
always returns true, so it's necessary to check whether the struct has a
non-trivial destructor before pushing a cleanup in EmitCallArg.
This fixes PR37146.
llvm-svn: 330304
Summary:
A clang builtin for xray typed events. Differs from
__xray_customevent(...) by the presence of a type tag that is vended by
compiler-rt in typical usage. This allows xray handlers to expand logged
events with their type description and plugins to process traced events
based on type.
This change depends on D45633 for the intrinsic definition.
Reviewers: dberris, pelikan, rnk, eizan
Subscribers: cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D45716
llvm-svn: 330220
to a header file.
This is in preparation for using the visitor classes to warn about
memcpy'ing non-trivial C structs.
See the discussion here:
https://reviews.llvm.org/D45310
rdar://problem/36124208
llvm-svn: 330201
register destructor functions annotated with __attribute__((destructor))
using __cxa_atexit or atexit.
Register destructor functions annotated with __attribute__((destructor))
calling __cxa_atexit in a synthesized constructor function instead of
emitting references to the functions in a special section.
The primary reason for adding this option is that we are planning to
deprecate the __mod_term_funcs section on Darwin in the future. This
feature is enabled by default only on Darwin. Users who do not want this
can use command line option 'fno_register_global_dtors_with_atexit' to
disable it.
rdar://problem/33887655
Differential Revision: https://reviews.llvm.org/D45578
llvm-svn: 330199
Summary:
The clang driver option -save-temps was not passed to the LTO config,
so when invoking the ThinLTO backends via clang during distributed
builds there was no way to get LTO to save temp files.
Getting this to work with ThinLTO distributed builds also required
changing the driver to avoid a separate compile step to emit unoptimized
bitcode when the input was already bitcode under -save-temps. Not only is
this unnecessary in general, it is problematic for ThinLTO backends since
the temporary bitcode file to the backend would not match the module path
in the combined index, leading to incorrect ThinLTO backend index-based
optimizations.
Reviewers: pcc
Subscribers: mehdi_amini, inglorion, eraman, cfe-commits
Differential Revision: https://reviews.llvm.org/D45217
llvm-svn: 330194
Global variables marked as declare target are allowed to be used in map
clauses. Patch fixes the crash of the compiler on the declare target
variables in map clauses.
llvm-svn: 330156
volatile array field is copied.
The crash occurs because method 'visitArray' passes a null FieldDecl to
method 'visit' and some of the methods called downstream expect a
non-null FieldDecl to be passed.
This reapplies r330151 with a fix to the test case.
rdar://problem/33599681
llvm-svn: 330155
framework module SomeKitCore {
...
export_as SomeKit
}
Given the module above, while generting autolink information during
codegen, clang should to emit '-framework SomeKitCore' only if SomeKit
was not imported in the relevant TU, otherwise it should use '-framework
SomeKit' instead.
rdar://problem/38269782
llvm-svn: 330152
volatile array field is copied.
The crash occurs because method 'visitArray' passes a null FieldDecl to
method 'visit' and some of the methods called downstream expect a
non-null FieldDecl to be passed.
rdar://problem/33599681
llvm-svn: 330151
When emitting CodeView debug information, compiler-generated thunk routines
should be emitted using S_THUNK32 symbols instead of S_GPROC32_ID symbols so
Visual Studio can properly step into the user code. This initial support only
handles standard thunk ordinals.
Differential Revision: https://reviews.llvm.org/D43838
llvm-svn: 330132
Summary:
Clean carriage returns from lib/ and include/. NFC.
(I have to make this change locally in order for `git diff` to show sane output after I edit a file, so I might as well ask for it to be committed. I don't have commit privs myself.)
(Without this patch, `git rebase`ing any change involving SemaDeclCXX.cpp is a real nightmare. :( So while I have no right to ask for this to be committed, geez would it make my workflow easier if it were.)
Here's the command I used to reformat things. (Requires bash and OSX/FreeBSD sed.)
git grep -l $'\r' lib include | xargs sed -i -e $'s/\r//'
find lib include -name '*-e' -delete
Reviewers: malcolm.parsons
Reviewed By: malcolm.parsons
Subscribers: emaste, krytarowski, cfe-commits
Differential Revision: https://reviews.llvm.org/D45591
Patch by Arthur O'Dwyer.
llvm-svn: 330112
Summary:
This change addresses http://llvm.org/PR36926 by allowing users to pick
which instrumentation bundles to use, when instrumenting with XRay. In
particular, the flag `-fxray-instrumentation-bundle=` has four valid
values:
- `all`: the default, emits all instrumentation kinds
- `none`: equivalent to -fnoxray-instrument
- `function`: emits the entry/exit instrumentation
- `custom`: emits the custom event instrumentation
These can be combined either as comma-separated values, or as
repeated flag values.
Reviewers: echristo, kpw, eizan, pelikan
Reviewed By: pelikan
Subscribers: mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D44970
llvm-svn: 329985
It means the same thing as -mllvm; there isn't any reason to have two
options which do the same thing.
Differential Revision: https://reviews.llvm.org/D45109
llvm-svn: 329965
Summary:
Protocols that were being referenced but could not be fully realized were being emitted without `properties`/`optional_properties`. Since all v3 protocols must be 9 processor words wide, the lack of these fields is catastrophic for the runtime.
As an example, the runtime cannot know [here](https://github.com/gnustep/libobjc2/blob/master/protocol.c#L73) that `properties` and `optional_properties` are invalid.
Reviewers: rjmccall, theraven
Reviewed By: rjmccall, theraven
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D45305
llvm-svn: 329882
When we enter a __finally block, the CGF's CurCodeDecl will be null
(because CodeGenFunction::StartFunction is given an empty GlobalDecl for
a __finally block), and so the dyn_cast here will result in an assertion
failure. Change it to dyn_cast_or_null to handle this case.
Differential Revision: https://reviews.llvm.org/D45523
llvm-svn: 329836
The current support of the feature produces only 2 lines in report:
-Some general Code Generation Time;
-Total time of Backend Consumer actions.
This patch extends Clang time report with new lines related to Preprocessor, Include Filea Search, Parsing, etc.
Differential Revision: https://reviews.llvm.org/D43578
llvm-svn: 329684
Summary:
Right now to disable -fsanitize=kernel-address instrumentation, one needs to use no_sanitize("kernel-address"). Make either no_sanitize("address") or no_sanitize("kernel-address") disable both ASan and KASan instrumentation. Also remove redundant test.
Patch by Andrey Konovalov
Reviewers: eugenis, kcc, glider, dvyukov, vitalybuka
Reviewed By: eugenis, vitalybuka
Differential Revision: https://reviews.llvm.org/D44981
llvm-svn: 329612
I believe all the pieces are now in place in the backend to make this work correctly. We can either mask the input to 32 bits for pmuludg or shl/ashr for pmuldq and use a regular mul instruction. The backend should combine this to PMULUDQ/PMULDQ and then SimplifyDemandedBits will remove the and/shifts.
Differential Revision: https://reviews.llvm.org/D45421
llvm-svn: 329605
Added NUW flags for all the add|mul|sub operations + replaced sdiv by udiv
as we operate on unsigned values only (addresses, converted to integers)
llvm-svn: 329411
Found via codespell -q 3 -I ../clang-whitelist.txt
Where whitelist consists of:
archtype
cas
classs
checkk
compres
definit
frome
iff
inteval
ith
lod
methode
nd
optin
ot
pres
statics
te
thru
Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few
files that have dubious fixes reverted.)
Differential revision: https://reviews.llvm.org/D44188
llvm-svn: 329399
the tail padding is not reused.
We track on the AggValueSlot (and through a couple of other
initialization actions) whether we're dealing with an object that might
share its tail padding with some other object, so that we can avoid
emitting stores into the tail padding if that's the case. We still
widen stores into tail padding when we can do so.
Differential Revision: https://reviews.llvm.org/D45306
llvm-svn: 329342
identifier.
This patch fixes a few places in CGObjCMac.cpp where the class
identifier was used instead of the name specified by objc_runtime_name.
rdar://problem/37910822
Differential Revision: https://reviews.llvm.org/D45101
llvm-svn: 329128
Summary:
Add support for the -fsanitize=shadow-call-stack flag which causes clang
to add ShadowCallStack attribute to functions compiled with that flag
enabled.
Reviewers: pcc, kcc
Reviewed By: pcc, kcc
Subscribers: cryptoad, cfe-commits, kcc
Differential Revision: https://reviews.llvm.org/D44801
llvm-svn: 329122
This reverts r328795 which introduced an issue with referencing __global__
function templates. More details in the original review D44747.
llvm-svn: 329099
Summary:
The following class hierarchy requires that we be able to emit a
this-adjusting thunk for B::foo in C's vftable:
struct Incomplete;
struct A {
virtual A* foo(Incomplete p) = 0;
};
struct B : virtual A {
void foo(Incomplete p) override;
};
struct C : B { int c; };
This TU is valid, but lacks a definition of 'Incomplete', which makes it
hard to build a thunk for the final overrider, B::foo.
Before this change, Clang gives up attempting to emit the thunk, because
it assumes that if the parameter types are incomplete, it must be
emitting the thunk for optimization purposes. This is untrue for the MS
ABI, where the implementation of B::foo has no idea what thunks C's
vftable may require. Clang needs to emit the thunk without necessarily
having access to the complete prototype of foo.
This change makes Clang emit a musttail variadic call when it needs such
a thunk. I call these "unprototyped" thunks, because they only prototype
the "this" parameter, which must always come first in the MS C++ ABI.
These thunks work, but they create ugly LLVM IR. If the call to the
thunk is devirtualized, it will be a call to a bitcast of a function
pointer. Today, LLVM cannot inline through such a call, but I want to
address that soon, because we also use this pattern for virtual member
pointer thunks.
This change also implements an old FIXME in the code about reusing the
thunk's computed CGFunctionInfo as much as possible. Now we don't end up
computing the thunk's mangled name and arranging it's prototype up to
around three times.
Fixes PR25641
Reviewers: rjmccall, rsmith, hans
Subscribers: Prazek, cfe-commits
Differential Revision: https://reviews.llvm.org/D45112
llvm-svn: 329009
This re-lands r328845 with fixes for crbug.com/827810.
The initial motiviation was to hoist MethodVFTableLocation to global
scope so it could be forward declared.
In this patch, I noticed that MicrosoftVTableContext uses some risky
patterns. It has methods that return references to data stored in
DenseMaps. I've made some of them return by value for trivial structs
and I've moved some things into separate allocations.
llvm-svn: 329007
CUDA shared variable should be initialized with undef.
Patch by Greg Rodgers.
Revised and lit test added by Yaxun Liu.
Differential Revision: https://reviews.llvm.org/D44985
llvm-svn: 328994
A recent addition to Coroutines TS (https://wg21.link/p0913) adds a pre-defined
coroutine noop_coroutine that does nothing. To implement this feature, we implemented
an llvm.coro.noop intrinsic that returns a coroutine handle to a coroutine that
does nothing when resumed or destroyed.
This patch adds a builtin __builtin_coro_noop() that maps to llvm.coro.noop intrinsic.
Related llvm change: https://reviews.llvm.org/D45114
llvm-svn: 328993
Summary:
The docs for the LLVM coroutines intrinsic `@llvm.coro.id` state that
"The second argument, if not null, designates a particular alloca instruction
to be a coroutine promise."
However, if the address sanitizer pass is run before the `@llvm.coro.id`
intrinsic is lowered, the `alloca` instruction passed to the intrinsic as its
second argument is converted, as per the
https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm docs, to
an `inttoptr` instruction that accesses the address of the promise.
On optimization levels `-O1` and above, the `-asan` pass is run after
`-coro-early`, `-coro-split`, and `-coro-elide`, and before
`-coro-cleanup`, and so there is no issue. At `-O0`, however, `-asan`
is run in between `-coro-early` and `-coro-split`, which causes an
assertion to be hit when the `inttoptr` instruction is forcibly cast to
an `alloca`.
Rearrange the passes such that the coroutine passes are registered
before the sanitizer passes.
Test Plan:
Compile a simple C++ program that uses coroutines in `-O0` with
`-fsanitize-address`, and confirm no assertion is hit:
`clang++ coro-example.cpp -fcoroutines-ts -g -fsanitize=address -fno-omit-frame-pointer`.
Reviewers: GorNishanov, lewissbaker, EricWF
Reviewed By: GorNishanov
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D43927
llvm-svn: 328951
The problem with the previous logic was that there might not be any
explicit copy/move constructor declarations, e.g. if the type is
trivial and we've never type-checked a copy of it. Relying on Sema's
computation seems much more reliable.
Also, I believe Richard's recommendation is exactly the rule we use
now on the Itanium ABI, modulo the trivial_abi attribute (which this
change of course fixes our handling of in Swift).
This does mean that we have a less portable rule for deciding
indirectness for swiftcall. I would prefer it if we just applied the
Itanium rule universally under swiftcall, but in the meantime, I need
to fix this bug.
This only arises when defining functions with class-type arguments
in C++, as we do in the Swift runtime. It doesn't affect normal Swift
operation because we don't import code as C++.
llvm-svn: 328942
variables.
Added emission of the offloading data sections for the variables within
declare target regions + fixes emission of the declare target variables
marked as declare target not within the declare target region.
llvm-svn: 328888
This allows forward declaring it so that we can add it to
MicrosoftMangleContext::mangleVirtualMemPtrThunk without including
VTableBuilder.h. That saves a hashtable lookup when emitting virtual
member pointer functions.
It also shortens a really long type name. This struct has "VFtable" in
the name, so it seems pretty unlikely that someone will assume it is
generally useful for non-MS C++ ABI stuff.
llvm-svn: 328845
This commit generalizes NRVO to cover C structs (both trivial and
non-trivial structs).
rdar://problem/33599681
Differential Revision: https://reviews.llvm.org/D44968
llvm-svn: 328809
This patch sets target specific calling convention for CUDA kernels in IR.
Patch by Greg Rodgers.
Revised and lit test added by Yaxun Liu.
Differential Revision: https://reviews.llvm.org/D44747
llvm-svn: 328795
The conversion of operatios to bitcode helps to eliminate an additional
store in certain cases. We used to lower these load intrinsics in DAG to
DAG conversion by which time, the "Dead Store Elimination" pass is
already run. There is an associated LLVM patch.
Patch by Sumanth Gundapaneni.
llvm-svn: 328776
ObjC and ObjC++ pass non-trivial structs in a way that is incompatible
with each other. For example:
typedef struct {
id f0;
__weak id f1;
} S;
// this code is compiled in c++.
extern "C" {
void foo(S s);
}
void caller() {
// the caller passes the parameter indirectly and destructs it.
foo(S());
}
// this function is compiled in c.
// 'a' is passed directly and is destructed in the callee.
void foo(S a) {
}
This patch fixes the incompatibility by passing and returning structs
with __strong or weak fields using the C ABI in C++ mode. __strong and
__weak fields in a struct do not cause the struct to be destructed in
the caller and __strong fields do not cause the struct to be passed
indirectly.
Also, this patch fixes the microsoft ABI bug mentioned here:
https://reviews.llvm.org/D41039?id=128767#inline-364710
rdar://problem/38887866
Differential Revision: https://reviews.llvm.org/D44908
llvm-svn: 328731
These instructions have been around for a long time, but we
haven't supported intrinsics for them. The "new" vesrions use
the CSx register for the start of the buffer instead of the K
field in the Mx register.
There is a related llvm patch.
Patch by Brendon Cahoon.
llvm-svn: 328725
When the declare target variables are emitted for the device,
constructors|destructors for these variables must emitted and registered
by the runtime in the offloading sections.
llvm-svn: 328705
r327219 added wrappers to std::sort which randomly shuffle the container before
sorting. This will help in uncovering non-determinism caused due to undefined
sorting order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of
std::sort.
llvm-svn: 328636
If the link clause is used on the declare target directive, the object
should be linked on target or target data directives, not during the
codegen. Patch adds support for this clause.
llvm-svn: 328544
Summary:
Disables certain CMP optimizations to improve fuzzing signal under -O1
and -O2.
Switches all fuzzer tests to -O2 except for a few leak tests where the
leak is optimized out under -O2.
Reviewers: kcc, vitalybuka
Reviewed By: vitalybuka
Subscribers: cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D44798
llvm-svn: 328384
Need to override convertConstraint to recognise amdgpu specific register names.
Differential Revision: https://reviews.llvm.org/D44533
llvm-svn: 328359
Add two additional implicit arguments for OpenCL for the AMDGPU target using the AMDHSA runtime to support device enqueue.
Differential Revision: https://reviews.llvm.org/D44696
llvm-svn: 328350
- Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target.
- Use a function attribute to communicate to the AMDGPU backend.
Differential Revision: https://reviews.llvm.org/D43735
llvm-svn: 328347
The issues was that we were setting hidden visibility if, when
processing a hidden class, we found out that we needed to emit a
reference to a vtable provided by the standard library.
Original message:
Set dso_local on vtables.
llvm-svn: 328288
Putting back the code in commit r327189 that was reverted in r322737. The code is being committed in three stages and this one is the last stage: 1) r327455 fp16 feature flags, 2) r327836 pass half type or i16 based on FullFP16, and 3) the code here which the front-end fp16 vector intrinsic for ARM.
Differential revision https://reviews.llvm.org/D43650
llvm-svn: 328277
The difference between CreateRuntimeFunction and CreateBuiltinFunction
is that CreateBuiltinFunction would not set dllimport or dso_local.
To keep the current semantics, just forward to CreateRuntimeFunction
with Local=true so it doesn't add dllimport.
llvm-svn: 328224
Summary: The workers also need to initialize the global stack. The call to the initialization function needs to happen after the kernel_init() function is called by the master. This ensures that the per-team data structures of the runtime have been initialized.
Reviewers: ABataev, grokos, carlo.bertolli, caomhin
Reviewed By: ABataev
Subscribers: jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D44749
llvm-svn: 328219
Now that LLVM has support for emitting calling conventions in DWARF (see
r328191) have clang emit them.
Patch by: Adrien Guinet
Differential revision: https://reviews.llvm.org/D42351
llvm-svn: 328196
This is needed for the upcoming implementation of the
new 8x32x16 and 32x8x16 variants of WMMA instructions
introduced in CUDA 9.1.
Differential Revision: https://reviews.llvm.org/D44719
llvm-svn: 328158
Summary:
Libc++'s default allocator uses `__builtin_operator_new` and `__builtin_operator_delete` in order to allow the calls to new/delete to be ellided. However, libc++ now needs to support over-aligned types in the default allocator. In order to support this without disabling the existing optimization Clang needs to support calling the aligned new overloads from the builtins.
See llvm.org/PR22634 for more information about the libc++ bug.
This patch changes `__builtin_operator_new`/`__builtin_operator_delete` to call any usual `operator new`/`operator delete` function. It does this by performing overload resolution with the arguments passed to the builtin to determine which allocation function to call. If the selected function is not a usual allocation function a diagnostic is issued.
One open issue is if the `align_val_t` overloads should be considered "usual" when `LangOpts::AlignedAllocation` is disabled.
In order to allow libc++ to detect this new behavior the value for `__has_builtin(__builtin_operator_new)` has been updated to `201802`.
Reviewers: rsmith, majnemer, aaron.ballman, erik.pilkington, bogner, ahatanak
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D43047
llvm-svn: 328134
This way we can support address-space specific variants without explicitly
encoding the space in the name of the intrinsic. Less intrinsics to deal with ->
less boilerplate.
Added a bit of tablegen magic to match/replace an intrinsics with a pointer
argument in particular address space with the space-specific instruction
variant.
Updated tests to use non-default address spaces.
Differential Revision: https://reviews.llvm.org/D43268
llvm-svn: 328006
constructs in generic mode.
Fixed codegen for distribute parallel combined constructs. We have to
pass and read the shared lower and upper bound from the distribute
region in the inner parallel region. Patch is for generic mode.
llvm-svn: 327990
If the generic codegen is enabled and private copy of the original
variable escapes the declaration context, this private copy should be
globalized just like it was the original variable.
llvm-svn: 327985
source expressions when iterating over a PseudoObjectExpr's semantic
subexpression list.
Previously the loop in emitPseudoObjectExpr would emit the IR for each
OpaqueValueExpr that was in a PseudoObjectExpr's semantic-form
expression list and use the result when the OpaqueValueExpr later
appeared in other expressions. This caused an assertion failure when
AggExprEmitter tried to copy the result of an OpaqueValueExpr and the
copied type didn't have trivial copy/move constructors or assignment
operators.
This patch adds flag IsUnique to OpaqueValueExpr which indicates it is a
unique reference to its source expression (it is not used in multiple
places). The loop in emitPseudoObjectExpr ignores OpaqueValueExprs that
are unique and CodeGen visitors simply traverse the source expressions
of such OpaqueValueExprs.
rdar://problem/34363596
Differential Revision: https://reviews.llvm.org/D39562
llvm-svn: 327939
The inline assembly generated for the ARC autorelease elision marker
must have a funclet token if it's emitted inside a funclet, otherwise
the inline assembly (and all subsequent code in the funclet) will be
marked unreachable. r324689 fixed this issue for regular inline assembly
blocks.
Note that clang only emits the marker at -O0, so this only fixes that
case. The optimizations case (where the marker is emitted by the
backend) will be fixed in a separate change.
Differential Revision: https://reviews.llvm.org/D44640
llvm-svn: 327892
This patch uses the infrastructure added in r326307 for enabling
non-trivial fields to be declared in C structs to allow __weak fields in
C structs in ARC.
This recommits r327206, which was reverted because it caused
module-enabled builders to fail. I discovered that the
CXXRecordDecl::CanPassInRegisters flag wasn't being set correctly in
some cases after I moved it to RecordDecl.
Thanks to Eric Liu for helping me investigate the bug.
rdar://problem/33599681
https://reviews.llvm.org/D44095
llvm-svn: 327870
For generating NEON intrinsics, this determines the NEON data type, and whether
it should be a half type or an i16 type. I.e., we always pass a half type for
AArch64, this hasn't changed, but now also for ARM but only when FullFP16 is
enabled, and i16 otherwise.
This is intended to be non-functional change, but together with the backend
work in D44538 which adds support for f16 vectors, this enables adding the
AArch32 FP16 (vector) intrinsics.
Differential Revision: https://reviews.llvm.org/D44561
llvm-svn: 327836
Summary:
The codegen for conditions assumes that a normal variable declaration is used in a condition, but this is not the case when a structured binding is used.
This fixes [PR36747](http://llvm.org/pr36747).
Thanks Nicolas Lesser for contributing the patch.
Reviewers: lichray, rsmith
Reviewed By: lichray
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D44534
llvm-svn: 327780
The patch adds nocf_check target independent attribute for disabling checks that were enabled by cf-protection flag.
The attribute can be appertained to functions and function pointers.
Attribute name follows GCC's similar attribute name.
Differential Revision: https://reviews.llvm.org/D41880
llvm-svn: 327768
Summary:
Previously we tried too hard to uphold the fiction that destructor
variants work like they do on Itanium throughout the ABI-neutral parts
of clang. This lead to MS C++ ABI incompatiblities and other bugs. Now,
-mconstructor-aliases will no longer control this ABI detail, and clang
-cc1's LLVM IR output will be this much closer to the clang driver's.
Based on a patch by Zahira Ammarguellat:
https://reviews.llvm.org/D39063
I've tried to move the logic that Zahira added into MicrosoftCXXABI.cpp.
There is only one ABI-specific detail sticking out, and that is in
CodeGenModule::getAddrOfCXXStructor, where we collapse complete dtors to
base dtors in the MS ABI.
This fixes PR32990.
Reviewers: erichkeane, zahiraam, majnemer, rjmccall
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D44505
llvm-svn: 327732
The compiler complained about
../tools/clang/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:184:15: error: unused variable 'CSI' [-Werror,-Wunused-variable]
if (auto *CSI = CGF.CapturedStmtInfo) {
^
1 error generated.
I don't know this code but it seems like an easy fix so I push it anyway
to get rid of the warning.
llvm-svn: 327694
If the variable is captured by value and the corresponding parameter in
the outlined function escapes its declaration context, this parameter
must be globalized. To globalize it we need to get the address of the
original parameter, load the value, store it to the global address and
use this global address instead of the original.
Patch improves globalization for parallel|teams regions + functions in
declare target regions.
llvm-svn: 327654
Added initial codegen for device side of declarations inside `omp
declare target` construct + codegen for implicit `declare target`
functions, which are used in the target regions.
llvm-svn: 327636
In this particular case it would be possible to just add an else with
CGM.setDSOLocal(GV), but it seems better to have as many callers as
possible just call setGVProperties so that we can centralize the logic
there.
This patch then makes setGVProperties able to handle null Decls.
llvm-svn: 327543
Recent change r326946 (https://reviews.llvm.org/D34367) causes regression in Eigen due to increased
memory footprint of CallArg.
This patch reduces LValue size from 112 to 96 bytes and reduces inline argument count of CallArgList
from 16 to 8.
It has been verified that this will let the added deep AST tree test pass with r326946.
In the long run, CallArg or LValue memory footprint should be further optimized.
Differential Revision: https://reviews.llvm.org/D44445
llvm-svn: 327515
In C, we'll wait until the end of the scope to clean up aggregate
temporaries used for returns from calls. This means in cases like:
{
// Assuming that `Bar` is large enough to warrant indirect returns
struct Bar b = {};
b = foo(&b);
b = foo(&b);
b = foo(&b);
b = foo(&b);
}
...We'll allocate space for 5 Bars on the stack (`b`, and 4
temporaries). This becomes painful in things like large switch
statements.
If cleaning up sooner is trivial, we should do it.
llvm-svn: 327229
This patch uses the infrastructure added in r326307 for enabling
non-trivial fields to be declared in C structs to allow __weak fields in
C structs in ARC.
rdar://problem/33599681
Differential Revision: https://reviews.llvm.org/D44095
llvm-svn: 327206
If CodeGenFunction::EmitCall is:
- asked to emit a call with an indirectly returned value,
- given an invalid return value slot, and
- told the return value of the function it's calling is unused
then it'll make its own temporary, and add lifetime markers so that the
temporary's lifetime ends immediately after the call.
The early lifetime.end becomes problematic when we need to run a
destructor on the result of the function.
Instead of unconditionally saying that results of all calls are used
here (which would be correct, but would also cause us to never emit
lifetime markers for these temporaries), we just build our own temporary
to pass in when a dtor has to be run.
llvm-svn: 327192
If initialization of the task reductions requires pointer to original
variable, which is stored in the threadprivate storage, we used the
address of this pointer instead.
llvm-svn: 327136
Simplify the dispatching for the personality routines. This really had
no test coverage previously, so add test coverage for the various cases.
This turns out to be pretty complicated as the various languages and
models interact to change personalities around.
You really should feel bad for the compiler if you are using exceptions.
There is no reason for this type of cruelty.
llvm-svn: 327105
using.
We may emit the code in wrong order because of incorrect implementation
of the runtime functions for task reductions. Threadprivate storages may
be initialized after real initialization of the reduction items. Patch
fixes this problem.
llvm-svn: 327008
Before this, we'd only emit lifetime.ends for these temps in
non-exceptional paths. This potentially made our stack larger than it
needed to be for any code that follows an EH cleanup. e.g. in
```
struct Foo { char cs[32]; };
void escape(void *);
struct Bar { ~Bar() { char cs[64]; escape(cs); } };
Foo getFoo();
void baz() {
Bar b;
getFoo();
}
```
baz() would require 96 bytes of stack, since the temporary from getFoo()
only had a lifetime.end on the non-exceptional path.
This also makes us keep hold of the Value* returned by
EmitLifetimeStart, so we don't have to remake it later.
llvm-svn: 326988
No effective behavior change, just for cleanliness.
Analysis and typing by me, actual patch mostly by Reid.
Fixes PR36159.
https://reviews.llvm.org/D44223
llvm-svn: 326960
Summary: Remove this scheme for now since it will be covered by another more generic scheme using global memory. This code will be worked into an optimization for the generic data sharing scheme. Removing this completely and then adding it via future patches will make all future data sharing patches cleaner.
Reviewers: ABataev, carlo.bertolli, caomhin
Reviewed By: ABataev
Subscribers: jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D43625
llvm-svn: 326948
The indirect function argument is in alloca address space in LLVM IR. However,
during Clang codegen for C++, the address space of indirect function argument
should match its address space in the source code, i.e., default addr space, even
for indirect argument. This is because destructor of the indirect argument may
be called in the caller function, and address of the indirect argument may be
taken, in either case the indirect function argument is expected to be in default
addr space, not the alloca address space.
Therefore, the indirect function argument should be mapped to the temp var
casted to default address space. The caller will cast it to alloca addr space
when passing it to the callee. In the callee, the argument is also casted to the
default address space and used.
CallArg is refactored to facilitate this fix.
Differential Revision: https://reviews.llvm.org/D34367
llvm-svn: 326946
OpenCL runtime tracks the invoke function emitted for
any block expression. Due to restrictions on blocks in
OpenCL (v2.0 s6.12.5), it is always possible to know the
block invoke function when emitting call of block expression
or __enqueue_kernel builtin functions. Since __enqueu_kernel
already has an argument for the invoke function, it is redundant
to have invoke function member in the llvm block literal structure.
This patch removes invoke function from the llvm block literal
structure. It also removes the bitcast of block invoke function
to the generic block literal type which is useless for OpenCL.
This will save some space for the kernel argument, and also
eliminate some store instructions.
Differential Revision: https://reviews.llvm.org/D43783
llvm-svn: 326937
We may emit incorrect lifetime info during codegen for loop counters in
OpenMP constructs because of automatic scope cleanup when we needed
temporarily locations for private loop counters.
llvm-svn: 326922
EmitLifetimeStart returns a non-null `size` pointer if it actually
emits a lifetime.start. Later in this function, we use `tempSize`'s
nullness to determine whether or not we should emit a lifetime.end.
llvm-svn: 326844
variables.
If the task has reduction construct and this construct for some variable
requires unique threadprivate storage, we may generate different names
for variables used in taskgroup task_reduction clause and in task
in_reduction clause. Patch fixes this problem.
llvm-svn: 326827
Summary:
Currently only calls to mcount were suppressed with
no_instrument_function attribute.
Linux kernel requires that calls to fentry should also not be
generated.
This is an extended fix for PR PR33515.
Reviewers: hfinkel, rengolin, srhines, rnk, rsmith, rjmccall, hans
Reviewed By: rjmccall
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D43995
llvm-svn: 326639
The patch fixes a number of bugs related to parameter indexing in
attributes:
* Parameter indices in some attributes (argument_with_type_tag,
pointer_with_type_tag, nonnull, ownership_takes, ownership_holds,
and ownership_returns) are specified in source as one-origin
including any C++ implicit this parameter, were stored as
zero-origin excluding any this parameter, and were erroneously
printing (-ast-print) and confusingly dumping (-ast-dump) as the
stored values.
* For alloc_size, the C++ implicit this parameter was not subtracted
correctly in Sema, leading to assert failures or to silent failures
of __builtin_object_size to compute a value.
* For argument_with_type_tag, pointer_with_type_tag, and
ownership_returns, the C++ implicit this parameter was not added
back to parameter indices in some diagnostics.
This patch fixes the above bugs and aims to prevent similar bugs in
the future by introducing careful mechanisms for handling parameter
indices in attributes. ParamIdx stores a parameter index and is
designed to hide the stored encoding while providing accessors that
require each use (such as printing) to make explicit the encoding that
is needed. Attribute declarations declare parameter index arguments
as [Variadic]ParamIdxArgument, which are exposed as ParamIdx[*]. This
patch rewrites all attribute arguments that are processed by
checkFunctionOrMethodParameterIndex in SemaDeclAttr.cpp to be declared
as [Variadic]ParamIdxArgument. The only exception is xray_log_args's
argument, which is encoded as a count not an index.
Differential Revision: https://reviews.llvm.org/D43248
llvm-svn: 326602
Patch fixes the problem with the functions marked as `declare simd`. If
the canonical declaration does not have associated `declare simd`
construct, we may not generate required code even if other
redeclarations are marked as `declare simd`.
llvm-svn: 326594
This makes it easier to debug crashes and hangs in block functions since
users can easily find out where the block is called from. The option
doesn't disable tail-calls from non-escaping blocks since non-escaping
blocks are not as hard to debug as escaping blocks.
rdar://problem/35758207
Differential Revision: https://reviews.llvm.org/D43841
llvm-svn: 326530
This shouldn't change any results for now, but is more consistent with
how we set dllimport/dllexport and will make future changes easier.
Since clang produces IR as it parses, it can find out mid file that
something is dllimport. When that happens we have to drop
dso_local. This is not a problem right now because
CodeGenModule::setDSOLocal is called from relatively few places at
the moment.
llvm-svn: 326527
Since LLVM r326341, default EmulatedTLS mode is decided in backend
according to target triple. Any front-end should pass -f[no]-emulated-tls
to backend and set up ExplicitEmulatedTLS only when the flags are used.
Differential Revision: https://reviews.llvm.org/D43965
llvm-svn: 326499
So I wrote a clang-tidy check to lint out redundant `isa`, `cast`, and
`dyn_cast`s for fun. This is a portion of what it found for clang; I
plan to do similar cleanups in LLVM and other subprojects when I find
time.
Because of the volume of changes, I explicitly avoided making any change
that wasn't highly local and obviously correct to me (e.g. we still have
a number of foo(cast<Bar>(baz)) that I didn't touch, since overloading
is a thing and the cast<Bar> did actually change the type -- just up the
class hierarchy).
I also tried to leave the types we were cast<>ing to somewhere nearby,
in cases where it wasn't locally obvious what we were dealing with
before.
llvm-svn: 326416
This is the next step in setting dso_local for COFF.
The patches changes setGVProperties to first set dllimport/dllexport
and changes a few cases that were setting dllimport/dllexport
manually. With this a few more GVs are marked dso_local.
llvm-svn: 326397
Differential Revision: https://reviews.llvm.org/D43852
This patch extends the SPMD implementation to all target constructs and guards this implementation under a new flag.
llvm-svn: 326368
objc_msgSend_stret takes a hidden parameter for the returned structure's
address for the construction. When the function signature is rewritten
for the inalloca passing, the return type is no longer marked as
indirect but rather inalloca stret. This enhances the test for the
indirect return to check for that case as well. This fixes the
incorrect return classification for Windows x86.
llvm-svn: 326362
Binaries for multiple architectures are combined by fatbinary,
so the current code was effectively not needed.
Differential Revision: https://reviews.llvm.org/D43461
llvm-svn: 326342
ARC mode.
Declaring __strong pointer fields in structs was not allowed in
Objective-C ARC until now because that would make the struct non-trivial
to default-initialize, copy/move, and destroy, which is not something C
was designed to do. This patch lifts that restriction.
Special functions for non-trivial C structs are synthesized that are
needed to default-initialize, copy/move, and destroy the structs and
manage the ownership of the objects the __strong pointer fields point
to. Non-trivial structs passed to functions are destructed in the callee
function.
rdar://problem/33599681
Differential Revision: https://reviews.llvm.org/D41228
llvm-svn: 326307
In DWARF v5 the Line Number Program Header is extensible, allowing values with
new content types. This vendor extension to DWARF v5 allows source text to be
embedded directly in the line tables of the debug line section.
Add new flag (-g[no-]embed-source) to Driver and CC1 which indicates
that source should be passed through to LLVM during CodeGen.
Differential Revision: https://reviews.llvm.org/D42766
llvm-svn: 326102
The tests that failed on a windows host have been fixed.
Original message:
Start setting dso_local for COFF.
With this there are still some GVs where we don't set dso_local
because setGVProperties is never called. I intend to fix that in
followup commits. This is just the bare minimum to teach
shouldAssumeDSOLocal what it should do for COFF.
llvm-svn: 325940
With this there are still some GVs where we don't set dso_local
because setGVProperties is never called. I intend to fix that in
followup commits. This is just the bare minimum to teach
shouldAssumeDSOLocal what it should do for COFF.
llvm-svn: 325915
The value of dso_local can be computed from just IR properties and
global information (object file type, command line options, etc).
With this patch we no longer pass in the Decl. It was almost unused
and making it fully unused guarantees that dso_local is consistent
with the rest of the IR.
llvm-svn: 325846
Differential Revision: https://reviews.llvm.org/D43513
This is a bug fix that removes the emission of reduction support for pragma 'distribute' when found alone or in combinations without simd.
Pragma 'distribute' does not have a reduction clause, but when combined with pragma 'simd' we need to emit the support for simd's reduction clause as part of code generation for distribute. This guard is similar to the one used for reduction support earlier in the same code gen function.
llvm-svn: 325822
Summary:
OpenCL 2.0 specification defines '-cl-uniform-work-group-size' option,
which requires that the global work-size be a multiple of the work-group
size specified to clEnqueueNDRangeKernel and allows optimizations that
are made possible by this restriction.
The patch introduces the support of this option.
To keep information about whether an OpenCL kernel has uniform work
group size or not, clang generates 'uniform-work-group-size' function
attribute for every kernel:
- "uniform-work-group-size"="true" for OpenCL 1.2 and lower,
- "uniform-work-group-size"="true" for OpenCL 2.0 and higher if
'-cl-uniform-work-group-size' option was specified,
- "uniform-work-group-size"="false" for OpenCL 2.0 and higher if no
'-cl-uniform-work-group-size' options was specified.
If the function is not an OpenCL kernel, 'uniform-work-group-size'
attribute isn't generated.
Patch by: krisb
Reviewers: yaxunl, Anastasia, b-sumner
Reviewed By: yaxunl, Anastasia
Subscribers: nhaehnle, yaxunl, Anastasia, cfe-commits
Differential Revision: https://reviews.llvm.org/D43570
llvm-svn: 325771
When using blocks with C++ on Windows x86, it is possible to have the
block literal be pushed into the inalloca'ed parameters. Teach IRGen to
handle the case properly by extracting the block literal from the
inalloca parameter. This fixes the use of blocks with C++ on Windows
x86.
llvm-svn: 325724
This patch fixes creating TBAA access descriptors for
may_alias-marked access types. Currently, for such types we
generate ordinary descriptors with char as its access type. The
patch changes this to produce proper may-alias descriptors.
Differential Revision: https://reviews.llvm.org/D42366
llvm-svn: 325575
Currently, clang compiles explicit initializers for array
elements into series of store instructions. For large arrays of
built-in types this results in bloated output code and
significant amount of time spent on the instruction selection
phase. This patch fixes the issue by initializing such arrays
with global constants that store the binary image of the
initializer.
Differential Revision: https://reviews.llvm.org/D43181
llvm-svn: 325478
Summary:
Gold plugin does not add pass to ThinLTO modules without useful symbols.
In this case ThinLTO can't create corresponding index file and some features, like CFI,
cannot be processes by backed correctly without index.
Given that we don't need the backed output we can request it to avoid
processing the module. This is implemented by this patch using new
"SkipModuleByDistributedBackend" flag.
Reviewers: pcc, tejohnson
Subscribers: mehdi_amini, inglorion, eraman, cfe-commits
Differential Revision: https://reviews.llvm.org/D42995
llvm-svn: 325411
Summary:
ThinLTO compilation may decide not to split module and keep at as regular LTO.
In this can this module already processed during indexing and already a part of
merged object file. So here we can just skip it.
Reviewers: pcc, tejohnson
Reviewed By: tejohnson
Subscribers: mehdi_amini, inglorion, eraman, cfe-commits
Differential Revision: https://reviews.llvm.org/D42680
llvm-svn: 325410
Codegen for ordered with doacross construct might produce incorrect code
because of missing cleanup scope for the construct. Without this scope
the final runtime function call could be emitted in the wrong order that
leads to incorrect codegen.
llvm-svn: 325304
The following test case causes issue with codegen of __enqueue_block
void (^block)(void) = ^{ callee(id, out); };
enqueue_kernel(queue, 0, ndrange, block);
Clang first does codegen for block expression in the first line and deletes its block info.
Clang then tries to do codegen for the same block expression again for the second line,
and fails because the block info is gone.
The fix is to do normal codegen for both lines. Introduce an API to OpenCL runtime to
record llvm block invoke function and llvm block literal emitted for each AST block
expression, and use the recorded information for generating the wrapper kernel.
The EmitBlockLiteral APIs are cleaned up to minimize changes to the normal codegen
of blocks.
Another minor issue is that some clean up AST expression is generated for block
with captures, which can be stripped by IgnoreImplicit.
Differential Revision: https://reviews.llvm.org/D43240
llvm-svn: 325264
Added support in clang for GCC function attribute 'artificial'. This attribute
is used to control stepping behavior of debugger with respect to inline
functions.
Patch By: Elizabeth Andrews (eandrews)
Differential Revision: https://reviews.llvm.org/D43259
llvm-svn: 325081
Summary:
This patch also adds the 'DW_AT_artificial' flag to the generated variable.
Addresses the issues mentioned in http://llvm.org/PR30553.
Reviewers: CarlosAlbertoEnciso, probinson, aprantl
Reviewed By: aprantl
Subscribers: JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D43189
llvm-svn: 324988
As reported here: https://bugs.llvm.org/show_bug.cgi?id=36301
The issue is that the 'use' causes the plain declaration to emit
the attributes to LLVM-IR. However, if the definition added it
later, these would silently disappear.
This commit extracts that logic to its own function in CodeGenModule,
and has the attribute-applications done during 'definition' update
the attributes properly.
Differential Revision: https://reviews.llvm.org/D43095
llvm-svn: 324907
Summary:
Right now clang is skipping array cookie poisoning for any operator
new[] which is not part of the set of replaceable global allocation
functions.
This commit adds a flag to tell clang to poison all operator new[]
cookies.
A previous review was poisoning all array cookies unconditionally, but
there is an edge case which would stop working under ASan (a custom
operator new[] saves whatever pointer it returned, and then accesses
it).
This newer revision adds a command line argument to toggle this feature.
Original revision: https://reviews.llvm.org/D41301
Compiler-rt test revision with an explanation of the edge case: https://reviews.llvm.org/D41664
Reviewers: rjmccall, kcc, rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D43013
llvm-svn: 324884
Summary:
This change avoids the overhead of storing, and later crawling,
an initializer list of all zeros for arrays. When LLVM
visits this (llvm/IR/Constants.cpp) ConstantArray::getImpl()
it will scan the list looking for an array of all zero.
We can avoid the store, and short-cut the scan, by detecting
all zeros when clang builds-up the initialization representation.
This was brought to my attention when investigating PR36030
Reviewers: majnemer, rjmccall
Reviewed By: rjmccall
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D42549
llvm-svn: 324776
Summary:
Fixes PR36247, which is where WinEHPrepare replaces inline asm in
funclets with unreachable.
Make getBundlesForFunclet return by value to simplify some call sites.
Reviewers: smeenai, majnemer
Subscribers: eraman, cfe-commits
Differential Revision: https://reviews.llvm.org/D43033
llvm-svn: 324689
Summary:
This patch is a fix for following issue:
https://bugs.llvm.org/show_bug.cgi?id=31362 The problem was caused by front end
lowering C calling conventions without taking into account calling conventions
enforced by attribute. In this case win64cc was no correctly lowered on targets
other than Windows.
Reviewed By: rnk (Reid Kleckner)
Differential Revision: https://reviews.llvm.org/D43016
Author: belickim <mateusz.belicki@intel.com>
llvm-svn: 324594
The difference from the previous try is that we no longer directly
access function declarations from position independent executables. It
should work, but currently doesn't with some linkers.
It now includes a fix to not mark available_externally definitions as
dso_local.
Original message:
Start setting dso_local in clang.
This starts adding dso_local to clang.
The hope is to eventually have TargetMachine::shouldAssumeDsoLocal go
away. My objective for now is to move enough of it to clang to remove
the need for the TargetMachine one to handle PIE copy relocations and
-fno-plt. With that it should then be easy to implement a
-fno-copy-reloc in clang.
This patch just adds the cases where we assume a symbol to be local
based on the file being compiled for an executable or a shared
library.
llvm-svn: 324535
This reverts commit r324500.
The bots found two failures:
ThreadSanitizer-x86_64 :: Linux/pie_no_aslr.cc
ThreadSanitizer-x86_64 :: pie_test.cc
when using gold. The issue is a limitation in gold when building pie
binaries. I will investigate how to work around it.
llvm-svn: 324505
It now includes a fix to not mark available_externally definitions as
dso_local.
Original message:
Start setting dso_local in clang.
This starts adding dso_local to clang.
The hope is to eventually have TargetMachine::shouldAssumeDsoLocal go
away. My objective for now is to move enough of it to clang to remove
the need for the TargetMachine one to handle PIE copy relocations and
-fno-plt. With that it should then be easy to implement a
-fno-copy-reloc in clang.
This patch just adds the cases where we assume a symbol to be local
based on the file being compiled for an executable or a shared
library.
llvm-svn: 324500
I found this while looking at the ppc failures caused by the dso_local
change.
The issue was that the patch would produce the wrong answer for
available_externally. Having ForDefinition_t available in places where
the code can just check the linkage is a bit of a foot gun.
This patch removes the ForDefiniton_t argument in places where the
linkage is already know.
llvm-svn: 324499
This patch:
* fixes an incorrect sign-extension of unsigned values, when emitting
debug info metadata for enumerators
* the enumerators metadata is created with a flag, which determines
interpretation of the value bits (signed or unsigned)
* the enumerations metadata contains the underlying integer type and a
flag, indicating whether this is a C++ "fixed enum"
Differential Revision: https://reviews.llvm.org/D42736
llvm-svn: 324490
This adds the frontend support required to support the use of the
comment pragma to enable auto linking on ELFish targets. This is a
generic ELF extension supported by LLVM. We need to change the handling
for the "dependentlib" in order to accommodate the previously discussed
encoding for the dependent library descriptor. Without the custom
handling of the PCK_Lib directive, the -l prefixed option would be
encoded into the resulting object (which is treated as a frontend
error).
llvm-svn: 324438
This change reduces the live range of the loaded function pointer,
resulting in a slight code size decrease (~10KB in clang), and also
improves the security of CFI for virtual calls by making it less
likely that the function pointer will be spilled, and ensuring that
it is not spilled across a function call boundary.
Fixes PR35353.
Differential Revision: https://reviews.llvm.org/D42725
llvm-svn: 324286
The 'trivial_abi' attribute can be applied to a C++ class, struct, or
union. It makes special functions of the annotated class (the destructor
and copy/move constructors) to be trivial for the purpose of calls and,
as a result, enables the annotated class or containing classes to be
passed or returned using the C ABI for the underlying type.
When a type that is considered trivial for the purpose of calls despite
having a non-trivial destructor (which happens only when the class type
or one of its subobjects is a 'trivial_abi' class) is passed to a
function, the callee is responsible for destroying the object.
For more background, see the discussions that took place on the mailing
list:
http://lists.llvm.org/pipermail/cfe-dev/2017-November/055955.htmlhttp://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20180101/thread.html#214043
rdar://problem/35204524
Differential Revision: https://reviews.llvm.org/D41039
llvm-svn: 324269
Summary:
Previously, Clang only emitted label names in assert builds.
However there is a CC1 option -discard-value-names that should have been used to control emission instead.
This patch removes the NDEBUG preprocessor block and instead allows LLVM to handle removing the names in accordance with the option.
Reviewers: erichkeane, aaron.ballman, majnemer
Reviewed By: aaron.ballman
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D42829
llvm-svn: 324127
This starts adding dso_local to clang.
The hope is to eventually have TargetMachine::shouldAssumeDsoLocal go
away. My objective for now is to move enough of it to clang to remove
the need for the TargetMachine one to handle PIE copy relocations and
-fno-plt. With that it should then be easy to implement a
-fno-copy-reloc in clang.
This patch just adds the cases where we assume a symbol to be local
based on the file being compiled for an executable or a shared
library.
llvm-svn: 324107
When trying to track down a different bug, we discovered
that calling __builtin_va_arg on a vec3f type caused
the SROA pass to issue a warning that there was an illegal
access.
Further research showed that the vec3f type is
alloca'ed as size '12', but the _builtin_va_arg code
on x86_64 was always loading this out of registers as
{double, double}. Thus, the 2nd store into the vec3f
was storing in bytes 12-15!
This patch alters the original implementation which always
assumed {double, double} to use the actual coerced type
instead, so the LLVM-IR generated is a load/GEP/store of
a <2 x float> and a float, rather than a double and a double.
Tests were added for all combinations I could think of that
would fit in 2 FP registers, and all work exactly as expected.
Differential Revision: https://reviews.llvm.org/D42811
llvm-svn: 324098
This fixes building Qt as shared libraries with clang in MinGW
mode; previously subclasses of the QObjectData class (in other
DLLs than the base DLL) failed to find the typeinfo symbols
(that neither were emitted in the base DLL nor in the DLL
containing the subclass).
If the virtual destructor in the newly added testcase wouldn't
be pure (or if there'd be another non-pure virtual method),
it'd be a key function and things would work out even before this
change. Make sure to locally emit the typeinfo for these classes
as well.
This matches what GCC does in this specific testcase.
This fixes the root issue that spawned PR35146. (The difference
to GCC that is initially described in that bug still is present
though.)
Differential Revision: https://reviews.llvm.org/D42641
llvm-svn: 324059
Summary:
This patch enables debugging of C99 VLA types by generating more precise
LLVM Debug metadata, using the extended DISubrange 'count' field that
takes a DIVariable.
This should implement:
Bug 30553: Debug info generated for arrays is not what GDB expects (not as good as GCC's)
https://bugs.llvm.org/show_bug.cgi?id=30553
Reviewers: echristo, aprantl, dexonsmith, clayborg, pcc, kristof.beyls, dblaikie
Reviewed By: aprantl
Subscribers: jholewinski, schweitz, davide, fhahn, JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D41698
llvm-svn: 323952
This patch fixes a bug in CGRecordLowering::accumulateBitFields where it
unconditionally starts a new run and emits a storage field when it sees
a zero-sized bitfield, which causes an assertion in insertPadding to
fail when -fno-bitfield-type-align is used.
It shouldn't emit new storage if UseZeroLengthBitfieldAlignment and
UseBitFieldTypeAlignment are both false.
rdar://problem/36762205
llvm-svn: 323943
Summary:
This change is step three in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use getDestAlignment()
and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
Reviewers: rjmccall
Subscribers: jyknight, nemanjai, nhaehnle, javed.absar, sbc100, aheejin, kbarton, fedor.sergeev, cfe-commits
Differential Revision: https://reviews.llvm.org/D41677
llvm-svn: 323617
constructor.
Previously, clang would emit an over-aligned (16-byte) store to
initialize B::x in B's base constructor when compiling the following
code:
struct A {
__attribute__((aligned(16))) double data1;
};
struct B : public virtual A {
B() : x(123) {}
double a;
int x;
};
struct C : public virtual B {};
void test() { B b; C c; }
This was happening because the code in IRGen that does member
initialization was using the alignment of a complete object instead of
the non-virtual alignment.
This commit fixes the bug.
rdar://problem/36382481
Differential Revision: https://reviews.llvm.org/D42521
llvm-svn: 323578
The MSVC runtime library does not provide a definition of wmemcmp,
so we need an inline implementation.
Differential Revision: https://reviews.llvm.org/D42441
llvm-svn: 323362
Hidden visibility is almost the opposite of dllimport. We were
producing them before (dllimport wins in the existing llvm
implementation), but now the llvm verifier produces an error.
llvm-svn: 323361
These symbols are supposed to be preserved even by the linker. Use the
`llvm.used` to ensure that the symbols are not removed by DCE in the
linker. This should be a no-op change on MachO since the symbols are
annotated as `no_dead_strip`.
llvm-svn: 323247
Pass and return _Float16 as if it were an int or float for ARM, but with the
top 16 bits unspecified, similarly like we already do for __fp16.
We will implement proper half-precision function argument lowering in the ARM
backend soon, but want to use this workaround in the mean time.
Differential Revision: https://reviews.llvm.org/D42318
llvm-svn: 323185
When a function taking transparent union is declared as taking one of
union members earlier in the translation unit, clang would hit an
"Invalid cast" assertion during EmitFunctionProlog. This case
corresponds to function f1 in test/CodeGen/transparent-union-redecl.c.
We decided to cast i32 to union because after merging function
declarations function parameter type becomes int,
CGFunctionInfo::ArgInfo type matches with ABIArgInfo type, so we decide
it is a trivial case. But these types should also be castable to
parameter declaration type which is not the case here.
Now the fix is in converting from ABIArgInfo type to VarDecl type and using
argument demotion when necessary.
Additional tests in Sema/transparent-union.c capture current behavior and make
sure there are no regressions.
rdar://problem/34949329
Reviewers: rjmccall, rafael
Reviewed By: rjmccall
Subscribers: aemerson, cfe-commits, kristof.beyls, ahatanak
Differential Revision: https://reviews.llvm.org/D41311
llvm-svn: 323156
The standard says:
[expr.static.cast] p11: "If the prvalue of type “pointer to cv1 B” points to a B
that is actually a subobject of an object of type D, the resulting pointer points
to the enclosing object of type D. Otherwise, the behavior is undefined."
Therefore, the GEP must be inbounds.
This should solve the failure to optimize away a null check shown in PR35909:
https://bugs.llvm.org/show_bug.cgi?id=35909
Differential Revision: https://reviews.llvm.org/D42249
llvm-svn: 322950
Firstly, each offloading entry must have a unique name or the
linker will complain if there are multiple files with target
regions. Secondly, the compiler must not introduce padding so
mark the struct with a PackedAttr.
Differential Revision: https://reviews.llvm.org/D42168
llvm-svn: 322858
When parsing C++ type construction expressions with list initialization,
forward the locations of the braces to Sema.
Without these locations, the code coverage pass crashes on the given test
case, because the pass relies on getLocEnd() returning a valid location.
Here is what this patch does in more detail:
- Forwards init-list brace locations to Sema (ParseExprCXX),
- Builds an InitializationKind with these locations (SemaExprCXX), and
- Uses these locations for constructor initialization (SemaInit).
The remaining changes fall out of introducing a new overload for
creating direct-list InitializationKinds.
Testing: check-clang, and a stage2 coverage-enabled build of clang with
asserts enabled.
Differential Revision: https://reviews.llvm.org/D41921
llvm-svn: 322729
simd`.
Added host codegen + codegen for devices with default codegen for
`#pragma omp target teams distribute parallel for simd` directive.
llvm-svn: 322515
RISCVABIInfo is implemented in terms of XLen, supporting both RV32 and RV64.
Unfortunately we need to count argument registers in the frontend in order to
determine when to emit signext and zeroext attributes. Integer scalars are
extended according to their type up to 32-bits and then sign-extended to XLen
when passed in registers, but are anyext when passed on the stack. This patch
only implements the base integer (soft float) ABIs.
For more information on the RISC-V ABI, see [the ABI
doc](https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md),
my [golden model](https://github.com/lowRISC/riscv-calling-conv-model), and
the [LLVM RISC-V calling convention
patch](https://reviews.llvm.org/D39898#2d1595b4) (specifically the comment
documenting frontend expectations).
Differential Revision: https://reviews.llvm.org/D40023
llvm-svn: 322494
Summary:
kunpck intrinsics were removed in favor of native IR a few months ago. The implementation lowers them as by operation on the integer types passed to the intrinsic and then just shifting, masking, and oring them together. A special X86 DAG combine was added to recognize this patter and turn it into a concat_vector operation.
I think it makes more sense to keep the IR implementation closer to vector operations on vXi1. Given that we expect these builtins to be used around other builtins that operate on k-registers which we try to represent in IR with vXi1. InstCombine should be able to get rid of the bitcasts between integers and vXi1 leaving only the vector operations.
Reviewers: RKSimon, spatel, zvi, jina.nahias
Reviewed By: RKSimon
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D42016
llvm-svn: 322461
This alignment can be less than 4 on certain embedded targets, which may
not even be able to deal with 4-byte alignment on the stack.
Patch by Jacob Young!
llvm-svn: 322406
As @rjmccall suggested in D40023, we can get rid of
ABIInfo::shouldSignExtUnsignedType (used to handle cases like the Mips calling
convention where 32-bit integers are always sign extended regardless of the
sign of the type) by adding a SignExt field to ABIArgInfo. In the common case,
this new field is set automatically by ABIArgInfo::getExtend based on the sign
of the type. For targets that want greater control, they can use
ABIArgInfo::getSignExtend or ABIArgInfo::getZeroExtend when necessary. This
change also cleans up logic in CGCall.cpp.
There is no functional change intended in this patch, and all tests pass
unchanged. As noted in D40023, Mips might want to sign-extend unsigned 32-bit
integer return types. A future patch might modify
MipsABIInfo::classifyReturnType to use MipsABIInfo::extendType.
Differential Revision: https://reviews.llvm.org/D41999
llvm-svn: 322396
getAssociatedStmt() returns the outermost captured statement for the
OpenMP directive. It may return incorrect region in case of combined
constructs. Reworked the code to reduce the number of calls of
getAssociatedStmt() and used getInnermostCapturedStmt() and
getCapturedStmt() functions instead.
In case of firstprivate variables it may lead to an extra allocas
generation for private copies even if the variable is passed by value
into outlined function and could be used directly as private copy.
llvm-svn: 322393
While updating clang tests for having clang set dso_local I noticed
that:
- There are *a lot* of tests to update.
- Many of the updates are redundant.
They are redundant because a GV is "obviously dso_local". This patch
starts formalizing that a bit by requiring that internal and private
GVs be dso_local too. Since they all are, we don't have to print
dso_local to the textual representation, making it a bit more compact
and easier to read.
llvm-svn: 322318
Adds option /guard:cf to clang-cl and -cfguard to cc1 to emit function IDs
of functions that have their address taken into a section named .gfids$y for
compatibility with Microsoft's Control Flow Guard feature.
The original patch didn't have the lit.local.cfg file that restricts the new
test to x86, thus the new test was failing on the non-x86 bots.
Differential Revision: https://reviews.llvm.org/D40531
The reverts r322008, which was a revert of r322005.
This reverts commit a05b89f9aca70597dc79fe97bc49b50b51f525ba.
llvm-svn: 322136
GCOV in the old pass manager also strips debug info (if debug info is
disabled/only produced for profiling anyway) after the GCOV pass runs.
I think the strip pass hasn't been ported to the new pass manager, so it
might take me a little while to wire that up.
llvm-svn: 322126
Cf-protection is a target independent flag that instructs the back-end to instrument control flow mechanisms like: Branch, Return, etc.
For example in X86 this flag will be used to instrument Indirect Branch Tracking instructions.
Differential Revision: https://reviews.llvm.org/D40478
Change-Id: I5126e766c0e6b84118cae0ee8a20fe78cc373dea
llvm-svn: 322063
r322028 attempted to remove something from the "Manglings"
list when it was no longer valid, and did so with 'erase'.
However, StringRefs to these were stored, so these became
dangling references. This patch changes to using 'remove' instead
of 'erase' to keep the strings valid.
llvm-svn: 322052
GCC's attribute 'target', in addition to being an optimization hint,
also allows function multiversioning. We currently have the former
implemented, this is the latter's implementation.
This works by enabling functions with the same name/signature to coexist,
so that they can all be emitted. Multiversion state is stored in the
FunctionDecl itself, and SemaDecl manages the definitions.
Note that it ends up having to permit redefinition of functions so
that they can all be emitted. Additionally, all versions of the function
must be emitted, so this also manages that.
Note that this includes some additional rules that GCC does not, since
defining something as a MultiVersion function after a usage has been made illegal.
The only 'history rewriting' that happens is if a function is emitted before
it has been converted to a multiversion'ed function, at which point its name
needs to be changed.
Function templates and virtual functions are NOT yet supported (not supported
in GCC either).
Additionally, constructors/destructors are disallowed, but the former is
planned.
llvm-svn: 322028
The new test fails on the Hexagon bot. Reverting while I investigate.
This reverts https://reviews.llvm.org/rL322005
This reverts commit b7e0026b4385180c378edc658ec91a39566f2942.
llvm-svn: 322008
Adds option /guard:cf to clang-cl and -cfguard to cc1 to emit function IDs
of functions that have their address taken into a section named .gfids$y for
compatibility with Microsoft's Control Flow Guard feature.
Differential Revision: https://reviews.llvm.org/D40531
llvm-svn: 322005
Adds the -fstack-size-section flag to enable the .stack_sizes section. The flag defaults to on for the PS4 triple.
Differential Revision: https://reviews.llvm.org/D40712
llvm-svn: 321992
These just overloads for _Float128. They're supported by GCC 7 and used
by glibc. APFloat support is already there so just add the overloads.
__builtin_copysignf128
__builtin_fabsf128
__builtin_huge_valf128
__builtin_inff128
__builtin_nanf128
__builtin_nansf128
This is the same support that GCC has, according to the documentation,
but limited to _Float128.
llvm-svn: 321948
As discussed in the mail thread <https://groups.google.com/a/isocpp.org/forum/
#!topic/std-discussion/T64_dW3WKUk> "Calling noexcept function throug non-
noexcept pointer is undefined behavior?", such a call should not be UB.
However, Clang currently warns about it.
This change removes exception specifications from the function types recorded
for -fsanitize=function, both in the functions themselves and at the call sites.
That means that calling a non-noexcept function through a noexcept pointer will
also not be flagged as UB. In the review of this change, that was deemed
acceptable, at least for now. (See the "TODO" in compiler-rt
test/ubsan/TestCases/TypeCheck/Function/function.cpp.)
To remove exception specifications from types, the existing internal
ASTContext::getFunctionTypeWithExceptionSpec was made public, and some places
otherwise unrelated to this change have been adapted to call it, too.
This is the cfe part of a patch covering both cfe and compiler-rt.
Differential Revision: https://reviews.llvm.org/D40720
llvm-svn: 321859
This implements the DWARF 5 feature described at
http://www.dwarfstd.org/ShowIssue.php?issue=141215.1
This allows a consumer to understand whether a composite data type is
trivially copyable and thus should be passed by value instead of by
reference. The canonical example is being able to distinguish the
following two types:
// S is not trivially copyable because of the explicit destructor.
struct S {
~S() {}
};
// T is a POD type.
struct T {
~T() = default;
};
<rdar://problem/36034993>
Differential Revision: https://reviews.llvm.org/D41039
llvm-svn: 321845
If the reduction required shuffle in the NVPTX codegen, we may need to
cast the reduced value to the integer type. This casting was implemented
incorrectly and may cause compiler crash. Patch fixes this problem.
llvm-svn: 321818
r320902 fixed the IRGen for some types of checked multiplications. It
did not handle unsigned overflow correctly in the case where the signed
operand is negative (PR35750).
Eli pointed out that on overflow, the result must be equal to the unique
value that is equivalent to the mathematically-correct result modulo two
raised to the k power, where k is the number of bits in the result type.
This patch fixes the specialized IRGen from r320902 accordingly.
Testing: Apart from check-clang, I modified the test harness from
r320902 to validate the results of all multiplications -- not just the
ones which don't overflow:
https://gist.github.com/vedantk/3eb9c88f82e5c32f2e590555b4af5081
llvm.org/PR35750, rdar://34963321
Differential Revision: https://reviews.llvm.org/D41717
llvm-svn: 321771
When a type is only used as a template parameter and that type is the
only type imported from another #include'd module, no skeleton CU for
that module is generated, so a consumer doesn't know where to find the
type definition. By emitting an import declaration, we can force a
skeleton CU to be generated for each imported module.
rdar://problem/36266156
llvm-svn: 321754
Summary:
The C++ Itanium ABI says:
No cookie is required if the new operator being used is ::operator new[](size_t, void*).
We should only avoid poisoning the cookie if we're calling this
operator, not others. This is dealt with before the call to
InitializeArrayCookie.
Reviewers: rjmccall, kcc, rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D41301
llvm-svn: 321645
only.
Added support for -fopenmp-simd option that allows compilation of
simd-based constructs without emission of OpenMP runtime calls.
llvm-svn: 321560
...when such an operation is done on an object during con-/destruction.
This is the cfe part of a patch covering both cfe and compiler-rt.
Differential Revision: https://reviews.llvm.org/D40295
llvm-svn: 321519
Now that in the new TBAA format we allow access types to be of
any object types, including aggregate ones, it becomes critical
to specify types of all sub-objects such aggregates comprise as
their members. In order to meet this requirement, this patch
enables generation of field descriptors for members of array
types.
Differential Revision: https://reviews.llvm.org/D41399
llvm-svn: 321352
Now that the MDBuilder helpers generating TBAA type and access
descriptors in the new format are in place, we can teach clang to
use them when requested.
Differential Revision: https://reviews.llvm.org/D41394
llvm-svn: 321351
When a function taking transparent union is declared as taking one of
union members earlier in the translation unit, clang would hit an
"Invalid cast" assertion during EmitFunctionProlog. This case
corresponds to function f1 in test/CodeGen/transparent-union-redecl.c.
We decided to cast i32 to union because after merging function
declarations function parameter type becomes int,
CGFunctionInfo::ArgInfo type matches with ABIArgInfo type, so we decide
it is a trivial case. But these types should also be castable to
parameter declaration type which is not the case here.
The fix is in checking for the trivial case if ABIArgInfo type matches with
parameter declaration type. It exposed inconsistency that we check
hasScalarEvaluationKind for different types in EmitParmDecl and
EmitFunctionProlog, and comment says they should match.
Additional tests in Sema/transparent-union.c capture current behavior and make
sure there are no regressions.
rdar://problem/34949329
Reviewers: rjmccall, rafael
Reviewed By: rjmccall
Subscribers: aemerson, cfe-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D41311
llvm-svn: 321296
The new format requires to specify both the type of the access
and its size. This patch fixes setting access sizes for TBAA tags
that denote accesses to structure members. This fix affects all
future TBAA metadata tests for the new format, so I guess we
don't need any special tests for this fix.
Differential Revision: https://reviews.llvm.org/D41452
llvm-svn: 321250
Diagnose 'unreachable' UB when a noreturn function returns.
1. Insert a check at the end of functions marked noreturn.
2. A decl may be marked noreturn in the caller TU, but not marked in
the TU where it's defined. To diagnose this scenario, strip away the
noreturn attribute on the callee and insert check after calls to it.
Testing: check-clang, check-ubsan, check-ubsan-minimal, D40700
rdar://33660464
Differential Revision: https://reviews.llvm.org/D40698
llvm-svn: 321231
Summary: Very similar to AddressSanitizer, with the exception of the error type encoding.
Reviewers: kcc, alekseyshl
Subscribers: cfe-commits, kubamracek, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D41417
llvm-svn: 321203
Summary: Plant an inline version of "((ac+bd)/(cc+dd)) + i((bc-ad)/(cc+dd))" instead.
Patch by Paul Walker.
Reviewed By: hfinkel
Differential Revision: https://reviews.llvm.org/D40299
llvm-svn: 321183
Fixes regression from r320533.
This fixes the undefined behavior, but I'm not sure it's really right...
I think we end up with missing coverage for code in modules.
Differential Revision: https://reviews.llvm.org/D41374
llvm-svn: 321052
At least <http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/
builds/6013/steps/annotate/logs/stdio> complains about
__ubsan::__ubsan_handle_function_type_mismatch_abort (compiler-rt
lib/ubsan/ubsan_handlers.cc) returning now despite being declared 'noreturn', so
looks like a different approach is needed for the function_type_mismatch check
to be called also in cases that may ultimately succeed.
llvm-svn: 320982
As discussed in the mail thread <https://groups.google.com/a/isocpp.org/forum/
#!topic/std-discussion/T64_dW3WKUk> "Calling noexcept function throug non-
noexcept pointer is undefined behavior?", such a call should not be UB.
However, Clang currently warns about it.
There is no cheap check whether two function type_infos only differ in noexcept,
so pass those two type_infos as additional data to the function_type_mismatch
handler (with the optimization of passing a null "static callee type" info when
that is already noexcept, so the additional check can be avoided anyway). For
the Itanium ABI (which appears to be the only one that happens to be used on
platforms that support -fsanitize=function, and which appears to only record
noexcept information for pointer-to-function type_infos, not for function
type_infos themselves), we then need to check the mangled names for occurrence
of "Do" representing "noexcept".
This is the cfe part of a patch covering both cfe and compiler-rt.
Differential Revision: https://reviews.llvm.org/D40720
llvm-svn: 320978
There are 2 parts to getting the -fassociative-math command-line flag translated to LLVM FMF:
1. In the driver/frontend, we accept the flag and its 'no' inverse and deal with the
interactions with other flags like -ffast-math -fno-signed-zeros -fno-trapping-math.
This was mostly already done - we just need to translate the flag as a codegen option.
The test file is complicated because there are many potential combinations of flags here.
Note that we are matching gcc's behavior that requires 'nsz' and no-trapping-math.
2. In codegen, we map the codegen option to FMF in the IR builder. This is simple code and
corresponding test.
For the motivating example from PR27372:
float foo(float a, float x) { return ((a + x) - x); }
$ ./clang -O2 27372.c -S -o - -ffast-math -fno-associative-math -emit-llvm | egrep 'fadd|fsub'
%add = fadd nnan ninf nsz arcp contract float %0, %1
%sub = fsub nnan ninf nsz arcp contract float %add, %2
So 'reassoc' is off as expected (and so is the new 'afn' but that's a different patch).
This case now works as expected end-to-end although the underlying logic is still wrong:
$ ./clang -O2 27372.c -S -o - -ffast-math -fno-associative-math | grep xmm
addss %xmm1, %xmm0
subss %xmm1, %xmm0
We're not done because the case where 'reassoc' is set is ignored by optimizer passes. Example:
$ ./clang -O2 27372.c -S -o - -fassociative-math -fno-signed-zeros -fno-trapping-math -emit-llvm | grep fadd
%add = fadd reassoc float %0, %1
$ ./clang -O2 27372.c -S -o - -fassociative-math -fno-signed-zeros -fno-trapping-math | grep xmm
addss %xmm1, %xmm0
subss %xmm1, %xmm0
Differential Revision: https://reviews.llvm.org/D39812
llvm-svn: 320920
This patch introduces a specialized way to lower overflow-checked
multiplications with mixed-sign operands. This fixes link failures and
ICEs on code like this:
void mul(int64_t a, uint64_t b) {
int64_t res;
__builtin_mul_overflow(a, b, &res);
}
The generic checked-binop irgen would use a 65-bit multiplication
intrinsic here, which requires runtime support for _muloti4 (128-bit
multiplication), and therefore fails to link on i386. To get an ICE
on x86_64, change the example to use __int128_t / __uint128_t.
Adding runtime and backend support for 65-bit or 129-bit checked
multiplication on all of our supported targets is infeasible.
This patch solves the problem by using simpler, specialized irgen for
the mixed-sign case.
llvm.org/PR34920, rdar://34963321
Testing: Apart from check-clang, I compared the output from this fairly
comprehensive test driver using unpatched & patched clangs:
https://gist.github.com/vedantk/3eb9c88f82e5c32f2e590555b4af5081
Differential Revision: https://reviews.llvm.org/D41149
llvm-svn: 320902
Previously the attributes were emitted only for function definitions.
Patch adds emission of the attributes for function declarations.
llvm-svn: 320826
Most of the -Wsign-compare warnings are due to the fact that
enums are signed by default in the MS ABI, while the
tautological comparison warnings trigger on x86 builds where
sizeof(size_t) is 4 bytes, so N > numeric_limits<unsigned>::max()
is always false.
Differential Revision: https://reviews.llvm.org/D41256
llvm-svn: 320750
Summary:
InterlockedCompareExchange128 is a bit more complicated than the other
InterlockedCompareExchange functions, so it requires a bit more work. It
doesn't directly refer to 128bit ints, instead it takes pointers to
64bit ints for Destination and ComparandResult, and exchange is taken as
two 64bit ints (high & low). The previous value is written to
ComparandResult, and success is returned. This implementation does the
following in order to produce a cmpxchg instruction:
1. Cast everything to 128bit ints or int pointers, and glues together
the Exchange values
2. Reads from CompareandResult to get the comparand
3. Calls cmpxchg volatile (on X86 this will produce a lock cmpxchg16b
instruction)
1. Result 0 (previous value) is written back to ComparandResult
2. Result 1 (success bool) is zext'ed to a uchar and returned
Resolves bug https://llvm.org/PR35251
Patch by Colden Cullen!
Reviewers: rnk, agutowski
Reviewed By: rnk
Subscribers: majnemer, cfe-commits
Differential Revision: https://reviews.llvm.org/D41032
llvm-svn: 320730
Adding the new enumerator forced a bunch more changes into this patch than I
would have liked. The -Wtautological-compare warning was extended to properly
check the new comparison operator, clang-format needed updating because it uses
precedence levels as weights for determining where to break lines (and several
operators increased their precedence levels with this change), thread-safety
analysis needed changes to build its own IL properly for the new operator.
All "real" semantic checking for this operator has been deferred to a future
patch. For now, we use the relational comparison rules and arbitrarily give
the builtin form of the operator a return type of 'void'.
llvm-svn: 320707
Under the Microsoft ABI, it is possible for an object not to have
a virtual table pointer of its own if all of its virtual functions
were introduced by virtual bases. In that case, we need to load the
vtable pointer from one of the virtual bases and perform the type
check using its type.
Differential Revision: https://reviews.llvm.org/D41036
llvm-svn: 320638
Summary:
The backend should only emit data sharing code for the cases where it is needed.
A new function attribute is used by Clang to enable data sharing only for the cases where OpenMP semantics require it and there are variables that need to be shared.
Reviewers: hfinkel, Hahnfeld, ABataev, carlo.bertolli, caomhin
Reviewed By: ABataev
Subscribers: cfe-commits, jholewinski
Differential Revision: https://reviews.llvm.org/D41123
llvm-svn: 320527
This adds a new command line option -mprefer-vector-width to specify a preferred vector width for the vectorizers. Valid values are 'none' and unsigned integers. The driver will check that it meets those constraints. Specific supported integers will be managed by the targets in the backend.
Clang will take the value and add it as a new function attribute during CodeGen.
This represents the alternate direction proposed by Sanjay in this RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-November/118734.html
The syntax here matches gcc, though gcc treats it as an x86 specific command line argument. gcc only allows values of 128, 256, and 512. I'm not having clang check any values.
Differential Revision: https://reviews.llvm.org/D40230
llvm-svn: 320419
This commit fixes a bug in IRGen where it generates completely broken
code for __fp16 vectors on X86. For example when the following code is
compiled:
half4 hv0, hv1, hv2; // these are vectors of __fp16.
void foo221() {
hv0 = hv1 + hv2;
}
clang generates the following IR, in which two i16 vectors are added:
@hv1 = common global <4 x i16> zeroinitializer, align 8
@hv2 = common global <4 x i16> zeroinitializer, align 8
@hv0 = common global <4 x i16> zeroinitializer, align 8
define void @foo221() {
%0 = load <4 x i16>, <4 x i16>* @hv1, align 8
%1 = load <4 x i16>, <4 x i16>* @hv2, align 8
%add = add <4 x i16> %0, %1
store <4 x i16> %add, <4 x i16>* @hv0, align 8
ret void
}
To fix the bug, this commit uses the code committed in r314056, which
modified clang to promote and truncate __fp16 vectors to and from float
vectors in the AST. It also fixes another IRGen bug where a short value
is assigned to an __fp16 variable without any integer-to-floating-point
conversion, as shown in the following example:
__fp16 a;
short b;
void foo1() {
a = b;
}
@b = common global i16 0, align 2
@a = common global i16 0, align 2
define void @foo1() #0 {
%0 = load i16, i16* @b, align 2
store i16 %0, i16* @a, align 2
ret void
}
rdar://problem/20625184
Differential Revision: https://reviews.llvm.org/D40112
llvm-svn: 320215
This is a follow-up to r320128. Eli pointed out that there is some gray
area in the language standard about whether the constant size is exact,
or a lower bound.
https://reviews.llvm.org/D40940
llvm-svn: 320185
There is no way to apply sanitizer suppressions to ObjC blocks. A
reasonable default is to have blocks inherit their parent's sanitizer
options.
rdar://32769634
Differential Revision: https://reviews.llvm.org/D40668
llvm-svn: 320132
CreateCoercedLoad/CreateCoercedStore assumes pointer argument of
memcpy is in addr space 0, which is not correct and causes invalid
bitcasts for triple amdgcn---amdgiz.
It is fixed by using alloca addr space instead.
Differential Revision: https://reviews.llvm.org/D40806
llvm-svn: 320000
The adjustment is calculated with CreatePtrDiff() which returns
the difference in (base) elements. This is passed to CreateGEP()
so make sure that the GEP base has the correct pointer type:
It needs to be a pointer to the base type, not a pointer to a
constant sized array.
Differential Revision: https://reviews.llvm.org/D40911
llvm-svn: 319931
Commit 7ac28eb0a5 / r310911 ("[OpenCL] Allow targets to select address
space per type", 2017-08-15) made Basic depend on AST, introducing a
circular dependency. Break this dependency by adding the
OpenCLTypeKind enum in Basic and map from AST types to this enum in
ASTContext.
Differential Revision: https://reviews.llvm.org/D40838
llvm-svn: 319883
Though it is incorrect from point of view of OpenMP standard to have
dependent iteration space in OpenMP loops, compiler should not crash.
Patch fixes this problem.
llvm-svn: 319700
There are 20 LLVM math intrinsics that correspond to mathlib calls according to the LangRef:
http://llvm.org/docs/LangRef.html#standard-c-library-intrinsics
We were only converting 3 mathlib calls (sqrt, fma, pow) and 12 builtin calls (ceil, copysign,
fabs, floor, fma, fmax, fmin, nearbyint, pow, rint, round, trunc) to their intrinsic-equivalents.
This patch pulls the transforms together and handles all 20 cases. The switch is guarded by a
check for const-ness to make sure we're not doing the transform if errno could possibly be set by
the libcall or builtin.
Differential Revision: https://reviews.llvm.org/D40044
llvm-svn: 319593
Previously we emitted `__tgt_target_teams` only for standalone teams
directives. This patch allows emit this function for all teams-based
directives.
llvm-svn: 319585
These command line options are not intended for public use, and often
don't even make sense in the context of a particular tool anyway. About
90% of them are already hidden, but when people add new options they
forget to hide them, so if you were to make a brand new tool today, link
against one of LLVM's libraries, and run tool -help you would get a
bunch of junk that doesn't make sense for the tool you're writing.
This patch hides these options. The real solution is to not have
libraries defining command line options, but that's a much larger effort
and not something I'm prepared to take on.
Differential Revision: https://reviews.llvm.org/D40674
llvm-svn: 319505
The basic idea behind this patch is that since in strict aliasing
mode all accesses to union members require their outermost
enclosing union objects to be specified explicitly, then for a
couple given accesses to union members of the form
p->a.b.c...
q->x.y.z...
it is known they can only alias if both p and q point to the same
union type and offset ranges of members a.b.c... and x.y.z...
overlap. Note that the actual types of the members do not matter.
Specifically, in this patch we do the following:
* Make unions to be valid TBAA base access types. This enables
generation of TBAA type descriptors for unions.
* Encode union types as structures with a single member of a
special "union member" type. Currently we do not encode
information about sizes of types, but conceptually such union
members are considered to be of the size of the whole union.
* Encode accesses to direct and indirect union members, including
member arrays, as accesses to these special members. All
accesses to members of a union thus get the same offset, which
is the offset of the union they are part of. This means the
existing LLVM TBAA machinery is able to handle such accesses
with no changes.
While this is already an improvement comparing to the current
situation, that is, representing all union accesses as may-alias
ones, there are further changes planned to complete the support
for unions. One of them is storing information about access sizes
so we can distinct accesses to non-overlapping union members,
including accesses to different elements of member arrays.
Another change is encoding type sizes in order to make it
possible to compute offsets within constant-indexed array
elements. These enhancements will be addressed with separate
patches.
Differential Revision: https://reviews.llvm.org/D39455
llvm-svn: 319413
Summary:
The -fxray-always-emit-customevents flag instructs clang to always emit
the LLVM IR for calls to the `__xray_customevent(...)` built-in
function. The default behaviour currently respects whether the function
has an `[[clang::xray_never_instrument]]` attribute, and thus not lower
the appropriate IR code for the custom event built-in.
This change allows users calling through to the
`__xray_customevent(...)` built-in to always see those calls lowered to
the corresponding LLVM IR to lay down instrumentation points for these
custom event calls.
Using this flag enables us to emit even just the user-provided custom
events even while never instrumenting the start/end of the function
where they appear. This is useful in cases where "phase markers" using
__xray_customevent(...) can have very few instructions, must never be
instrumented when entered/exited.
Reviewers: rnk, dblaikie, kpw
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D40601
llvm-svn: 319388
Emit a gap area starting after the r-paren location and ending at the
start of the body for the braces-optional statements (for, for-each,
while, etc). The count for the gap area equal to the body's count. This
extends the fix in r317758.
Fixes PR35387, rdar://35570345
Testing: stage2 coverage-enabled build of clang, check-clang
llvm-svn: 319373
Fixes regression introduced by r319297. MSVC environments still use SEH
unwind opcodes but they should use the Microsoft C++ EH personality, not
the mingw one.
llvm-svn: 319363
This is a re-apply of r319294.
adds -fseh-exceptions and -fdwarf-exceptions flags
clang will check if the user has specified an exception model flag,
in the absense of specifying the exception model clang will then check
the driver default and append the model flag for that target to cc1
-fno-exceptions has a higher priority then specifying the model
move __SEH__ macro definitions out of Targets into InitPreprocessor
behind the -fseh-exceptions flag
move __ARM_DWARF_EH__ macrodefinitions out of verious targets and into
InitPreprocessor behind the -fdwarf-exceptions flag and arm|thumb check
remove unused USESEHExceptions from the MinGW Driver
fold USESjLjExceptions into a new GetExceptionModel function that
gives the toolchain classes more flexibility with eh models
Reviewers: rnk, mstorsjo
Differential Revision: https://reviews.llvm.org/D39673
llvm-svn: 319297
adds -fseh-exceptions and -fdwarf-exceptions flags
clang will check if the user has specified an exception model flag,
in the absense of specifying the exception model clang will then check
the driver default and append the model flag for that target to cc1
clang cc1 assumes dwarf is the default if none is passed
and -fno-exceptions has a higher priority then specifying the model
move __SEH__ macro definitions out of Targets into InitPreprocessor
behind the -fseh-exceptions flag
move __ARM_DWARF_EH__ macrodefinitions out of verious targets and into
InitPreprocessor behind the -fdwarf-exceptions flag and arm|thumb check
remove unused USESEHExceptions from the MinGW Driver
fold USESjLjExceptions into a new GetExceptionModel function that
gives the toolchain classes more flexibility with eh models
Reviewers: rnk, mstorsjo
Differential Revision: https://reviews.llvm.org/D39673
llvm-svn: 319294
Currently CodeGen is calling std::sort on the features vector in TargetOptions for every function, but I don't think CodeGen should be modifying TargetOptions.
Differential Revision: https://reviews.llvm.org/D40228
llvm-svn: 319195
These functions were defined as static members of TemplateSpecializationType.
Now they are moved to namespace level. Previously there were different
implementations for lists containing TemplateArgument and TemplateArgumentLoc,
now these implementations share the same code.
This change is a result of refactoring patch D40508. NFC.
llvm-svn: 319178
The information about access and type sizes is necessary for
producing TBAA metadata in the new size-aware format. With this
patch, D39955 and D39956 in place we should be able to change
CodeGenTBAA::createScalarTypeNode() and
CodeGenTBAA::getBaseTypeInfo() to generate metadata in the new
format under the -new-struct-path-tbaa command-line option. For
now, this new information remains unused.
Differential Revision: https://reviews.llvm.org/D40176
llvm-svn: 319012
In the future the compiler will analyze whether the OpenMP
runtime needs to be (fully) initialized and avoid that overhead
if possible. The functions already take an argument to transfer
that information to the runtime, so pass in the default value 1.
(This is needed for binary compatibility with libomptarget-nvptx
currently being upstreamed.)
Differential Revision: https://reviews.llvm.org/D40354
llvm-svn: 318836
This clang patch changes the __tgt_* API function signatures in preparation for the new map interface.
Changes are: Device IDs 32bits --> 64bits, Flags 32bits --> 64bits
Differential revision: https://reviews.llvm.org/D40281
llvm-svn: 318789
This is an instrumentation flag that's similar to
-finstrument-functions, but it only inserts calls on function entry, the
calls are inserted post-inlining, and they don't take any arugments.
This is intended for users who want to instrument function entry with
minimal overhead.
(-pg would be another alternative, but forces frame pointer emission and
affects link flags, so is probably best left alone to be used for
generating gcov data.)
Differential revision: https://reviews.llvm.org/D40276
llvm-svn: 318785
OpenMP 5.0 introduces asynchronous data update/dependecies clauses on
target data directives. Patch adds initial support for outer task
regions to use task-based codegen for future async target data
directives.
llvm-svn: 318781