Commit Graph

12554 Commits

Author SHA1 Message Date
Richard Smith 3501895863 Revert r345562: "PR23833, DR2140: an lvalue-to-rvalue conversion on a glvalue of type"
This exposes a (known) CodeGen bug: it can't cope with emitting lvalue
expressions that denote non-odr-used but usable-in-constant-expression
variables. See PR39528 for a testcase.

Reverted for now until that issue can be fixed.

llvm-svn: 346065
2018-11-03 02:23:33 +00:00
Mandeep Singh Grang 7fa07e554d [COFF, ARM64] Implement InterlockedExchange*_* builtins
Summary: Windows SDK needs these intrinsics to be proper builtins.  This is second in a series of patches to move intrinsic defintions out of intrin.h.

Reviewers: rnk, mstorsjo, efriedma, TomTan

Reviewed By: rnk, efriedma

Subscribers: javed.absar, kristof.beyls, chrib, jfb, kristina, cfe-commits

Differential Revision: https://reviews.llvm.org/D54046

llvm-svn: 346044
2018-11-02 21:18:23 +00:00
Mandeep Singh Grang 6cef4e5c87 [COFF, ARM64] Change setjmp for AArch64 Windows to use Intrinsic.sponentry
Summary: ARM64 setjmp expects sp on entry instead of framepointer.

Patch by: Yin Ma (yinma@codeaurora.org)

Reviewers: mgrang, eli.friedman, ssijaric, mstorsjo, rnk, compnerd

Reviewed By: mgrang

Subscribers: efriedma, javed.absar, kristof.beyls, chrib, cfe-commits

Differential Revision: https://reviews.llvm.org/D53998

llvm-svn: 346024
2018-11-02 18:10:07 +00:00
Erik Pilkington e3b7144e6a [CodeGen] Fix a crash when updating a designated initializer
We need to handle the ConstantAggregateZero case here too.

rdar://45691981

Differential revision: https://reviews.llvm.org/D54010

llvm-svn: 346004
2018-11-02 17:36:58 +00:00
Filipe Cabecinhas 0eb5008352 Change -fsanitize-address-poison-class-member-array-new-cookie to -fsanitize-address-poison-custom-array-cookie
Handle it in the driver and propagate it to cc1

Reviewers: rjmccall, kcc, rsmith

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D52615

llvm-svn: 346001
2018-11-02 17:29:04 +00:00
Alexey Bataev 1fc1f8e819 [OPENMP][NVPTX]Use __kmpc_data_sharing_coalesced_push_stack function.
Coalesced memory access requires use of the new function
`__kmpc_data_sharing_coalesced_push_stack` instead of the
`__kmpc_data_sharing_push_stack`.

llvm-svn: 345991
2018-11-02 16:08:31 +00:00
Alexey Bataev 2dc07d0cf3 [OPENMP]Change the mapping type for lambda captures.
The previously used combination `PTR_AND_OBJ | PRIVATE` could be used for mapping of some data in Fortran. Changed it to `PTR_AND_OBJ | LITERAL`.

llvm-svn: 345982
2018-11-02 15:25:06 +00:00
Alexey Bataev e40901806f [OPENMP][NVPTX]Improve emission of the globalized variables for
target/teams/distribute regions.

Target/teams/distribute regions exist for all the time the kernel is
executed. Thus, if the variable is declared in their context and then
escape it, we can allocate global memory statically instead of
allocating it dynamically.
Patch captures all the globalized variables in target/teams/distribute
contexts, merges them into the records, one per each target region.
Those records are then joined into the union, one per compilation unit
(to save the global memory). Those units are organized into
2 x dimensional arrays, where the first dimension is
the number of blocks per SM and the second one is the number of SMs.
Runtime functions manage this global memory space between the executing
teams.

llvm-svn: 345978
2018-11-02 14:54:07 +00:00
Tim Northover 314fbfa1c4 Reapply Logging: make os_log buffer size an integer constant expression.
The size of an os_log buffer is known at any stage of compilation, so making it
a constant expression means that the common idiom of declaring a buffer for it
won't result in a VLA. That allows the compiler to skip saving and restoring
the stack pointer around such buffers.

This also moves the OSLog and other FormatString helpers from
libclangAnalysis to libclangAST to avoid a circular dependency.

llvm-svn: 345971
2018-11-02 13:14:11 +00:00
Patrick Lyster 7a2a27c4a4 Add support for 'atomic_default_mem_order' clause on 'requires' directive. Also renamed test files relating to 'requires'. Differntial review: https://reviews.llvm.org/D53513
llvm-svn: 345967
2018-11-02 12:18:11 +00:00
Volodymyr Sapsai 8b286f99d4 [CodeGen] Fix assertion on referencing constexpr Obj-C object with ARC.
Failed assertion is
> Assertion failed: ((ND->isUsed(false) || !isa<VarDecl>(ND) || !E->getLocation().isValid()) && "Should not use decl without marking it used!"), function EmitDeclRefLValue, file llvm-project/clang/lib/CodeGen/CGExpr.cpp, line 2437.

`EmitDeclRefLValue` mentions
> // A DeclRefExpr for a reference initialized by a constant expression can
> // appear without being odr-used. Directly emit the constant initializer.

The fix is to use the similar approach for non-references as for references. It
is achieved by trying to emit a constant before we attempt to load non-odr-used
variable as LValue.

rdar://problem/40650504

Reviewers: ahatanak, rjmccall

Reviewed By: rjmccall

Subscribers: dexonsmith, erik.pilkington, cfe-commits

Differential Revision: https://reviews.llvm.org/D53674

llvm-svn: 345903
2018-11-01 22:50:08 +00:00
Volodymyr Sapsai ef1899b01d [CodeGen] Move `emitConstant` from ScalarExprEmitter to CodeGenFunction. NFC.
The goal is to use `emitConstant` in more places. Didn't move
`ComplexExprEmitter::emitConstant` because it returns a different type.

Reviewers: rjmccall, ahatanak

Reviewed By: rjmccall

Subscribers: dexonsmith, erik.pilkington, cfe-commits

Differential Revision: https://reviews.llvm.org/D53725

llvm-svn: 345897
2018-11-01 21:57:05 +00:00
Reid Kleckner 4dc0b1ac60 Fix clang -Wimplicit-fallthrough warnings across llvm, NFC
This patch should not introduce any behavior changes. It consists of
mostly one of two changes:
1. Replacing fall through comments with the LLVM_FALLTHROUGH macro
2. Inserting 'break' before falling through into a case block consisting
   of only 'break'.

We were already using this warning with GCC, but its warning behaves
slightly differently. In this patch, the following differences are
relevant:
1. GCC recognizes comments that say "fall through" as annotations, clang
   doesn't
2. GCC doesn't warn on "case N: foo(); default: break;", clang does
3. GCC doesn't warn when the case contains a switch, but falls through
   the outer case.

I will enable the warning separately in a follow-up patch so that it can
be cleanly reverted if necessary.

Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu

Differential Revision: https://reviews.llvm.org/D53950

llvm-svn: 345882
2018-11-01 19:54:45 +00:00
Mandeep Singh Grang 5c39b6ab7f Revert "[COFF, ARM64] Change setjmp for AArch64 Windows to use Intrinsic.sponentry"
This reverts commit 619111f5ccf349b635e4987ec02d15777c571495.

llvm-svn: 345872
2018-11-01 18:38:26 +00:00
Tim Northover eedc0f0f1a Revert "Reapply Logging: make os_log buffer size an integer constant expression."
Still more dependency hell.

llvm-svn: 345871
2018-11-01 18:37:42 +00:00
Tim Northover c1ac697ab7 Reapply Logging: make os_log buffer size an integer constant expression.
The size of an os_log buffer is known at any stage of compilation, so making it
a constant expression means that the common idiom of declaring a buffer for it
won't result in a VLA. That allows the compiler to skip saving and restoring
the stack pointer around such buffers.

This also moves the OSLog helpers from libclangAnalysis to libclangAST
to avoid a circular dependency.

llvm-svn: 345866
2018-11-01 18:04:49 +00:00
Tim Northover d686dbbc7c Revert "Logging: make os_log buffer size an integer constant expression.
This also reverts a couple of follow-up commits trying to fix the
dependency issues. Latest revision added a cyclic dependency that can't
just be patched up in 5 minutes.

llvm-svn: 345846
2018-11-01 16:15:24 +00:00
Erich Keane 9e94c18abe CPU-Dispatch- Fix type of a member function, prevent deferrals
The member type creation for a cpu-dispatch function was not correctly
including the 'this' parameter, so ensure that the type is properly
determined. Also, disable defer in the cases of emitting the functoins,
as it can end up resulting in the wrong version being emitted.

Change-Id: I0b8fc5e0b0d1ae1a9d98fd54f35f27f6e5d5d083
llvm-svn: 345838
2018-11-01 15:11:41 +00:00
Tim Northover a94ecc619b Logging: make os_log buffer size an integer constant expression.
The size of an os_log buffer is known at any stage of compilation, so making it
a constant expression means that the common idiom of declaring a buffer for it
won't result in a VLA. That allows the compiler to skip saving and restoring
the stack pointer around such buffers.

llvm-svn: 345828
2018-11-01 13:49:54 +00:00
Erich Keane 44731c5300 CPU-Dispatch-- Fix conflict between 'generic' and 'pentium'
When a dispatch function was being emitted that had both a generic and a
pentium configuration listed, we would assert.  This is because neither
configuration has any 'features' associated with it so they were both
considered the 'default' version.  'pentium' lacks any features because
we implement it in terms of __builtin_cpu_supports (instead of Intel
proprietary checks), which is unable to decern between the two.

The fix for this is to omit the 'generic' version from the dispatcher if
both are present. This permits existing code to compile, and still will
choose the 'best' version available (since 'pentium' is technically
better than 'generic').

Change-Id: I4b69f3e0344e74cbdbb04497845d5895dd05fda0
llvm-svn: 345826
2018-11-01 12:50:37 +00:00
Roman Lebedev 1bb9aea56b [clang][CodeGen] ImplicitIntegerSignChangeSanitizer: actually ignore NOP casts.
I fully expected for that to be handled by the canonical type check,
but it clearly wasn't. Sadly, somehow it hide until now.

Reported by Eli Friedman.

llvm-svn: 345816
2018-11-01 08:56:51 +00:00
Mandeep Singh Grang be0e78e017 [COFF, ARM64] Implement llvm.addressofreturnaddress intrinsic
llvm-svn: 345808
2018-11-01 01:35:34 +00:00
Thomas Lively 6940328d02 [WebAssembly] Fix type names in truncation builtins
Summary: Use the same convention as all the other WebAssembly builtin names.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, kristina, cfe-commits

Differential Revision: https://reviews.llvm.org/D53724

llvm-svn: 345804
2018-11-01 01:03:17 +00:00
Mandeep Singh Grang e7c7934a11 [COFF, ARM64] Change setjmp for AArch64 Windows to use Intrinsic.sponentry
Summary: ARM64 setjmp expects sp on entry instead of framepointer.

Reviewers: mgrang, rnk, TomTan, compnerd, mstorsjo, efriedma

Reviewed By: mstorsjo

Subscribers: javed.absar, kristof.beyls, chrib, cfe-commits

Differential Revision: https://reviews.llvm.org/D53684

llvm-svn: 345792
2018-10-31 23:17:36 +00:00
Eli Friedman b262d1631e [ARM64] [Windows] Implement _InterlockedExchangeAdd*_* builtins.
These apparently need to be proper builtins to handle the Windows
SDK.

Differential Revision: https://reviews.llvm.org/D53916

llvm-svn: 345779
2018-10-31 21:31:09 +00:00
Richard Smith 3ad0636e0a Part of PR39508: Emit an @llvm.invariant.start after storing to
__tls_guard.

__tls_guard can only ever transition from 0 to 1, and only once. This
permits LLVM to remove repeated checks for TLS initialization and
repeated initialization code in cases like:

  int g();
  thread_local int n = g();
  int a = n + n;

where we could not prove that __tls_guard was still 'true' when checking
it for the second reference to 'n' in the initializer of 'a'.

llvm-svn: 345774
2018-10-31 20:39:26 +00:00
Reid Kleckner 08f64e9083 Re-land r345676 "[Win64] Handle passing i128 by value"
Fix the unintended switch/case fallthrough to avoid changing long double
behavior.

llvm-svn: 345748
2018-10-31 17:43:55 +00:00
Bill Wendling 7c44da279e Create ConstantExpr class
A ConstantExpr class represents a full expression that's in a context where a
constant expression is required. This class reflects the path the evaluator
took to reach the expression rather than the syntactic context in which the
expression occurs.

In the future, the class will be expanded to cache the result of the evaluated
expression so that it's not needlessly re-evaluated

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D53475

llvm-svn: 345692
2018-10-31 03:48:47 +00:00
Richard Trieu 4ff6697b7e Revert r345676 due to test failure.
This was causing CodeGen/mingw-long-double.c to start failing.

llvm-svn: 345691
2018-10-31 02:10:51 +00:00
Reid Kleckner 0897caad30 [Win64] Handle passing i128 by value
For arguments, pass it indirectly, since the ABI doc says pretty clearly
that arguments larger than 8 bytes are passed indirectly. This makes
va_list handling easier, anyway.

When returning, GCC returns in XMM0, and we match them.

Fixes PR39492.

llvm-svn: 345676
2018-10-30 23:58:41 +00:00
Richard Trieu 161121fc9d Silence unused variable warnings. NFC
llvm-svn: 345669
2018-10-30 23:01:15 +00:00
Roman Lebedev 62debd8055 [clang][ubsan] Implicit Conversion Sanitizer - integer sign change - clang part
This is the second half of Implicit Integer Conversion Sanitizer.
It completes the first half, and finally makes the sanitizer
fully functional! Only the bitfield handling is missing.

Summary:
C and C++ are interesting languages. They are statically typed, but weakly.
The implicit conversions are allowed. This is nice, allows to write code
while balancing between getting drowned in everything being convertible,
and nothing being convertible. As usual, this comes with a price:

```
void consume(unsigned int val);

void test(int val) {
  consume(val);
  // The 'val' is `signed int`, but `consume()` takes `unsigned int`.
  // If val is negative, then consume() will be operating on a large
  // unsigned value, and you may or may not have a bug.

  // But yes, sometimes this is intentional.
  // Making the conversion explicit silences the sanitizer.
  consume((unsigned int)val);
}
```

Yes, there is a `-Wsign-conversion`` diagnostic group, but first, it is kinda
noisy, since it warns on everything (unlike sanitizers, warning on an
actual issues), and second, likely there are cases where it does **not** warn.

The actual detection is pretty easy. We just need to check each of the values
whether it is negative, and equality-compare the results of those comparisons.
The unsigned value is obviously non-negative. Zero is non-negative too.
https://godbolt.org/g/w93oj2

We do not have to emit the check *always*, there are obvious situations
where we can avoid emitting it, since it would **always** get optimized-out.
But i do think the tautological IR (`icmp ult %x, 0`, which is always false)
should be emitted, and the middle-end should cleanup it.

This sanitizer is in the `-fsanitize=implicit-conversion` group,
and is a logical continuation of D48958 `-fsanitize=implicit-integer-truncation`.
As for the ordering, i'we opted to emit the check **after**
`-fsanitize=implicit-integer-truncation`. At least on these simple 16 test cases,
this results in 1 of the 12 emitted checks being optimized away,
as compared to 0 checks being optimized away if the order is reversed.

This is a clang part.
The compiler-rt part is D50251.

Finishes fixing [[ https://bugs.llvm.org/show_bug.cgi?id=21530 | PR21530 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=37552 | PR37552 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=35409 | PR35409 ]].
Finishes partially fixing [[ https://bugs.llvm.org/show_bug.cgi?id=9821 | PR9821 ]].
Finishes fixing https://github.com/google/sanitizers/issues/940.

Only the bitfield handling is missing.

Reviewers: vsk, rsmith, rjmccall, #sanitizers, erichkeane

Reviewed By: rsmith

Subscribers: chandlerc, filcab, cfe-commits, regehr

Tags: #sanitizers, #clang

Differential Revision: https://reviews.llvm.org/D50250

llvm-svn: 345660
2018-10-30 21:58:56 +00:00
Erik Pilkington fa98390b3c NFC: Remove the ObjC1/ObjC2 distinction from clang (and related projects)
We haven't supported compiling ObjC1 for a long time (and never will again), so
there isn't any reason to keep these separate. This patch replaces
LangOpts::ObjC1 and LangOpts::ObjC2 with LangOpts::ObjC.

Differential revision: https://reviews.llvm.org/D53547

llvm-svn: 345637
2018-10-30 20:31:30 +00:00
Alexey Bataev 6070542296 [OPENMP] Support for mapping of the lambdas in target regions.
Added support for mapping of lambdas in the target regions. It scans all
the captures by reference in the lambda, implicitly maps those variables
in the target region and then later reinstate the addresses of
references in lambda to the correct addresses of the captured|privatized
variables.

llvm-svn: 345609
2018-10-30 15:50:12 +00:00
Bruno Ricci 023b1d19f3 [AST] Only store data for the NRVO candidate in ReturnStmt if needed
Only store the NRVO candidate if needed in ReturnStmt.
A good chuck of all of the ReturnStmt have no NRVO candidate
(more than half when parsing all of Boost). For all of them
this saves one pointer. This has no impact on children().

Differential Revision: https://reviews.llvm.org/D53716

Reviewed By: rsmith

llvm-svn: 345605
2018-10-30 14:40:49 +00:00
Roman Lebedev a32a2e3443 [clang] Move two utility functions into SourceManager
Summary: So we can keep that not-so-great logic in one place.

Reviewers: rsmith, aaron.ballman

Reviewed By: rsmith

Subscribers: nemanjai, kbarton, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D53837

llvm-svn: 345594
2018-10-30 12:37:16 +00:00
Bjorn Pettersson 6c2d83b46d [OPENMP] Fix for "error: unused variable 'CED'"
Quick fix to make code compile with -Werror,-Wunused-variable.

llvm-svn: 345573
2018-10-30 08:49:26 +00:00
Richard Smith d2e69dfddb PR23833, DR2140: an lvalue-to-rvalue conversion on a glvalue of type
nullptr_t does not access memory.

We now reuse CK_NullToPointer to represent a conversion from a glvalue
of type nullptr_t to a prvalue of nullptr_t where necessary.

llvm-svn: 345562
2018-10-30 02:02:49 +00:00
John McCall d2bfe4b73e In swiftcall, don't merge FP/vector types within a chunk.
llvm-svn: 345536
2018-10-29 20:32:36 +00:00
Gheorghe-Teodor Bercea e92567601b [OpenMP][NVPTX] Use single loops when generating code for distribute parallel for
Summary: This patch adds a new code generation path for bound sharing directives containing distribute parallel for. The new code generation scheme applies to chunked schedules on distribute and parallel for directives. The scheme simplifies the code that is being generated by eliminating the need for an outer for loop over chunks for both distribute and parallel for directives. In the case of distribute it applies to any sized chunk while in the parallel for case it only applies when chunk size is 1.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D53448

llvm-svn: 345509
2018-10-29 15:45:47 +00:00
Gheorghe-Teodor Bercea 669dbde7a5 [OpenMP][NVPTX] Enable default scheduling for parallel for in non-SPMD cases.
Summary: This patch enables the choosing of the default schedule for parallel for loops even in non-SPMD cases.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D53443

llvm-svn: 345507
2018-10-29 15:23:23 +00:00
Alexey Bataev 6ab5bb115a [OPENMP] Do not capture private loop counters.
If the loop counter is not declared in the context of the loop and it is
private, such loop counters should not be captured in the outlined
regions.

llvm-svn: 345505
2018-10-29 15:01:58 +00:00
Bruno Ricci 17ff026b73 [AST] Refactor PredefinedExpr
Make the following changes to PredefinedExpr:

1. Move PredefinedExpr below StringLiteral so that it can use its definition.
2. Rename IdentType to IdentKind to be more in line with clang's conventions,
   and propagate the change to its users.
3. Move the location and the IdentKind into the newly available space of
   the bit-fields of Stmt.
4. Only store the function name when needed. When parsing all of Boost,
   of the 1357 PredefinedExpr 919 have no function name.

Differential Revision: https://reviews.llvm.org/D53605

Reviewed By: rjmccall

llvm-svn: 345460
2018-10-27 19:21:19 +00:00
Leonard Chan eebecb3214 Revert "[PassManager/Sanitizer] Enable usage of ported AddressSanitizer passes with -fsanitize=address"
This reverts commit 8d6af840396f2da2e4ed6aab669214ae25443204 and commit
b78d19c287b6e4a9abc9fb0545de9a3106d38d3d which causes slower build times
by initializing the AddressSanitizer on every function run.

The corresponding revisions are https://reviews.llvm.org/D52814 and
https://reviews.llvm.org/D52739.

llvm-svn: 345433
2018-10-26 22:51:51 +00:00
Bjorn Pettersson b25340236c [Fixed Point Arithmetic] Refactor fixed point casts
Summary:
- Added names for some emitted values (such as "tobool" for
  the result of a cast to boolean).
- Replaced explicit IRBuilder request for doing sext/zext/trunc
  by using CreateIntCast instead.
- Simplify code for emitting satuation into one if-statement
  for clamping to max, and one if-statement for clamping to min.

Reviewers: leonardchan, ebevhan

Reviewed By: leonardchan

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D53707

llvm-svn: 345398
2018-10-26 16:12:12 +00:00
Richard Smith 7c7e531f97 PR31978: Don't crash if CodeGen sees a top-level BindingDecl.
llvm-svn: 345362
2018-10-26 03:21:20 +00:00
Saleem Abdulrasool 98ac9984b0 CodeGen: correct the case for swift 4.2, 5.0
This corrects the leader for the swift names.  The encoding for 4.2 and
5.0 differ by a single bit on the second character and were swapped.

llvm-svn: 345360
2018-10-26 03:16:16 +00:00
Eli Friedman 540be6d0bb [AArch64] Support Windows stack probe command-line arguments.
Adds support for -mno-stack-arg-probe and -mstack-probe-size.

(Not really happy copy-pasting code, but that's what we do for all the
other Windows targets.)

Differential Revision: https://reviews.llvm.org/D53617

llvm-svn: 345354
2018-10-26 01:31:57 +00:00
Bryan Chan 223307b3dc [AArch64] Implement FP16FML intrinsics
Generate the FP16FML intrinsics into arm_neon.h (AArch64 only for now).
Add two new type modifiers to NeonEmitter to handle the new prototypes.
Define __ARM_FEATURE_FP16FML when +fp16fml is enabled and guard the
intrinsics with the macro in arm_neon.h.

Based on a patch by Gao Yiling.

Differential Revision: https://reviews.llvm.org/D53633

llvm-svn: 345344
2018-10-25 23:47:00 +00:00
Erich Keane 85822b304e Change keep-static-consts to work on static storage duration, not
storage class.

To be more in line with what GCC does, switch the condition to be based
on the Static Storage duration instead of the storage class.

Change-Id: I8e959d762433cda48855099353bf3c950b9d54b8
llvm-svn: 345302
2018-10-25 19:13:46 +00:00
Thomas Lively d4bf99a540 [WebAssembly] Bitselect and min/max builtins
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits

Differential Revision: https://reviews.llvm.org/D53685

llvm-svn: 345301
2018-10-25 19:11:41 +00:00
Thomas Lively 535b4df75a [WebAssembly] Lower to target-independent saturating add
Summary: Goes along with D53721.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits

Differential Revision: https://reviews.llvm.org/D53722

llvm-svn: 345300
2018-10-25 19:06:15 +00:00
Erich Keane 19a8adc9bd Implement Function Multiversioning for Non-ELF Systems.
Similar to how ICC handles CPU-Dispatch on Windows, this patch uses the
resolver function directly to forward the call to the proper function.
This is not nearly as efficient as IFuncs of course, but is still quite
useful for large functions specifically developed for certain
processors.

This is unfortunately still limited to x86, since it depends on
__builtin_cpu_supports and __builtin_cpu_is, which are x86 builtins.

The naming for the resolver/forwarding function for cpu-dispatch was
taken from ICC's implementation, which uses the unmodified name for this
(no mangling additions).  This is possible, since cpu-dispatch uses '.A'
for the 'default' version.

In 'target' multiversioning, this function keeps the '.resolver'
extension in order to keep the default function keeping the default
mangling.

Change-Id: I4731555a39be26c7ad59a2d8fda6fa1a50f73284

Differential Revision: https://reviews.llvm.org/D53586

llvm-svn: 345298
2018-10-25 18:57:19 +00:00
Saleem Abdulrasool 0c16a13fb9 CodeGen: alter CFConstantString class name for swift 5.0
Swift 5.0 has changed the name decoration for swift symbols, using a 'S' sigil
rather than 's' as in 4.2.  Adopt the new convention.

llvm-svn: 345291
2018-10-25 17:52:13 +00:00
Luke Cheeseman a8a24aa042 [AArch64] Branch Protection and Return Address Signing B Key Support
- Add support for -mbranch-protection=<type>[+<type>]* where
  - <type> ::= [standard, none, bti, pac-ret[+b-key,+leaf]*]
- The protection emits relevant function attributes
  - sign-return-address=<scope>
  - sign-return-address-key=<key>
  - branch-protection

llvm-svn: 345273
2018-10-25 15:23:49 +00:00
Craig Topper 00897e9b7b [CodeGen] Always emit the 'min-legal-vector-width' attribute even when the value is 0.
The X86 backend will need to see the attribute to make decisions. If it isn't present the backend will have to assume large vectors may be present.

llvm-svn: 345237
2018-10-25 05:04:35 +00:00
Saleem Abdulrasool 81a650ee87 Driver,CodeGen: introduce support for Swift CFString layout
Add a new driver level flag `-fcf-runtime-abi=` that allows one to specify the
runtime ABI for CoreFoundation.  This controls the language interoperability.
In particular, this is relevant for generating the CFConstantString classes
(primarily through the `__builtin___CFStringMakeConstantString` builtin) which
construct a reference to the "CFObject"'s `isa` field.  This type differs
between swift 4.1 and 4.2+.

Valid values for the new option include:
  - objc [default behaviour] - enable ObjectiveC interoperability
  - swift-4.1 - enable interoperability with swift 4.1
  - swift-4.2 - enable interoperability with swift 4.2
  - swift-5.0 - enable interoperability with swift 5.0
  - swift [alias] - target the latest swift ABI

Furthermore, swift 4.2+ changed the layout for the CFString when building
CoreFoundation *without* ObjectiveC interoperability.  In such a case, a field
was added to the CFObject base type changing it from: <{ const int*, int }> to
<{ uintptr_t, uintptr_t, uint64_t }>.

In swift 5.0, the CFString type will be further adjusted to change the length
from a uint32_t on everything but BE LP64 targets to uint64_t.

Note that the default behaviour for clang remains unchanged and the new layout
must be explicitly opted into via `-fcf-runtime-abi=swift*`.

llvm-svn: 345222
2018-10-24 23:28:28 +00:00
Alexey Bataev ac6e4de714 Do not always request an implicit taskgroup region inside the kmpc_taskloop function
Summary:
For the following code:
```
    int i;
    #pragma omp taskloop
    for (i = 0; i < 100; ++i)
    {}

    #pragma omp taskloop nogroup
    for (i = 0; i < 100; ++i)
    {}
```

Clang emits the following LLVM IR:

```
 ...
  call void @__kmpc_taskgroup(%struct.ident_t* @0, i32 %0)
  %2 = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @0, i32 %0, i32 1, i64 80, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates*)* @.omp_task_entry. to i32 (i32, i8*)*))
  ...
  call void @__kmpc_taskloop(%struct.ident_t* @0, i32 %0, i8* %2, i32 1, i64* %8, i64* %9, i64 %13, i32 0, i32 0, i64 0, i8* null)
  call void @__kmpc_end_taskgroup(%struct.ident_t* @0, i32 %0)

  ...
  %15 = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @0, i32 %0, i32 1, i64 80, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.1*)* @.omp_task_entry..2 to i32 (i32, i8*)*))
  ...
  call void @__kmpc_taskloop(%struct.ident_t* @0, i32 %0, i8* %15, i32 1, i64* %21, i64* %22, i64 %26, i32 0, i32 0, i64 0, i8* null)

```

The first set of instructions corresponds to the first taskloop construct. It is important to note that the implicit taskgroup region associated with the taskloop construct has been materialized in our IR:  the `__kmpc_taskloop` occurs inside a taskgroup region. Note also that this taskgroup region does not exist in our second taskloop because we are using the `nogroup` clause.

The issue here is the 4th argument of the kmpc_taskloop call, starting from the end,  is always a zero. Checking the LLVM OpenMP RT implementation, we see that this argument corresponds to the nogroup parameter:

```
void __kmpc_taskloop(ident_t *loc, int gtid, kmp_task_t *task, int if_val,
                     kmp_uint64 *lb, kmp_uint64 *ub, kmp_int64 st, int nogroup,
                     int sched, kmp_uint64 grainsize, void *task_dup);
```

So basically we always tell to the RT to do another taskgroup region. For the first taskloop, this means that we create two taskgroup regions. For the second example, it means that despite the fact we had a nogroup clause we are going to have a taskgroup region, so we unnecessary wait until all descendant tasks have been executed.

Reviewers: ABataev

Reviewed By: ABataev

Subscribers: rogfer01, cfe-commits

Differential Revision: https://reviews.llvm.org/D53636

llvm-svn: 345180
2018-10-24 19:06:37 +00:00
Craig Topper 3113ec3dc7 [CodeGen] Update min-legal-vector width based on function argument and return types
This is a continuation of my patches to inform the X86 backend about what the largest IR types are in the function so that we can restrict the backend type legalizer to prevent 512-bit vectors on SKX when -mprefer-vector-width=256 is specified if no explicit 512 bit vectors were specified by the user.

This patch updates the vector width based on the argument and return types of the current function and from the types of any functions it calls. This is intended to make sure the backend type legalizer doesn't disturb any types that are required for ABI.

Differential Revision: https://reviews.llvm.org/D52441

llvm-svn: 345168
2018-10-24 17:42:17 +00:00
Saleem Abdulrasool d5a27884b1 CodeGen: extract some local variables in CFConstantString creation (NFC)
Extract the reference to the ASTContext and Triple and use them throughout the
function.  This is simply a cosmetic cleanup while in the area.  NFC.

llvm-svn: 345160
2018-10-24 16:56:36 +00:00
Erich Keane dafdd049fc Remove a pair of unused dispatch multiversion declarations.
These declarations somehow survived a cleanup that combined them with the target
multiversioning functions.  This patch removes them as they are no
longer necessary or used.

Change-Id: I318286401ace63bef1aa48018dabb25be0117ca0
llvm-svn: 345145
2018-10-24 14:33:30 +00:00
Adrian Prantl ba6fdc57b4 Debug Info (-gmodules): emit full types for non-anchored template specializations
Before this patch, clang would emit a (module-)forward declaration for
template instantiations that are not anchored by an explicit template
instantiation, but still are guaranteed to be available in an imported
module. Unfortunately detecting the owning module doesn't reliably
work when local submodule visibility is enabled and the template is
inside a cross-module namespace.

This make clang debuggable again with -gmodules and LSV enabled.

rdar://problem/41552377

llvm-svn: 345109
2018-10-24 00:06:02 +00:00
Leonard Chan b4ba467da8 [Fixed Point Arithmetic] Fixed Point to Boolean Cast
This patch is a part of https://reviews.llvm.org/D48456 in an attempt to split
the casting logic up into smaller patches. This contains the code for casting
from fixed point types to boolean types.

Differential Revision: https://reviews.llvm.org/D53308

llvm-svn: 345063
2018-10-23 17:55:35 +00:00
Andrew Savonichev b555b76ed3 [OpenCL][NFC] Unify ZeroToOCL* cast types
Reviewers: Anastasia, yaxunl

Reviewed By: Anastasia

Subscribers: asavonic, cfe-commits

Differential Revision: https://reviews.llvm.org/D52654

llvm-svn: 345038
2018-10-23 15:19:20 +00:00
Hans Wennborg 022cb5bd25 Revert r345009 "[DebugInfo] Generate debug information for labels. (After fix PR39094)"
This broke the Chromium build. See
https://bugs.chromium.org/p/chromium/issues/detail?id=898152#c1 for the
reproducer.

> Generate DILabel metadata and call llvm.dbg.label after label
> statement to associate the metadata with the label.
>
> After fixing PR37395.
> After fixing problems in LiveDebugVariables.
> After fixing NULL symbol problems in AddressPool when enabling
> split-dwarf-file.
> After fixing PR39094.
>
> Differential Revision: https://reviews.llvm.org/D45045

llvm-svn: 345026
2018-10-23 13:17:13 +00:00
Hsiangkai Wang 63b099050c [DebugInfo] Generate debug information for labels. (After fix PR39094)
Generate DILabel metadata and call llvm.dbg.label after label
statement to associate the metadata with the label.

After fixing PR37395.
After fixing problems in LiveDebugVariables.
After fixing NULL symbol problems in AddressPool when enabling
split-dwarf-file.
After fixing PR39094.

Differential Revision: https://reviews.llvm.org/D45045

llvm-svn: 345009
2018-10-23 08:06:21 +00:00
Richard Trieu 9b36a9c8da [CodeGen] Attach InlineHint to more functions
For instantiated functions, search the template pattern to see if it marked
inline to determine if InlineHint attribute should be added to the function.

llvm-svn: 344987
2018-10-23 01:26:28 +00:00
Vlad Tsyrklevich ca1c9791e3 Revert "Ensure sanitizer check function calls have a !dbg location"
This reverts commit r344915. It was causing exceptions on the
x86_64-linux-ubsan bot.

llvm-svn: 344961
2018-10-22 21:51:58 +00:00
Erich Keane 7ef210d053 Give Multiversion-inline functions linkonce linkage
Since multiversion variant functions can be inline, in C they become
available-externally linkage.  This ends up causing the variants to not
be emitted, and not available to the linker.

The solution is to make sure that multiversion functions are always
emitted by marking them linkonce.

Change-Id: I897aa37c7cbba0c1eb2c57ee881d5000a2113b75
llvm-svn: 344957
2018-10-22 21:20:45 +00:00
Adrian Prantl 5f5b910495 Ensure sanitizer check function calls have a !dbg location
Function calls without a !dbg location inside a function that has a
DISubprogram make it impossible to construct inline information and
are rejected by the verifier. This patch ensures that sanitizer check
function calls have a !dbg location, by carrying forward the location
of the preceding instruction or by inserting an artificial location if
necessary.

This fixes a crash when compiling the attached testcase with -Os.

rdar://problem/45311226

Differential Revision: https://reviews.llvm.org/D53459

llvm-svn: 344915
2018-10-22 16:27:41 +00:00
Fangrui Song 3117b17bc5 Use llvm::{all,any,none}_of instead std::{all,any,none}_of. NFC
llvm-svn: 344859
2018-10-20 17:53:42 +00:00
Akira Hatanaka 32e0a584f8 [CodeGen] Use the mangle context owned by CodeGenModule to correctly
mangle types of lambda objects captured by a block instead of creating a
new mangle context everytime a captured field type is mangled.

This fixes a bug in IRGen's block helper merging code that was
introduced in r339438 where two blocks capturing two distinct lambdas
would end up sharing helper functions and the block descriptor. This
happened because the ID number used to distinguish lambdas defined
in the same context is reset everytime a mangled context is created.

rdar://problem/45314494

llvm-svn: 344833
2018-10-20 05:45:01 +00:00
Craig Topper 4d8ced1807 [X86] Add support for more than 32 features for __builtin_cpu_is
libgcc supports more than 32 features by adding a new 32-bit variable __cpu_features2.

This adds the clang support for checking these feature bits.

Patches for compiler-rt and llvm to support this are coming as well.

Probably still need an additional patch for target multiversioning in clang.

Differential Revision: https://reviews.llvm.org/D53458

llvm-svn: 344832
2018-10-20 03:51:52 +00:00
Craig Topper 9c8f3c9654 [X86] When checking the bits in cpu_features for function multiversioning dispatcher in the resolver, make sure all the required bits are set. Not just one of them
Summary:
The multiversioning code repurposed the code from __builtin_cpu_supports for checking if a single feature is enabled. That code essentially performed (_cpu_features & (1 << C)) != 0. But with the multiversioning path, the mask is no longer guaranteed to be a power of 2. So we return true anytime any one of the bits in the mask is set not just all of the bits.

The correct check is (_cpu_features & mask) == mask

Reviewers: erichkeane, echristo

Reviewed By: echristo

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D53460

llvm-svn: 344824
2018-10-20 01:30:00 +00:00
Richard Smith b3d203ff7f PR24164, PR39336: init-captures are not distinct full-expressions.
Rather, they are subexpressions of the enclosing lambda-expression, and
any temporaries in them are destroyed at the end of that
full-expression, or when the corresponding lambda-expression is
destroyed if they are lifetime-extended.

llvm-svn: 344801
2018-10-19 19:01:34 +00:00
Mandeep Singh Grang 2147b1af95 [COFF, ARM64] Add _ReadStatusReg and_WriteStatusReg intrinsics
Reviewers: rnk, compnerd, mstorsjo, efriedma, TomTan, haripul, javed.absar

Reviewed By: efriedma

Subscribers: dmajor, kristof.beyls, chrib, cfe-commits

Differential Revision: https://reviews.llvm.org/D53115

llvm-svn: 344765
2018-10-18 23:35:35 +00:00
Kristina Brooks 7f569b7c4f Add support for -mno-tls-direct-seg-refs to Clang
This patch exposes functionality added in rL344723 to the Clang driver/frontend
as a flag and adds appropriate metadata.

Driver tests pass:
```
ninja check-clang-driver
-snip-
  Expected Passes    : 472
  Expected Failures  : 3
  Unsupported Tests  : 65
```

Odd failure in CodeGen tests but unrelated to this:
```
ninja check-clang-codegen
-snip-
/SourceCache/llvm-trunk-8.0/tools/clang/test/CodeGen/builtins-wasm.c:87:10:
error: cannot compile this builtin function yet
-snip-
Failing Tests (1):
    Clang :: CodeGen/builtins-wasm.c

  Expected Passes    : 1250
  Expected Failures  : 2
  Unsupported Tests  : 120
  Unexpected Failures: 1
```

Original commit:
[X86] Support for the mno-tls-direct-seg-refs flag
Allows to disable direct TLS segment access (%fs or %gs). GCC supports a
similar flag, it can be useful in some circumstances, e.g. when a thread
context block needs to be updated directly from user space. More info and
specific use cases: https://bugs.llvm.org/show_bug.cgi?id=16145

Patch by nruslan (Ruslan Nikolaev).

Differential Revision: https://reviews.llvm.org/D53102

llvm-svn: 344739
2018-10-18 14:07:02 +00:00
Chandler Carruth 4aaaaabe87 [TI removal] Test predicate rather than casting to detect a terminator
and use the range based successor API.

llvm-svn: 344730
2018-10-18 08:16:20 +00:00
Leonard Chan ebd10a24f4 [PassManager/Sanitizer] Enable usage of ported AddressSanitizer passes with -fsanitize=address
Enable usage of `AddressSanitizer` and `AddressModuleSanitizer` ported from the
legacy to the new PassManager.

This patch depends on https://reviews.llvm.org/D52739.

Differential Revision: https://reviews.llvm.org/D52814

llvm-svn: 344699
2018-10-17 15:38:22 +00:00
Takuto Ikuta 8aa53700ff NFC: Remove trailing space from CodeGenModule.cpp
llvm-svn: 344668
2018-10-17 04:29:56 +00:00
Yaxun Liu aae1e87f4b AMDGPU: add __builtin_amdgcn_update_dpp
Emit llvm.amdgcn.update.dpp for both __builtin_amdgcn_mov_dpp and
__builtin_amdgcn_update_dpp. The first argument to
llvm.amdgcn.update.dpp will be undef for __builtin_amdgcn_mov_dpp.

Differential Revision: https://reviews.llvm.org/D52320

llvm-svn: 344665
2018-10-17 02:32:26 +00:00
Alexey Bataev 93a38d60be [OPENMP][NVPTX]Increment iterator only when it is used, NFC.
llvm-svn: 344574
2018-10-16 00:09:06 +00:00
Leonard Chan 99bda375a1 [Fixed Point Arithmetic] FixedPointCast
This patch is a part of https://reviews.llvm.org/D48456 in an attempt to
split them up. This contains the code for casting between fixed point types
and other fixed point types.

The method for converting between fixed point types is based off the convert()
method in APFixedPoint.

Differential Revision: https://reviews.llvm.org/D50616

llvm-svn: 344530
2018-10-15 16:07:02 +00:00
Sean Fertile d900dd0c23 Revert "[CodeGenCXX] Treat 'this' as noalias in constructors"
This reverts commit https://reviews.llvm.org/rL344150 which causes
MachineOutliner related failures on the ppc64le multistage buildbot.

llvm-svn: 344526
2018-10-15 15:43:00 +00:00
Chandler Carruth e303c87e19 [TI removal] Make `getTerminator()` return a generic `Instruction`.
This removes the primary remaining API producing `TerminatorInst` which
will reduce the rate at which code is introduced trying to use it and
generally make it much easier to remove the remaining APIs across the
codebase.

Also clean up some of the stragglers that the previous mechanical update
of variables missed.

Users of LLVM and out-of-tree code generally will need to update any
explicit variable types to handle this. Replacing `TerminatorInst` with
`Instruction` (or `auto`) almost always works. Most of these edits were
made in prior commits using the perl one-liner:
```
perl -i -ple 's/TerminatorInst(\b.* = .*getTerminator\(\))/Instruction\1/g'
```

This also my break some rare use cases where people overload for both
`Instruction` and `TerminatorInst`, but these should be easily fixed by
removing the `TerminatorInst` overload.

llvm-svn: 344504
2018-10-15 10:42:50 +00:00
Alexey Bataev 4ac58d1a4b [OPENMP][NVPTX]Reduce memory usage in target region.
Additional reduction of the global memory usage in the target regions
without parallel regions.

llvm-svn: 344413
2018-10-12 20:19:59 +00:00
Erik Pilkington 92d8a29ca0 [CodeGen] Handle extern references to OBJC_CLASS_$_*
Some ObjC users declare a extern variable named OBJC_CLASS_$_Foo, then use it's
address as a Class. I.e., one could define isInstanceOfF:

BOOL isInstanceOfF(id c) {
  extern void OBJC_CLASS_$_F;
  return [c class] == (Class)&OBJC_CLASS_$_F;
}

This leads to asserts in clang CodeGen if there is an @implementation of F in
the same TU as an instance of this pattern, because CodeGen assumes that a
variable named OBJC_CLASS_$_* has the right type. This commit fixes the problem
by RAUWing the old (incorrectly typed) global with a new global, then removing
the old global.

rdar://45077269

Differential revision: https://reviews.llvm.org/D53154

llvm-svn: 344373
2018-10-12 17:22:10 +00:00
Alexey Bataev 9bfe91da3d [OPENMP][NVPTX]Reduce memory usage in orphaned functions.
if the function has globalized variables and called in context of
target/teams/distribute regions, it does not need to globalize 32
copies of the same variables for memory coalescing, it is enough to
have just one copy, because there is parallel region.
Patch does this by adding call for `__kmpc_parallel_level` function and
checking its return value. If the code sees that the parallel level is
0, then only one variable is allocated, not 32.

llvm-svn: 344356
2018-10-12 16:04:20 +00:00
Alexey Bataev ff23bb6622 [OPENMP][NVPTX]Reduce memory use for globalized vars in
target/teams/distribute regions.

Previously introduced globalization scheme that uses memory coalescing
scheme may increase memory usage fr the variables that are devlared in
target/teams/distribute contexts. We don't need 32 copies of such
variables, just 1. Patch reduces memory use in this case.

llvm-svn: 344273
2018-10-11 18:30:31 +00:00
Patrick Lyster 3fe9e396f4 Add support for 'dynamic_allocators' clause on 'requires' directive. Differential Revision: https://reviews.llvm.org/D53079
llvm-svn: 344249
2018-10-11 14:41:10 +00:00
Roman Lebedev dd403575a2 [clang][ubsan] Split Implicit Integer Truncation Sanitizer into unsigned and signed checks
Summary:
As per IRC disscussion, it seems we really want to have more fine-grained `-fsanitize=implicit-integer-truncation`:
* A check when both of the types are unsigned.
* Another check for the other cases (either one of the types is signed, or both of the types is signed).

This is clang part.
Compiler-rt part is D50902.

Reviewers: rsmith, vsk, Sanitizers

Reviewed by: rsmith

Differential Revision: https://reviews.llvm.org/D50901

llvm-svn: 344230
2018-10-11 09:09:50 +00:00
Thomas Lively 07ce6df879 [WebAssembly] Saturating float-to-int builtins
Summary: Depends on D53007 and D53004.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, kristina, cfe-commits

Differential Revision: https://reviews.llvm.org/D53009

llvm-svn: 344205
2018-10-11 00:07:55 +00:00
Richard Smith 8654ae52b0 Add a flag to remap manglings when reading profile data information.
This can be used to preserve profiling information across codebase
changes that have widespread impact on mangled names, but across which
most profiling data should still be usable. For example, when switching
from libstdc++ to libc++, or from the old libstdc++ ABI to the new ABI,
or even from a 32-bit to a 64-bit build.

The user can provide a remapping file specifying parts of mangled names
that should be treated as equivalent (eg, std::__1 should be treated as
equivalent to std::__cxx11), and profile data will be treated as
applying to a particular function if its name is equivalent to the name
of a function in the profile data under the provided equivalences. See
the documentation change for a description of how this is configured.

Remapping is supported for both sample-based profiling and instruction
profiling. We do not support remapping indirect branch target
information, but all other profile data should be remapped
appropriately.

Support is only added for the new pass manager. If someone wants to also
add support for this for the old pass manager, doing so should be
straightforward.

llvm-svn: 344199
2018-10-10 23:13:35 +00:00
Anton Bikineev cc7e74753a [CodeGenCXX] Treat 'this' as noalias in constructors
This is currently a clang extension and a resolution
of the defect report in the C++ Standard.

Differential Revision: https://reviews.llvm.org/D46441

llvm-svn: 344150
2018-10-10 16:14:51 +00:00
Martin Storsjo 3cd67c9e3b [MinGW] Fix passing a sanitizer lib name as dependent lib
Differential Revision: https://reviews.llvm.org/D52990

llvm-svn: 344125
2018-10-10 09:01:00 +00:00
Ed Maste 8bddfdd59c clang: Allow ifunc resolvers to accept arguments
When ifunc support was added to Clang (r265917) it did not allow
resolvers to take function arguments.  This was based on GCC's
documentation, which states resolvers return a pointer and take no
arguments.

However, GCC actually allows resolvers to take arguments, and glibc (on
non-x86 platforms) and FreeBSD (on x86 and arm64) pass some CPU
identification information as arguments to ifunc resolvers.  I believe
GCC's documentation is simply incorrect / out-of-date.

FreeBSD already removed the prohibition in their in-tree Clang copy.

Differential Revision:	https://reviews.llvm.org/D52703

llvm-svn: 344100
2018-10-10 00:34:17 +00:00
Alexey Bataev 9ea3c38597 [OPENMP][NVPTX] Support memory coalescing for globalized variables.
Added support for memory coalescing for better performance for
globalized variables. From now on all the globalized variables are
represented as arrays of 32 elements and each thread accesses these
elements using `tid & 31` as index.

llvm-svn: 344049
2018-10-09 14:49:00 +00:00
Mandeep Singh Grang df7929676d [COFF, ARM64] Add _InterlockedAdd intrinsic
Reviewers: rnk, mstorsjo, compnerd, TomTan, haripul, javed.absar, efriedma

Reviewed By: efriedma

Subscribers: efriedma, kristof.beyls, chrib, jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D52811

llvm-svn: 343894
2018-10-05 21:57:41 +00:00
Vedant Kumar 5931b4e5b5 [DebugInfo] Add support for DWARF5 call site-related attributes
DWARF v5 introduces DW_AT_call_all_calls, a subprogram attribute which
indicates that all calls (both regular and tail) within the subprogram
have call site entries. The information within these call site entries
can be used by a debugger to populate backtraces with synthetic tail
call frames.

Tail calling frames go missing in backtraces because the frame of the
caller is reused by the callee. Call site entries allow a debugger to
reconstruct a sequence of (tail) calls which led from one function to
another. This improves backtrace quality. There are limitations: tail
recursion isn't handled, variables within synthetic frames may not
survive to be inspected, etc. This approach is not novel, see:

  https://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=get&target=jelinek.pdf

This patch adds an IR-level flag (DIFlagAllCallsDescribed) which lowers
to DW_AT_call_all_calls. It adds the minimal amount of DWARF generation
support needed to emit standards-compliant call site entries. For easier
deployment, when the debugger tuning is LLDB, the DWARF requirement is
adjusted to v4.

Testing: Apart from check-{llvm, clang}, I built a stage2 RelWithDebInfo
clang binary. Its dSYM passed verification and grew by 1.4% compared to
the baseline. 151,879 call site entries were added.

rdar://42001377

Differential Revision: https://reviews.llvm.org/D49887

llvm-svn: 343883
2018-10-05 20:37:17 +00:00
Mandeep Singh Grang 15e0f7fa28 [COFF, ARM64] Add _InterlockedCompareExchangePointer_nf intrinsic
Reviewers: rnk, mstorsjo, compnerd, TomTan, haripul, efriedma

Reviewed By: efriedma

Subscribers: efriedma, kristof.beyls, chrib, jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D52807

llvm-svn: 343881
2018-10-05 19:49:36 +00:00
Artem Belevich 93552b39c9 [CUDA] Use all 64 bits of GUID in __nv_module_id
getGUID() returns an uint64_t and "%x" only prints 32 bits of it.
Use PRIx64 format string to print all 64 bits.

Differential Revision: https://reviews.llvm.org/D52938

llvm-svn: 343875
2018-10-05 18:39:58 +00:00
Alexey Bataev 6bc2732f71 [OPENMP][NVPTX] Fix emission of __kmpc_global_thread_num() for non-SPMD
mode.

__kmpc_global_thread_num() should be called before initialization of the
runtime.

llvm-svn: 343857
2018-10-05 15:27:47 +00:00
Alexey Bataev fd006c44ee [OPENMP] Fix emission of the __kmpc_global_thread_num.
Fixed emission of the __kmpc_global_thread_num() so that it is not
messed up with alloca instructions anymore. Plus, fixes emission of the
__kmpc_global_thread_num() functions in the target outlined regions so
that they are not called before runtime is initialized.

llvm-svn: 343856
2018-10-05 15:08:53 +00:00
Thomas Lively d2a293c562 [WebAssembly] abs and sqrt builtins
Summary: Depends on D52910.

Reviewers: aheejin, dschuff, craig.topper

Subscribers: sbc100, jgravelle-google, sunfish, kristina, cfe-commits

Differential Revision: https://reviews.llvm.org/D52913

llvm-svn: 343838
2018-10-05 01:02:54 +00:00
Thomas Lively 291d75b0de [WebAssembly] any_true and all_true builtins
Summary: Depends on D52858.

Reviewers: aheejin, dschuff, craig.topper

Subscribers: sbc100, jgravelle-google, sunfish, kristina, cfe-commits

Differential Revision: https://reviews.llvm.org/D52910

llvm-svn: 343837
2018-10-05 00:59:37 +00:00
Thomas Lively 9034a47e79 [WebAssembly] saturating arithmetic builtins
Summary: Depends on D52856.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, kristina, cfe-commits

Differential Revision: https://reviews.llvm.org/D52858

llvm-svn: 343836
2018-10-05 00:58:56 +00:00
Thomas Lively a347436f09 [WebAssembly] __builtin_wasm_replace_lane_* builtins
Summary: Depends on D52852.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, kristina, cfe-commits

Differential Revision: https://reviews.llvm.org/D52856

llvm-svn: 343835
2018-10-05 00:58:07 +00:00
Thomas Lively d6792c0c28 [WebAssembly] __builtin_wasm_extract_lane_* builtins
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, kristina, cfe-commits

Differential Revision: https://reviews.llvm.org/D52852

llvm-svn: 343834
2018-10-05 00:54:44 +00:00
Mandeep Singh Grang ecc82ef0c2 [COFF, ARM64] Add __getReg intrinsic
Reviewers: rnk, mstorsjo, compnerd, TomTan, haripul, javed.absar, efriedma

Reviewed By: efriedma

Subscribers: peter.smith, efriedma, kristof.beyls, chrib, cfe-commits

Differential Revision: https://reviews.llvm.org/D52838

llvm-svn: 343824
2018-10-04 22:32:42 +00:00
Patrick Lyster 6bdf63bd32 [OPENMP] Add reverse_offload clause to requires directive
llvm-svn: 343711
2018-10-03 20:07:58 +00:00
Matthew Voss 2016536304 Add template type and value parameter metadata nodes to template variable specializations
Summary: Add an optional attribute referring to a tuple of type and value template parameter nodes to the DIGlobalVariable node. This allows us to record the parameters of template variable specializations.

Reviewers: dblaikie, aprantl, probinson, JDevlieghere, clayborg, jingham

Reviewed By: JDevlieghere

Subscribers: cfe-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D52058

llvm-svn: 343707
2018-10-03 18:45:04 +00:00
Mandeep Singh Grang aef87980a9 [COFF, ARM64] Add _ReadWriteBarrier intrinsic
Reviewers: rnk, mstorsjo, compnerd, TomTan, haripul, javed.absar

Reviewed By: rnk

Subscribers: kristof.beyls, chrib, jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D52809

llvm-svn: 343699
2018-10-03 17:24:21 +00:00
Jonas Hahnfeld 3ca4701d35 [OpenMP][NVPTX] Simplify codegen for orphaned parallel, NFCI.
Worker threads fork off to the compiler generated worker function
directly after entering the kernel function. Hence, there is no
need to check whether the current thread is the master if we are
outside of a parallel region (neither SPMD nor parallel_level > 0).

Differential Revision: https://reviews.llvm.org/D52732

llvm-svn: 343618
2018-10-02 19:12:54 +00:00
Jonas Hahnfeld 5aaaecea7c [OpenMP] Simplify code for reductions on distribute directives, NFC.
Only need to care about the 'distribute simd' case, all other composite
directives are handled elsewhere. This was already reflected in the
outer 'if' condition, so all other inner conditions could never be true.

Differential Revision: https://reviews.llvm.org/D52731

llvm-svn: 343617
2018-10-02 19:12:47 +00:00
Yaxun Liu 9767089d00 [HIP] Support early finalization of device code for -fno-gpu-rdc
This patch renames -f{no-}cuda-rdc to -f{no-}gpu-rdc and keeps the original
options as aliases. When -fgpu-rdc is off,
clang will assume the device code in each translation unit does not call
external functions except those in the device library, therefore it is possible
to compile the device code in each translation unit to self-contained kernels
and embed them in the host object, so that the host object behaves like
usual host object which can be linked by lld.

The benefits of this feature is: 1. allow users to create static libraries which
can be linked by host linker; 2. amortized device code linking time.

This patch modifies HIP action builder to insert actions for linking device
code and generating HIP fatbin, and pass HIP fatbin to host backend action.
It extracts code for constructing command for generating HIP fatbin as
a function so that it can be reused by early finalization. It also modifies
codegen of HIP host constructor functions to embed the device fatbin
when it is available.

Differential Revision: https://reviews.llvm.org/D52377

llvm-svn: 343611
2018-10-02 17:48:54 +00:00
Sven van Haastregt da3b632057 Revert r326937 "[OpenCL] Remove block invoke function from emitted block literal struct"
This reverts r326937 as it broke block argument handling in OpenCL.
See the discussion on https://reviews.llvm.org/D43783 .

The next commit will add a test case that revealed the issue.

llvm-svn: 343582
2018-10-02 13:02:24 +00:00
Akira Hatanaka 9d34307788 [CodeGen] Before entering the loop that copies a non-trivial array field
of a non-trivial C struct, copy the preceding trivial fields that
haven't been copied.

This commit fixes a bug where the instructions used to copy the
preceding trivial fields were emitted inside the loop body.

rdar://problem/44185064

llvm-svn: 343556
2018-10-02 01:00:44 +00:00
Akira Hatanaka 8e57b07f66 Distinguish `__block` variables that are captured by escaping blocks
from those that aren't.

This patch changes the way __block variables that aren't captured by
escaping blocks are handled:

- Since non-escaping blocks on the stack never get copied to the heap
  (see https://reviews.llvm.org/D49303), Sema shouldn't error out when
  the type of a non-escaping __block variable doesn't have an accessible
  copy constructor.

- IRGen doesn't have to use the specialized byref structure (see
  https://clang.llvm.org/docs/Block-ABI-Apple.html#id8) for a
  non-escaping __block variable anymore. Instead IRGen can emit the
  variable as a normal variable and copy the reference to the block
  literal. Byref copy/dispose helpers aren't needed either.

This reapplies r343518 after fixing a use-after-free bug in function
Sema::ActOnBlockStmtExpr where the BlockScopeInfo was dereferenced after
it was popped and deleted.

rdar://problem/39352313

Differential Revision: https://reviews.llvm.org/D51564

llvm-svn: 343542
2018-10-01 21:51:28 +00:00
Akira Hatanaka 3197484701 Revert r343518.
Bots are still failing.

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/24420
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/12958

llvm-svn: 343531
2018-10-01 20:29:34 +00:00
Akira Hatanaka 2bf09ccfd5 Distinguish `__block` variables that are captured by escaping blocks
from those that aren't.

This patch changes the way __block variables that aren't captured by
escaping blocks are handled:

- Since non-escaping blocks on the stack never get copied to the heap
  (see https://reviews.llvm.org/D49303), Sema shouldn't error out when
  the type of a non-escaping __block variable doesn't have an accessible
  copy constructor.

- IRGen doesn't have to use the specialized byref structure (see
  https://clang.llvm.org/docs/Block-ABI-Apple.html#id8) for a
  non-escaping __block variable anymore. Instead IRGen can emit the
  variable as a normal variable and copy the reference to the block
  literal. Byref copy/dispose helpers aren't needed either.

This reapplies r341754, which was reverted in r341757 because it broke a
couple of bots. r341754 was calling markEscapingByrefs after the call to
PopFunctionScopeInfo, which caused the popped function scope to be
cleared out when the following code was compiled, for example:

$ cat test.m
struct A {
  id data[10];
};

void foo() {
  __block A v;
  ^{ (void)v; };
}

This commit calls markEscapingByrefs before calling PopFunctionScopeInfo
to prevent that from happening.

rdar://problem/39352313

Differential Revision: https://reviews.llvm.org/D51564

llvm-svn: 343518
2018-10-01 18:50:14 +00:00
Alexey Bataev e79451a517 [OPENMP][NVPTX] Handle `requires datasharing` flag correctly with
lightweight runtime.

The datasharing flag must be set to `1` when executing SPMD-mode compatible directive with reduction|lastprivate clauses.

llvm-svn: 343492
2018-10-01 16:20:57 +00:00
Alexey Bataev 4da1d5c33f [OPENMP] Simplify code, NFC.
llvm-svn: 343483
2018-10-01 14:40:06 +00:00
Alexey Bataev 94c5064701 [OPENMP] Fix enum identifier, NFC.
llvm-svn: 343479
2018-10-01 14:26:31 +00:00
Patrick Lyster 4a370b9f63 Add support for unified_shared_memory clause on requires directive
llvm-svn: 343472
2018-10-01 13:47:43 +00:00
Fangrui Song 1d38c13f6e Use the container form llvm::sort(C, ...)
There are a few leftovers of rC343147 that are not (\w+)\.begin but in
the form of ([-[:alnum:]>.]+)\.begin or spanning two lines. Change them
to use the container form in this commit. The 12 occurrences have been
inspected manually for safety.

llvm-svn: 343425
2018-09-30 21:41:11 +00:00
Richard Smith 8baa50013c [cxx2a] P0614R1: Support init-statements in range-based for loops.
We don't yet support this for the case where a range-based for loop is
implicitly rewritten to an ObjC for..in statement.

llvm-svn: 343350
2018-09-28 18:44:09 +00:00
Gheorghe-Teodor Bercea 8233af90e1 [OpenMP] Make default parallel for schedule in NVPTX target regions in SPMD mode achieve coalescing
Summary: Set default schedule for parallel for loops to schedule(static, 1) when using SPMD mode on the NVPTX device offloading toolchain to ensure coalescing.

Reviewers: ABataev, Hahnfeld, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D52629

llvm-svn: 343260
2018-09-27 20:29:00 +00:00
Gheorghe-Teodor Bercea 02650d4c2c [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing
Summary: For the OpenMP NVPTX toolchain choose a default distribute schedule that ensures coalescing on the GPU when in SPMD mode. This significantly increases the performance of offloaded target code and reduces the number of registers used on the GPU side.

Reviewers: ABataev, caomhin, Hahnfeld

Reviewed By: ABataev, Hahnfeld

Subscribers: Hahnfeld, jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D52434

llvm-svn: 343253
2018-09-27 19:22:56 +00:00
Vitaly Buka 84ffd06f8b Revert "[DebugInfo] Generate debug information for labels."
This reverts commit r343148.

It crashes on sanitizer-x86_64-linux-autoconf.

llvm-svn: 343183
2018-09-27 08:15:24 +00:00
Hsiangkai Wang 705121aaae [DebugInfo] Generate debug information for labels.
Generate DILabel metadata and call llvm.dbg.label after label
statement to associate the metadata with the label.

After fixing PR37395.
After fixing problems in LiveDebugVariables.
After fixing NULL symbol problems in AddressPool when enabling
split-dwarf-file.

Differential Revision: https://reviews.llvm.org/D45045

llvm-svn: 343148
2018-09-26 22:18:45 +00:00
Fangrui Song 55fab260ca llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)
Summary: The convenience wrapper in STLExtras is available since rL342102.

Reviewers: rsmith, #clang, dblaikie

Reviewed By: rsmith, #clang

Subscribers: mgrang, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52576

llvm-svn: 343147
2018-09-26 22:16:28 +00:00
Craig Topper fb5d9f2849 [X86] For lzcnt/tzcnt intrinsics use cttz/ctlz intrinsics with zero_undef flag set to false.
Previously we used a select and the zero_undef=true intrinsic. In -O2 this pattern will get optimized to zero_undef=false. But in -O0 this optimization won't happen. This results in a compare and cmov being wrapped around a tzcnt/lzcnt instruction.

By using the zero_undef=false intrinsic directly without the select, we can improve the -O0 codegen to just an lzcnt/tzcnt instruction.

Differential Revision: https://reviews.llvm.org/D52392

llvm-svn: 343126
2018-09-26 17:01:44 +00:00
Kelvin Li 1408f91a25 [OPENMP] Add support for OMP5 requires directive + unified_address clause
Add support for OMP5.0 requires directive and unified_address clause.
Patches to follow will include support for additional clauses.

Differential Revision: https://reviews.llvm.org/D52359

llvm-svn: 343063
2018-09-26 04:28:39 +00:00
Kristina Brooks 34e24d5b6a Reland "[Clang][CodeGen][ObjC]: Fix CoreFoundation on ELF with `-fconstant-cfstrings`"
Relanding rL342883 with more fragmented tests to test ELF-specific
section emission separately from broad-scope CFString tests. Now this
tests the following separately

1). CoreFoundation builds and linkage for ELF while building it.
2). CFString ELF section emission outside CF in assembly output.
3). Broad scope `cfstring3.c` tests which cover all object formats at
    bitcode level and assembly level (including ELF). 

This fixes non-bridged CoreFoundation builds on ELF targets
that use -fconstant-cfstrings. The original changes from differential 
for a similar patch to PE/COFF (https://reviews.llvm.org/D44491) did not
check for an edge case where the global could be a constant which surfaced
as an issue when building for ELF because of different linkage semantics.

This patch addresses several issues with crashes related to CF builds on ELF
as well as improves data layout by ensuring string literals that back
the actual CFConstStrings end up in .rodata in line with Mach-O.

Change itself tested with CoreFoundation on Linux x86_64 but should be valid
for BSD-like systems as well that use ELF as the native object format.

Differential Revision: https://reviews.llvm.org/D52344

llvm-svn: 343038
2018-09-25 22:27:40 +00:00
Calixte Denizet fcd661d278 [CodeGen] Revert commit https://reviews.llvm.org/rL342717
llvm-svn: 342912
2018-09-24 18:24:18 +00:00
Benjamin Kramer 0181e7a6bd Fix the type of 1<<31 integer constants.
Shifting into the sign bit is technically undefined behavior. No known
compiler exploits it though.

llvm-svn: 342909
2018-09-24 17:51:15 +00:00
Kristina Brooks a6398cdcfc Revert "rL342883: [Clang][CodeGen][ObjC]: Fix CoreFoundation on ELF with `-fconstant-cfstrings`."
Seems to be causing buildbot failures, need to look into it.

llvm-svn: 342893
2018-09-24 15:26:08 +00:00
Kristina Brooks 7c142a52e2 [Clang][CodeGen][ObjC]: Fix CoreFoundation on ELF with `-fconstant-cfstrings`.
[Clang][CodeGen][ObjC]: Fix non-bridged CoreFoundation builds on ELF targets
that use `-fconstant-cfstrings`. The original changes from differential 
for a similar patch to PE/COFF (https://reviews.llvm.org/D44491) did not
check for an edge case where the global could be a constant which surfaced
as an issue when building for ELF because of different linkage semantics.

This patch addresses several issues with crashes related to CF builds on ELF
as well as improves data layout by ensuring string literals that back
the actual CFConstStrings end up in .rodata in line with Mach-O.

Change itself tested with CoreFoundation on Linux x86_64 but should be valid
for BSD-like systems as well that use ELF as the native object format.

Differential Revision: https://reviews.llvm.org/D52344

llvm-svn: 342883
2018-09-24 14:06:47 +00:00
Richard Trieu 5061e83295 Make compare function in r342648 have strict weak ordering.
Comparison functions used in sorting algorithms need to have strict weak
ordering.  Remove the assert and allow comparisons on all lists.

llvm-svn: 342774
2018-09-21 21:20:33 +00:00
Caroline Tice 62279730e2 Add necessary support for storing code-model to module IR.
Currently the code-model does not get saved in the module IR, so if a
code model is specified when compiling with LTO, it gets lost and is
not propagated properly to LTO. This patch does what is necessary in
the front end to pass the code-model to the module, so that the back
end can store it in the Module .

Differential Revision: https://reviews.llvm.org/D52323

llvm-svn: 342758
2018-09-21 18:34:59 +00:00
JF Bastien 72c475276e [NFC] remove unused variable
It was causing a warning.

llvm-svn: 342750
2018-09-21 17:38:41 +00:00
Alexey Bataev 2adecff1aa [OPENMP][NVPTX] Enable support for lastprivates in SPMD constructs.
Previously we could not use lastprivates in SPMD constructs, patch
allows supporting lastprivates in SPMD with uninitialized runtime.

llvm-svn: 342738
2018-09-21 14:22:53 +00:00
JF Bastien 2f0582fcc7 NFC: deduplicate isRepeatedBytePattern from clang to LLVM's isBytewiseValue
Summary:
This code was in CGDecl.cpp and really belongs in LLVM. It happened to have isBytewiseValue which served a very similar purpose but wasn't as powerful as clang's version. Remove the clang version, and augment isBytewiseValue to be as powerful so that clang does the same thing it used to.

LLVM part of this patch: D51751

Subscribers: dexonsmith, cfe-commits

Differential Revision: https://reviews.llvm.org/D51752

llvm-svn: 342734
2018-09-21 13:54:09 +00:00
Calixte Denizet 5713db4c4a [CodeGen] Add to emitted DebugLoc information about coverage when it's required
Summary:
Some lines have a hit counter where they should not have one.
Cleanup stuff is located to the last line of the body which is most of the time a '}'.
And Exception stuff is added at the beginning of a function and at the end (represented by '{' and '}').
So in such cases, the DebugLoc used in GCOVProfiling.cpp must be marked as not covered.
This patch is a followup of https://reviews.llvm.org/D49915.
Tests in projects/compiler_rt are fixed by: https://reviews.llvm.org/D49917

Reviewers: marco-c, davidxl

Reviewed By: marco-c

Subscribers: dblaikie, cfe-commits, sylvestre.ledru

Differential Revision: https://reviews.llvm.org/D49916

llvm-svn: 342717
2018-09-21 09:17:06 +00:00
Mike Rice 0ed4666d85 [OPENMP] Fix spelling of getLoopCounter (NFC)
llvm-svn: 342666
2018-09-20 17:19:41 +00:00
Alexey Bataev e82445f5a9 [OPENMP] Add support for mapping memory pointed by member pointer.
Added support for map(s, s.ptr[0:1]) kind of mapping.

llvm-svn: 342648
2018-09-20 13:54:02 +00:00
QingShan Zhang accb65b994 [PowerPC] [Clang] Add vector int128 pack/unpack builtins
unsigned long long builtin_unpack_vector_int128 (vector int128_t, int);
vector int128_t builtin_pack_vector_int128 (unsigned long long, unsigned long long);

Builtins should behave the same way as in GCC.

Patch By: wuzish (Zixuan Wu)
Differential Revision: https://reviews.llvm.org/D52074

llvm-svn: 342614
2018-09-20 05:04:57 +00:00
Reid Kleckner 78381c63ae [MS] Defer dllexport inline friend functions like other inline methods
This special case was added in r264841, but the code breaks our
invariants by calling EmitTopLevelDecl without first creating a
HandlingTopLevelDeclRAII scope.

This fixes the PCH crash in https://crbug.com/884427. I was never able
to make a satisfactory reduction, unfortunately. I'm not very worried
about this regressing since this change makes the code simpler while
passing the existing test that shows we do emit dllexported friend
function definitions. Now we just defer their emission until the tag is
fully complete, which is generally good.

llvm-svn: 342516
2018-09-18 23:16:30 +00:00
Dean Michael Berris 05cf443463 [XRay][clang] Emit "never-instrument" attribute
Summary:
Before this change, we only emit the XRay attributes in LLVM IR when the
-fxray-instrument flag is provided. This may cause issues with thinlto
when the final binary is being built/linked with -fxray-instrument, and
the constitutent LLVM IR gets re-lowered with xray instrumentation.

With this change, we can honour the "never-instrument "attributes
provided in the source code and preserve those in the IR. This way, even
in thinlto builds, we retain the attributes which say whether functions
should never be XRay instrumented.

This change addresses llvm.org/PR38922.

Reviewers: mboerger, eizan

Subscribers: mehdi_amini, dexonsmith, cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D52015

llvm-svn: 342200
2018-09-14 01:59:12 +00:00
Erich Keane f353ae1848 [NFC]Refactor MultiVersion Resolver Emission to combine types
Previously, both types (plus the future target-clones) of
multiversioning had a separate ResolverOption structure and emission
function.  This patch combines the two, at the expense of a slightly
more expensive sorting function.

llvm-svn: 342152
2018-09-13 16:58:24 +00:00
Alexey Bataev e6aa4694de [OPENMP] Fix PR38903: Crash on instantiation of the non-dependent
declare reduction.

If the declare reduction construct with the non-dependent type is
defined in the template construct, the compiler might crash on the
template instantition. Reworked the whole instantiation scheme for the
declare reduction constructs to fix this problem correctly.

llvm-svn: 342151
2018-09-13 16:54:05 +00:00
Oliver Stannard 95ac65bc32 [AArch64] Enable return address signing for static ctors
Functions generated by clang and included in the .init_array section (such as
static constructors) do not follow the usual code path for adding
target-specific function attributes, so we have to add the return address
signing attribute here too, as is currently done for the sanitisers.

Differential revision: https://reviews.llvm.org/D51418

llvm-svn: 342126
2018-09-13 10:25:36 +00:00
David Green be0c5b6d3c [CodeGen] Align rtti and vtable data
Previously the alignment on the newly created rtti/typeinfo data was largely
not set, meaning that DataLayout::getPreferredAlignment was free to overalign
it to 16 bytes. This causes unnecessary code bloat.

Differential Revision: https://reviews.llvm.org/D51416

llvm-svn: 342053
2018-09-12 14:09:06 +00:00
Mikhail Maltsev e04ab4fe97 [CodeGen][ARM] Coerce FP16 vectors to integer vectors when needed
Summary:
On targets that do not support FP16 natively LLVM currently legalizes
vectors of FP16 values by scalarizing them and promoting to FP32. This
causes problems for the following code:

  void foo(int, ...);

  typedef __attribute__((neon_vector_type(4))) __fp16 float16x4_t;
  void bar(float16x4_t x) {
    foo(42, x);
  }

According to the AAPCS (appendix A.2) float16x4_t is a containerized
vector fundamental type, so 'foo' expects that the 4 16-bit FP values
are packed into 2 32-bit registers, but instead bar promotes them to
4 single precision values.

Since we already handle scalar FP16 values in the frontend by
bitcasting them to/from integers, this patch adds similar handling for
vector types and homogeneous FP16 vector aggregates.

One existing test required some adjustments because we now generate
more bitcasts (so the patch changes the test to target a machine with
native FP16 support).

Reviewers: eli.friedman, olista01, SjoerdMeijer, javed.absar, efriedma

Reviewed By: javed.absar, efriedma

Subscribers: efriedma, kristof.beyls, cfe-commits, chrib

Differential Revision: https://reviews.llvm.org/D50507

llvm-svn: 342034
2018-09-12 09:19:19 +00:00
Adrian Prantl 05a623eb87 Remove all uses of DIFlagBlockByrefStruct
This patch removes the last reason why DIFlagBlockByrefStruct from
Clang by directly implementing the drilling into the member type done
in DwarfDebug::DbgVariable::getType() into the frontend.

rdar://problem/31629055

Differential Revision: https://reviews.llvm.org/D51807

llvm-svn: 341842
2018-09-10 16:14:28 +00:00
Akira Hatanaka 9bd2452708 Revert r341754.
The commit broke a couple of bots:

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/12347
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/7310

llvm-svn: 341757
2018-09-09 05:22:49 +00:00
Akira Hatanaka 2e00b98027 Distinguish `__block` variables that are captured by escaping blocks
from those that aren't.

This patch changes the way __block variables that aren't captured by
escaping blocks are handled:

- Since non-escaping blocks on the stack never get copied to the heap
  (see https://reviews.llvm.org/D49303), Sema shouldn't error out when
  the type of a non-escaping __block variable doesn't have an accessible
  copy constructor.

- IRGen doesn't have to use the specialized byref structure (see
  https://clang.llvm.org/docs/Block-ABI-Apple.html#id8) for a
  non-escaping __block variable anymore. Instead IRGen can emit the
  variable as a normal variable and copy the reference to the block
  literal. Byref copy/dispose helpers aren't needed either.

rdar://problem/39352313

Differential Revision: https://reviews.llvm.org/D51564

llvm-svn: 341754
2018-09-08 20:03:00 +00:00
Richard Smith da3729d1e6 Do not use optimized atomic libcalls for misaligned atomics.
Summary:
The optimized (__atomic_foo_<n>) libcalls assume that the atomic object
is properly aligned, so should never be called on an underaligned
object.

This addresses one of several problems identified in PR38846.

Reviewers: jyknight, t.p.northover

Subscribers: jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D51817

llvm-svn: 341734
2018-09-07 23:57:54 +00:00
Richard Smith edb9fbb78a Make -Watomic-alignment say whether the atomic operation was oversized
or misaligned.

llvm-svn: 341710
2018-09-07 21:24:27 +00:00
Craig Topper ecf2e2fe31 [X86] Custom emit __builtin_rdtscp so we can emit an explicit store for the out parameter
This is the clang side of D51803. The llvm intrinsic now returns two results. So we need to emit an explicit store in IR for the out parameter. This is similar to addcarry/subborrow/rdrand/rdseed.

Differential Revision: https://reviews.llvm.org/D51805

llvm-svn: 341699
2018-09-07 19:14:24 +00:00
Craig Topper 52a61fc2ac [X86] Modify addcarry/subborrow builtins to emit an 2 result and intrinsic and an store instruction.
This is the clang side of D51769. The llvm intrinsics now return two results instead of using an out parameter.

Differential Revision: https://reviews.llvm.org/D51771

llvm-svn: 341678
2018-09-07 16:58:57 +00:00
Alexander Potapenko d49c32ce3f [MSan] add KMSAN support to Clang driver
Boilerplate code for using KMSAN instrumentation in Clang.

We add a new command line flag, -fsanitize=kernel-memory, with a
corresponding SanitizerKind::KernelMemory, which, along with
SanitizerKind::Memory, maps to the memory_sanitizer feature.

KMSAN is only supported on x86_64 Linux.

It's incompatible with other sanitizers, but supports code coverage
instrumentation.

llvm-svn: 341641
2018-09-07 09:21:09 +00:00
Stephen Kelly 96160587f9 Remove deprecated API
Reviewers: teemperor!

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D50353

llvm-svn: 341573
2018-09-06 18:26:30 +00:00
Reid Kleckner 7a36896864 Re-land r334417 "[MS] Use mangled names and comdats for string merging with ASan"
The issue with -fprofile-generate was fixed and the dependent CL
relanded in r340232.

llvm-svn: 341572
2018-09-06 18:25:39 +00:00
Sam McCall 026d8a20ec Revert "[DebugInfo] Generate debug information for labels. (Fix PR37395)"
This reverts commit r341519, which generates debug info that causes
backend crashes. (with -split-dwarf-file)

Details in https://reviews.llvm.org/D50495

llvm-svn: 341549
2018-09-06 14:27:40 +00:00
Hsiangkai Wang 0a875b2f15 [DebugInfo] Generate debug information for labels. (Fix PR37395)
Generate DILabel metadata and call llvm.dbg.label after label
statement to associate the metadata with the label.

After fixing PR37395.
After fixing problems in LiveDebugVariables.

Differential Revision: https://reviews.llvm.org/D45045

llvm-svn: 341519
2018-09-06 06:03:36 +00:00
Chandler Carruth 664aa868f5 [x86/SLH] Add a real Clang flag and LLVM IR attribute for Speculative
Load Hardening.

Wires up the existing pass to work with a proper IR attribute rather
than just a hidden/internal flag. The internal flag continues to work
for now, but I'll likely remove it soon.

Most of the churn here is adding the IR attribute. I talked about this
Kristof Beyls and he seemed at least initially OK with this direction.
The idea of using a full attribute here is that we *do* expect at least
some forms of this for other architectures. There isn't anything
*inherently* x86-specific about this technique, just that we only have
an implementation for x86 at the moment.

While we could potentially expose this as a Clang-level attribute as
well, that seems like a good question to defer for the moment as it
isn't 100% clear whether that or some other programmer interface (or
both?) would be best. We'll defer the programmer interface side of this
for now, but at least get to the point where the feature can be enabled
without relying on implementation details.

This also allows us to do something that was really hard before: we can
enable *just* the indirect call retpolines when using SLH. For x86, we
don't have any other way to mitigate indirect calls. Other architectures
may take a different approach of course, and none of this is surfaced to
user-level flags.

Differential Revision: https://reviews.llvm.org/D51157

llvm-svn: 341363
2018-09-04 12:38:00 +00:00
Argyrios Kyrtzidis adc178ef2c Add header guards to some headers that are missing them
llvm-svn: 341324
2018-09-03 16:26:36 +00:00
Craig Topper d88f76a891 [X86] Add ktest intrinsics to match gcc and icc.
These aren't documented in the Intel Intrinsics Guide, but are supported by gcc and icc.

Includes these intrinsics:
_ktestc_mask8_u8, _ktestz_mask8_u8, _ktest_mask8_u8
_ktestc_mask16_u8, _ktestz_mask16_u8, _ktest_mask16_u8
_ktestc_mask32_u8, _ktestz_mask32_u8, _ktest_mask32_u8
_ktestc_mask64_u8, _ktestz_mask64_u8, _ktest_mask64_u8

llvm-svn: 341265
2018-08-31 22:29:56 +00:00
Craig Topper 42a4d0822e [X86] Add k-mask conversion and load/store instrinsics to match gcc and icc.
This adds:
_cvtmask8_u32, _cvtmask16_u32, _cvtmask32_u32, _cvtmask64_u64
_cvtu32_mask8, _cvtu32_mask16, _cvtu32_mask32, _cvtu64_mask64
_load_mask8, _load_mask16, _load_mask32, _load_mask64
_store_mask8, _store_mask16, _store_mask32, _store_mask64

These are currently missing from the Intel Intrinsics Guide webpage.

llvm-svn: 341251
2018-08-31 20:41:06 +00:00
Craig Topper 2aa8efc820 [X86] Add kshift intrinsics to match gcc and icc.
This adds the following intrinsics:
_kshiftli_mask8
_kshiftli_mask16
_kshiftli_mask32
_kshiftli_mask64
_kshiftri_mask8
_kshiftri_mask16
_kshiftri_mask32
_kshiftri_mask64

llvm-svn: 341234
2018-08-31 18:22:52 +00:00
Alexey Bataev 80e1b5eb34 [DEBUGINFO] Add support for emission of the debug directives only.
Summary:
Added option -gline-directives-only to support emission of the debug directives
only. It behaves very similar to -gline-tables-only, except that it sets
llvm debug info emission kind to
llvm::DICompileUnit::DebugDirectivesOnly.

Reviewers: echristo

Subscribers: aprantl, fedor.sergeev, JDevlieghere, cfe-commits

Differential Revision: https://reviews.llvm.org/D51177

llvm-svn: 341212
2018-08-31 13:56:14 +00:00
Alexey Bataev bd8ff9bd70 [OPENMP] Fix PR38710: static functions are not emitted as implicitly
'declare target'.

All the functions, referenced in implicit|explicit target regions must
be emitted during code emission for the device.

llvm-svn: 341093
2018-08-30 18:56:11 +00:00
Alexey Bataev 80a9a61ded [OPENMP][NVPTX] Add options -f[no-]openmp-cuda-force-full-runtime.
Added options -f[no-]openmp-cuda-force-full-runtime to [not] force use
of the full runtime for OpenMP offloading to CUDA devices.

llvm-svn: 341073
2018-08-30 14:45:24 +00:00
Alexey Bataev b4dd6d24d7 [OPENMP] Do not create offloading entry for declare target variables
declarations.

We should not create offloading entries for declare target var
declarations as it causes compiler crash.

llvm-svn: 340968
2018-08-29 20:41:37 +00:00
Alexey Bataev 8d8e1235ab [OPENMP][NVPTX] Add support for lightweight runtime.
If the target construct can be executed in SPMD mode + it is a loop
based directive with static scheduling, we can use lightweight runtime
support.

llvm-svn: 340953
2018-08-29 18:32:21 +00:00
Martin Storsjo 5ff7a8e67b [MinGW] Don't mark external variables as DSO local
Since MinGW supports automatically importing external variables from
DLLs even without the DLLImport attribute, we shouldn't mark them
as DSO local unless we actually know them to be local for sure.

Keep marking thread local variables as DSO local.

Differential Revision: https://reviews.llvm.org/D51382

llvm-svn: 340941
2018-08-29 17:26:58 +00:00
Mike Rice e1ca7b614f [OPENMP] Create non-const ident_t objects.
Currently ident_t objects are created const when debug info is not
enabled, but the libittnotify libray in the OpenMP runtime writes to
the reserved_2 field (See __kmp_itt_region_forking in
openmp/runtime/src/kmp_itt.inl).  Now create ident_t objects non-const.

Differential Revision: https://reviews.llvm.org/D51331

llvm-svn: 340934
2018-08-29 15:45:11 +00:00
Craig Topper a65bf65e0b [X86] Add kadd intrinsics to match gcc and icc.
This adds the following intrinsics:
_kadd_mask64
_kadd_mask32
_kadd_mask16
_kadd_mask8

These are missing from the Intel Intrinsics Guide, but are implemented by both gcc and icc.

llvm-svn: 340879
2018-08-28 22:32:14 +00:00
Craig Topper cb5fd56c7f [X86] Add kortest intrinsics for 8, 32, and 64 bit masks. Add new intrinsic names for 16 bit masks.
This matches gcc and icc despite not being documented in the Intel Intrinsics Guide.

llvm-svn: 340798
2018-08-28 06:28:25 +00:00
Craig Topper c330ca8611 [X86] Add intrinsics for kand/kandn/knot/kor/kxnor/kxor with 8, 32, and 64-bit mask registers.
This also adds a second intrinsic name for the 16-bit mask versions.

These intrinsics match gcc and icc. They just aren't published in the Intel Intrinsics Guide so I only recently found they existed.

llvm-svn: 340719
2018-08-27 06:20:22 +00:00
Eli Friedman 53591233c2 [LTO] Fix -save-temps with LTO and unnamed globals.
If all LLVM passes are disabled, we can't emit a summary because there
could be unnamed globals in the IR.

Differential Revision: https://reviews.llvm.org/D51198

llvm-svn: 340640
2018-08-24 19:31:52 +00:00
Elizabeth Andrews 6593df241a Currently clang does not emit unused static constants. GCC emits these
constants by default when there is no optimization.

GCC's option -fno-keep-static-consts can be used to not emit
unused static constants.

In Clang, since default behavior does not keep unused static constants, 
-fkeep-static-consts can be used to emit these if required. This could be 
useful for producing identification strings like SVN identifiers 
inside the object file even though the string isn't used by the program.

Differential Revision: https://reviews.llvm.org/D40925

llvm-svn: 340439
2018-08-22 19:05:19 +00:00
Akira Hatanaka 2a5e4639ea [CodeGen] Look at the type of a block capture field rather than the type
of the captured variable when determining whether the capture needs
special handing when the block is copied or disposed.

This fixes bugs in the handling of variables captured by a block that is
nested inside a lambda that captures the variables by reference.

rdar://problem/43540889

Differential Revision: https://reviews.llvm.org/D51025

llvm-svn: 340408
2018-08-22 13:41:19 +00:00
David Green ecc698712c [AArch64] Add Tiny Code Model for AArch64
Adds a tiny code model to Clang along side rL340397.

Differential Revision: https://reviews.llvm.org/D49674

llvm-svn: 340398
2018-08-22 11:34:28 +00:00
Nico Weber 14a577bfd1 Eliminate instances of `EmitScalarExpr(E->getArg(n))` in EmitX86BuiltinExpr().
EmitX86BuiltinExpr() emits all args into Ops at the beginning, so don't do that
work again.

This changes behavior: If e.g. ++a was passed as an arg, we incremented a twice
previously. This change fixes that bug.

https://reviews.llvm.org/D50979

llvm-svn: 340348
2018-08-21 22:19:55 +00:00
Martin Storsjo d39d53b0d1 [CodeGen] Implicitly set stackrealign on the main function, if custom stack alignment is used
If using a custom stack alignment, one is expected to make sure
that all callers provide such alignment, or realign the stack in
all entry points (and callbacks).

Despite this, the compiler can assume that the main function will
need realignment in these cases, since the startup routines calling
the main function most probably won't provide the custom alignment.

This matches what GCC does in similar cases; if compiling with
-mincoming-stack-boundary=X -mpreferred-stack-boundary=X, GCC normally
assumes such alignment on entry to a function, but specifically for
the main function still does realignment.

Differential Revision: https://reviews.llvm.org/D51026

llvm-svn: 340334
2018-08-21 20:41:17 +00:00
Erik Pilkington 5a559e64a9 Add a new flag and attributes to control static destructor registration
This commit adds the flag -fno-c++-static-destructors and the attributes
[[clang::no_destroy]] and [[clang::always_destroy]]. no_destroy specifies that a
specific static or thread duration variable shouldn't have it's destructor
registered, and is the default in -fno-c++-static-destructors mode.
always_destroy is the opposite, and is the default in -fc++-static-destructors
mode.

A variable whose destructor is disabled (either because of
-fno-c++-static-destructors or [[clang::no_destroy]]) doesn't count as a use of
the destructor, so we don't do any access checking or mark it referenced. We
also don't emit -Wexit-time-destructors for these variables.

rdar://21734598

Differential revision: https://reviews.llvm.org/D50994

llvm-svn: 340306
2018-08-21 17:24:06 +00:00
David Blaikie 658645241b DebugInfo: Add the ability to disable DWARF name tables entirely
This changes the current default behavior (from emitting pubnames by
default, to not emitting them by default) & moves to matching GCC's
behavior* with one significant difference: -gno(-gnu)-pubnames disables
pubnames even in the presence of -gsplit-dwarf (though -gsplit-dwarf
still by default enables -ggnu-pubnames). This allows users to disable
pubnames (& the new DWARF5 accelerated access tables) when they might
not be worth the size overhead.

* GCC's behavior is that -ggnu-pubnames and -gpubnames override each
other, and that -gno-gnu-pubnames and -gno-pubnames act as synonyms and
disable either kind of pubnames if they come last. (eg: -gpubnames
-gno-gnu-pubnames causes no pubnames (neither gnu or standard) to be
emitted)

llvm-svn: 340206
2018-08-20 20:14:08 +00:00
Alexey Bataev 7f792cab12 [OPENMP] Fix crash on the emission of the weak function declaration.
If the function is actually a weak reference, it should not be marked as
deferred definition as this is only a declaration. Patch adds checks for
the definitions if they must be emitted. Otherwise, only declaration is
emitted.

llvm-svn: 340191
2018-08-20 18:03:40 +00:00
Alexey Bataev 7b1a7bd5ba [OPENMP][BLOCKS]Fix PR38923: reference to a global variable is captured
by a block.

Added checks for capturing of the variable in the block when trying to
emit correct address for the variable with the reference type. This
extra check allows correctly identify the variables that are not
captured in the block context.

llvm-svn: 340181
2018-08-20 16:00:22 +00:00
Sanjay Patel ad82390d3f [CodeGen] add rotate builtins that map to LLVM funnel shift
This is a partial retry of rL340137 (reverted at rL340138 because of gcc host compiler crashing)
with 1 change:
Remove the changes to make microsoft builtins also use the LLVM intrinsics.
 
This exposes the LLVM funnel shift intrinsics as more familiar bit rotation functions in clang
(when both halves of a funnel shift are the same value, it's a rotate).

We're free to name these as we want because we're not copying gcc, but if there's some other
existing art (eg, the microsoft ops) that we want to replicate, we can change the names.

The funnel shift intrinsics were added here:
https://reviews.llvm.org/D49242

With improved codegen in:
https://reviews.llvm.org/rL337966
https://reviews.llvm.org/rL339359

And basic IR optimization added in:
https://reviews.llvm.org/rL338218
https://reviews.llvm.org/rL340022

...so these are expected to produce asm output that's equal or better to the multi-instruction
alternatives using primitive C/IR ops.

In the motivating loop example from PR37387:
https://bugs.llvm.org/show_bug.cgi?id=37387#c7
...we get the expected 'rolq' x86 instructions if we substitute the rotate builtin into the source.

Differential Revision: https://reviews.llvm.org/D50924

llvm-svn: 340141
2018-08-19 16:50:30 +00:00
Sanjay Patel a09ae4b8a6 revert r340137: [CodeGen] add rotate builtins
At least a couple of bots (gcc host compiler on PPC only?) are showing the compiler dying while trying to compile.

llvm-svn: 340138
2018-08-19 15:31:42 +00:00
Sanjay Patel 446529b0d9 [CodeGen] add/fix rotate builtins that map to LLVM funnel shift (retry)
This is a retry of rL340135 (reverted at rL340136 because of gcc host compiler crashing)
with 2 changes:
1. Move the code into a helper to reduce code duplication (and hopefully work-around the crash).
2. The original commit had a formatting bug in the docs (missing an underscore).

Original commit message:

This exposes the LLVM funnel shift intrinsics as more familiar bit rotation functions in clang
(when both halves of a funnel shift are the same value, it's a rotate).

We're free to name these as we want because we're not copying gcc, but if there's some other
existing art (eg, the microsoft ops that are modified in this patch) that we want to replicate,
we can change the names.

The funnel shift intrinsics were added here:
https://reviews.llvm.org/D49242

With improved codegen in:
https://reviews.llvm.org/rL337966
https://reviews.llvm.org/rL339359

And basic IR optimization added in:
https://reviews.llvm.org/rL338218
https://reviews.llvm.org/rL340022

...so these are expected to produce asm output that's equal or better to the multi-instruction
alternatives using primitive C/IR ops.

In the motivating loop example from PR37387:
https://bugs.llvm.org/show_bug.cgi?id=37387#c7
...we get the expected 'rolq' x86 instructions if we substitute the rotate builtin into the source.

Differential Revision: https://reviews.llvm.org/D50924

llvm-svn: 340137
2018-08-19 14:44:47 +00:00
Sanjay Patel 39b4dd2da7 revert r340135: [CodeGen] add rotate builtins
At least a couple of bots (PPC only?) are showing the compiler dying while trying to compile:
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/11065/steps/build%20stage%201/logs/stdio
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/18267/steps/build%20stage%201/logs/stdio

llvm-svn: 340136
2018-08-19 13:48:06 +00:00
Sanjay Patel 9116f0438c [CodeGen] add rotate builtins
This exposes the LLVM funnel shift intrinsics as more familiar bit rotation functions in clang 
(when both halves of a funnel shift are the same value, it's a rotate).

We're free to name these as we want because we're not copying gcc, but if there's some other 
existing art (eg, the microsoft ops that are modified in this patch) that we want to replicate, 
we can change the names.

The funnel shift intrinsics were added here:
D49242

With improved codegen in:
rL337966
rL339359

And basic IR optimization added in:
rL338218
rL340022

...so these are expected to produce asm output that's equal or better to the multi-instruction 
alternatives using primitive C/IR ops.

In the motivating loop example from PR37387:
https://bugs.llvm.org/show_bug.cgi?id=37387#c7
...we get the expected 'rolq' x86 instructions if we substitute the rotate builtin into the source.

Differential Revision: https://reviews.llvm.org/D50924

llvm-svn: 340135
2018-08-19 13:12:40 +00:00
Alex Lorenz b111da14ad [ObjC] Error out when using forward-declared protocol in a @protocol
expression

Clang emits invalid protocol metadata when a @protocol expression is used with a
forward-declared protocol. The protocol metadata is missing protocol conformance
list of the protocol since we don't have access to the definition of it in the
compiled translation unit. The linker then might end up picking the invalid
metadata when linking which will lead to incorrect runtime protocol conformance
checks.

This commit makes sure that Clang fails to compile code that uses a @protocol
expression with a forward-declared protocol. This ensures that Clang does not
emit invalid protocol metadata. I added an extra assert in CodeGen to ensure
that this kind of issue won't happen in other places.

rdar://32787811

Differential Revision: https://reviews.llvm.org/D49462

llvm-svn: 340102
2018-08-17 22:18:08 +00:00
Reid Kleckner 59320fc9d7 Update comments in CGDebugInfo to reflect changes in the MS mangler, NFC
I've tried to elaborate on the purpose of these type identifiers and why
and when clang uses them.

llvm-svn: 340080
2018-08-17 20:59:52 +00:00
Yaxun Liu 94ff57f5b1 [HIP] Make __hip_gpubin_handle hidden to avoid being merged across different shared libraries
Different shared libraries contain different fat binary, which is stored in a global variable
__hip_gpubin_handle. Since different compilation units share the same fat binary, this
variable has linkonce linkage. However, it should not be merged across different shared
libraries.

This patch set the visibility of the global variable to be hidden, which will make it invisible
in the shared library, therefore preventing it from being merged.

Differential Revision: https://reviews.llvm.org/D50596

llvm-svn: 340056
2018-08-17 17:47:31 +00:00
Nico Weber b2c53d3393 Make __shiftleft128 / __shiftright128 real compiler built-ins.
r337619 added __shiftleft128 / __shiftright128 as functions in intrin.h.
Microsoft's STL plans on using these functions, and they're using intrin0.h
which just has declarations of built-ins to not pull in the huge intrin.h
header in the standard library headers. That requires that these functions are
real built-ins.

https://reviews.llvm.org/D50907

llvm-svn: 340048
2018-08-17 17:19:06 +00:00
Akira Hatanaka 2ec36f08a6 [CodeGen] Merge identical block descriptor global variables.
Currently, clang generates a new block descriptor global variable for
each new block literal. This commit merges block descriptors that are
identical inside and across translation units using the same approach
taken in r339438.

To enable merging identical block descriptors, the size and signature of
the block and information about the captures are encoded into the name
of the block descriptor variable. Also, the block descriptor variable is
marked as linkonce_odr and unnamed_addr.

rdar://problem/42640703

Differential Revision: https://reviews.llvm.org/D50783

llvm-svn: 340041
2018-08-17 15:46:07 +00:00
Luke Cheeseman 0ac44c18b7 [AArch64] - return address signing
- Add a command line options -msign-return-address to enable return address
  signing
- Armv8.3a added instructions to sign the return address to help mitigate
  against ROP attacks
- This patch adds command line options to generate function attributes that
  signal to the back whether return address signing instructions should be
  added

Differential revision: https://reviews.llvm.org/D49793

llvm-svn: 340019
2018-08-17 12:55:05 +00:00
David Blaikie 9982bb8739 Disable pubnames in NVPTX debug info using metadata
llvm-svn: 339968
2018-08-16 23:56:32 +00:00
Vedant Kumar ee6c233ae0 [InstrProf] Use atomic profile counter updates for TSan
Thread sanitizer instrumentation fails to skip all loads and stores to
profile counters. This can happen if profile counter updates are merged:

  %.sink = phi i64* ...
  %pgocount5 = load i64, i64* %.sink
  %27 = add i64 %pgocount5, 1
  %28 = bitcast i64* %.sink to i8*
  call void @__tsan_write8(i8* %28)
  store i64 %27, i64* %.sink

To suppress TSan diagnostics about racy counter updates, make the
counter updates atomic when TSan is enabled. If there's general interest
in this mode it can be surfaced as a clang/swift driver option.

Testing: check-{llvm,clang,profile}

rdar://40477803

Differential Revision: https://reviews.llvm.org/D50867

llvm-svn: 339955
2018-08-16 22:24:47 +00:00
David Blaikie 19763d93fd Update for LLVM API change
llvm-svn: 339941
2018-08-16 21:30:24 +00:00
Craig Topper 72a7606433 [X86] Remove masking from the 512-bit paddus/psubus builtins. Use a select builtin instead.
llvm-svn: 339845
2018-08-16 07:28:06 +00:00
Alexey Bataev d01b74974b [OPENMP] FIx processing of declare target variables.
The compiler may produce unexpected error messages/crashes when declare
target variables were used. Patch fixes problems with the declarations
marked as declare target to or link.

llvm-svn: 339805
2018-08-15 19:45:12 +00:00
Craig Topper 2a87314e75 [InlineAsm] Update the min-legal-vector-width function attribute based on inputs and outputs to inline assembly
Summary:
Another piece of my ongoing to work for prefer-vector-width.

min-legal-vector-width will eventually be used by the X86 backend to know whether it needs to make 512 bits type legal when prefer-vector-width=256. If the user used inline assembly that passed in/out a 512-bit register, we need to make sure 512 bits are considered legal. Otherwise we'll get an assert failure when we try to wire up the inline assembly to the rest of the code.

This patch just checks the LLVM IR types to see if they are vectors and then updates the attribute based on their total width. I'm not sure if this is the best way to do this or if there's any subtlety I might have missed. So if anyone has other opinions on how to do this I'm open to suggestions.

Reviewers: chandlerc, rsmith, rnk

Reviewed By: rnk

Subscribers: eraman, cfe-commits

Differential Revision: https://reviews.llvm.org/D50678

llvm-svn: 339721
2018-08-14 20:21:05 +00:00
Alexey Bataev 97b722121e [OPENMP] Fix processing of declare target construct.
The attribute marked as inheritable since OpenMP 5.0 supports it +
additional fixes to support new functionality.

llvm-svn: 339704
2018-08-14 18:31:20 +00:00
David Chisnall c66d480bce [gnu-objc] Make selector order deterministic.
Summary:
This probably fixes PR35277, though there may be other sources of
nondeterminism (this was the only case of iterating over a DenseMap).

It's difficult to provide a test case for this, because it shows up only
on systems with ASLR enabled.

Reviewers: rjmccall

Reviewed By: rjmccall

Subscribers: bmwiedemann, mgrang, cfe-commits

Differential Revision: https://reviews.llvm.org/D50559

llvm-svn: 339668
2018-08-14 10:05:25 +00:00
Tomasz Krupa e8cf972d86 [X86] Lowering addus/subus intrinsics to native IR
Summary: This is the patch that lowers x86 intrinsics to native IR in order to enable optimizations.

Reviewers: craig.topper, spatel, RKSimon

Reviewed By: craig.topper

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D46892

llvm-svn: 339651
2018-08-14 08:01:38 +00:00
Akira Hatanaka 936240c77a [CodeGen] Before returning a copy/dispose helper function, bitcast it to
a void pointer type.

This fixes a bug introduced in r339438.

llvm-svn: 339633
2018-08-14 00:15:42 +00:00
Akira Hatanaka 4a6f190e19 Convert if/else to a switch. NFC.
llvm-svn: 339613
2018-08-13 20:59:57 +00:00
Alexey Bataev f138fda5ed [OPENMP] Fix emission of the loop doacross constructs.
The number of loops associated with the OpenMP loop constructs should
not be considered as the number loops to collapse.

llvm-svn: 339603
2018-08-13 19:04:24 +00:00
Alexey Bataev 23647171ea Revert "[OPENMP] Fix emission of the loop doacross constructs."
This reverts commit r339568 because of the problems with the buildbots.

llvm-svn: 339574
2018-08-13 14:42:18 +00:00
Alexey Bataev 0ce6360e0e [OPENMP] Fix emission of the loop doacross constructs.
The number of loops associated with the OpenMP loop constructs should
not be considered as the number loops to collapse.

llvm-svn: 339568
2018-08-13 14:05:43 +00:00
Akira Hatanaka 9978da3615 [CodeGen] Merge equivalent block copy/helper functions.
Clang generates copy and dispose helper functions for each block literal
on the stack. Often these functions are equivalent for different blocks.
This commit makes changes to merge equivalent copy and dispose helper
functions and reduce code size.

To enable merging equivalent copy/dispose functions, the captured object
infomation is encoded into the helper function name. This allows IRGen
to check whether an equivalent helper function has already been emitted
and reuse the function instead of generating a new helper function
whenever a block is defined. In addition, the helper functions are
marked as linkonce_odr to enable merging helper functions that have the
same name across translation units and marked as unnamed_addr to enable
the linker's deduplication pass to merge functions that have different
names but the same content.

rdar://problem/42640608

Differential Revision: https://reviews.llvm.org/D50152

llvm-svn: 339438
2018-08-10 15:09:24 +00:00
David Chisnall b3c11504bd Fix a deprecated warning in the last commit.
Done as a separate commit to make it easier to cherry pick the changes
to the release branch.

llvm-svn: 339429
2018-08-10 12:53:18 +00:00
David Chisnall 93ce018f3d Add Windows support for the GNUstep Objective-C ABI V2.
Summary:
Introduces funclet-based unwinding for Objective-C and fixes an issue
where global blocks can't have their isa pointers initialised on
Windows.

After discussion with Dustin, this changes the name mangling of
Objective-C types to prevent a C++ catch statement of type struct X*
from catching an Objective-C object of type X*.

Reviewers: rjmccall, DHowett-MSFT

Reviewed By: rjmccall, DHowett-MSFT

Subscribers: mgrang, mstorsjo, smeenai, cfe-commits

Differential Revision: https://reviews.llvm.org/D50144

llvm-svn: 339428
2018-08-10 12:53:13 +00:00
Hans Wennborg a912e3e6be clang-cl: Support /guard:cf,nochecks
This extension emits the guard cf table without inserting the
instrumentation. Currently that's what clang-cl does with /guard:cf
anyway, but this allows a user to request that explicitly.

Differential Revision: https://reviews.llvm.org/D50513

llvm-svn: 339420
2018-08-10 09:49:21 +00:00
Stephen Kelly 40922db37c Mark up deprecated methods as such
Reviewers: teemperor!

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D50352

llvm-svn: 339403
2018-08-09 22:45:38 +00:00
Stephen Kelly 1c301dcbc4 Port getLocEnd -> getEndLoc
Reviewers: teemperor!

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D50351

llvm-svn: 339386
2018-08-09 21:09:38 +00:00
Stephen Kelly f2ceec4811 Port getLocStart -> getBeginLoc
Reviewers: teemperor!

Subscribers: jholewinski, whisperity, jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D50350

llvm-svn: 339385
2018-08-09 21:08:08 +00:00
Stephen Kelly a6e4358f07 Port getStartLoc -> getBeginLoc
Reviewers: teemperor!

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D50349

llvm-svn: 339384
2018-08-09 21:05:56 +00:00
Stephen Kelly 3cffc4c76a Add getBeginLoc API to replace getStartLoc
Reviewers: teemperor!

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D50347

llvm-svn: 339373
2018-08-09 20:05:18 +00:00
David Chisnall c5a458cc53 Correctly initialise global blocks on Windows.
Summary:
Windows does not allow globals to be initialised to point to globals in
another DLL.  Exported globals may be referenced only from code.  Work
around this by creating an initialiser that runs in early library
initialisation and sets the isa pointer.

Reviewers: rjmccall

Reviewed By: rjmccall

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D50436

llvm-svn: 339317
2018-08-09 08:02:42 +00:00
Craig Topper 0a4f6be443 [Builtins] Implement __builtin_clrsb to be compatible with gcc
gcc defines an intrinsic called __builtin_clrsb which counts the number of extra sign bits on a number. This is equivalent to counting the number of leading zeros on a positive number or the number of leading ones on a negative number and subtracting one from the result. Since we can't count leading ones we need to invert negative numbers to count zeros.

This patch will cause the builtin to be expanded inline while gcc uses a call to a function like clrsbdi2 that is implemented in libgcc. But this is similar to what we already do for popcnt. And I don't think compiler-rt supports clrsbdi2.

Differential Revision: https://reviews.llvm.org/D50168

llvm-svn: 339282
2018-08-08 19:55:52 +00:00
Craig Topper 2a92a0efc7 [CodeGen][Timers] Enable llvm::TimePassesIsEnabled when -ftime-report is specified
r330571 added a new FrontendTimesIsEnabled variable and replaced many usages of llvm::TimePassesIsEnabled. Including the place that set llvm::TimePassesIsEnabled for -ftime-report. The effect of this is that -ftime-report now only contains the timers specifically referenced in CodeGenAction.cpp and none of the timers in the backend.

This commit adds back the assignment, but otherwise leaves everything else unchanged.

llvm-svn: 339281
2018-08-08 19:14:23 +00:00
Scott Linder 58df0e4d2c [DebugInfo][OpenCL] Address post-commit review for r338299
NFC refactor of code to generate debug info for OpenCL 2.X blocks.

Differential Revision: https://reviews.llvm.org/D50099

llvm-svn: 339265
2018-08-08 15:56:12 +00:00
Simon Pilgrim 04c5a34f4f [CGObjCGNU] Rename GetSelector helper method to fix -Woverloaded-virtual warning (PR38210)
As suggested by @theraven on PR38210, this patch fixes the gcc -Woverloaded-virtual warnings by renaming the extra CGObjCGNU::GetSelector method to CGObjCGNU::GetTypedSelector

Differential Revision: https://reviews.llvm.org/D50448

llvm-svn: 339264
2018-08-08 15:53:14 +00:00
Balaji V. Iyer 749e8285a5 [CodeGen] IncompleteArray Support
Added code to support ArrayType that is not ConstantArray.

https://reviews.llvm.org/D49952
rdar://42476155

llvm-svn: 339207
2018-08-08 00:01:21 +00:00
JF Bastien b9cc1fcf6b [NFC] CGDecl factor out constant emission
The code is cleaner this way, and with some changes I'm playing with it makes sense to split it out so we can reuse it.

llvm-svn: 339191
2018-08-07 21:55:13 +00:00
Alexey Bataev bf8fe71b91 [OPENMP] Mark variables captured in declare target region as implicitly
declare target.

According to OpenMP 5.0, variables captured in lambdas in declare target
regions must be considered as implicitly declare target.

llvm-svn: 339152
2018-08-07 16:14:36 +00:00
Scott Linder f8b3df4dec [OpenCL] Restore r338899 (reverted in r338904), fixing stack-use-after-return
Always emit alloca in entry block for enqueue_kernel builtin.

Ensures the statically sized alloca is not converted to DYNAMIC_STACKALLOC
later because it is not in the entry block.

llvm-svn: 339150
2018-08-07 15:52:49 +00:00
David Chisnall 9e31036302 [objc-gnustep] Don't emit .guess ivar offset vars.
These were intended to allow non-fragile and fragile ABI code to be
mixed, as long as the fragile classes were higher up the hierarchy than
the non-fragile ones.  Unfortunately:

 - No one actually wants to do this.
 - Recent versions of Linux's run-time linker break it.

llvm-svn: 339128
2018-08-07 12:02:46 +00:00
Hsiangkai Wang ea1b0e0960 Revert "[DebugInfo] Generate debug information for labels. (Fix PR37395)"
Build failed in
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/27258

In lib/CodeGen/LiveDebugVariables.cpp:589, it uses std::prev(MBBI) to
get DebugValue's SlotIndex. however, the previous instruction may be
also a debug instruction.

llvm-svn: 338992
2018-08-06 07:07:18 +00:00
Hsiangkai Wang 3bec3abf38 [DebugInfo] Generate debug information for labels. (Fix PR37395)
Generate DILabel metadata and call llvm.dbg.label after label
statement to associate the metadata with the label.

After fixing PR37395.

Differential Revision: https://reviews.llvm.org/D45045

llvm-svn: 338989
2018-08-06 05:58:59 +00:00
Hsiangkai Wang e7b3da2dc5 [DebugInfo] Use DbgVariableIntrinsic as the base class of variables.
After refactoring DbgInfoIntrinsic class hierarchy, we use
DbgVariableIntrinsic as the base class of variable debug info.

In resolveTopLevelMetadata() in CGVTables.cpp, we only care about
dbg.value, so we try to cast the instructions to DbgVariableIntrinsic
before resolving variables.

Differential Revision: https://reviews.llvm.org/D50226

llvm-svn: 338985
2018-08-06 04:00:08 +00:00
Richard Smith aa140bf164 Avoid creating conditional cleanup blocks that contain only @llvm.lifetime.end calls
When a non-extended temporary object is created in a conditional branch, the
lifetime of that temporary ends outside the conditional (at the end of the
full-expression). If we're inserting lifetime markers, this means we could end
up generating

  if (some_cond) {
    lifetime.start(&tmp);
    Tmp::Tmp(&tmp);
  }
  // ...
  if (some_cond) {
    lifetime.end(&tmp);
  }

... for a full-expression containing a subexpression of the form `some_cond ?
Tmp().x : 0`. This patch moves the lifetime start for such a temporary out of
the conditional branch so that we don't need to generate an additional basic
block to hold the lifetime end marker.

This is disabled if we want precise lifetime markers (for asan's
stack-use-after-scope checks) or of the temporary has a non-trivial destructor
(in which case we'd generate an extra basic block anyway to hold the destructor
call).

Differential Revision: https://reviews.llvm.org/D50286

llvm-svn: 338945
2018-08-04 01:25:06 +00:00
Sergey Dmitriev bde9cf942b [OpenMP] Encode offload target triples into comdat key for offload initialization code
Encoding offload target triples onto comdat group key for offload initialization
code guarantees that it will be executed once per each unique combination of
offload targets.

Differential Revision: https://reviews.llvm.org/D50218

llvm-svn: 338916
2018-08-03 20:19:28 +00:00
Erich Keane ed69e1bc98 [NFC] Initialize a variable to prevent future invalid deref.
Found by KlockWorks, this variable is properly protected, however
the conditions in the test that initializes it and the one that uses
it could diverge, it seems to me that this is a 'free' init that will
prevent issues if one of the conditions is ever modified without the other.

llvm-svn: 338909
2018-08-03 18:08:36 +00:00
Vlad Tsyrklevich c7d3d34b98 Revert "[OpenCL] Always emit alloca in entry block for enqueue_kernel builtin"
This reverts commit r338899, it was causing ASan test failures on sanitizer-x86_64-linux-fast.

llvm-svn: 338904
2018-08-03 17:47:58 +00:00
Scott Linder 91f578467c [OpenCL] Always emit alloca in entry block for enqueue_kernel builtin
Ensures the statically sized alloca is not converted to DYNAMIC_STACKALLOC
later because it is not in the entry block.

Differential Revision: https://reviews.llvm.org/D50104

llvm-svn: 338899
2018-08-03 15:50:52 +00:00
Michael Kruse cba47b4978 [CodeGen] Emit parallel_loop_access for each loop in the loop stack.
Summary:
Emit !llvm.mem.parallel_loop_access metadata for memory accesses even if the parallel loop is not the top on the loop stack.

Fixes llvm.org/PR37558.

Reviewers: ABataev, hfinkel, amusman, tyler.nowicki

Reviewed By: hfinkel

Subscribers: Meinersbur, hfinkel, cfe-commits

Differential Revision: https://reviews.llvm.org/D48808

llvm-svn: 338810
2018-08-03 04:42:52 +00:00
Heejin Ahn 00aa81b4df [WebAssembly] Support for atomic.wait / atomic.wake builtins
Summary:
Add support for atomic.wait / atomic.wake builtins based on the Wasm
thread proposal.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits

Differential Revision: https://reviews.llvm.org/D49396

llvm-svn: 338771
2018-08-02 21:44:40 +00:00
Matt Arsenault c65f966d76 Try to make builtin address space declarations not useless
The way address space declarations for builtins currently work
is nearly useless. The code assumes the address spaces used for
builtins is a confusingly named "target address space" from user
code using __attribute__((address_space(N))) that matches
the builtin declaration. There's no way to use this to declare
a builtin that returns a language specific address space.
The terminology used is highly cofusing since it has nothing
to do with the the address space selected by the target to use
for a language address space.

This feature is essentially unused as-is. AMDGPU and NVPTX
are the only in-tree targets attempting to use this. The AMDGPU
builtins certainly do not behave as intended (i.e. all of the
builtins returning pointers can never compile because the numbered
address space never matches the expected named address space).

The NVPTX builtins are missing tests for some, and the others
seem to rely on an implicit addrspacecast.

Change the used address space for builtins based on a target
hook to allow using a language address space for a builtin.
This allows the same builtin declaration to be used for multiple
languages with similarly purposed address spaces (e.g. the same
AMDGPU builtin can be used in OpenCL and CUDA even though the
constant address spaces are arbitarily different).

This breaks the possibility of using arbitrary numbered
address spaces alongside the named address spaces for builtins.
If this is an issue we probably need to introduce another builtin
declaration character to distinguish language address spaces from
so-called "target address spaces".

llvm-svn: 338707
2018-08-02 12:14:28 +00:00
David Green c8e3924b3b [UnrollAndJam] Add unroll_and_jam pragma handling
This adds support for the unroll_and_jam pragma, to go with the recently
added unroll and jam pass. The name of the pragma is the same as is used
in the Intel compiler, and most of the code works the same as for unroll.

#pragma clang loop unroll_and_jam has been separated into a different
patch. This part adds #pragma unroll_and_jam with an optional count, and
#pragma no_unroll_and_jam to disable the transform.

Differential Revision: https://reviews.llvm.org/D47267

llvm-svn: 338566
2018-08-01 14:36:12 +00:00
Alexey Bataev 62a4cb06a9 [OPENMP] Change linkage of offloading symbols to support dropping
offload targets.

Changed the linkage of omp_offloading.img_start.<triple> and omp_offloading.img_end.<triple> symbols from external to external weak to allow dropping of some targets during linking.

llvm-svn: 338413
2018-07-31 18:27:42 +00:00
Alexey Bataev 3823514b56 [OPENMP] Prevent problems with linking of the static variables.
No need to change the linkage, we can avoid the problem using special variable. That points to the original variable and, thus, prevent some of the optimizations that might break the compilation.

llvm-svn: 338399
2018-07-31 16:40:15 +00:00
Eric Christopher a2227a3f9a Revert "Add a definition for FieldSize that seems to make sense here."
This reverts commit r338327, the problem was previously fixed in r338321.

llvm-svn: 338328
2018-07-30 23:21:51 +00:00
Eric Christopher 8f02a9ed3c Add a definition for FieldSize that seems to make sense here.
This could be sunk out of the if statements, but fix the warning for now.

llvm-svn: 338327
2018-07-30 23:17:27 +00:00
Scott Linder a7c4568583 Fix use of uninitialized variable in r338299
llvm-svn: 338321
2018-07-30 22:52:07 +00:00
Scott Linder 2b5cf04180 [DebugInfo][OpenCL] Generate correct block literal debug info for OpenCL
OpenCL block literal structs have different fields which are now correctly
identified in the debug info.

Differential Revision: https://reviews.llvm.org/D49930

llvm-svn: 338299
2018-07-30 20:31:11 +00:00
Fangrui Song 6907ce2f8f Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338291
2018-07-30 19:24:48 +00:00
Roman Lebedev b69ba22773 [clang][ubsan] Implicit Conversion Sanitizer - integer truncation - clang part
Summary:
C and C++ are interesting languages. They are statically typed, but weakly.
The implicit conversions are allowed. This is nice, allows to write code
while balancing between getting drowned in everything being convertible,
and nothing being convertible. As usual, this comes with a price:

```
unsigned char store = 0;

bool consume(unsigned int val);

void test(unsigned long val) {
  if (consume(val)) {
    // the 'val' is `unsigned long`, but `consume()` takes `unsigned int`.
    // If their bit widths are different on this platform, the implicit
    // truncation happens. And if that `unsigned long` had a value bigger
    // than UINT_MAX, then you may or may not have a bug.

    // Similarly, integer addition happens on `int`s, so `store` will
    // be promoted to an `int`, the sum calculated (0+768=768),
    // and the result demoted to `unsigned char`, and stored to `store`.
    // In this case, the `store` will still be 0. Again, not always intended.
    store = store + 768; // before addition, 'store' was promoted to int.
  }

  // But yes, sometimes this is intentional.
  // You can either make the conversion explicit
  (void)consume((unsigned int)val);
  // or mask the value so no bits will be *implicitly* lost.
  (void)consume((~((unsigned int)0)) & val);
}
```

Yes, there is a `-Wconversion`` diagnostic group, but first, it is kinda
noisy, since it warns on everything (unlike sanitizers, warning on an
actual issues), and second, there are cases where it does **not** warn.
So a Sanitizer is needed. I don't have any motivational numbers, but i know
i had this kind of problem 10-20 times, and it was never easy to track down.

The logic to detect whether an truncation has happened is pretty simple
if you think about it - https://godbolt.org/g/NEzXbb - basically, just
extend (using the new, not original!, signedness) the 'truncated' value
back to it's original width, and equality-compare it with the original value.

The most non-trivial thing here is the logic to detect whether this
`ImplicitCastExpr` AST node is **actually** an implicit conversion, //or//
part of an explicit cast. Because the explicit casts are modeled as an outer
`ExplicitCastExpr` with some `ImplicitCastExpr`'s as **direct** children.
https://godbolt.org/g/eE1GkJ

Nowadays, we can just use the new `part_of_explicit_cast` flag, which is set
on all the implicitly-added `ImplicitCastExpr`'s of an `ExplicitCastExpr`.
So if that flag is **not** set, then it is an actual implicit conversion.

As you may have noted, this isn't just named `-fsanitize=implicit-integer-truncation`.
There are potentially some more implicit conversions to be warned about.
Namely, implicit conversions that result in sign change; implicit conversion
between different floating point types, or between fp and an integer,
when again, that conversion is lossy.

One thing i know isn't handled is bitfields.

This is a clang part.
The compiler-rt part is D48959.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=21530 | PR21530 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=37552 | PR37552 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=35409 | PR35409 ]].
Partially fixes [[ https://bugs.llvm.org/show_bug.cgi?id=9821 | PR9821 ]].
Fixes https://github.com/google/sanitizers/issues/940. (other than sign-changing implicit conversions)

Reviewers: rjmccall, rsmith, samsonov, pcc, vsk, eugenis, efriedma, kcc, erichkeane

Reviewed By: rsmith, vsk, erichkeane

Subscribers: erichkeane, klimek, #sanitizers, aaron.ballman, RKSimon, dtzWill, filcab, danielaustin, ygribov, dvyukov, milianw, mclow.lists, cfe-commits, regehr

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D48958

llvm-svn: 338288
2018-07-30 18:58:30 +00:00
Momchil Velikov 20208cc046 [ARM, AArch64]: Use unadjusted alignment when passing composites as arguments
The "Procedure Call Procedure Call Standard for the ARM® Architecture"
(https://static.docs.arm.com/ihi0042/f/IHI0042F_aapcs.pdf), specifies that
composite types are passed according to their "natural alignment", i.e. the
alignment before alignment adjustment on the entire composite is applied.

The same applies for AArch64 ABI.

Clang, however, used the adjusted alignment.

GCC already implements the ABI correctly. With this patch Clang becomes
compatible with GCC and passes such arguments in accordance with AAPCS.

Differential Revision: https://reviews.llvm.org/D46013

llvm-svn: 338279
2018-07-30 17:48:23 +00:00
Stefan Maksimovic b9da8a5dff [mips64][clang] Provide the signext attribute for i32 return values
Additional info: see r338019.

Differential Revision: https://reviews.llvm.org/D49289

llvm-svn: 338239
2018-07-30 10:44:46 +00:00
Chandler Carruth 1f82d9ba6e Revert r337456: [CodeGen] Disable aggressive structor optimizations at -O0, take 3
This commit increases the number of sections and overall output size of
.o files by 10% and sometimes a bit more. This alone is challenging for
some users, but it also appears to trigger an as-yet unexplained
behavior in the Gold linker where the memory usage increases
considerably more than 10% (we think).

The increase is also frustrating because in many (if not all) cases we
end up with almost all of the growth coming from the ELF overhead of
-ffunction-sections and such, not from actual extra code being emitted.

Richard Smith and Eric Christopher are both going to investigate this
and try to get to the bottom of what is triggering this and whether the
kinds of increases here are sustainable or what options we might have to
minimize the impact they have. However, this is currently breaking
a pretty large number of our users' builds so reverting it while we sort
out how to make progress here. I've seen a longer and more detailed
update to the commit thread.

llvm-svn: 338209
2018-07-29 03:05:07 +00:00
Serge Pavlov 376051820d [UBSan] Strengthen pointer checks in 'new' expressions
With this change compiler generates alignment checks for wider range
of types. Previously such checks were generated only for the record types
with non-trivial default constructor. So the types like:

    struct alignas(32) S2 { int x; };
    typedef __attribute__((ext_vector_type(2), aligned(32))) float float32x2_t;

did not get checks when allocated by 'new' expression.

This change also optimizes the checks generated for the arrays created
in 'new' expressions. Previously the check was generated for each
invocation of type constructor. Now the check is generated only once
for entire array.

Differential Revision: https://reviews.llvm.org/D49589

llvm-svn: 338199
2018-07-28 15:33:03 +00:00
Yaxun Liu a4005e13f7 [CUDA][HIP] Allow function-scope static const variable
CUDA 8.0 E.3.9.4 says: Within the body of a __device__ or __global__
function, only __shared__ variables or variables without any device
memory qualifiers may be declared with static storage class.

It is unclear how a function-scope non-const static variable
without device memory qualifier is implemented, therefore only static
const variable without device memory qualifier is allowed, which
can be emitted as a global variable in constant address space.

Currently clang only allows function-scope static variable with
__shared__ qualifier.

This patch also allows function-scope static const variable without
device memory qualifier and emits it as a global variable in constant
address space.

Differential Revision: https://reviews.llvm.org/D49931

llvm-svn: 338188
2018-07-28 03:05:25 +00:00
George Karpenkov 39e5137f43 [AST] Add a convenient getter from QualType to RecordDecl
Differential Revision: https://reviews.llvm.org/D49951

llvm-svn: 338187
2018-07-28 02:16:13 +00:00
Sanjin Sijaric 56391d6f84 [ARM64] [Windows] Follow MS X86_64 C++ ABI when passing structs
Summary: Microsoft's C++ object model for ARM64 is the same as that for X86_64.
For example, small structs with non-trivial copy constructors or virtual
function tables are passed indirectly.  Currently, they are passed in registers
when compiled with clang.

Reviewers: rnk, mstorsjo, TomTan, haripul, javed.absar

Reviewed By: rnk, mstorsjo

Subscribers: kristof.beyls, chrib, llvm-commits, cfe-commits

Differential Revision: https://reviews.llvm.org/D49770

llvm-svn: 338076
2018-07-26 22:18:28 +00:00
Mandeep Singh Grang 2a153101bf [COFF, ARM64] Decide when to mark struct returns as SRet
Summary:
Refer the MS ARM64 ABI Convention for the behavior for struct returns:
https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions#return-values

Reviewers: mstorsjo, compnerd, rnk, javed.absar, yinma, efriedma

Reviewed By: rnk, efriedma

Subscribers: haripul, TomTan, yinma, efriedma, kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D49464

llvm-svn: 338050
2018-07-26 18:07:59 +00:00
Ana Pazos 1eee1b771f [RISCV] Add support for interrupt attribute
Summary:
Clang supports the GNU style ``__attribute__((interrupt))`` attribute  on RISCV targets.
Permissible values for this parameter are user, supervisor, and machine.
If there is no parameter, then it defaults to machine.
Reference: https://gcc.gnu.org/onlinedocs/gcc/RISC-V-Function-Attributes.html
Based on initial patch by Zhaoshi Zheng.

Reviewers: asb, aaron.ballman

Reviewed By: asb, aaron.ballman

Subscribers: rkruppe, the_o, aaron.ballman, MartinMosbeck, brucehoult, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, cfe-commits

Differential Revision: https://reviews.llvm.org/D48412

llvm-svn: 338045
2018-07-26 17:37:45 +00:00
Akira Hatanaka cb6a933c9b [CodeGen][ObjC] Make block copy/dispose helper functions exception-safe.
When an exception is thrown in a block copy helper function, captured
objects that have previously been copied should be destructed or
released. Similarly, captured objects that are yet to be released should
be released when an exception is thrown in a dispose helper function.

rdar://problem/42410255

Differential Revision: https://reviews.llvm.org/D49718

llvm-svn: 338041
2018-07-26 16:51:21 +00:00
Alexey Bataev 8521ff6ec4 [OPENMP] ThreadId in serialized parallel regions is 0.
The first argument for the parallel outlined functions, called as
serialized parallel regions, should be a pointer to the global thread id
that always is 0.

llvm-svn: 337957
2018-07-25 20:03:01 +00:00
JF Bastien 6508929da9 CodeGen: use non-zero memset when possible for automatic variables
Summary:
Right now automatic variables are either initialized with bzero followed by a few stores, or memcpy'd from a synthesized global. We end up encountering a fair amount of code where memcpy of non-zero byte patterns would be better than memcpy from a global because it touches less memory and generates a smaller binary. The optimizer could reason about this, but it's not really worth it when clang already knows.

This code could definitely be more clever but I'm not sure it's worth it. In particular we could track a histogram of bytes seen and figure out (as we do with bzero) if a memset could be followed by a handful of stores. Similarly, we could tune the heuristics for GlobalSize, but using the same as for bzero seems conservatively OK for now.

<rdar://problem/42563091>

Reviewers: dexonsmith

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D49771

llvm-svn: 337887
2018-07-25 04:29:03 +00:00
Shiva Chen 0ed11a9792 Revert "[DebugInfo] Generate debug information for labels. (Fix PR37395)"
This reverts commit 4288dd3bf082482e02c8a044c611c18168cb0180.

llvm-svn: 337803
2018-07-24 02:57:11 +00:00
Shiva Chen c50fbb9da7 [DebugInfo] Generate debug information for labels. (Fix PR37395)
Generate DILabel metadata and call llvm.dbg.label after label
statement to associate the metadata with the label.

After fixing PR37395.

Differential Revision: https://reviews.llvm.org/D45045

Patch by Hsiangkai Wang.

llvm-svn: 337800
2018-07-24 02:23:59 +00:00
Thomas Anderson b6d87cfe5f Borrow visibility from __fundamental_type_info for generated fundamental type infos
This is necessary so the clang gives hidden visibility to fundamental types when
-fvisibility=hidden is passed. Fixes
https://bugs.llvm.org/show_bug.cgi?id=35066

Differential Revision: https://reviews.llvm.org/D49109

llvm-svn: 337788
2018-07-24 00:43:47 +00:00
Richard Smith f66e4f7dbd Support lifetime-extension of conditional temporaries.
llvm-svn: 337767
2018-07-23 22:56:45 +00:00
Aaron Smith 044326c7fc [CodeGen] Record if a C++ record is a trivial type
Summary: This has a dependence on D45122

Reviewers: rnk, zturner, llvm-commits, aleksandr.urakov

Reviewed By: rnk

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D45124

llvm-svn: 337736
2018-07-23 20:49:07 +00:00
Ivan A. Kosarev 8264bb8d34 [NEON] Fix support for vrndi_f32(), vrndiq_f32() and vrndns_f32() intrinsics
This patch adds support for vrndi_f32() and vrndiq_f32()
intrinsics in AArch32 mode and for vrndns_f32() intrinsic in
AArch64 mode.

Differential Revision: https://reviews.llvm.org/D48829

llvm-svn: 337690
2018-07-23 13:26:37 +00:00
Yaxun Liu e1bfbc589f [HIP] Support -fcuda-flush-denormals-to-zero for amdgcn
Differential Revision: https://reviews.llvm.org/D48287

llvm-svn: 337639
2018-07-21 02:02:22 +00:00
JF Bastien ed92f608bb [NFC] CodeGen: rename memset to bzero
The optimization looks for opportunities to emit bzero, not memset. Rename the functions accordingly (and clang-format the diff) because I want to add a fallback optimization which actually tries to generate memset. bzero is still better and it would confuse the code to merge both.

llvm-svn: 337636
2018-07-20 23:37:12 +00:00
Yaxun Liu f99752b66b [HIP] Register/unregister device fat binary only once
HIP generates one fat binary for all devices after linking. However, for each compilation
unit a ctor function is emitted which register the same fat binary. Measures need to be
taken to make sure the fat binary is only registered once.

Currently each ctor function calls __hipRegisterFatBinary and stores the returned value
to __hip_gpubin_handle. This patch changes the linkage of __hip_gpubin_handle to be linkonce
so that they are shared between LLVM modules. Then this patch adds check of value of
__hip_gpubin_handle to make sure __hipRegisterFatBinary is only called once. The code
is equivalent to

void *_gpubin_handle;
void ctor() {
  if (__hip_gpubin_handle == 0) {
    __hip_gpubin_handle = __hipRegisterFatBinary(...);
  }
  // register kernels and variables.
}
The patch also does similar change to dtors so that __hipUnregisterFatBinary
is called once.

Differential Revision: https://reviews.llvm.org/D49083

llvm-svn: 337631
2018-07-20 22:45:24 +00:00
Reid Kleckner 891b2714bc [codeview] Don't emit variable templates as class members
MSVC doesn't, so neither should we.

Fixes PR38004, which is a crash that happens when we try to emit debug
info for a still-dependent partial variable template specialization.

As a follow-up, we should review what we're doing for function and class
member templates. It looks like we don't filter those out, but I can't
seem to get clang to emit any.

llvm-svn: 337616
2018-07-20 20:55:00 +00:00
Akira Hatanaka dbfa453e41 [CodeGen][ObjC] Make copying and disposing of a non-escaping block
no-ops.

A non-escaping block on the stack will never be called after its
lifetime ends, so it doesn't have to be copied to the heap. To prevent
a non-escaping block from being copied to the heap, this patch sets
field 'isa' of the block object to NSConcreteGlobalBlock and sets the
BLOCK_IS_GLOBAL bit of field 'flags', which causes the runtime to treat
the block as if it were a global block (calling _Block_copy on the block
just returns the original block and calling _Block_release is a no-op).

Also, a new flag bit 'BLOCK_IS_NOESCAPE' is added, which allows the
runtime or tools to distinguish between true global blocks and
non-escaping blocks.

rdar://problem/39352313

Differential Revision: https://reviews.llvm.org/D49303

llvm-svn: 337580
2018-07-20 17:10:32 +00:00
Erich Keane 3efe00206f Implement cpu_dispatch/cpu_specific Multiversioning
As documented here: https://software.intel.com/en-us/node/682969 and
https://software.intel.com/en-us/node/523346. cpu_dispatch multiversioning
is an ICC feature that provides for function multiversioning.

This feature is implemented with two attributes: First, cpu_specific,
which specifies the individual function versions. Second, cpu_dispatch,
which specifies the location of the resolver function and the list of
resolvable functions.

This is valuable since it provides a mechanism where the resolver's TU
can be specified in one location, and the individual implementions
each in their own translation units.

The goal of this patch is to be source-compatible with ICC, so this
implementation diverges from the ICC implementation in a few ways:
1- Linux x86/64 only: This implementation uses ifuncs in order to
properly dispatch functions. This is is a valuable performance benefit
over the ICC implementation. A future patch will be provided to enable
this feature on Windows, but it will obviously more closely fit ICC's
implementation.
2- CPU Identification functions: ICC uses a set of custom functions to identify
the feature list of the host processor. This patch uses the cpu_supports
functionality in order to better align with 'target' multiversioning.
1- cpu_dispatch function def/decl: ICC's cpu_dispatch requires that the function
marked cpu_dispatch be an empty definition. This patch supports that as well,
however declarations are also permitted, since the linker will solve the
issue of multiple emissions.

Differential Revision: https://reviews.llvm.org/D47474

llvm-svn: 337552
2018-07-20 14:13:28 +00:00
Fangrui Song 99337e246c Change \t to spaces
llvm-svn: 337530
2018-07-20 08:19:20 +00:00
Richard Smith 4c6568869e Fix typo causing assert in self-host.
llvm-svn: 337508
2018-07-19 23:24:41 +00:00
Richard Smith 83497d9ead When we choose to use zeroinitializer for a trailing portion of an array
constant, don't convert the rest into a packed struct.

If an array constant has a large non-zero portion and a large zero
portion, we want to emit the first part as an array and the rest as a
zeroinitializer if possible. This fixes a memory usage regression from
r333141 when compiling PHP.

llvm-svn: 337498
2018-07-19 21:38:56 +00:00
Nico Weber f29044536d fix typo in comment
llvm-svn: 337480
2018-07-19 18:59:38 +00:00
Erich Keane e69755a55f Fix unused variable warning.
llvm-svn: 337473
2018-07-19 17:19:16 +00:00
Alexey Bataev b363813543 The patch adds support for the new map interface between clang and libomptarget. The changes in the interface are the following:
device IDs are now 64-bit integers (as opposed to 32-bit)
map flags are 64-bit long (used to be 32-bit)
mappings for partially mapped structs are now calculated at compile time and members of partially mapped structs are flagged using the MEMBER_OF field
Support for is_device_ptr on struct members was dropped - this functionality is not supported by the OpenMP standard and its implementation is technically infeasible (however, use_device_ptr on struct members works as a non-standard extension of the compiler)

llvm-svn: 337468
2018-07-19 16:34:13 +00:00
Pavel Labath 45a8dfacf4 [CodeGen] Disable aggressive structor optimizations at -O0, take 3
The previous version of this patch (r332839) was reverted because it was
causing "definition with same mangled name as another definition" errors
in some module builds. This was caused by an unrelated bug in module
importing which it exposed. The importing problem was fixed in r336240,
so this recommits the original patch (r332839).

Differential Revision: https://reviews.llvm.org/D46685

llvm-svn: 337456
2018-07-19 14:05:22 +00:00
Nemanja Ivanovic 2600b839d5 NFC: Remove extraneous semicolons as pointed out in the differential review
The commit for
https://reviews.llvm.org/D49424
missed the comment about the extraneous semicolons. Remove them.

llvm-svn: 337451
2018-07-19 12:49:27 +00:00
Nemanja Ivanovic 1ac56bd33f [PowerPC] Handle __builtin_xxpermdi the same way as GCC does
The codegen for this builtin was initially implemented to match GCC.
However, due to interest from users GCC changed behaviour to account for the
big endian bias of the instruction and correct it. This patch brings the
handling inline with GCC.

Fixes https://bugs.llvm.org/show_bug.cgi?id=38192

Differential Revision: https://reviews.llvm.org/D49424

llvm-svn: 337449
2018-07-19 12:44:15 +00:00
Manoj Gupta da08f6ac16 [clang]: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.

More details : https://lkml.org/lkml/2018/4/4/601

GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.

-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.

This feature is implemented in as the function attribute
"null-pointer-is-valid"="true".
This CL only adds the attribute on the function.
It also strips "nonnull" attributes from function arguments but
keeps the related warnings unchanged.

Corresponding LLVM change rL336613 already updated the
optimizations to not treat null pointer dereferencing
as undefined if the attribute is present.

Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv

Reviewed By: jyknight

Subscribers: drinkcat, xbolva00, cfe-commits

Differential Revision: https://reviews.llvm.org/D47894

llvm-svn: 337433
2018-07-19 00:44:52 +00:00
Erich Keane 7963e8bebb Add support for __declspec(code_seg("segname"))
This patch uses CodeSegAttr to represent __declspec(code_seg) rather than 
building on the existing support for #pragma code_seg.
The code_seg declspec is applied on functions and classes. This attribute 
enables the placement of code into separate named segments, including compiler-
generated codes and template instantiations.

For more information, please see the following:
https://msdn.microsoft.com/en-us/library/dn636922.aspx

This patch fixes the regression for the support for attribute ((section).
746b78de78

Patch by Soumi Manna (Manna)
Differential Revision: https://reviews.llvm.org/D48841

llvm-svn: 337420
2018-07-18 20:04:48 +00:00
Peter Collingbourne 14b468bab6 Re-land r337333, "Teach Clang to emit address-significance tables.",
which was reverted in r337336.

The problem that required a revert was fixed in r337338.

Also added a missing "REQUIRES: x86-registered-target" to one of
the tests.

Original commit message:
> Teach Clang to emit address-significance tables.
>
> By default, we emit an address-significance table on all ELF
> targets when the integrated assembler is enabled. The emission of an
> address-significance table can be controlled with the -faddrsig and
> -fno-addrsig flags.
>
> Differential Revision: https://reviews.llvm.org/D48155

llvm-svn: 337339
2018-07-18 00:27:07 +00:00
Peter Collingbourne 35c6996b68 Revert r337333, "Teach Clang to emit address-significance tables."
Causing multiple failures on sanitizer bots due to TLS symbol errors,
e.g.

/usr/bin/ld: __msan_origin_tls: TLS definition in /home/buildbots/ppc64be-clang-test/clang-ppc64be/stage1/lib/clang/7.0.0/lib/linux/libclang_rt.msan-powerpc64.a(msan.cc.o) section .tbss.__msan_origin_tls mismatches non-TLS reference in /tmp/lit_tmp_0a71tA/mallinfo-3ca75e.o

llvm-svn: 337336
2018-07-17 23:56:30 +00:00
Peter Collingbourne 27242c0402 Teach Clang to emit address-significance tables.
By default, we emit an address-significance table on all ELF
targets when the integrated assembler is enabled. The emission of an
address-significance table can be controlled with the -faddrsig and
-fno-addrsig flags.

Differential Revision: https://reviews.llvm.org/D48155

llvm-svn: 337333
2018-07-17 23:17:16 +00:00
Richard Smith 7027ffa85f Replace LLVM_ALIGNAS with just alignas.
Various places in Clang and LLVM are already using alignas; it seems
our minimum host configuration now requires it.

llvm-svn: 337330
2018-07-17 22:24:11 +00:00
Mandeep Singh Grang 0054f48b44 [COFF] Add more missing MSVC ARM64 intrinsics
Summary:
Added the following intrinsics:
_BitScanForward, _BitScanReverse, _BitScanForward64, _BitScanReverse64
_InterlockedAnd64, _InterlockedDecrement64, _InterlockedExchange64,
_InterlockedExchangeAdd64, _InterlockedExchangeSub64,
_InterlockedIncrement64, _InterlockedOr64, _InterlockedXor64.

Reviewers: compnerd, mstorsjo, rnk, javed.absar

Reviewed By: mstorsjo

Subscribers: kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D49445

llvm-svn: 337327
2018-07-17 22:03:24 +00:00
Alexey Bataev c52f01d1d7 [OPENMP] Fix checks for declare target link entries.
If the declare target link entries are created but not used, the
compiler will produce an error message. Patch improves handling of such
situations + improves checks for possibly lost declare target variables.

llvm-svn: 337207
2018-07-16 20:05:25 +00:00
Alexey Bataev 7f01d20993 [OPENMP] Fix syntactic errors in error messages.
Fixed spelling of the offloading error messages.

llvm-svn: 337196
2018-07-16 18:12:18 +00:00
Alexey Bataev 3dd1f9d61d [OPENMP, NVPTX] Globalize only captured variables.
Sometimes we can try to globalize non-variable declarations, which may
lead to compiler crash.

llvm-svn: 337191
2018-07-16 16:49:20 +00:00
Teresa Johnson b1d17f64e5 Restore "[ThinLTO] Ensure we always select the same function copy to import"
This reverts commit r337082, restoring r337051, since the LLVM side
patch has been restored.

llvm-svn: 337185
2018-07-16 15:30:36 +00:00
Teresa Johnson 70993d37e8 Revert "[ThinLTO] Ensure we always select the same function copy to import"
This reverts commit r337051.

llvm-svn: 337082
2018-07-14 01:50:14 +00:00
Teresa Johnson 9fe8af7e00 [ThinLTO] Ensure we always select the same function copy to import
Clang change to reflect the FunctionsToImportTy type change
in the llvm changes for D48670.

llvm-svn: 337051
2018-07-13 21:35:58 +00:00
JF Bastien 9aab85a6a0 CodeGen: specify alignment + inbounds for automatic variable initialization
Summary: Automatic variable initialization was generating default-aligned stores (which are deprecated) instead of using the known alignment from the alloca. Further, they didn't specify inbounds.

Subscribers: dexonsmith, cfe-commits

Differential Revision: https://reviews.llvm.org/D49209

llvm-svn: 337041
2018-07-13 20:33:23 +00:00
Gheorghe-Teodor Bercea ad4e579407 [OpenMP] Initialize data sharing stack for SPMD case
Summary: In the SPMD case, we need to initialize the data sharing and globalization infrastructure. This covers the case when an SPMD region calls a function in a different compilation unit.

Reviewers: ABataev, carlo.bertolli, caomhin

Reviewed By: ABataev

Subscribers: Hahnfeld, jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D49188

llvm-svn: 337015
2018-07-13 16:18:24 +00:00
Petr Pavlu a934f9da41 Fix setting of empty implicit-section-name attribute
Code in `CodeGenModule::SetFunctionAttributes()` could set an empty
attribute `implicit-section-name` on a function that is affected by
`#pragma clang text="section"`. This is incorrect because the attribute
should contain a valid section name. If the function additionally also
used `__attribute__((section("section")))` then this could result in
emitting the function in a section with an empty name.

The patch fixes the issue by removing the problematic code that sets
empty `implicit-section-name` from
`CodeGenModule::SetFunctionAttributes()` because it is sufficient to set
this attribute only from a similar code in `setNonAliasAttributes()`
when the function is emitted.

Differential Revision: https://reviews.llvm.org/D48916

llvm-svn: 336842
2018-07-11 20:17:54 +00:00
JF Bastien f014bdc199 [NFC] typo
llvm-svn: 336840
2018-07-11 19:51:40 +00:00
Erich Keane be65e874fe [NFC] Switch CodeGenFunction to use value init instead of member init lists
The member init list for the sole constructor for CodeGenFunction
has gotten out of hand, so this patch moves the non-parameter-dependent
initializations into the member value inits.

Note: This is what was intended to be committed in r336726
llvm-svn: 336729
2018-07-10 21:07:50 +00:00
Erich Keane 9960b8f13a Revert -r336726, which included more files than intended.
llvm-svn: 336727
2018-07-10 20:51:41 +00:00
Erich Keane 7b8c12e7cc [NFC] Switch CodeGenFunction to use value init instead of member init lists
The member init list for the sole constructor for CodeGenFunction
has gotten out of hand, so this patch moves the non-parameter-dependent
initializations into the member value inits.

llvm-svn: 336726
2018-07-10 20:46:46 +00:00
Bjorn Pettersson 404f414ee1 Patch to fix pragma metadata for do-while loops
Summary:
Make sure that loop metadata only is put on the backedge
when expanding a do-while loop.
Previously we added the loop metadata also on the branch
in the pre-header. That could confuse optimization passes
and result in the loop metadata being associated with the
wrong loop.

Fixes https://bugs.llvm.org/show_bug.cgi?id=38011

Committing on behalf of deepak2427 (Deepak Panickal)

Reviewers: #clang, ABataev, hfinkel, aaron.ballman, bjope

Reviewed By: bjope

Subscribers: bjope, rsmith, shenhan, zzheng, xbolva00, lebedev.ri, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D48721

llvm-svn: 336717
2018-07-10 19:55:02 +00:00
Craig Topper 1faf953d75 [X86] Remove custom handling for __builtin_ia32_divss_round_mask and __builtin_ia32_divsd_round_mask.
llvm-svn: 336628
2018-07-10 00:50:03 +00:00
Craig Topper 638426fc36 [X86] Add __builtin_ia32_selectss_128 and __builtin_ia32_selectsd_128 that is suitable for use in scalar mask intrinsics.
This will convert the i8 mask argument to <8 x i1> and extract an i1 and then emit a select instruction. This replaces the '(__U & 1)" and ternary operator used in some of intrinsics. The old sequence was lowered to a scalar and and compare. The new sequence uses an i1 vector that will interoperate better with other mask intrinsics.

This removes the need to handle div_ss/sd specially in CGBuiltin.cpp. A follow up patch will add the GCCBuiltin name back in llvm and remove the custom handling.

I made some adjustments to legacy move_ss/sd intrinsics which we reused here to do a simpler extract and insert instead of 2 extracts and two inserts or a shuffle.

llvm-svn: 336622
2018-07-10 00:37:25 +00:00
Craig Topper 74c10e3236 [Builtins][Attributes][X86] Tag all X86 builtins with their required vector width. Add a min_vector_width function attribute and tag all x86 instrinsics with it
This is part of an ongoing attempt at making 512 bit vectors illegal in the X86 backend type legalizer due to CPU frequency penalties associated with wide vectors on Skylake Server CPUs. We want the loop vectorizer to be able to emit IR containing wide vectors as intermediate operations in vectorized code and allow these wide vectors to be legalized to 256 bits by the X86 backend even though we are targetting a CPU that supports 512 bit vectors. This is similar to what happens with an AVX2 CPU, the vectorizer can emit wide vectors and the backend will split them. We want this splitting behavior, but still be able to use new Skylake instructions that work on 256-bit vectors and support things like masking and gather/scatter.

Of course if the user uses explicit vector code in their source code we need to not split those operations. Especially if they have used any of the 512-bit vector intrinsics from immintrin.h. And we need to make it so that merely using the intrinsics produces the expected code in order to be backwards compatible.

To support this goal, this patch adds a new IR function attribute "min-legal-vector-width" that can indicate the need for a minimum vector width to be legal in the backend. We need to ensure this attribute is set to the largest vector width needed by any intrinsics from immintrin.h that the function uses. The inliner will be reponsible for merging this attribute when a function is inlined. We may also need a way to limit inlining in the future as well, but we can discuss that in the future.

To make things more complicated, there are two different ways intrinsics are implemented in immintrin.h. Either as an always_inline function containing calls to builtins(can be target specific or target independent) or vector extension code. Or as a macro wrapper around a taget specific builtin. I believe I've removed all cases where the macro was around a target independent builtin.

To support the always_inline function case this patch adds attribute((min_vector_width(128))) that can be used to tag these functions with their vector width. All x86 intrinsic functions that operate on vectors have been tagged with this attribute.

To support the macro case, all x86 specific builtins have also been tagged with the vector width that they require. Use of any builtin with this property will implicitly increase the min_vector_width of the function that calls it. I've done this as a new property in the attribute string for the builtin rather than basing it on the type string so that we can opt into it on a per builtin basis and avoid any impact to target independent builtins.

There will be future work to support vectors passed as function arguments and supporting inline assembly. And whatever else we can find that isn't covered by this patch.

Special thanks to Chandler who suggested this direction and reviewed a preview version of this patch. And thanks to Eric Christopher who has had many conversations with me about this issue.

Differential Revision: https://reviews.llvm.org/D48617

llvm-svn: 336583
2018-07-09 19:00:16 +00:00
Alexey Bataev b99dcb5f31 [OPENMP, NVPTX] Do not globalize local variables in parallel regions.
In generic data-sharing mode we are allowed to not globalize local
variables that escape their declaration context iff they are declared
inside of the parallel region. We can do this because L2 parallel
regions are executed sequentially and, thus, we do not need to put
shared local variables in the global memory.

llvm-svn: 336567
2018-07-09 17:43:58 +00:00
Craig Topper 8a8d72794f [X86] Add new scalar fma intrinsics with rounding mode that use f32/f64 types.
This allows us to handle masking in a very similar way to the default rounding version that uses llvm.fma

llvm-svn: 336507
2018-07-08 01:10:47 +00:00
Craig Topper f89f62a680 [X86] When creating a select for scalar masked sqrt and div builtins make sure we optimize the all ones mask case.
This case occurs in the intrinsic headers so we should avoid emitting the mask in those cases.

Factor the code into a helper function to make this easy.

llvm-svn: 336472
2018-07-06 22:46:52 +00:00
Craig Topper be4c2933a2 [X86] Implement _builtin_ia32_vfmaddss and _builtin_ia32_vfmaddsd with native IR using llvm.fma intrinsic.
This generates some extra zeroing currently, but we should be able to quickly address that with some isel patterns.

llvm-svn: 336417
2018-07-06 07:14:47 +00:00
Craig Topper 284c5f342c [X86] Use shufflevector instead of a select with a constant mask for fmaddsub/fmsubadd IR emission.
Shufflevector is easier to generate and matches what the backend pattern matches without relying on constant selects being turned into shuffles.

While I was there I also made the IR regular expressions a little stricter to ensure operand order on the shuffle.

llvm-svn: 336388
2018-07-05 20:38:31 +00:00
Gabor Buella 9679eb6527 [X86] Fix some vector cmp builtins - TRUE/FALSE predicates
This patch removes on optimization used with the TRUE/FALSE
predicates, as was suggested in https://reviews.llvm.org/D45616
for r335339.
The optimization was buggy, since r335339 used it also
for *_mask builtins, without actually applying the mask -- the
mask argument was just ignored.

Reviewers: craig.topper, uriel.k, RKSimon, andrew.w.kaylor, spatel, scanon, efriedma

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D48715

llvm-svn: 336355
2018-07-05 14:26:56 +00:00
Lei Huang 449252d2ad [Power9] Update fp128 as a valid homogenous aggregate base type
Update clang to treat fp128 as a valid base type for homogeneous aggregate
passing and returning.

Differential Revision: https://reviews.llvm.org/D48044

llvm-svn: 336308
2018-07-05 04:32:01 +00:00
Piotr Padlewski 0705829e40 [CodeGenCXX] Emit strip.invariant.group with -fstrict-vtable-pointers
Summary:
Emmiting new intrinsic that strips invariant.groups to make
devirtulization sound, as described in RFC: Devirtualization v2.

Reviewers: rjmccall, rsmith, amharc, kuhar

Subscribers: llvm-commits, cfe-commits

Differential Revision: https://reviews.llvm.org/D47299

Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com>
llvm-svn: 336137
2018-07-02 19:21:36 +00:00
Craig Topper 8bf793fb35 [X86] Remove masking from the avx512 packed sqrt builtins. Use select builtins instead.
llvm-svn: 335945
2018-06-29 05:43:33 +00:00
Artem Belevich 5ce0a08cf6 [CUDA] Place all CUDA sections in __NV_CUDA segment on Mac.
That's where CUDA binaries appear to put them.

Differential Revision: https://reviews.llvm.org/D48615

llvm-svn: 335880
2018-06-28 17:15:52 +00:00
Jonas Devlieghere 9ef4965a0d [DebugInfo] Follow-up commit to improve consistency. NFC
Follow-up commit for r335757 to address some inconsistencies.

llvm-svn: 335834
2018-06-28 10:56:40 +00:00
Artem Belevich c66d254ded [CUDA] Use atexit() to call module destructor.
This matches the way NVCC does it. Doing module cleanup at global
destructor phase used to work, but is, apparently, too late for
the CUDA runtime in CUDA-9.2, which ends up crashing with double-free.

Differential Revision: https://reviews.llvm.org/D48613

llvm-svn: 335763
2018-06-27 18:32:51 +00:00
Jonas Devlieghere d8ba8ae875 [DebugInfo] Emit ObjC methods as part of interface
As brought up during the discussion of the DWARF5 accelerator tables,
there is currently no way to associate Objective-C methods with the
interface they belong to, other than the .apple_objc accelerator table.

After due consideration we came to the conclusion that it makes more
sense to follow Pavel's suggestion of just emitting this information in
the .debug_info section. One concern was that categories were
emitted in the .apple_names as well, but it turns out that LLDB doesn't
rely on the accelerator tables for this information.

This patch changes the codegen behavior to emit subprograms for
structure types, like we do for C++. This will result in the
DW_TAG_subprogram being nested as a child under its
DW_TAG_structure_type. This behavior is only enabled for DWARF5 and
later, so we can have a unique code path in LLDB with regards to
obtaining the class methods.

This was tested on the LLDB side and doesn't lead to a regression.
There's already code in place to deal with member functions in C++,
which deals with this transparently.

For more background please refer to the discussion on the mailing list:
http://lists.llvm.org/pipermail/llvm-dev/2018-June/123986.html

Differential revision: https://reviews.llvm.org/D48241

llvm-svn: 335757
2018-06-27 17:31:59 +00:00
Craig Topper 851f363691 [X86] Rename llvm.x86.avx512.mask.fpclass.p* to exclude 'mask.' from the name to match llvm.
llvm-svn: 335745
2018-06-27 15:57:57 +00:00
Ivan A. Kosarev a9f484ac4a [NEON] Support vldNq intrinsics in AArch32 (Clang part)
This patch reworks the support for dup NEON intrinsics as
described in D48439.

Differential Revision: https://reviews.llvm.org/D48440

llvm-svn: 335734
2018-06-27 13:58:43 +00:00
Richard Smith bf5bcf2c15 Diagnose missing 'template' keywords in more cases.
We track when we see a name-shaped expression followed by a '<' token
and parse the '<' as a comparison. Then:

 * if we see a token sequence that cannot possibly be an expression but
   can be a template argument (in particular, a type-id) that follows
   either a ',' or the '<', diagnose that the '<' was supposed to start
   a template argument list, and
 * if we see '>()', diagnose that the '<' was supposed to start a
   template argument list.

This only changes the diagnostic for error cases, and in practice
appears to catch the most common cases where a missing 'template'
keyword leads to parse errors within a template.

Differential Revision: https://reviews.llvm.org/D48571

llvm-svn: 335687
2018-06-26 23:20:26 +00:00
Evgeniy Stepanov c69e067668 Revert "[MS] Use mangled names and comdats for string merging with ASan"
Depends on r334313, which has been reverted in r335681.

llvm-svn: 335684
2018-06-26 23:10:48 +00:00
Peter Collingbourne 7a17a8ba1e Compile CodeGenModule.cpp with /bigobj.
Apparently we're now hitting an object file section limit on this
file with expensive checks enabled.

llvm-svn: 335636
2018-06-26 17:45:26 +00:00
Alexey Bataev 91433f6877 [OPENMP, NVPTX] Reduce the number of the globalized variables.
Patch tries to make better analysis of the variables that should be
globalized. From now, instead of all parallel directives it will check
only distribute parallel .. directives and check only for
firstprivte/lastprivate variables if they must be globalized.

llvm-svn: 335632
2018-06-26 17:24:03 +00:00
Peter Collingbourne e44acadf6a Implement CFI for indirect calls via a member function pointer.
Similarly to CFI on virtual and indirect calls, this implementation
tries to use program type information to make the checks as precise
as possible.  The basic way that it works is as follows, where `C`
is the name of the class being defined or the target of a call and
the function type is assumed to be `void()`.

For virtual calls:
- Attach type metadata to the addresses of function pointers in vtables
  (not the functions themselves) of type `void (B::*)()` for each `B`
  that is a recursive dynamic base class of `C`, including `C` itself.
  This type metadata has an annotation that the type is for virtual
  calls (to distinguish it from the non-virtual case).
- At the call site, check that the computed address of the function
  pointer in the vtable has type `void (C::*)()`.

For non-virtual calls:
- Attach type metadata to each non-virtual member function whose address
  can be taken with a member function pointer. The type of a function
  in class `C` of type `void()` is each of the types `void (B::*)()`
  where `B` is a most-base class of `C`. A most-base class of `C`
  is defined as a recursive base class of `C`, including `C` itself,
  that does not have any bases.
- At the call site, check that the function pointer has one of the types
  `void (B::*)()` where `B` is a most-base class of `C`.

Differential Revision: https://reviews.llvm.org/D47567

llvm-svn: 335569
2018-06-26 02:15:47 +00:00
Craig Topper 4ef61aecbd [X86] Redefine avx512 packed fpclass intrinsics to return a vXi1 mask and implement the mask input argument using an 'and' IR instruction.
Additional IR is emitted to convert between scalar and vXi1 type to match the expected software inferface for the builtin that clang exposes.

llvm-svn: 335564
2018-06-26 00:44:02 +00:00
Sam Clegg 6fd7d680b0 [WebAssembly] Add no-prototype attribute to prototype-less C functions
The WebAssembly backend in particular benefits from being
able to distinguish between varargs functions (...) and prototype-less
C functions.

Differential Revision: https://reviews.llvm.org/D48443

llvm-svn: 335510
2018-06-25 18:47:32 +00:00
Alexey Bataev 96edb2e37e [OPENMP] Do not consider address constant vars as possibly
threadprivate.

Do not delay emission of the address constant variables in OpenMP mode
as they cannot be defined as threadprivate.

llvm-svn: 335483
2018-06-25 15:32:05 +00:00
Igor Kudrin eff8f9d178 [CodeGen] Provide source locations for UBSan type checks when emitting constructor calls.
Differential Revision: https://reviews.llvm.org/D48531

llvm-svn: 335445
2018-06-25 05:48:04 +00:00
Brian Gesiak 12728474b3 [Coroutines] Less IR for noexcept await_resume
Summary:
In his review of https://reviews.llvm.org/D45860, @GorNishanov suggested
avoiding generating additional exception-handling IR in the case that
the resume function was marked as 'noexcept', and exceptions could not
occur. This implements that suggestion.

Test Plan: `check-clang`

Reviewers: GorNishanov, EricWF

Reviewed By: GorNishanov

Subscribers: cfe-commits, GorNishanov

Differential Revision: https://reviews.llvm.org/D47673

llvm-svn: 335422
2018-06-23 18:57:26 +00:00
Tobias Edler von Koch 7609cb83e6 Re-land "[LTO] Enable module summary emission by default for regular LTO"
Since we are now producing a summary also for regular LTO builds, we
need to run the NameAnonGlobals pass in those cases as well (the
summary cannot handle anonymous globals).

See https://reviews.llvm.org/D34156 for details on the original change.

This reverts commit 6c9ee4a4a438a8059aacc809b2dd57128fccd6b3.

llvm-svn: 335385
2018-06-22 20:23:21 +00:00
Alexey Bataev 12c62908b5 [OPENMP, NVPTX] Fix reduction of the big data types/structures.
If the shuffle is required for the reduced structures/big data type,
current code may cause compiler crash because of the loading of the
aggregate values. Patch fixes this problem.

llvm-svn: 335377
2018-06-22 19:10:38 +00:00
Gabor Buella 716863c820 [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR
Summary:
Lowering some vector comparision builtins to fcmp IR instructions.
This ignores the signaling behaviour specified in the predicate
argument of said builtins.

Affected AVX512 builtins:

__builtin_ia32_cmpps128_mask
__builtin_ia32_cmpps256_mask
__builtin_ia32_cmpps512_mask
__builtin_ia32_cmppd128_mask
__builtin_ia32_cmppd256_mask
__builtin_ia32_cmppd512_mask

Reviewers: craig.topper, uriel.k, RKSimon, andrew.w.kaylor, spatel, scanon, efriedma

Reviewed By: craig.topper, spatel, efriedma

Differential Revision: https://reviews.llvm.org/D45616

llvm-svn: 335339
2018-06-22 11:59:16 +00:00
Craig Topper 342b095689 [X86] Update handling in CGBuiltin to be tolerant of out of range immediates.
D48464 contains changes that will loosen some of the range checks in SemaChecking to a DefaultError warning that can be disabled.

This patch adds explicit masking to avoid using the upper bits of immediates to gracefully handle the warning being disabled.

llvm-svn: 335308
2018-06-21 23:39:47 +00:00
Evgeniy Stepanov fb762b27f2 Ignore blacklist when generating __cfi_check_fail.
Summary: Fixes PR37898.

Reviewers: pcc, vlad.tsyrklevich

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D48454

llvm-svn: 335305
2018-06-21 23:22:37 +00:00
Tobias Edler von Koch e597a2cf81 Revert "[LTO] Enable module summary emission by default for regular LTO"
This is breaking a couple of buildbots. We need to run the
NameAnonGlobal pass for regular LTO now as well (since we're producing a
summary). I'll post a separate patch for review to make this happen and
then re-commit.

This reverts commit c0759b7b1f4a81ff9021b952aa38a222d5fa4dfd.

llvm-svn: 335291
2018-06-21 21:24:30 +00:00
Alexey Bataev 4065b9ae48 [OPENMP, NVPTX] Fix globalization of the variables passed to orphaned
parallel region.

If the current construct requires sharing of the local variable in the
inner parallel region, this variable must be globalized to avoid
runtime crash.

llvm-svn: 335285
2018-06-21 20:26:33 +00:00
Tobias Edler von Koch 9a8be606f3 [LTO] Enable module summary emission by default for regular LTO
Summary:
With D33921, we gained the ability to have module summaries in regular
LTO modules without triggering ThinLTO compilation. Module summaries in
regular LTO allow garbage collection (dead stripping) before LTO
compilation and thus open up additional optimization opportunities.

This patch enables summary emission in regular LTO for all targets
except ld64-based ones (which use the legacy LTO API).

Reviewers: pcc, tejohnson, mehdi_amini

Subscribers: inglorion, eraman, cfe-commits

Differential Revision: https://reviews.llvm.org/D34156

llvm-svn: 335284
2018-06-21 20:20:41 +00:00
Anastasis Grammenos dfe8fe503c [DebugInfo] Inline for without DebugLocation
Summary:
This test is a strip down version of a function inside the
amalgamated sqlite source. When converted to IR clang produces
a phi instruction without debug location.

This patch fixes the above issue.

Differential Revision: https://reviews.llvm.org/D47720

llvm-svn: 335255
2018-06-21 16:53:48 +00:00
Leonard Chan db01c3adc6 [Fixed Point Arithmetic] Fixed Point Precision Bits and Fixed Point Literals
This diff includes the logic for setting the precision bits for each primary fixed point type in the target info and logic for initializing a fixed point literal.

Fixed point literals are declared using the suffixes

```
hr: short _Fract
uhr: unsigned short _Fract
r: _Fract
ur: unsigned _Fract
lr: long _Fract
ulr: unsigned long _Fract
hk: short _Accum
uhk: unsigned short _Accum
k: _Accum
uk: unsigned _Accum
```
Errors are also thrown for illegal literal values

```
unsigned short _Accum u_short_accum = 256.0uhk;   // expected-error{{the integral part of this literal is too large for this unsigned _Accum type}}
```

Differential Revision: https://reviews.llvm.org/D46915

llvm-svn: 335148
2018-06-20 17:19:40 +00:00
Peter Collingbourne d914fd2163 IRgen: Mark aliases of ctors and dtors as unnamed_addr.
This is not only semantically correct but ensures that they will not
be marked as address-significant once D48155 lands.

Differential Revision: https://reviews.llvm.org/D48206

llvm-svn: 334982
2018-06-18 20:58:54 +00:00
Tomasz Krupa 83ba6fa98d Fix a bug introduced by rL334850
Summary: All *_sqrt_round_s[s|d] intrinsics should execute a square root on
zeroth element from B (Ops[1]) and insert in to A (Ops[0]), not the other way around.

Reviewers: itaraban, craig.topper

Reviewed By: craig.topper

Subscribers: craig.topper, cfe-commits

Differential Revision: https://reviews.llvm.org/D48288

llvm-svn: 334964
2018-06-18 17:57:05 +00:00
Alexey Bataev 7b55d2d554 [OPENMP, NVPTX] Emit simple reduction if requested.
If simple reduction is requested, use the simple reduction instead of
the runtime functions calls.

llvm-svn: 334962
2018-06-18 17:11:45 +00:00
Yaxun Liu cbd80f49d5 Call CreateTempAllocaWithoutCast for ActiveFlag
This is partial re-commit of r332982.

llvm-svn: 334879
2018-06-16 01:20:52 +00:00
Tomasz Krupa f1792bb3d6 [X86] Lowering sqrt intrinsics to native IR
Reviewers: craig.topper, spatel, RKSimon, igorb, uriel.k

Reviewed By: craig.topper

Subscribers: tkrupa, cfe-commits

Differential Revision: https://reviews.llvm.org/D41168

llvm-svn: 334850
2018-06-15 18:05:59 +00:00
Yaxun Liu aefdb8ed34 [NFC] Add CreateMemTempWithoutCast and CreateTempAllocaWithoutCast
This is partial re-commit of r332982

llvm-svn: 334837
2018-06-15 15:33:22 +00:00
Luke Geeson da2b2e8c26 [AArch64] Reverted rC334696 with Clang VCVTA test fix
llvm-svn: 334820
2018-06-15 10:10:45 +00:00
Craig Topper 31730ae761 [X86] Rename __builtin_ia32_pslldqi128 to __builtin_ia32_pslldqi128_byteshift and similar for other sizes. Remove the multiply by 8 from the header files.
The previous names took the shift amount in bits to match gcc and required a multiply by 8 in the header. This creates a misleading error message when we check the range of the immediate to the builtin since the allowed range also got multiplied by 8.

This commit changes the builtins to use a byte shift amount to match the underlying instruction and the Intel intrinsic.

Fixes the remaining issue from PR37795.

llvm-svn: 334773
2018-06-14 22:02:35 +00:00
Tomasz Krupa 82aa42af49 [X86] Lowering Mask Scalar intrinsics to native IR (Clang part)
Summary: Lowering add, sub, mul, and div mask scalar intrinsic calls
to native IR.

Reviewers: craig.topper, RKSimon, spatel, sroland

Reviewed By: craig.topper

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D47979

llvm-svn: 334741
2018-06-14 17:36:23 +00:00
Leonard Chan ab80f3c8b7 [Fixed Point Arithmetic] Addition of the remaining fixed point types and their saturated equivalents
This diff includes changes for the remaining _Fract and _Sat fixed point types.

```
signed short _Fract s_short_fract;
signed _Fract s_fract;
signed long _Fract s_long_fract;
unsigned short _Fract u_short_fract;
unsigned _Fract u_fract;
unsigned long _Fract u_long_fract;

// Aliased fixed point types
short _Accum short_accum;
_Accum accum;
long _Accum long_accum;
short _Fract short_fract;
_Fract fract;
long _Fract long_fract;

// Saturated fixed point types
_Sat signed short _Accum sat_s_short_accum;
_Sat signed _Accum sat_s_accum;
_Sat signed long _Accum sat_s_long_accum;
_Sat unsigned short _Accum sat_u_short_accum;
_Sat unsigned _Accum sat_u_accum;
_Sat unsigned long _Accum sat_u_long_accum;
_Sat signed short _Fract sat_s_short_fract;
_Sat signed _Fract sat_s_fract;
_Sat signed long _Fract sat_s_long_fract;
_Sat unsigned short _Fract sat_u_short_fract;
_Sat unsigned _Fract sat_u_fract;
_Sat unsigned long _Fract sat_u_long_fract;

// Aliased saturated fixed point types
_Sat short _Accum sat_short_accum;
_Sat _Accum sat_accum;
_Sat long _Accum sat_long_accum;
_Sat short _Fract sat_short_fract;
_Sat _Fract sat_fract;
_Sat long _Fract sat_long_fract;
```

This diff only allows for declaration of these fixed point types. Assignment and other operations done on fixed point types according to http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf will be added in future patches.

Differential Revision: https://reviews.llvm.org/D46911

llvm-svn: 334718
2018-06-14 14:53:51 +00:00
Luke Geeson bb399f8013 [AArch64] reverting rC334693 due to build failures
llvm-svn: 334696
2018-06-14 08:59:33 +00:00
Luke Geeson 010bbbf390 [AArch64] Added support for the vcvta_u16_f16 instrinsic for FP16 Armv8.2-A
llvm-svn: 334693
2018-06-14 08:28:56 +00:00
Mandeep Singh Grang 2d28383097 [COFF] Add ARM64 intrinsics: __yield, __wfe, __wfi, __sev, __sevl
Summary: These intrinsics result in hint instructions. They are provided here for MSVC ARM64 compatibility.

Reviewers: mstorsjo, compnerd, javed.absar

Reviewed By: mstorsjo

Subscribers: kristof.beyls, chrib, cfe-commits

Differential Revision: https://reviews.llvm.org/D48132

llvm-svn: 334639
2018-06-13 18:49:35 +00:00
Piotr Padlewski e368de364e Add -fforce-emit-vtables
Summary:
 In many cases we can't devirtualize
 because definition of vtable is not present. Most of the
 time it is caused by inline virtual function not beeing
 emitted. Forcing emitting of vtable adds a reference of these
 inline virtual functions.
 Note that GCC was always doing it.

Reviewers: rjmccall, rsmith, amharc, kuhar

Subscribers: llvm-commits, cfe-commits

Differential Revision: https://reviews.llvm.org/D47108

Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com>
llvm-svn: 334600
2018-06-13 13:55:42 +00:00
Richard Smith 8fa638ae6f Fix crash emitting transparent list initializer for a large aggregate.
llvm-svn: 334565
2018-06-13 02:06:28 +00:00
Yaxun Liu 6c10a66ec7 [CUDA][HIP] Set kernel calling convention before arrange function
Currently clang set kernel calling convention for CUDA/HIP after
arranging function, which causes incorrect kernel function type since
it depends on calling convention.

This patch moves setting kernel convention before arranging
function.

Differential Revision: https://reviews.llvm.org/D47733

llvm-svn: 334457
2018-06-12 00:16:33 +00:00
Craig Topper 201b9dd334 [X86] Fix operand order in the shuffle created for blend builtins.
This was broken when the builtin was added in r334249.

llvm-svn: 334422
2018-06-11 17:06:01 +00:00
Reid Kleckner 3513fdcc0f [MS] Use mangled names and comdats for string merging with ASan
This should reduce the binary size penalty of ASan on Windows. After
r334313, ASan will add red zones to globals in comdats, so we will still
find OOB accesses to string literals.

llvm-svn: 334417
2018-06-11 16:49:43 +00:00
Craig Topper 3cce6a7ed9 [X86] Use target independent masked expandload and compressstore intrinsics to implement expandload/compressstore builtins.
Summary: We've had these target independent intrinsics for at least a year and a half. Looks like they do exactly what we need here and the backend already supports them.

Reviewers: RKSimon, delena, spatel, GBuella

Reviewed By: RKSimon

Subscribers: cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D47693

llvm-svn: 334366
2018-06-10 17:27:05 +00:00
Ivan A. Kosarev 73c76c35a5 [NEON] Support VST1xN intrinsics in AArch32 mode (Clang part)
We currently support them only in AArch64. The NEON Reference,
however, says they are 'ARMv7, ARMv8' intrinsics.

Differential Revision: https://reviews.llvm.org/D47446

llvm-svn: 334362
2018-06-10 09:28:10 +00:00
Craig Topper 7c89d046ea Use SmallPtrSet instead of SmallSet in places where we iterate over the set.
SmallSet forwards to SmallPtrSet for pointer types. SmallPtrSet supports iteration, but a normal SmallSet doesn't. So if it wasn't for the forwarding, this wouldn't work.

These places were found by hiding the begin/end methods in the SmallSet forwarding.

llvm-svn: 334339
2018-06-09 00:30:45 +00:00
Craig Topper 88097d9355 [X86] Add back some masked vector truncate builtins. Custom IRgen a a few others.
I'd like to make the select builtins require an avx512f, avx512bw, or avx512vl fature to match what is normally required to get masking. Truncate is special in that there are instructions with a 128/256-bit masked result even without avx512vl.

By using special buitlins we can emit a select without using the 128/256-bit select builtins.

llvm-svn: 334331
2018-06-08 21:50:08 +00:00
Craig Topper 5f50f33806 [X86] Fold masking into subvector extract builtins.
I'm looking into making the select builtins require avx512f, avx512bw, or avx512vl since masking operations generally require those features.

The extract builtins are funny because the 512-bit versions return a 128 or 256 bit vector with masking even when avx512vl is not supported.

llvm-svn: 334330
2018-06-08 21:50:07 +00:00
Craig Topper 03f4f04b91 [X86] Add builtins for vpermq/vpermpd instructions to enable target feature checking.
llvm-svn: 334311
2018-06-08 18:00:25 +00:00
Jonas Hahnfeld 3b9cbba9a8 [CUDA] Fix emission of constant strings in sections
CGM.GetAddrOfConstantCString() sets the adress of the created GlobalValue
to unnamed. When emitting the object file LLVM will mark the surrounding
section as SHF_MERGE iff the string is nul-terminated and contains no
other nuls (see IsNullTerminatedString). This results in problems when
saving temporaries because LLVM doesn't set an EntrySize, so reading in
the serialized assembly file fails.
This never happened for the GPU binaries because they usually contain
a nul-character somewhere. Instead this only affected the module ID
when compiling relocatable device code.

However, this points to a potentially larger problem: If we put a
constant string into a named section, we really want the data to end
up in that section in the object file. To avoid LLVM merging sections
this patch unmarks the GlobalVariable's address as unnamed which also
fixes the problem of invalid serialized assembly files when saving
temporaries.

Differential Revision: https://reviews.llvm.org/D47902

llvm-svn: 334281
2018-06-08 11:17:08 +00:00
Craig Topper 422a1bbb84 [X86] Add builtins for shufps and shufpd to enable target feature and immediate range checking.
llvm-svn: 334266
2018-06-08 07:18:33 +00:00
Craig Topper 03de166ccd [X86] Add builtins for pshufd, pshuflw, and pshufhw to enable target feature and immediate range checking.
llvm-svn: 334265
2018-06-08 06:13:16 +00:00
Craig Topper 3428beeb2f [X86] Add subvector insert and extract builtins to enable target feature checking and immediate range checking.
Test changes are due to differences in how we generate undef elements now. We also changed the types used for extractf128_si256/insertf128_si256 to match the signature of the builtin that previously existed which this patch resurrects. This also matches gcc.

llvm-svn: 334261
2018-06-08 03:24:47 +00:00
Craig Topper acf5601961 [X86] Add builtins for vpermilps/pd instructions to enable target feature checking.
llvm-svn: 334256
2018-06-08 00:59:27 +00:00
Shoaib Meenai a5fc603379 [CodeGen] Always use MSVC personality for windows-msvc targets
The windows-msvc target is meant to be ABI compatible with MSVC,
including the exception handling. Ensure that a windows-msvc triple
always equates to the MSVC personality being used.

This mostly affects the GNUStep and ObjFW Obj-C runtimes. To the best of
my knowledge, those are normally not used with windows-msvc triples. I
believe WinObjC is based on GNUStep (or it at least uses libobjc2), but
that also takes the approach of wrapping Obj-C exceptions in C++
exceptions, so the MSVC personality function is the right one to use
there as well.

Differential Revision: https://reviews.llvm.org/D47862

llvm-svn: 334253
2018-06-08 00:41:01 +00:00
Craig Topper 7d17d7278b [X86] Add builtins for blend with immediate control to enforce target feature requirements and check immediate range.
llvm-svn: 334249
2018-06-08 00:00:21 +00:00
Craig Topper 9392136414 [X86] Add builtins for shuff32x4/shuff64x2/shufi32x4/shuff64x2 to enable target feature checking and immediate range checking.
llvm-svn: 334244
2018-06-07 23:03:08 +00:00
Reid Kleckner aa46ed9278 [MS] Re-add support for the ARM interlocked bittest intrinscs
Adds support for these intrinsics, which are ARM and ARM64 only:
  _interlockedbittestandreset_acq
  _interlockedbittestandreset_rel
  _interlockedbittestandreset_nf
  _interlockedbittestandset_acq
  _interlockedbittestandset_rel
  _interlockedbittestandset_nf

Refactor the bittest intrinsic handling to decompose each intrinsic into
its action, its width, and its atomicity.

llvm-svn: 334239
2018-06-07 21:39:04 +00:00
Craig Topper e56819eb69 [X86] Add builtins for VALIGNQ/VALIGND to enable proper target feature checking.
We still emit shufflevector instructions we just do it from CGBuiltin.cpp now. This ensures the intrinsics that use this are only available on CPUs that support the feature.

I also added range checking to the immediate, but only checked it is 8 bits or smaller. We should maybe be stricter since we never use all 8 bits, but gcc doesn't seem to do that.

llvm-svn: 334237
2018-06-07 21:27:41 +00:00
Craig Topper d3623155a2 [X86] Add back builtins for _mm_slli_si128/_mm_srli_si128 and similar intrinsics.
We still lower them to native shuffle IR, but we do it in CGBuiltin.cpp now. This allows us to check the target feature and ensure the immediate fits in 8 bits.

This also improves our -O0 codegen slightly because we're able to see the zeroinitializer in the shuffle. It looks like it got lost behind a store+load previously.

llvm-svn: 334208
2018-06-07 17:28:03 +00:00
Gabor Buella 1a83d06768 [CodeGen] Improve diagnostics related to target attributes
Summary:
When requirement imposed by __target__ attributes on functions
are not satisfied, prefer printing those requirements, which
are explicitly mentioned in the attributes.

This makes such messages more useful, e.g. printing avx512f instead of avx2
in the following scenario:

```
$ cat foo.c
static inline void __attribute__((__always_inline__, __target__("avx512f")))
x(void)
{
}

int main(void)
{
            x();
}
$ clang foo.c
foo.c:7:2: error: always_inline function 'x' requires target feature 'avx2', but would be inlined into function 'main' that is compiled without support for 'avx2'
        x();
    ^
1 error generated.
```

bugzilla: https://bugs.llvm.org/show_bug.cgi?id=37338

Reviewers: craig.topper, echristo, dblaikie

Reviewed By: craig.topper, echristo

Differential Revision: https://reviews.llvm.org/D46541

llvm-svn: 334174
2018-06-07 08:48:36 +00:00
Craig Topper b92c77d176 [X86] Add back _mask, _maskz, and _mask3 builtins for some 512-bit fmadd/fmsub/fmaddsub/fmsubadd builtins.
Summary:
We recently switch to using a selects in the intrinsics header files for FMA instructions. But the 512-bit versions support flavors with rounding mode which must be an Integer Constant Expression. This has forced those intrinsics to be implemented as macros. As it stands now the mask and mask3 intrinsics evaluate one of their macro arguments twice. If that argument itself is another intrinsic macro, we can end up over expanding macros. Or if its something we can CSE later it would show up multiple times when it shouldn't.

I tried adding __extension__ around the macro and making it an expression statement and declaring a local variable. But whatever name you choose for the local variable can never be used as the name of an input to the macro in user code. If that happens you would end up with the same name on the LHS and RHS of an assignment after expansion. We might be safe if we use __ in front of the variable names because those names are reserved and user code shouldn't use that, but I wasn't sure I wanted to make that claim.

The other option which I've chosen here, is to add back _mask, _maskz, and _mask3 flavors of the builtin which we will expand in CGBuiltin.cpp to replicate the argument as needed and insert any fneg needed on the third operand to make a subtract. The _maskz isn't truly necessary if we have an unmasked version or if we use the masked version with a -1 mask and wrap a select around it. But I've chosen to make things more uniform.

I separated out the scalar builtin handling to avoid too many things going on in EmitX86FMAExpr. It was different enough due to the extract and insert that the minor duplication of the CreateCall was probably worth it.

Reviewers: tkrupa, RKSimon, spatel, GBuella

Reviewed By: tkrupa

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D47724

llvm-svn: 334159
2018-06-07 02:46:02 +00:00
Reid Kleckner 11c99ed05f [MS][ARM64]: Promote _setjmp to_setjmpex as there is no _setjmp in the ARM64 libvcruntime.lib
Factor out the common setjmp call emission code.

Based on a patch by Chris January

Differential Revision: https://reviews.llvm.org/D47784

llvm-svn: 334112
2018-06-06 18:39:47 +00:00
Reid Kleckner 05df851327 Fix std::tuple errors
llvm-svn: 334060
2018-06-06 01:44:10 +00:00
Reid Kleckner 368d52b7e0 Implement bittest intrinsics generically for non-x86 platforms
I tested these locally on an x86 machine by disabling the inline asm
codepath and confirming that it does the same bitflips as we do with the
inline asm.

Addresses code review feedback.

llvm-svn: 334059
2018-06-06 01:35:08 +00:00
Craig Topper f3914b74c1 [X86] Add builtins for vector element insert and extract for different 128 and 256 bit vector types. Use them to implement the extract and insert intrinsics.
Previously we were just using extended vector operations in the header file.

This unfortunately allowed non-constant indices to be used with the intrinsics. This is incompatible with gcc, icc, and MSVC. It also introduces a different performance characteristic because non-constant index gets lowered to a vector store and an element sized load.

By adding the builtins we can check for the index to be a constant and ensure its in range of the vector element count.

User code still has the option to use extended vector operations themselves if they need non-constant indexing.

llvm-svn: 334057
2018-06-06 00:24:55 +00:00
Craig Topper 6b5b5ce06c [X86] Implement __builtin_ia32_vec_ext_v2si correctly even though we only use it with an index of 0.
This builtin takes an index as its second operand, but the codegen hardcodes an index of 0 and doesn't use the operand. The only use of the builtin in the header file passes 0 to the operand so this works for that usage. But its more correct to use the real operand.

llvm-svn: 334054
2018-06-05 22:40:03 +00:00
Yaxun Liu 6328f9a988 [CUDA][HIP] Do not emit type info when compiling for device
CUDA/HIP does not support RTTI on device side, therefore there
is no point of emitting type info when compiling for device.

Emitting type info for device not only clutters the IR with useless
global variables, but also causes undefined symbol at linking
since vtable for cxxabiv1::class_type_info has external linkage.

Differential Revision: https://reviews.llvm.org/D47694

llvm-svn: 334021
2018-06-05 15:11:02 +00:00
Reid Kleckner 1d9c249db5 Reimplement the bittest intrinsic family as builtins with inline asm
We need to implement _interlockedbittestandset as a builtin for
windows.h, so we might as well do the whole family. It reduces code
duplication anyway.

Fixes PR33188, a long standing bug in our bittest implementation
encountered by Chakra.

llvm-svn: 333978
2018-06-05 01:33:40 +00:00
Reid Kleckner 89fbd55145 Revert r333791 "Cap "voluntary" vector alignment at 16 for all Darwin platforms."
Adding __attribute__((aligned(32))) to __m256 breaks the implementation
of _mm256_loadu_ps on Windows. On Windows, alignment attributes have
higher precedence than packing attributes.

We also might want to carefully consider the consequences of changing
our vector typedefs, since many users copy them and invent their own
new, non-Intel specific vector type names.

llvm-svn: 333958
2018-06-04 21:39:20 +00:00
David Blaikie 181a61307b Update for an LLVM header file move
llvm-svn: 333955
2018-06-04 21:23:29 +00:00
Heejin Ahn 0083179f06 Remove llvm::Triple argument from get***Personality() functions. NFC.
Summary:
Because `llvm::Triple` can be derived from `TargetInfo`, it is simpler
to take only `TargetInfo` argument.

Reviewers: sbc100

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D47620

llvm-svn: 333938
2018-06-04 18:23:00 +00:00
Leonard Chan f921d85422 This diff includes changes for supporting the following types.
// Primary fixed point types
signed short _Accum s_short_accum;
signed _Accum s_accum;
signed long _Accum s_long_accum;
unsigned short _Accum u_short_accum;
unsigned _Accum u_accum;
unsigned long _Accum u_long_accum;

// Aliased fixed point types
short _Accum short_accum;
_Accum accum;
long _Accum long_accum;
This diff only allows for declaration of the fixed point types. Assignment and other operations done on fixed point types according to http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf will be added in future patches. The saturated versions of these types and the equivalent _Fract types will also be added in future patches.

The tests included are for asserting that we can declare these types.

Fixed the test that was failing by not checking for dso_local on some
targets.

Differential Revision: https://reviews.llvm.org/D46084

llvm-svn: 333923
2018-06-04 16:07:52 +00:00
Craig Topper 6fb26f93ef [X86] Replace __builtin_ia32_vbroadcastf128_pd256 and __builtin_ia32_vbroadcastf128_ps256 with an unaligned load intrinsics and a __builtin_shufflevector call.
llvm-svn: 333853
2018-06-03 19:42:59 +00:00
Craig Topper f886b44693 [X86] Pass ArrayRef instead of SmallVectorImpl& to the X86 builtin helper functions. NFC
llvm-svn: 333851
2018-06-03 19:02:57 +00:00
Craig Topper 8508c1db98 Revert r333848 "[X86] Pass ArrayRef instead of SmallVectorImpl& to the X86 builtin helper functions. NFC"
Looks like I missed some changes to make this work.

llvm-svn: 333850
2018-06-03 18:41:22 +00:00
Craig Topper d4a610f6f7 [X86] Pass ArrayRef instead of SmallVectorImpl& to the X86 builtin helper functions. NFC
llvm-svn: 333848
2018-06-03 18:08:37 +00:00
Craig Topper 21f56f5b9c [X86] When emitting masked loads/stores don't check for all ones mask.
This seems like a premature optimization. It's unlikely a user would pass something the frontend can tell is all ones to the masked load/store intrinsics.

We do this optimization for emitting select for masking because we have builtin calls in header files that pass an all ones mask in. Though at this point we may not longer have any builtins that emit some IR and a select. We may only have the select builtins so maybe we can remove that optimization too.

llvm-svn: 333847
2018-06-03 18:08:36 +00:00
Ivan A. Kosarev 9c40c0ad0c [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)
We currently support them only in AArch64. The NEON Reference,
however, says they are 'ARMv7, ARMv8' intrinsics.

Differential Revision: https://reviews.llvm.org/D47121

llvm-svn: 333829
2018-06-02 17:42:59 +00:00
Leonard Chan 0d485dbb40 Revert "This diff includes changes for supporting the following types."
This reverts commit r333814, which fails for a test checking the bit
width on ubuntu.

llvm-svn: 333815
2018-06-02 03:27:13 +00:00
Leonard Chan db55d8331e This diff includes changes for supporting the following types.
```

// Primary fixed point types
signed short _Accum s_short_accum;
signed _Accum s_accum;
signed long _Accum s_long_accum;
unsigned short _Accum u_short_accum;
unsigned _Accum u_accum;
unsigned long _Accum u_long_accum;

// Aliased fixed point types
short _Accum short_accum;
_Accum accum;
long _Accum long_accum;

```

This diff only allows for declaration of the fixed point types. Assignment and other operations done on fixed point types according to http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf will be added in future patches. The saturated versions of these types and the equivalent `_Fract` types will also be added in future patches.

The tests included are for asserting that we can declare these types.

Differential Revision: https://reviews.llvm.org/D46084

llvm-svn: 333814
2018-06-02 02:58:51 +00:00
John McCall 280c656031 Cap "voluntary" vector alignment at 16 for all Darwin platforms.
This fixes two major problems:
- We were not capping vector alignment as desired on 32-bit ARM.
- We were using different alignments based on the AVX settings on
  Intel, so we did not have a consistent ABI.

This is an ABI break, but we think we can get away with it because
vectors tend to be used mostly in inline code (which is why not having
a consistent ABI has not proven disastrous on Intel).

Intel's AVX types are specified as having 32-byte / 64-byte alignment,
so align them explicitly instead of relying on the base ABI rule.
Note that this sort of attribute is stripped from template arguments
in template substitution, so there's a possibility that code templated
over vectors will produce inadequately-aligned objects.  The right
long-term solution for this is for alignment attributes to be
interpreted as true qualifiers and thus preserved in the canonical type.

llvm-svn: 333791
2018-06-01 21:34:26 +00:00
Heejin Ahn 1eb074d76e [WebAssembly] Hide new Wasm EH behind its feature flag
Summary:
clang's current wasm EH implementation is a non-MVP feature in progress.
We had a `-mexception-handling` wasm feature but were not using it. This
patch hides the non-MVP wasm EH behind a flag, so it does not affect
other code for now.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits

Differential Revision: https://reviews.llvm.org/D47614

llvm-svn: 333716
2018-06-01 01:01:37 +00:00
Vedant Kumar d781d97ed4 [Coverage] End deferred regions before labels, fixes PR35867
A deferred region should end before the start of a label, and should not
extend to the start of the label sub-statement.

Fixes llvm.org/PR35867.

llvm-svn: 333715
2018-06-01 00:37:13 +00:00
Dan Gohman 9f8ee03772 [WebAssembly] Update to the new names for the memory builtin functions.
The WebAssembly committee has decided on the names `memory.size` and
`memory.grow` for the memory intrinsics, so update the clang builtin
functions to follow those names, keeping both sets of old names in place
for compatibility.

llvm-svn: 333712
2018-06-01 00:05:51 +00:00
Heejin Ahn c647919933 [WebAssembly] Use Windows EH instructions for Wasm EH
Summary:
Because wasm control flow needs to be structured, using WinEH
instructions to support wasm EH brings several benefits. This patch
makes wasm EH uses Windows EH instructions, with some changes:
1. Because wasm uses a single catch block to catch all C++ exceptions,
   this merges all catch clauses into a single catchpad, within which we
   test the EH selector as in Itanium EH.
2. Generates a call to `__clang_call_terminate` in case a cleanup
   throws. Wasm does not have a runtime to handle this.
3. In case there is no catch-all clause, inserts a call to
   `__cxa_rethrow` at the end of a catchpad in order to unwind to an
   enclosing EH scope.

Reviewers: majnemer, dschuff

Subscribers: jfb, sbc100, jgravelle-google, sunfish, cfe-commits

Differential Revision: https://reviews.llvm.org/D44931

llvm-svn: 333703
2018-05-31 22:18:13 +00:00
Reid Kleckner 26fc531dbc Fix null MSInheritanceAttr deref in CXXRecordDecl::getMSInheritanceModel()
Ensure latest MPT decl has a MSInheritanceAttr when instantiating
templates, to avoid null MSInheritanceAttr deref in
CXXRecordDecl::getMSInheritanceModel().

See PR#37399 for repo / details.

Patch by Andrew Rogers!

Differential Revision: https://reviews.llvm.org/D46664

llvm-svn: 333680
2018-05-31 18:42:29 +00:00
Peter Collingbourne 3aa30e8062 IRGen: Write .dwo files when -split-dwarf-file is used together with -fthinlto-index.
Differential Revision: https://reviews.llvm.org/D47597

llvm-svn: 333677
2018-05-31 18:25:59 +00:00
Vedant Kumar 61763b65af [Coverage] Discard the last uncompleted deferred region in a decl
Discard the last uncompleted deferred region in a decl, if one exists.
This prevents lines at the end of a function containing only whitespace
or closing braces from being marked as uncovered, if they follow a
region terminator (return/break/etc).

The previous behavior was to heuristically complete deferred regions at
the end of a decl. In practice this ended up being too brittle for too
little gain. Users would complain that there was no way to reach full
code coverage because whitespace at the end of a function would be
marked uncovered.

rdar://40238228

Differential Revision: https://reviews.llvm.org/D46918

llvm-svn: 333609
2018-05-30 23:35:44 +00:00
Peter Collingbourne ac94ca54c5 IRGen: Rename bitsets -> type metadata. NFC.
"Type metadata" is the term that we've been using for the CFI-related
information on vtables for a while now.

llvm-svn: 333602
2018-05-30 22:29:08 +00:00
Gabor Buella 70d8d51073 [X86] Lowering FMA intrinsics to native IR (Clang part)
This patch replaces all packed (and scalar without rounding
mode) fused intrinsics with fmadd/fmaddsub variations.
Then fmadd/fmaddsub are lowered to native IR.

Patch by tkrupa

Reviewers: craig.topper, sroland, spatel, RKSimon

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D47444

llvm-svn: 333555
2018-05-30 15:27:49 +00:00
Simon Tatham 89e31fa7fc Support __iso_volatile_load8 etc on aarch64-win32.
These intrinsics are used by MSVC's header files on AArch64 Windows as
well as AArch32, so we should support them for both targets. I've
factored them out of CodeGenFunction::EmitARMBuiltinExpr into separate
functions that EmitAArch64BuiltinExpr can call as well.

Reviewers: javed.absar, mstorsjo

Reviewed By: mstorsjo

Subscribers: kristof.beyls, cfe-commits

Differential Revision: https://reviews.llvm.org/D47476

llvm-svn: 333513
2018-05-30 07:54:05 +00:00
Richard Smith b534510cd5 Make the mangled name collision diagnostic a bit more useful by listing the mangling.
This helps especially when the collision is for a template specialization,
where the template arguments are not available from anywhere else in the
diagnostic, and are likely relevant to the problem.

llvm-svn: 333489
2018-05-30 01:52:16 +00:00
Richard Smith 6ca999baf2 Revert r332839.
This is causing miscompiles and "definition with same mangled name as another
definition" errors.

llvm-svn: 333482
2018-05-30 00:45:10 +00:00
Akira Hatanaka 1da9dbbc25 [CodeGen][Darwin] Set the calling-convention of thread-local variable
initialization functions to 'cxx_fast_tlscc'.

This fixes a bug where instructions calling initialization functions for
thread-local static members of c++ template classes were using calling
convention 'cxx_fast_tlscc' while the called functions weren't annotated
with the calling convention.

rdar://problem/40447463

Differential Revision: https://reviews.llvm.org/D47354

llvm-svn: 333447
2018-05-29 18:28:49 +00:00
Paul Robinson 76178632a2 Revert "[DebugInfo] Don't bother with MD5 checksums of preprocessed files."
This reverts commit d734f2aa3f76fbf355ecd2bbe081d0c1f49867ab.
Also known as r333311.  A very small but nonzero number of bots fail.

llvm-svn: 333319
2018-05-25 22:35:59 +00:00
Bob Wilson fa84fc916c Support Swift calling convention for PPC64 targets
This adds basic support for the Swift calling convention with PPC64 targets.
Patch provided by Atul Sowani in bug report #37223

llvm-svn: 333316
2018-05-25 21:26:03 +00:00
Paul Robinson 638d606f83 [DebugInfo] Don't bother with MD5 checksums of preprocessed files.
The checksum will not reflect the real source, so there's no clear
reason to include them in the debug info.  Also this was causing a
crash on the DWARF side.

Differential Revision: https://reviews.llvm.org/D47260

llvm-svn: 333311
2018-05-25 20:59:29 +00:00
Alexey Bataev 0baba9e728 [OPENMP, NVPTX] Fixed codegen for orphaned parallel region.
If orphaned parallel region is found, the next code must be emitted:
```
if(__kmpc_is_spmd_exec_mode() || __kmpc_parallel_level(loc, gtid))
  Serialized execution.
else if (IsMasterThread())
  Prepare and signal worker.
else
  Outined function call.
```

llvm-svn: 333301
2018-05-25 20:16:03 +00:00
Richard Smith 3e268632cf Use zeroinitializer for (trailing zero portion of) large array initializers
more reliably.

This re-commits r333044 with a fix for PR37560.

llvm-svn: 333141
2018-05-23 23:41:38 +00:00
Hans Wennborg 156349fa10 Revert r333044 "Use zeroinitializer for (trailing zero portion of) large array initializers"
It caused asserts, see PR37560.

> Use zeroinitializer for (trailing zero portion of) large array initializers
> more reliably.
>
> Clang has two different ways it emits array constants (from InitListExprs and
> from APValues), and both had some ability to emit zeroinitializer, but neither
> was able to catch all cases where we could use zeroinitializer reliably. In
> particular, emitting from an APValue would fail to notice if all the explicit
> array elements happened to be zero. In addition, for large arrays where only an
> initial portion has an explicit initializer, we would emit the complete
> initializer (which could be huge) rather than emitting only the non-zero
> portion. With this change, when the element would have a suffix of more than 8
> zero elements, we emit the array constant as a packed struct of its initial
> portion followed by a zeroinitializer constant for the trailing zero portion.
>
> In passing, I found a bug where SemaInit would sometimes walk the entire array
> when checking an initializer that only covers the first few elements; that's
> fixed here to unblock testing of the rest.
>
> Differential Revision: https://reviews.llvm.org/D47166

llvm-svn: 333067
2018-05-23 08:24:01 +00:00
Craig Topper f2043b08b4 [X86] Remove mask argument from more builtins that are handled completely in CGBuiltin.cpp. Just wrap a select builtin around them in the header file instead.
llvm-svn: 333061
2018-05-23 04:51:54 +00:00
Richard Smith 9062bbf419 Use zeroinitializer for (trailing zero portion of) large array initializers
more reliably.

Clang has two different ways it emits array constants (from InitListExprs and
from APValues), and both had some ability to emit zeroinitializer, but neither
was able to catch all cases where we could use zeroinitializer reliably. In
particular, emitting from an APValue would fail to notice if all the explicit
array elements happened to be zero. In addition, for large arrays where only an
initial portion has an explicit initializer, we would emit the complete
initializer (which could be huge) rather than emitting only the non-zero
portion. With this change, when the element would have a suffix of more than 8
zero elements, we emit the array constant as a packed struct of its initial
portion followed by a zeroinitializer constant for the trailing zero portion.

In passing, I found a bug where SemaInit would sometimes walk the entire array
when checking an initializer that only covers the first few elements; that's
fixed here to unblock testing of the rest.

Differential Revision: https://reviews.llvm.org/D47166

llvm-svn: 333044
2018-05-23 00:09:29 +00:00
Sanjay Patel 74c7fb002f [CodeGen] use nsw negation for builtin abs
The clang builtins have the same semantics as the stdlib functions.
The stdlib functions are defined in section 7.20.6.1 of the C standard with:
"If the result cannot be represented, the behavior is undefined."

That lets us mark the negation with 'nsw' because "sub i32 0, INT_MIN" would
be UB/poison.

Differential Revision: https://reviews.llvm.org/D47202

llvm-svn: 333038
2018-05-22 23:02:13 +00:00
Craig Topper 8e3689c066 [X86] Remove mask argument from some builtins that are handled completely in CGBuiltin.cpp. Just wrap a select builtin around them in the header file instead.
llvm-svn: 333027
2018-05-22 20:48:24 +00:00
Peter Collingbourne 91d02844a3 Reland r332885, "CodeGen, Driver: Start using direct split dwarf emission in clang."
As well as two follow-on commits r332906, r332911 with a fix for
test clang/test/CodeGen/split-debug-filename.c.

llvm-svn: 333013
2018-05-22 18:52:37 +00:00
Yaxun Liu 00ddbed298 Revert r332982 Call CreateTempMemWithoutCast for ActiveFlag
Due to regression on arm.

llvm-svn: 332991
2018-05-22 16:13:07 +00:00
Sanjay Patel 1ff6b27940 [CodeGen] produce the LLVM canonical form of abs
We chose the 'slt' form as canonical in IR with:
rL332819
...so we should generate that form directly for efficiency.

llvm-svn: 332989
2018-05-22 15:36:50 +00:00
Yaxun Liu 8a60e5db70 Call CreateTempMemWithoutCast for ActiveFlag
Introduced CreateMemTempWithoutCast and CreateTemporaryAllocaWithoutCast to emit alloca
without casting to default addr space.

ActiveFlag is a temporary variable emitted for clean up. It is defined as AllocaInst* type and there is
a cast to AlllocaInst in SetActiveFlag. An alloca casted to generic pointer causes assertion in
SetActiveFlag.

Since there is only load/store of ActiveFlag, it is safe to use the original alloca, therefore use
CreateMemTempWithoutCast is called.

Differential Revision: https://reviews.llvm.org/D47099

llvm-svn: 332982
2018-05-22 14:36:26 +00:00
Brock Wyma 8557ec5d64 [CodeView] Enable debugging of captured variables within C++ lambdas
This change will help Visual Studio resolve forward references to C++ lambda
routines used by captured variables.

Differential Revision: https://reviews.llvm.org/D45438

llvm-svn: 332975
2018-05-22 12:41:19 +00:00
Amara Emerson f528bcc32a Revert "CodeGen, Driver: Start using direct split dwarf emission in clang."
This reverts commit r332885 as it broke several greendragon buildbots.

llvm-svn: 332973
2018-05-22 11:18:58 +00:00
Amara Emerson b6aa52a1c4 Revert "Fix another make_unique ambiguity."
This reverts commit r332906 as a dependency to revert r332885.

llvm-svn: 332972
2018-05-22 11:18:49 +00:00
David Chisnall 48a7afa5a1 [objc-gnustep2] Use unsigned char to avoid potential UB in isalnum.
Suggested by Gabor Buella.

llvm-svn: 332966
2018-05-22 10:13:17 +00:00
David Chisnall 88e754f57d [objc-gnustep2] Use isalnum instead of a less efficient and nonportable equivalent.
Patch by Hans Wennborg!

llvm-svn: 332964
2018-05-22 10:13:11 +00:00
David Chisnall 404bbcbdcb Revert "Revert r332955 "GNUstep Objective-C ABI version 2""
llvm-svn: 332963
2018-05-22 10:13:06 +00:00
Bjorn Pettersson 844663353d Revert r332955 "GNUstep Objective-C ABI version 2"
Reverted due to buildbot failures.
Seems like isnumber() is some Apple addition to cctype.

llvm-svn: 332957
2018-05-22 08:16:45 +00:00
David Chisnall 7c6cd52698 Add cctype include.
This appears to leak in already on libc++ platforms, but is breaking on
some other targets.

llvm-svn: 332955
2018-05-22 07:22:50 +00:00
David Chisnall 79356eefc0 GNUstep Objective-C ABI version 2
Summary:
This includes initial support for the (hopefully final) updated Objective-C ABI, developed here:

https://github.com/davidchisnall/clang-gnustep-abi-2

It also includes some cleanups and refactoring from older GNU ABIs.

The current version is ELF only, other formats to follow.

Reviewers: rjmccall, DHowett-MSFT

Reviewed By: rjmccall

Subscribers: smeenai, cfe-commits

Differential Revision: https://reviews.llvm.org/D46052

llvm-svn: 332950
2018-05-22 06:09:23 +00:00
Peter Collingbourne 0322e3cbf5 Fix another make_unique ambiguity.
llvm-svn: 332906
2018-05-21 21:48:17 +00:00
Craig Topper 288bd2e5a0 [X86] Remove masking from pternlog llvm intrinsics and use a select instruction instead.
Because the intrinsics in the headers are implemented as macros, we can't just use a select builtin and pternlog builtin. This would require one of the macro arguments to be used twice. Depending on what was passed to the macro we could expand an expression twice leading to weird behavior. We could maybe declare our local variable in the macro, but that would need to worry about name collisions.

To avoid that just generate IR directly in CGBuiltin.cpp.

Differential Revision: https://reviews.llvm.org/D47125

llvm-svn: 332891
2018-05-21 20:58:23 +00:00
Richard Smith 3f1d6de4f7 Revert r332847; it caused us to miscompile certain forms of reference initialization.
llvm-svn: 332886
2018-05-21 20:36:58 +00:00
Peter Collingbourne 47bc01786d CodeGen, Driver: Start using direct split dwarf emission in clang.
Fixes PR37466.

Differential Revision: https://reviews.llvm.org/D47093

llvm-svn: 332885
2018-05-21 20:31:59 +00:00
Peter Collingbourne 9a45114b3c CodeGen: Add a dwo output file argument to addPassesToEmitFile and hook it up to dwo output.
Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47089

llvm-svn: 332881
2018-05-21 20:16:41 +00:00
Richard Smith bbb2655de0 Revert r332028; see PR37545 for details.
llvm-svn: 332879
2018-05-21 20:10:54 +00:00
Daniil Fukalov 1b14a3ad3d [AMDGPU] fixes for lds f32 builtins
1. added restrictions to memory scope, order and volatile parameters
2. added custom processing for these builtins - currently is not used code,
   needed to switch off GCCBuiltin link to the builtins (ongoing change to llvm
   tree)
3. builtins renamed as requested

Differential Revision: https://reviews.llvm.org/D43281

llvm-svn: 332848
2018-05-21 16:18:07 +00:00
Serge Pavlov 9f8068420a [CodeGen] Recognize more cases of zero initialization
If a variable has an initializer, codegen tries to build its value. If
the variable is large in size, building its value requires substantial
resources. It causes strange behavior from user viewpoint: compilation
of huge zero initialized arrays like:

    char data_1[2147483648u] = { 0 };

consumes enormous amount of time and memory.

With this change codegen tries to determine if variable initializer is
equivalent to zero initializer. In this case variable value is not
constructed.

This change fixes PR18978.

Differential Revision: https://reviews.llvm.org/D46241

llvm-svn: 332847
2018-05-21 16:09:54 +00:00
Pavel Labath b3b0255df9 [CodeGen] Disable aggressive structor optimizations at -O0, take 2
The first version of the patch (r332228) was flawed because it was
putting structors into C5/D5 comdats very eagerly. This is correct only
if we can ensure the comdat contains all required versions of the
structor (which wasn't the case). This version uses a more nuanced
approach:
- for local structor symbols we use an alias because we don't have to
  worry about comdats or other compilation units.
- linkonce symbols are emitted separately, as we cannot guarantee we
  will have all symbols we need to form a comdat (they are emitted
  lazily, only when referenced).
- available_externally symbols are also emitted separately, as the code
  seemed to be worried about emitting an alias in this case.
- other linkage types are not affected by the optimization level. They
  either get put into a comdat (weak) or get aliased (external).

Reviewers: rjmccall, aprantl

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D46685

llvm-svn: 332839
2018-05-21 11:47:45 +00:00
JF Bastien 6b23972f07 CodeGen: block capture shouldn't ICE
When a lambda capture captures a __block in the same statement, the compiler asserts out because isCapturedBy assumes that an Expr can only be a BlockExpr, StmtExpr, or if it's a Stmt then all the statement's children are expressions. That's wrong, we need to visit all sub-statements even if they're not expressions to see if they also capture.

Fix this issue by pulling out the isCapturedBy logic to use RecursiveASTVisitor.

<rdar://problem/39926584>

llvm-svn: 332801
2018-05-19 04:21:26 +00:00
Yaxun Liu 29155b01c1 [HIP] Support offloading by linker script
To support linking device code in different source files, it is necessary to
embed fat binary at host linking stage.

This patch emits an external symbol for fat binary in host codegen, then
embed the fat binary by lld through a linker script.

Differential Revision: https://reviews.llvm.org/D46472

llvm-svn: 332724
2018-05-18 15:07:56 +00:00
Peter Collingbourne 070777dbdd Support: Add a raw_ostream::write_zeros() function. NFCI.
This will eventually replace MCObjectWriter::WriteZeros.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47033

llvm-svn: 332675
2018-05-17 22:11:43 +00:00
Reid Kleckner 138ab4947c Fix a mangling failure on clang-cl C++17
MethodVFTableLocations in MigrosoftVTableContext contains canonicalized
decl. But, it's sometimes asked to lookup for non-canonicalized decl,
and that causes assertion failure, and compilation failure.

Fixes PR37481.

Patch by Taiju Tsuiki!

Differential Revision: https://reviews.llvm.org/D46929

llvm-svn: 332639
2018-05-17 18:12:18 +00:00
Yaxun Liu a2a9cfab83 CodeGen: Fix invalid bitcast for lifetime.start/end
lifetime.start/end expects pointer argument in alloca address space.
However in C++ a temporary variable is in default address space.

This patch changes API CreateMemTemp and CreateTempAlloca to
get the original alloca instruction and pass it lifetime.start/end.

It only affects targets with non-zero alloca address space.

Differential Revision: https://reviews.llvm.org/D45900

llvm-svn: 332593
2018-05-17 11:16:35 +00:00
Alexey Bataev 4ac68a210c [OPENMP] DO not crash on combined constructs in declare target
functions.

If the combined construct is specified in the declare target function
and the device code is emitted, the compiler crashes because of the
incorrectly chosen captured stmt. We should choose the innermost
captured statement, not the outermost.

llvm-svn: 332477
2018-05-16 15:08:32 +00:00
Alexey Bataev 673110d5d5 [OPENMP, NVPTX] Add check for SPMD mode in orphaned parallel directives.
If the orphaned directive is executed in SPMD mode, we need to emit the
check for the SPMD mode and run the orphaned parallel directive in
sequential mode.

llvm-svn: 332467
2018-05-16 13:36:30 +00:00
Benjamin Kramer 651d0bf9dc Move helper classes into anonymous namespaces. NFCI.
llvm-svn: 332400
2018-05-15 21:26:47 +00:00
Akira Hatanaka 852829792b Address post-commit review comments after r328731. NFC.
- Define a function (canPassInRegisters) that determines whether a
record can be passed in registers based on language rules and
target-specific ABI rules.

- Set flag RecordDecl::ParamDestroyedInCallee to true in MSVC mode and
remove ASTContext::isParamDestroyedInCallee, which is no longer needed.

- Use the same type (unsigned) for RecordDecl's bit-field members.

For more background, see the following discussions that took place on
cfe-commits.

http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20180326/223498.html
http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20180402/223688.html
http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20180409/224754.html
http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20180423/226494.html
http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20180507/227647.html

llvm-svn: 332397
2018-05-15 21:00:30 +00:00
Alexey Bataev 2a3320a928 [OPENMP, NVPTX] Do not globalize variables with reference/pointer types.
In generic data-sharing mode we do not need to globalize
variables/parameters of reference/pointer types. They already are placed
in the global memory.

llvm-svn: 332380
2018-05-15 18:01:01 +00:00
Nicola Zaghen 3538b39ed5 [clang] Update uses of DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM

Explicitly avoided changing the strings in the clang-format tests.

Differential Revision: https://reviews.llvm.org/D44975

llvm-svn: 332350
2018-05-15 13:30:56 +00:00
Brock Wyma 3db2b108c3 [CodeView] Improve debugging of virtual base class member variables
Initial support for passing the virtual base pointer offset to CodeViewDebug.

https://reviews.llvm.org/D46271

llvm-svn: 332296
2018-05-14 21:21:22 +00:00
Yaxun Liu daceb1ea0f CodeGen: Emit string literal in constant address space
Some targets have constant address space (e.g. amdgcn). For them string literal should be
emitted in constant address space then casted to default address space.

Differential Revision: https://reviews.llvm.org/D46643

llvm-svn: 332279
2018-05-14 19:20:12 +00:00
Pavel Labath c370f26251 Revert "[CodeGen] Disable aggressive structor optimizations at -O0"
It breaks the sanitizer build
<http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/23739>

This reverts commit r332228.

llvm-svn: 332232
2018-05-14 11:35:44 +00:00
Pavel Labath f639eb0553 [CodeGen] Disable aggressive structor optimizations at -O0
Summary:
Removing the full structor and replacing all usages with the base one
can degrade debug quality as it will leave the debugger unable to locate
the full object structor. This is apparent when evaluating an expression
in the debugger which requires constructing an object of class which has
had this optimization applied to it.  When compiling the expression, we
pretend that the class and its methods have been defined in another
compilation unit, so the expression compiler assumes the structor
definition must be available. This didn't use to be the case for
structors with internal linkage. Less aggressive optimizations like
emitting the full structor as an alias remain in place, as they do not
cause the structor symbol to disappear completely.

This improves debug quality on non-darwin platforms (darwin does not
have -mconstructor-aliases on by default, so it is spared these
problems) and enable us to remove some workarounds from LLDB which attempt to
mitigate this issue.

Reviewers: rjmccall, aprantl

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D46685

llvm-svn: 332228
2018-05-14 11:02:23 +00:00
Elena Demikhovsky d31327d505 Added atomic_fetch_min, max, umin, umax intrinsics to clang.
These intrinsics work exactly as all other atomic_fetch_* intrinsics and allow to create *atomicrmw* with ordering.
Updated the clang-extensions document.

Differential Revision: https://reviews.llvm.org/D46386

llvm-svn: 332193
2018-05-13 07:45:58 +00:00
Alexey Bataev df093e7b45 [OPENMP, NVPTX] Do not use SPMD mode for target simd and target teams
distribute simd directives.

Directives `target simd` and `target teams distribute simd` must be
executed in non-SPMD mode.

llvm-svn: 332129
2018-05-11 19:45:14 +00:00
Eric Fiselier 85ba3321c6 [Itanium] Emit type info names with external linkage.
Summary:
The Itanium ABI requires that the type info for pointer-to-incomplete types to have internal linkage, so that it doesn't interfere with the type info once completed.  Currently it also marks the type info name as internal as well. However, this causes a bug with the STL implementations, which use the type info name pointer to perform ordering and hashing of type infos.
For example:

```
// header.h
struct T;
extern std::type_info const& Info;

// tu_one.cpp
#include "header.h"
std::type_info const& Info = typeid(T*);

// tu_two.cpp
#include "header.h"
struct T {};
int main() {
  auto &TI1 = Info;
  auto &TI2 = typeid(T*);
  assert(TI1 == TI2); // Fails
  assert(TI1.hash_code() == TI2.hash_code()); // Fails
}
```

This patch fixes the STL bug by emitting the type info name as linkonce_odr when the type-info is for a pointer-to-incomplete type.

Note that libc++ could fix this without a compiler change, but the quality of fix would be poor. The library would either have to:

(A) Always perform strcmp/string hashes.
(B) Determine if we have a pointer-to-incomplete type, and only do strcmp then. This would require an ABI break for libc++.


Reviewers: rsmith, rjmccall, majnemer, vsapsai

Reviewed By: rjmccall

Subscribers: smeenai, cfe-commits

Differential Revision: https://reviews.llvm.org/D46665

llvm-svn: 332028
2018-05-10 19:51:56 +00:00
Julie Hockett 96fbe58b0f Reland '[clang] Adding CharacteristicKind to PPCallbacks::InclusionDirective'
This commit relands r331904.

Adding a SrcMgr::CharacteristicKind parameter to the InclusionDirective
in PPCallbacks, and updating calls to that function. This will be useful
in https://reviews.llvm.org/D43778 to determine which includes are
system
headers.

Differential Revision: https://reviews.llvm.org/D46614

llvm-svn: 332021
2018-05-10 19:05:36 +00:00
Alexey Bataev bf5c84861c [OPENMP, NVPTX] Initial support for L2 parallelism in SPMD mode.
Added initial support for L2 parallelism in SPMD mode. Note, though,
that the orphaned parallel directives are not currently supported in
SPMD mode.

llvm-svn: 332016
2018-05-10 18:32:08 +00:00
Strahinja Petrovic 0f274c0111 This patch provides that bitfields are splitted even in case
when current field is not legal integer type.

Differential Revision: https://reviews.llvm.org/D39053

llvm-svn: 331979
2018-05-10 12:31:12 +00:00
Eric Fiselier 864ef70eb0 Revert "[Itanium] Emit type info names with external linkage."
This reverts commit r331957. It seems to be causing failures
on ppc64le-linux.

llvm-svn: 331963
2018-05-10 08:10:57 +00:00
Craig Topper 74ac0eda68 [X86] Change the implementation of scalar masked load/store intrinsics to not use a 512-bit intermediate vector.
This is unnecessary for AVX512VL supporting CPUs like SKX. We can just emit a 128-bit masked load/store here no matter what. The backend will widen it to 512-bits on KNL CPUs.

Fixes the frontend portion of PR37386. Need to fix the backend to optimize the new sequences well.

llvm-svn: 331958
2018-05-10 05:43:43 +00:00
Eric Fiselier ae56a957af [Itanium] Emit type info names with external linkage.
Summary:
The Itanium ABI requires that the type info for pointer-to-incomplete types to have internal linkage, so that it doesn't interfere with the type info once completed.  Currently it also marks the type info name as internal as well. However, this causes a bug with the STL implementations, which use the type info name pointer to perform ordering and hashing of type infos.
For example:

```
// header.h
struct T;
extern std::type_info const& Info;

// tu_one.cpp
#include "header.h"
std::type_info const& Info = typeid(T*);

// tu_two.cpp
#include "header.h"
struct T {};
int main() {
  auto &TI1 = Info;
  auto &TI2 = typeid(T*);
  assert(TI1 == TI2); // Fails
  assert(TI1.hash_code() == TI2.hash_code()); // Fails
}
```

This patch fixes the STL bug by emitting the type info name as linkonce_odr when the type-info is for a pointer-to-incomplete type.

Note that libc++ could fix this without a compiler change, but the quality of fix would be poor. The library would either have to:

(A) Always perform strcmp/string hashes.
(B) Determine if we have a pointer-to-incomplete type, and only do strcmp then. This would require an ABI break for libc++.


Reviewers: rsmith, rjmccall, majnemer, vsapsai

Reviewed By: rjmccall

Subscribers: smeenai, cfe-commits

Differential Revision: https://reviews.llvm.org/D46665

llvm-svn: 331957
2018-05-10 05:25:15 +00:00
Craig Topper 2b248849ae [Builtins] Improve the IR emitted for MSVC compatible rotr/rotl builtins to match what the middle and backends understand
Previously we emitted something like

rotl(x, n) {
  n &= bitwidth-1;
  return n != 0 ? ((x << n) | (x >> (bitwidth - n)) : x;
}

We use a select to avoid the undefined behavior on the (bitwidth - n) shift.

The middle and backend don't really recognize this as a rotate and end up emitting a cmov or control flow because of the select.

A better pattern is (x << (n & mask)) | (x << (-n & mask)) where mask is bitwidth - 1.

Fixes the main complaint in PR37387. There's still some work to be done if the user writes that sequence directly on a short or char where type promotion rules can prevent it from being recognized. The builtin is emitting direct IR with unpromoted types so that isn't a problem for it.

Differential Revision: https://reviews.llvm.org/D46656

llvm-svn: 331943
2018-05-10 00:05:13 +00:00
Julie Hockett b524d5e553 Revert "[clang] Adding CharacteristicKind to PPCallbacks::InclusionDirective"
This reverts commit r331904 because of a memory leak.

llvm-svn: 331932
2018-05-09 22:25:47 +00:00
Manoj Gupta 4fbf84c173 [Clang] Implement function attribute no_stack_protector.
Summary:
This attribute tells clang to skip this function from stack protector
when -stack-protector option is passed.
GCC option for this is:
__attribute__((__optimize__("no-stack-protector"))) and the
equivalent clang syntax would be: __attribute__((no_stack_protector))

This is used in Linux kernel to selectively disable stack protector
in certain functions.

Reviewers: aaron.ballman, rsmith, rnk, probinson

Reviewed By: aaron.ballman

Subscribers: probinson, srhines, cfe-commits

Differential Revision: https://reviews.llvm.org/D46300

llvm-svn: 331925
2018-05-09 21:41:18 +00:00
Julie Hockett 36d94ab8f0 [clang] Adding CharacteristicKind to PPCallbacks::InclusionDirective
Adding a SrcMgr::CharacteristicKind parameter to the InclusionDirective
in PPCallbacks, and updating calls to that function. This will be useful
in https://reviews.llvm.org/D43778 to determine which includes are system
headers.

Differential Revision: https://reviews.llvm.org/D46614

llvm-svn: 331904
2018-05-09 18:27:33 +00:00
Alexey Bataev c15ea70b1f [OPENMP] Generate unique names for offloading regions id.
It is required to emit unique names for offloading regions ids. Required
to support compilation and linking of several compilation units.

llvm-svn: 331899
2018-05-09 18:02:37 +00:00
Yaxun Liu 3cab24aa4f [OpenCL] Fix typos in emitted enqueue kernel function names
Two typos: 
vaarg => vararg
get_kernel_preferred_work_group_multiple => get_kernel_preferred_work_group_size_multiple

Differential Revision: https://reviews.llvm.org/D46601

llvm-svn: 331895
2018-05-09 17:07:06 +00:00
Alexey Bataev e253f2f886 [OPENMP] Mark global tors/dtors as used.
If the global variables are marked as declare target and they need
ctors/dtors, these ctors/dtors are emitted and then invoked by the
offloading runtime library. They are not explicitly used in the emitted
code and thus can be optimized out. Patch marks these functions as used,
so the optimizer cannot remove these function during the optimization
phase.

llvm-svn: 331879
2018-05-09 14:15:18 +00:00
Hans Wennborg ef2f6948be Revert r331843 "[DebugInfo] Generate debug information for labels."
It broke the Chromium build (see reply on the review).

> Generate DILabel metadata and call llvm.dbg.label after label
> statement to associate the metadata with the label.
>
> Differential Revision: https://reviews.llvm.org/D45045
>
> Patch by Hsiangkai Wang.

This doesn't revert the change to backend-unsupported-error.ll
that seems to correspond to an llvm-side change.

llvm-svn: 331861
2018-05-09 09:29:58 +00:00
Shiva Chen 667fbe2cb0 [DebugInfo] Generate debug information for labels.
Generate DILabel metadata and call llvm.dbg.label after label
statement to associate the metadata with the label.

Differential Revision: https://reviews.llvm.org/D45045

Patch by Hsiangkai Wang.

llvm-svn: 331843
2018-05-09 02:41:56 +00:00
Adrian Prantl 9fc8faf9e6 Remove \brief commands from doxygen comments.
This is similar to the LLVM change https://reviews.llvm.org/D46290.

We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46320

llvm-svn: 331834
2018-05-09 01:00:01 +00:00
Alexey Bataev 9a70017537 [OPENMP, NVPTX] Fix linkage of the global entries.
The linkage of the global entries must be weak to enable support of
redefinition of the same target regions in multiple compilation units.

llvm-svn: 331768
2018-05-08 14:16:57 +00:00
Simon Pilgrim 3366dcfe1f Fix 'not all control paths return a value' MSVC warnings. NFCI.
llvm-svn: 331753
2018-05-08 09:40:32 +00:00
Eric Fiselier c5fb858053 [C++2a] Implement operator<=>: Address bugs and post-commit review comments after r331677.
This patch addresses some mostly trivial post-commit review comments received
on r331677.

Additionally, this patch fixes an assertion in `getNarrowingKind` caused by
the use of an uninitialized value from `checkThreeWayNarrowingConversion`.

llvm-svn: 331707
2018-05-08 00:52:19 +00:00
Eric Fiselier 0683c0e68d [C++2a] Implement operator<=> CodeGen and ExprConstant
Summary:
This patch tackles long hanging fruit for the builtin operator<=> expressions. It is currently needs some cleanup before landing, but I want to get some initial feedback.

The main changes are:

* Lookup, build, and store the required standard library types and expressions in `ASTContext`. By storing them in ASTContext we don't need to store (and duplicate) the required expressions in the BinaryOperator AST nodes. 

* Implement [expr.spaceship] checking, including diagnosing narrowing conversions. 

* Implement `ExprConstant` for builtin spaceship operators.

* Implement builitin operator<=> support in `CodeGenAgg`. Initially I emitted the required comparisons using `ScalarExprEmitter::VisitBinaryOperator`, but this caused the operand expressions to be emitted once for every required cmp.

* Implement [builtin.over] with modifications to support the intent of P0946R0. See the note on `BuiltinOperatorOverloadBuilder::addThreeWayArithmeticOverloads` for more information about the workaround.




Reviewers: rsmith, aaron.ballman, majnemer, rnk, compnerd, rjmccall

Reviewed By: rjmccall

Subscribers: rjmccall, rsmith, aaron.ballman, junbuml, mgorny, cfe-commits

Differential Revision: https://reviews.llvm.org/D45476

llvm-svn: 331677
2018-05-07 21:07:10 +00:00
Alexey Bataev 504fc2d0cd [OPENMP, NVPTX] Codegen for critical construct.
Added correct codegen for the critical construct on NVPTX devices.

llvm-svn: 331652
2018-05-07 17:23:05 +00:00
Alexey Bataev d7ff6d647f [OPENMP, NVPTX] Added support for L2 parallelism.
Added initial codegen for level 2, 3 etc. parallelism. Currently, all
the second, the third etc. parallel regions will run sequentially.

llvm-svn: 331642
2018-05-07 14:50:05 +00:00
Teresa Johnson 66744f8137 [ThinLTO] Support opt remarks options with distributed ThinLTO backends
Summary:
Passes down the necessary code ge options to the LTO Config to enable
-fdiagnostics-show-hotness and -fsave-optimization-record in the ThinLTO
backend for a distributed build.

Also, remove warning about not having PGO when the input is IR.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, cfe-commits

Differential Revision: https://reviews.llvm.org/D46464

llvm-svn: 331592
2018-05-05 14:37:29 +00:00
Brian Gesiak ea9144e818 [Coroutines] Catch exceptions in await_resume
Summary:
http://wg21.link/P0664r2 section "Evolution/Core Issues 24" describes a
proposed change to Coroutines TS that would have any exceptions thrown
after the initial suspend point of a coroutine be caught by the handler
specified by the promise type's 'unhandled_exception' member function.
This commit provides a sample implementation of the specified behavior.

Test Plan: `check-clang`

Reviewers: GorNishanov, EricWF

Reviewed By: GorNishanov

Subscribers: cfe-commits, lewissbaker, eric_niebler

Differential Revision: https://reviews.llvm.org/D45860

llvm-svn: 331519
2018-05-04 14:02:37 +00:00
Craig Topper 274506dab8 [CodeGenFunction] Use the StringRef::split function that takes a char separator instead of StringRef separator. NFC
The char separator version should be a little better optimized.

llvm-svn: 331482
2018-05-03 21:01:33 +00:00
Piotr Padlewski 5dde809404 Rename invariant.group.barrier to launder.invariant.group
Summary:
This is one of the initial commit of "RFC: Devirtualization v2" proposal:
https://docs.google.com/document/d/16GVtCpzK8sIHNc2qZz6RN8amICNBtvjWUod2SujZVEo/edit?usp=sharing

Reviewers: rsmith, amharc, kuhar, sanjoy

Subscribers: arsenm, nhaehnle, javed.absar, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D45111

llvm-svn: 331448
2018-05-03 11:03:01 +00:00
Richard Smith eaf11ad709 Track the result of evaluating a computed noexcept specification on the
FunctionProtoType.

We previously re-evaluated the expression each time we wanted to know whether
the type is noexcept or not. We now evaluate the expression exactly once.

This is not quite "no functional change": it fixes a crasher bug during AST
deserialization where we would try to evaluate the noexcept specification in a
situation where we have not deserialized sufficient portions of the AST to
permit such evaluation.

llvm-svn: 331428
2018-05-03 03:58:32 +00:00
Alexey Bataev fac26cf4ca [OPENMP] Add support for reductions on simd directives in target
regions.

Added codegen for `simd reduction()` constructs in target directives.

llvm-svn: 331393
2018-05-02 20:03:27 +00:00
Alexey Bataev 6d94410914 [OPENMP] Support C++ member functions in the device constructs.
Added correct emission of the C++ member functions for the device
function when they are used in the device constructs.

llvm-svn: 331365
2018-05-02 15:45:28 +00:00
Alexey Bataev 18fa2323b6 [OPENMP] Emit names of the globals depending on target.
Some symbols are not allowed to be used as names on some targets. Patch
ries to unify the emission of the names of LLVM globals so they could be
used on different targets.

llvm-svn: 331358
2018-05-02 14:20:50 +00:00
Alexey Bataev fb38828cb2 [OPENMP] Emit template instatiation|specialization functions for
devices.

If the function is an instantiation|specialization of the template and
is used in the device code, the definitions of such functions should be
emitted for the device.

llvm-svn: 331261
2018-05-01 14:09:46 +00:00
Richard Smith 3a8244df6f Implement P0482R2, support for char8_t type.
This is not yet part of any C++ working draft, and so is controlled by the flag
-fchar8_t rather than a -std= flag. (The GCC implementation is controlled by a
flag with the same name.)

This implementation is experimental, and will be removed or revised
substantially to match the proposal as it makes its way through the C++
committee.

llvm-svn: 331244
2018-05-01 05:02:45 +00:00
Craig Topper 13d759f87e [CodeGen] Fix typo in comment form->from. NFC
llvm-svn: 331231
2018-04-30 22:02:48 +00:00
Sanjay Patel c81450e29b [Driver, CodeGen] rename options to disable an FP cast optimization
As suggested in the post-commit thread for rL331056, we should match these 
clang options with the established vocabulary of the corresponding sanitizer
option. Also, the use of 'strict' is well-known for these kinds of knobs, 
and we can improve the descriptive text in the docs.

So this intends to match the logic of D46135 but only change the words.
Matching LLVM commit to match this spelling of the attribute to follow shortly.

Differential Revision: https://reviews.llvm.org/D46236

llvm-svn: 331209
2018-04-30 18:19:03 +00:00
Alexey Bataev dadf2d1238 [OPENMP] Do not crash on codegen for CXX member functions.
Non-static member functions should not be emitted as a standalone
functions, this leads to compiler crash.

llvm-svn: 331206
2018-04-30 18:09:40 +00:00
Alexey Bataev 64e62dcfff [OPENMP] Do not crash on incorrect input data.
Emit error messages instead of compiler crashing when the target region
does not exist in the device code + fix crash when the location comes
from macros.

llvm-svn: 331195
2018-04-30 16:26:57 +00:00
Richard Smith b5f8171a1b PR37189 Fix incorrect end source location and spelling for a split '>>' token.
When a '>>' token is split into two '>' tokens (in C++11 onwards), or (as an
extension) when we do the same for other tokens starting with a '>', we can't
just use a location pointing to the first '>' as the location of the split
token, because that would result in our miscomputing the length and spelling
for the token. As a consequence, for example, a refactoring replacing 'A<X>'
with something else would sometimes replace one character too many, and
similarly diagnostics highlighting a template-id source range would highlight
one character too many.

Fix this by creating an expansion range covering the first character of the
'>>' token, whose spelling is '>'. For this to work, we generalize the
expansion range of a macro FileID to be either a token range (the common case)
or a character range (used in this new case).

llvm-svn: 331155
2018-04-30 05:25:48 +00:00
Sanjay Patel d175476566 [Driver, CodeGen] add options to enable/disable an FP cast optimization
As discussed in the post-commit thread for:
rL330437 ( http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180423/545906.html )

We need a way to opt-out of a float-to-int-to-float cast optimization because too much 
existing code relies on the platform-specific undefined result of those casts when the 
float-to-int overflows.

The LLVM changes associated with adding this function attribute are here:
rL330947
rL330950
rL330951

Also as suggested, I changed the LLVM doc to mention the specific sanitizer flag that 
catches this problem:
rL330958

Differential Revision: https://reviews.llvm.org/D46135

llvm-svn: 331041
2018-04-27 14:22:48 +00:00
Oliver Stannard 2fcee8bd52 [ARM,AArch64] Add intrinsics for dot product instructions
The ACLE spec which describes these intrinsics hasn't been published yet, but
this is based on the final draft which will be published soon, and these have
already been implemented by GCC.

Differential revision: https://reviews.llvm.org/D46109

llvm-svn: 331039
2018-04-27 14:03:32 +00:00
Sven van Haastregt 4700faa28e [OpenCL] Add separate read_only and write_only pipe IR types
SPIR-V encodes the read_only and write_only access qualifiers of pipes,
so separate LLVM IR types are required to target SPIR-V.  Other backends
may also find this useful.

These new types are `opencl.pipe_ro_t` and `opencl.pipe_wo_t`, which
replace `opencl.pipe_t`.

This replaces __get_pipe_num_packets(...) and __get_pipe_max_packets(...)
which took a read_only pipe with separate versions for read_only and
write_only pipes, namely:

 * __get_pipe_num_packets_ro(...)
 * __get_pipe_num_packets_wo(...)
 * __get_pipe_max_packets_ro(...)
 * __get_pipe_max_packets_wo(...)

These separate versions exist to avoid needing a bitcast to one of the
two qualified pipe types.

Patch by Stuart Brady.

Differential Revision: https://reviews.llvm.org/D46015

llvm-svn: 331026
2018-04-27 10:37:04 +00:00
Akira Hatanaka ccda3d2970 [CodeGen] Avoid destructing a callee-destructued struct type in a
function if a function delegates to another function.

Fix a bug introduced in r328731, which caused a struct with ObjC __weak
fields that was passed to a function to be destructed twice, once in the
callee function and once in another function the callee function
delegates to. To prevent this, keep track of the callee-destructed
structs passed to a function and disable their cleanups at the point of
the call to the delegated function.

This reapplies r331016, which was reverted in r331019 because it caused
an assertion to fail in EmitDelegateCallArg on a windows bot. I made
changes to EmitDelegateCallArg so that it doesn't try to deactivate
cleanups for structs that have trivial destructors (cleanups for those
structs are never pushed to the cleanup stack in EmitParmDecl).

rdar://problem/39194693

Differential Revision: https://reviews.llvm.org/D45382

llvm-svn: 331020
2018-04-27 06:57:00 +00:00
Akira Hatanaka b4f3637cec Revert "[CodeGen] Avoid destructing a callee-destructued struct type in a"
This reverts commit r331016, which broke a windows bot.

http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/11727

llvm-svn: 331019
2018-04-27 05:56:55 +00:00
Akira Hatanaka e712374496 [CodeGen] Avoid destructing a callee-destructued struct type in a
function if a function delegates to another function.

Fix a bug introduced in r328731, which caused a struct with ObjC __weak
fields that was passed to a function to be destructed twice, once in the
callee function and once in another function the callee function
delegates to. To prevent this, keep track of the callee-destructed
structs passed to a function and disable their cleanups at the point of
the call to the delegated function.

rdar://problem/39194693

Differential Revision: https://reviews.llvm.org/D45382

llvm-svn: 331016
2018-04-27 04:21:51 +00:00
Chandler Carruth 16429acacb [x86] Revert r330322 (& r330323): Lowering x86 adds/addus/subs/subus intrinsics
The LLVM commit introduces a crash in LLVM's instruction selection.

I filed http://llvm.org/PR37260 with the test case.

llvm-svn: 330997
2018-04-26 21:46:01 +00:00
Faisal Vali a534f07f8c Revert rC330794 and some dependent tiny bug fixes
See Richard's humbling feedback here: 
http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20180423/226482.html
http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20180423/226486.html

Wish I'd had the patience to solicit the feedback prior to committing :)

Sorry for the noise guys.

Thank you Richard for being the steward that clang deserves!

llvm-svn: 330888
2018-04-26 00:42:40 +00:00
Faisal Vali 936de9d666 [c++2a] [concepts] Add rudimentary parsing support for template concept declarations
This patch is a tweak of changyu's patch: https://reviews.llvm.org/D40381. It differs in that the recognition of the 'concept' token is moved into the machinery that recognizes declaration-specifiers - this allows us to leverage the attribute handling machinery more seamlessly.

See the test file to get a sense of the basic parsing that this patch supports. 

There is much more work to be done before concepts are usable...

Thanks Changyu!

llvm-svn: 330794
2018-04-25 02:42:26 +00:00
Yaxun Liu 887c569bcb [HIP] Add hip input kind and codegen for kernel launching
HIP is a language similar to CUDA (https://github.com/ROCm-Developer-Tools/HIP/blob/master/docs/markdown/hip_kernel_language.md ).
The language syntax is very similar, which allows a hip program to be compiled as a CUDA program by Clang. The main difference
is the host API. HIP has a set of vendor neutral host API which can be implemented on different platforms. Currently there is open source
implementation of HIP runtime on amdgpu target (https://github.com/ROCm-Developer-Tools/HIP).

This patch adds support of input kind and language standard hip.

When hip file is compiled, both LangOpts.CUDA and LangOpts.HIP is turned on. This allows compilation of hip program as CUDA
in most cases and only special handling of hip program is needed LangOpts.HIP is checked.

This patch also adds support of kernel launching of HIP program using HIP host API.

When -x hip is not specified, there is no behaviour change for CUDA.

Patch by Greg Rodgers.
Revised and lit test added by Yaxun Liu.

Differential Revision: https://reviews.llvm.org/D44984

llvm-svn: 330790
2018-04-25 01:10:37 +00:00
Roman Lebedev e931b02616 Link to AggressiveInstCombine in a few places. Unbreaks build for me.
/usr/local/bin/ld.lld: error: undefined symbol: llvm::createAggressiveInstCombinerPass()
>>> referenced by cc1_main.cpp
>>>               tools/clang/tools/driver/CMakeFiles/clang.dir/cc1_main.cpp.o:(_GLOBAL__sub_I_cc1_main.cpp)

And so on

The bot coverage is clearly missing.

llvm-svn: 330694
2018-04-24 08:40:44 +00:00
David Blaikie d2a57220ac Fix build break due to content moving from Scalar.h to InstCombine.h in LLVM
llvm-svn: 330671
2018-04-24 00:59:22 +00:00
Alexey Bataev 2091ca6c97 [OPENMP] Do not cast captured by value variables with pointer types in
NVPTX target.

When generating the wrapper function for the offloading region, we need
to call the outlined function and cast the arguments correctly to follow
the ABI. Usually, variables captured by value are casted to `uintptr_t`
type. But this should not performed for the variables with pointer type.

llvm-svn: 330620
2018-04-23 17:33:41 +00:00
Mikhail Maltsev 4a4e7a31ad [CodeGen] Reland r330442: Add an option to suppress output of llvm.ident
The test case in the original patch was overly contrained and
failed on PPC targets.

llvm-svn: 330575
2018-04-23 10:08:46 +00:00
Andrew V. Tischenko 8ab2c9cd1e Use special new Clang flag 'FrontendTimesIsEnabled' instead of 'llvm::TimePassesIsEnabled' inside -ftime-report feature.
Differential Revision: https://reviews.llvm.org/D45619

llvm-svn: 330571
2018-04-23 09:22:30 +00:00
Tim Northover 9dc1d0c74e [Atomics] warn about atomic accesses using libcalls
If an atomic variable is misaligned (and that suspicion is why Clang emits
libcalls at all) the runtime support library will have to use a lock to safely
access it, with potentially very bad performance consequences. There's a very
good chance this is unintentional so it makes sense to issue a warning.

Also give it a named group so people can promote it to an error, or disable it
if they really don't care.

llvm-svn: 330566
2018-04-23 08:16:24 +00:00
Mikhail Maltsev 42b2a0e162 Revert r330442, CodeGen/no-ident-version.c is failing on PPC
llvm-svn: 330451
2018-04-20 17:14:39 +00:00
Yaxun Liu 4306f2086f [CUDA] Set LLVM calling convention for CUDA kernel
Some targets need special LLVM calling convention for CUDA kernel.
This patch does that through a TargetCodeGenInfo hook.

It only affects amdgcn target.

Patch by Greg Rodgers.
Revised and lit tests added by Yaxun Liu.

Differential Revision: https://reviews.llvm.org/D45223

llvm-svn: 330447
2018-04-20 17:01:03 +00:00
Mikhail Maltsev 6550c13912 [CodeGen] Add an option to suppress output of llvm.ident
Summary:
By default Clang outputs its version (including git commit hash, in
case of trunk builds) into object and assembly files. It might be
useful to have an option to disable this, especially for debugging
purposes.
This patch implements new command line flags -Qn and -Qy (the names
are chosen for compatibility with GCC). -Qn disables output of
the 'llvm.ident' metadata string and the 'producer' debug info. -Qy
(enabled by default) does the opposite.

Reviewers: faisalv, echristo, aprantl

Reviewed By: aprantl

Subscribers: aprantl, cfe-commits, JDevlieghere, rogfer01

Differential Revision: https://reviews.llvm.org/D45255

llvm-svn: 330442
2018-04-20 16:29:03 +00:00
Jonas Hahnfeld f5527c2381 [CUDA] Register relocatable GPU binaries
nvcc generates a unique registration function for each object file
that contains relocatable device code. Unique names are achieved
with a module id that is also reflected in the function's name.

Differential Revision: https://reviews.llvm.org/D42922

llvm-svn: 330425
2018-04-20 13:04:45 +00:00
Alexey Sotkin 3858e26f22 [OpenCL] Add 'denorms-are-zero' function attribute
Summary:
Generate attribute 'denorms-are-zero'='true' if '-cl-denorms-are-zero'
compile option was specified and 'denorms-are-zero'='false' otherwise.

Patch by krisb

Reviewers: Anastasia, yaxunl

Reviewed By:  yaxunl 

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D45808

llvm-svn: 330404
2018-04-20 08:08:04 +00:00
Saleem Abdulrasool 3fe5b7a497 Implement proper support for `-falign-functions`
This implements support for the previously ignored flag
`-falign-functions`.  This allows the frontend to request alignment on
function definitions in the translation unit where they are not
explicitly requested in code.  This is compatible with the GCC behaviour
and the ICC behaviour.

The scalar value passed to `-falign-functions` aligns functions to a
power-of-two boundary.  If flag is used, the functions are aligned to
16-byte boundaries.  If the scalar is specified, it must be an integer
less than or equal to 4096.  If the value is not a power-of-two, the
driver will round it up to the nearest power of two.

llvm-svn: 330378
2018-04-19 23:14:57 +00:00
Erich Keane b127a39404 Fix __attribute__((force_align_arg_pointer)) misalignment bug
The force_align_arg_pointer attribute was using a hardcoded 16-byte
alignment value which in combination with -mstack-alignment=32 (or
larger) would produce a misaligned stack which could result in crashes
when accessing stack buffers using aligned AVX load/store instructions.

Fix the issue by using the "stackrealign" function attribute instead
of using a hardcoded 16-byte alignment.

Patch By: Gramner

Differential Revision: https://reviews.llvm.org/D45812

llvm-svn: 330331
2018-04-19 14:27:05 +00:00
Alexander Ivchenko d96ddccdb4 Lowering x86 adds/addus/subs/subus intrinsics (clang)
This is the patch that lowers x86 intrinsics to native IR
in order to enable optimizations.

Patch by tkrupa

Differential Revision: https://reviews.llvm.org/D44786

llvm-svn: 330323
2018-04-19 12:15:11 +00:00
Akira Hatanaka 4ce0e5a892 [CodeGen] Do not push a destructor cleanup for a struct that doesn't
have a non-trivial destructor.

This fixes a bug introduced in r328731 where CodeGen emits calls to
synthesized destructors for non-trivial C structs in C++ mode when the
struct passed to EmitCallArg doesn't have a non-trivial destructor.
Under Microsoft's ABI, ASTContext::isParamDestroyedInCallee currently
always returns true, so it's necessary to check whether the struct has a
non-trivial destructor before pushing a cleanup in EmitCallArg.

This fixes PR37146.

llvm-svn: 330304
2018-04-18 23:33:15 +00:00
Reid Kleckner 54a33d7a27 [MS] Fix unprototyped thunk emission for incomplete return types
Fixes PR37161

llvm-svn: 330303
2018-04-18 23:21:32 +00:00
Artem Belevich 0ae8590354 [NVPTX, CUDA] Added support for m8n32k16 and m32n8k16 variants of wmma instructions.
The new instructions were added added for sm_70+ GPUs in CUDA-9.1.

Differential Revision: https://reviews.llvm.org/D45068

llvm-svn: 330296
2018-04-18 21:51:48 +00:00
Keith Wyss f437e35671 [XRay] Add clang builtin for xray typed events.
Summary:
A clang builtin for xray typed events. Differs from
__xray_customevent(...) by the presence of a type tag that is vended by
compiler-rt in typical usage. This allows xray handlers to expand logged
events with their type description and plugins to process traced events
based on type.

This change depends on D45633 for the intrinsic definition.

Reviewers: dberris, pelikan, rnk, eizan

Subscribers: cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D45716

llvm-svn: 330220
2018-04-17 21:32:43 +00:00
Akira Hatanaka 52a84e750a Move the visitor classes that are used to traverse non-trivial C structs
to a header file.

This is in preparation for using the visitor classes to warn about
memcpy'ing non-trivial C structs.

See the discussion here:
https://reviews.llvm.org/D45310

rdar://problem/36124208

llvm-svn: 330201
2018-04-17 19:05:17 +00:00
Akira Hatanaka 617e26152d Add a command line option 'fregister_global_dtors_with_atexit' to
register destructor functions annotated with __attribute__((destructor))
using __cxa_atexit or atexit.

Register destructor functions annotated with __attribute__((destructor))
calling __cxa_atexit in a synthesized constructor function instead of
emitting references to the functions in a special section.

The primary reason for adding this option is that we are planning to
deprecate the __mod_term_funcs section on Darwin in the future. This
feature is enabled by default only on Darwin. Users who do not want this
can use command line option 'fno_register_global_dtors_with_atexit' to
disable it.

rdar://problem/33887655

Differential Revision: https://reviews.llvm.org/D45578

llvm-svn: 330199
2018-04-17 18:41:52 +00:00
Teresa Johnson 9e4321c12d [ThinLTO] Pass -save-temps to LTO backend for distributed ThinLTO builds
Summary:
The clang driver option -save-temps was not passed to the LTO config,
so when invoking the ThinLTO backends via clang during distributed
builds there was no way to get LTO to save temp files.

Getting this to work with ThinLTO distributed builds also required
changing the driver to avoid a separate compile step to emit unoptimized
bitcode when the input was already bitcode under -save-temps. Not only is
this unnecessary in general, it is problematic for ThinLTO backends since
the temporary bitcode file to the backend would not match the module path
in the combined index, leading to incorrect ThinLTO backend index-based
optimizations.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, cfe-commits

Differential Revision: https://reviews.llvm.org/D45217

llvm-svn: 330194
2018-04-17 16:39:25 +00:00
Aaron Ballman fe93546b11 Add modifiers for unsigned char and signed char field printing for __builtin_dump_struct.
Patch by Paul Semel.

llvm-svn: 330188
2018-04-17 14:00:06 +00:00
Aaron Ballman b6a7702297 Add checks for format specifiers used by __builtin_dump_struct and added a new specifier for null-terminated constant strings.
Patch by Paul Semel.

llvm-svn: 330185
2018-04-17 11:57:47 +00:00
Alexey Bataev 2c1dffece6 [OPENMP] Allow to use declare target variables in map clauses
Global variables marked as declare target are allowed to be used in map
clauses. Patch fixes the crash of the compiler on the declare target
variables in map clauses.

llvm-svn: 330156
2018-04-16 20:34:41 +00:00
Akira Hatanaka 7c55265c2d [CodeGen] Fix a crash that occurs when a non-trivial C struct with a
volatile array field is copied.

The crash occurs because method 'visitArray' passes a null FieldDecl to
method 'visit' and some of the methods called downstream expect a
non-null FieldDecl to be passed.

This reapplies r330151 with a fix to the test case.

rdar://problem/33599681

llvm-svn: 330155
2018-04-16 20:23:52 +00:00
Alexey Bataev 9ff8083d98 [OPENMP] General code improvements.
llvm-svn: 330154
2018-04-16 20:16:21 +00:00
Akira Hatanaka d55675d068 Revert "[CodeGen] Fix a crash that occurs when a non-trivial C struct with a"
This reverts commit r330151, which caused bots to fail.

llvm-svn: 330153
2018-04-16 19:53:59 +00:00
Bruno Cardoso Lopes a3b5f71eaa Use export_as for autolinking frameworks
framework module SomeKitCore {
  ...
  export_as SomeKit
}

Given the module above, while generting autolink information during
codegen, clang should to emit '-framework SomeKitCore' only if SomeKit
was not imported in the relevant TU, otherwise it should use '-framework
SomeKit' instead.

rdar://problem/38269782

llvm-svn: 330152
2018-04-16 19:42:32 +00:00
Akira Hatanaka 1c3bd2ff0c [CodeGen] Fix a crash that occurs when a non-trivial C struct with a
volatile array field is copied.

The crash occurs because method 'visitArray' passes a null FieldDecl to
method 'visit' and some of the methods called downstream expect a
non-null FieldDecl to be passed.

rdar://problem/33599681

llvm-svn: 330151
2018-04-16 19:38:00 +00:00
Alexey Bataev a4fa0b880a [OPENMP] General code improvements.
llvm-svn: 330140
2018-04-16 17:59:34 +00:00
Brock Wyma 94ece8fbc9 [CodeView] Initial support for emitting S_THUNK32 symbols for compiler...
When emitting CodeView debug information, compiler-generated thunk routines
should be emitted using S_THUNK32 symbols instead of S_GPROC32_ID symbols so
Visual Studio can properly step into the user code.  This initial support only
handles standard thunk ordinals.

Differential Revision: https://reviews.llvm.org/D43838

llvm-svn: 330132
2018-04-16 16:53:57 +00:00
Malcolm Parsons fab3680990 Clean carriage returns from lib/ and include/. NFC.
Summary:
Clean carriage returns from lib/ and include/. NFC.
(I have to make this change locally in order for `git diff` to show sane output after I edit a file, so I might as well ask for it to be committed. I don't have commit privs myself.)
(Without this patch, `git rebase`ing any change involving SemaDeclCXX.cpp is a real nightmare. :( So while I have no right to ask for this to be committed, geez would it make my workflow easier if it were.)

Here's the command I used to reformat things. (Requires bash and OSX/FreeBSD sed.)

    git grep -l $'\r' lib include | xargs sed -i -e $'s/\r//'
    find lib include -name '*-e' -delete

Reviewers: malcolm.parsons

Reviewed By: malcolm.parsons

Subscribers: emaste, krytarowski, cfe-commits

Differential Revision: https://reviews.llvm.org/D45591

Patch by Arthur O'Dwyer.

llvm-svn: 330112
2018-04-16 08:31:08 +00:00
Andrey Konovalov 1ba9d9c6ca hwasan: add -fsanitize=kernel-hwaddress flag
This patch adds -fsanitize=kernel-hwaddress flag, that essentially enables
-hwasan-kernel=1 -hwasan-recover=1 -hwasan-match-all-tag=0xff.

Differential Revision: https://reviews.llvm.org/D45046

llvm-svn: 330044
2018-04-13 18:05:21 +00:00
Alexey Bataev 43a919f667 [OPENMP] Replace push_back by emplace_back, NFC.
llvm-svn: 330042
2018-04-13 17:48:43 +00:00
Alexey Bataev ddf3db9b5e [OPENMP] Code cleanup + formatting, NFC.
llvm-svn: 330040
2018-04-13 17:31:06 +00:00
Ivan A. Kosarev 9cdb2c75d9 [NEON] Support vrndns_f32 intrinsic
Differential Revision: https://reviews.llvm.org/D45515

llvm-svn: 330012
2018-04-13 12:46:02 +00:00
Dean Michael Berris 488f7c2b67 [XRay][clang] Add flag to choose instrumentation bundles
Summary:
This change addresses http://llvm.org/PR36926 by allowing users to pick
which instrumentation bundles to use, when instrumenting with XRay. In
particular, the flag `-fxray-instrumentation-bundle=` has four valid
values:

- `all`: the default, emits all instrumentation kinds
- `none`: equivalent to -fnoxray-instrument
- `function`: emits the entry/exit instrumentation
- `custom`: emits the custom event instrumentation

These can be combined either as comma-separated values, or as
repeated flag values.

Reviewers: echristo, kpw, eizan, pelikan

Reviewed By: pelikan

Subscribers: mgorny, cfe-commits

Differential Revision: https://reviews.llvm.org/D44970

llvm-svn: 329985
2018-04-13 02:31:58 +00:00
Eli Friedman 01d349bab1 Remove -cc1 option "-backend-option".
It means the same thing as -mllvm; there isn't any reason to have two
options which do the same thing.

Differential Revision: https://reviews.llvm.org/D45109

llvm-svn: 329965
2018-04-12 22:21:36 +00:00
Erich Keane cf3c4a9f24 [NFC] Fix terrible formatting of CGRecordLower constructor.
llvm-svn: 329952
2018-04-12 20:46:31 +00:00
David Chisnall 10e590e950 ObjCGNU: Fix empty v3 protocols being emitted two fields short
Summary:
Protocols that were being referenced but could not be fully realized were being emitted without `properties`/`optional_properties`. Since all v3 protocols must be 9 processor words wide, the lack of these fields is catastrophic for the runtime.

As an example, the runtime cannot know [here](https://github.com/gnustep/libobjc2/blob/master/protocol.c#L73) that `properties` and `optional_properties` are invalid.

Reviewers: rjmccall, theraven

Reviewed By: rjmccall, theraven

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D45305

llvm-svn: 329882
2018-04-12 06:46:15 +00:00
Shoaib Meenai 34aa13169b [CodeGen] Handle __func__ inside __finally
When we enter a __finally block, the CGF's CurCodeDecl will be null
(because CodeGenFunction::StartFunction is given an empty GlobalDecl for
a __finally block), and so the dyn_cast here will result in an assertion
failure. Change it to dyn_cast_or_null to handle this case.

Differential Revision: https://reviews.llvm.org/D45523

llvm-svn: 329836
2018-04-11 18:17:35 +00:00
Aaron Ballman 0652534131 Introduce a new builtin, __builtin_dump_struct, that is useful for dumping structure contents at runtime in circumstances where debuggers may not be easily available (such as in kernel work).
Patch by Paul Semel.

llvm-svn: 329762
2018-04-10 21:58:13 +00:00
Alexey Bataev c0f879bcec [OPENMP] Additional attributes for the pointer parameters.
Added attributes for better optimization of the OpenMP code.

llvm-svn: 329751
2018-04-10 20:10:53 +00:00
Nico Weber ade321e7dd Revert r329684 (and follow-ups 329693, 329714). See discussion on https://reviews.llvm.org/D43578.
llvm-svn: 329739
2018-04-10 18:53:28 +00:00
Andrew V. Tischenko c88deb100f -ftime-report switch support in Clang.
The current support of the feature produces only 2 lines in report:
 -Some general Code Generation Time;
 -Total time of Backend Consumer actions.
This patch extends Clang time report with new lines related to Preprocessor, Include Filea Search, Parsing, etc.
Differential Revision: https://reviews.llvm.org/D43578

llvm-svn: 329684
2018-04-10 10:34:13 +00:00
Vitaly Buka 69a2e18b4a asan: kernel: make no_sanitize("address") attribute work with -fsanitize=kernel-address
Summary:
Right now to disable -fsanitize=kernel-address instrumentation, one needs to use no_sanitize("kernel-address"). Make either no_sanitize("address") or no_sanitize("kernel-address")  disable both ASan and KASan instrumentation. Also remove redundant test.

Patch by Andrey Konovalov

Reviewers: eugenis, kcc, glider, dvyukov, vitalybuka

Reviewed By: eugenis, vitalybuka

Differential Revision: https://reviews.llvm.org/D44981

llvm-svn: 329612
2018-04-09 20:10:29 +00:00
Craig Topper 304edc1e75 [X86] Emit native IR for pmuldq/pmuludq builtins.
I believe all the pieces are now in place in the backend to make this work correctly. We can either mask the input to 32 bits for pmuludg or shl/ashr for pmuldq and use a regular mul instruction. The backend should combine this to PMULUDQ/PMULDQ and then SimplifyDemandedBits will remove the and/shifts.

Differential Revision: https://reviews.llvm.org/D45421

llvm-svn: 329605
2018-04-09 19:17:54 +00:00
John McCall bfbc05e2f5 Generalize the swiftcall API since being passed indirectly isn't
C++-specific anymore.

llvm-svn: 329513
2018-04-07 20:16:47 +00:00
Nico Weber 727f22bff2 Make CodeGen depend just once on clangAnalysis.
llvm-svn: 329477
2018-04-07 03:29:47 +00:00
Alexey Bataev e290ec02c7 [OPENMP, NVPTX] Fix codegen for the teams reduction.
Added NUW flags for all the add|mul|sub operations + replaced sdiv by udiv
as we operate on unsigned values only (addresses, converted to integers)

llvm-svn: 329411
2018-04-06 16:03:36 +00:00
Alexander Kornienko 2a8c18d991 Fix typos in clang
Found via codespell -q 3 -I ../clang-whitelist.txt
Where whitelist consists of:

  archtype
  cas
  classs
  checkk
  compres
  definit
  frome
  iff
  inteval
  ith
  lod
  methode
  nd
  optin
  ot
  pres
  statics
  te
  thru

Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few
files that have dubious fixes reverted.)

Differential revision: https://reviews.llvm.org/D44188

llvm-svn: 329399
2018-04-06 15:14:32 +00:00
Krzysztof Parzyszek 49fb6b5ecf [Hexagon] Remove default values from lambda parameters
llvm-svn: 329394
2018-04-06 13:51:48 +00:00
Richard Smith e78fac5126 PR36992: do not store beyond the dsize of a class object unless we know
the tail padding is not reused.

We track on the AggValueSlot (and through a couple of other
initialization actions) whether we're dealing with an object that might
share its tail padding with some other object, so that we can avoid
emitting stores into the tail padding if that's the case. We still
widen stores into tail padding when we can do so.

Differential Revision: https://reviews.llvm.org/D45306

llvm-svn: 329342
2018-04-05 20:52:58 +00:00
Akira Hatanaka 0c194461b5 [ObjC] Use the name specified by objc_runtime_name instead of the class
identifier.

This patch fixes a few places in CGObjCMac.cpp where the class
identifier was used instead of the name specified by objc_runtime_name.

rdar://problem/37910822

Differential Revision: https://reviews.llvm.org/D45101

llvm-svn: 329128
2018-04-03 22:50:16 +00:00
Vlad Tsyrklevich e55aa03ad4 Add the -fsanitize=shadow-call-stack flag
Summary:
Add support for the -fsanitize=shadow-call-stack flag which causes clang
to add ShadowCallStack attribute to functions compiled with that flag
enabled.

Reviewers: pcc, kcc

Reviewed By: pcc, kcc

Subscribers: cryptoad, cfe-commits, kcc

Differential Revision: https://reviews.llvm.org/D44801

llvm-svn: 329122
2018-04-03 22:33:53 +00:00
Artem Belevich 55ebd6cc26 Revert "Set calling convention for CUDA kernel"
This reverts r328795 which introduced an issue with referencing __global__
function templates. More details in the original review D44747.

llvm-svn: 329099
2018-04-03 18:29:31 +00:00
Reid Kleckner 399d96e39c [MS] Emit vftable thunks for functions with incomplete prototypes
Summary:
The following class hierarchy requires that we be able to emit a
this-adjusting thunk for B::foo in C's vftable:

  struct Incomplete;
  struct A {
    virtual A* foo(Incomplete p) = 0;
  };
  struct B : virtual A {
    void foo(Incomplete p) override;
  };
  struct C : B { int c; };

This TU is valid, but lacks a definition of 'Incomplete', which makes it
hard to build a thunk for the final overrider, B::foo.

Before this change, Clang gives up attempting to emit the thunk, because
it assumes that if the parameter types are incomplete, it must be
emitting the thunk for optimization purposes. This is untrue for the MS
ABI, where the implementation of B::foo has no idea what thunks C's
vftable may require. Clang needs to emit the thunk without necessarily
having access to the complete prototype of foo.

This change makes Clang emit a musttail variadic call when it needs such
a thunk. I call these "unprototyped" thunks, because they only prototype
the "this" parameter, which must always come first in the MS C++ ABI.

These thunks work, but they create ugly LLVM IR. If the call to the
thunk is devirtualized, it will be a call to a bitcast of a function
pointer. Today, LLVM cannot inline through such a call, but I want to
address that soon, because we also use this pattern for virtual member
pointer thunks.

This change also implements an old FIXME in the code about reusing the
thunk's computed CGFunctionInfo as much as possible. Now we don't end up
computing the thunk's mangled name and arranging it's prototype up to
around three times.

Fixes PR25641

Reviewers: rjmccall, rsmith, hans

Subscribers: Prazek, cfe-commits

Differential Revision: https://reviews.llvm.org/D45112

llvm-svn: 329009
2018-04-02 20:20:33 +00:00
Reid Kleckner cbec0269ba Fix some DenseMap use-after-rehash bugs and hoist MethodVFTableLocation
This re-lands r328845 with fixes for crbug.com/827810.

The initial motiviation was to hoist MethodVFTableLocation to global
scope so it could be forward declared.

In this patch, I noticed that MicrosoftVTableContext uses some risky
patterns. It has methods that return references to data stored in
DenseMaps. I've made some of them return by value for trivial structs
and I've moved some things into separate allocations.

llvm-svn: 329007
2018-04-02 20:00:39 +00:00
Richard Smith 866dee4ea0 Add helper to determine if a field is a zero-length bitfield.
llvm-svn: 328999
2018-04-02 18:29:43 +00:00
Yaxun Liu a64a491e7b [CUDA] Let device-side shared variables be initialized with undef
CUDA shared variable should be initialized with undef.

Patch by Greg Rodgers.
Revised and lit test added by Yaxun Liu.

Differential Revision: https://reviews.llvm.org/D44985

llvm-svn: 328994
2018-04-02 17:38:24 +00:00
Gor Nishanov 2a78fa5209 [coroutines] Add __builtin_coro_noop => llvm.coro.noop
A recent addition to Coroutines TS (https://wg21.link/p0913) adds a pre-defined
coroutine noop_coroutine that does nothing. To implement this feature, we implemented
an llvm.coro.noop intrinsic that returns a coroutine handle to a coroutine that
does nothing when resumed or destroyed.

This patch adds a builtin __builtin_coro_noop() that maps to llvm.coro.noop intrinsic.

Related llvm change: https://reviews.llvm.org/D45114

llvm-svn: 328993
2018-04-02 17:35:37 +00:00
Brian Gesiak 91a4b5af3a [Coroutines] Schedule coro-split before asan
Summary:
The docs for the LLVM coroutines intrinsic `@llvm.coro.id` state that
"The second argument, if not null, designates a particular alloca instruction
to be a coroutine promise."

However, if the address sanitizer pass is run before the `@llvm.coro.id`
intrinsic is lowered, the `alloca` instruction passed to the intrinsic as its
second argument is converted, as per the
https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm docs, to
an `inttoptr` instruction that accesses the address of the promise.

On optimization levels `-O1` and above, the `-asan` pass is run after
`-coro-early`, `-coro-split`, and `-coro-elide`, and before
`-coro-cleanup`, and so there is no issue. At `-O0`, however, `-asan`
is run in between `-coro-early` and `-coro-split`, which causes an
assertion to be hit when the `inttoptr` instruction is forcibly cast to
an `alloca`.

Rearrange the passes such that the coroutine passes are registered
before the sanitizer passes.

Test Plan:
Compile a simple C++ program that uses coroutines in `-O0` with
`-fsanitize-address`, and confirm no assertion is hit:
`clang++ coro-example.cpp -fcoroutines-ts -g -fsanitize=address -fno-omit-frame-pointer`.

Reviewers: GorNishanov, lewissbaker, EricWF

Reviewed By: GorNishanov

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D43927

llvm-svn: 328951
2018-04-01 23:55:21 +00:00
John McCall 4fcd9ef673 Fix a major swiftcall ABI bug with trivial C++ class types.
The problem with the previous logic was that there might not be any
explicit copy/move constructor declarations, e.g. if the type is
trivial and we've never type-checked a copy of it.  Relying on Sema's
computation seems much more reliable.

Also, I believe Richard's recommendation is exactly the rule we use
now on the Itanium ABI, modulo the trivial_abi attribute (which this
change of course fixes our handling of in Swift).

This does mean that we have a less portable rule for deciding
indirectness for swiftcall.  I would prefer it if we just applied the
Itanium rule universally under swiftcall, but in the meantime, I need
to fix this bug.

This only arises when defining functions with class-type arguments
in C++, as we do in the Swift runtime.  It doesn't affect normal Swift
operation because we don't import code as C++.

llvm-svn: 328942
2018-04-01 21:04:30 +00:00
Nico Weber e7c7d70278 Revert r328845, it caused crbug.com/827810.
llvm-svn: 328922
2018-03-31 18:26:25 +00:00
Alexey Bataev 03f270c900 [OPENMP] Added emission of offloading data sections for declare target
variables.

Added emission of the offloading data sections for the variables within
declare target regions + fixes emission of the declare target variables
marked as declare target not within the declare target region.

llvm-svn: 328888
2018-03-30 18:31:07 +00:00
Reid Kleckner 9e3eb9f9d2 Hoist MethodVFTableLocation out of MicrosoftVTableContext, NFC
This allows forward declaring it so that we can add it to
MicrosoftMangleContext::mangleVirtualMemPtrThunk without including
VTableBuilder.h. That saves a hashtable lookup when emitting virtual
member pointer functions.

It also shortens a really long type name. This struct has "VFtable" in
the name, so it seems pretty unlikely that someone will assume it is
generally useful for non-MS C++ ABI stuff.

llvm-svn: 328845
2018-03-29 22:42:24 +00:00
Rafael Espindola b2c47fbf94 Set dso_local on cfi_slowpath.
llvm-svn: 328836
2018-03-29 22:08:01 +00:00
Rafael Espindola 54d44bf14c Mark __cfi_check as dso_local.
llvm-svn: 328825
2018-03-29 20:51:30 +00:00
Akira Hatanaka 673af7a688 Generalize NRVO to cover C structs.
This commit generalizes NRVO to cover C structs (both trivial and
non-trivial structs).

rdar://problem/33599681

Differential Revision: https://reviews.llvm.org/D44968

llvm-svn: 328809
2018-03-29 17:56:24 +00:00
Rafael Espindola c9643d8fc8 Set dso_local when clearing dllimport.
llvm-svn: 328801
2018-03-29 16:45:18 +00:00
Yaxun Liu b2f2bb26e4 Set calling convention for CUDA kernel
This patch sets target specific calling convention for CUDA kernels in IR.

Patch by Greg Rodgers.
Revised and lit test added by Yaxun Liu.

Differential Revision: https://reviews.llvm.org/D44747

llvm-svn: 328795
2018-03-29 15:02:08 +00:00
Yaxun Liu b0eee29c74 Disable emitting static extern C aliases for amdgcn target for CUDA
Patch by Greg Rodgers.
Revised and lit test added by Yaxun Liu.

Differential Revision: https://reviews.llvm.org/D44987

llvm-svn: 328793
2018-03-29 14:50:00 +00:00
Krzysztof Parzyszek 790e422be9 [Hexagon] Aid bit-reverse load intrinsics lowering with bitcode
The conversion of operatios to bitcode helps to eliminate an additional
store in certain cases. We used to lower these load intrinsics in DAG to
DAG conversion by which time, the "Dead Store Elimination" pass is
already run. There is an associated LLVM patch.
    
Patch by Sumanth Gundapaneni.

llvm-svn: 328776
2018-03-29 13:54:31 +00:00
Akira Hatanaka fcbe17c6be [ObjC++] Make parameter passing and function return compatible with ObjC
ObjC and ObjC++ pass non-trivial structs in a way that is incompatible
with each other. For example:
    
typedef struct {
  id f0;
  __weak id f1;
} S;
    
// this code is compiled in c++.
extern "C" {
  void foo(S s);
}
    
void caller() {
  // the caller passes the parameter indirectly and destructs it.
  foo(S());
}
    
// this function is compiled in c.
// 'a' is passed directly and is destructed in the callee.
void foo(S a) {
}
    
This patch fixes the incompatibility by passing and returning structs
with __strong or weak fields using the C ABI in C++ mode. __strong and
__weak fields in a struct do not cause the struct to be destructed in
the caller and __strong fields do not cause the struct to be passed
indirectly.
    
Also, this patch fixes the microsoft ABI bug mentioned here:
    
https://reviews.llvm.org/D41039?id=128767#inline-364710
    
rdar://problem/38887866
    
Differential Revision: https://reviews.llvm.org/D44908

llvm-svn: 328731
2018-03-28 21:13:14 +00:00
Krzysztof Parzyszek 1ef2a1f414 [Hexagon] Add support for "new" circular buffer intrinsics
These instructions have been around for a long time, but we
haven't supported intrinsics for them. The "new" vesrions use
the CSx register for the start of the buffer instead of the K
field in the Mx register.

There is a related llvm patch.

Patch by Brendon Cahoon.

llvm-svn: 328725
2018-03-28 19:40:57 +00:00
David Blaikie c133b1e387 Fix for LLVM header changes
llvm-svn: 328718
2018-03-28 17:45:10 +00:00
Alexey Bataev 34f8a7043b [OPENMP] Codegen for ctor|dtor of declare target variables.
When the declare target variables are emitted for the device,
constructors|destructors for these variables must emitted and registered
by the runtime in the offloading sections.

llvm-svn: 328705
2018-03-28 14:28:54 +00:00
Mandeep Singh Grang c205d8cc8d [clang] Change std::sort to llvm::sort in response to r327219
r327219 added wrappers to std::sort which randomly shuffle the container before
sorting.  This will help in uncovering non-determinism caused due to undefined
sorting order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of
std::sort.

llvm-svn: 328636
2018-03-27 16:50:00 +00:00
Alexey Bataev 92327c50d3 [OPENMP] Codegen for declare target with link clause.
If the link clause is used on the declare target directive, the object
should be linked on target or target data directives, not during the
codegen. Patch adds support for this clause.

llvm-svn: 328544
2018-03-26 16:40:55 +00:00
Matt Morehouse 5317f2e4c9 [libFuzzer] Use OptForFuzzing attribute with -fsanitize=fuzzer.
Summary:
Disables certain CMP optimizations to improve fuzzing signal under -O1
and -O2.

Switches all fuzzer tests to -O2 except for a few leak tests where the
leak is optimized out under -O2.

Reviewers: kcc, vitalybuka

Reviewed By: vitalybuka

Subscribers: cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D44798

llvm-svn: 328384
2018-03-23 23:35:28 +00:00
David Blaikie 4e1ae83b3c Change for an LLVM header file move
llvm-svn: 328380
2018-03-23 22:16:59 +00:00
Yaxun Liu ac1263cd54 [AMDGPU] Fix codegen for inline assembly
Need to override convertConstraint to recognise amdgpu specific register names.

Differential Revision: https://reviews.llvm.org/D44533

llvm-svn: 328359
2018-03-23 19:43:42 +00:00
Tony Tye 68e11a6eca [AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPU (CLANG)
Add two additional implicit arguments for OpenCL for the AMDGPU target using the AMDHSA runtime to support device enqueue.

Differential Revision: https://reviews.llvm.org/D44696

llvm-svn: 328350
2018-03-23 18:51:45 +00:00
Tony Tye 1a3f3a2d14 [AMDGPU] Remove use of OpenCL triple environment and replace with function attribute for AMDGPU (CLANG)
- Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target.
- Use a function attribute to communicate to the AMDGPU backend.

Differential Revision: https://reviews.llvm.org/D43735

llvm-svn: 328347
2018-03-23 18:43:15 +00:00
Rafael Espindola fe9a55a3f1 Bring r328238 back with a fix.
The issues was that we were setting hidden visibility if, when
processing a hidden class, we found out that we needed to emit a
reference to a vtable provided by the standard library.

Original message:

Set dso_local on vtables.

llvm-svn: 328288
2018-03-23 01:36:23 +00:00
Abderrazek Zaafrani b5ac56fb81 [ARM] Add ARMv8.2-A FP16 vector intrinsic
Putting back the code in commit r327189 that was reverted in r322737. The code is being committed in three stages and this one is the last stage: 1) r327455 fp16 feature flags, 2) r327836 pass half type or i16 based on FullFP16, and 3) the code here which the front-end fp16 vector intrinsic for ARM.

Differential revision https://reviews.llvm.org/D43650

llvm-svn: 328277
2018-03-23 00:08:40 +00:00