Commit Graph

12258 Commits

Author SHA1 Message Date
Simon Pilgrim a7bcd72c0a [X86] Replace VPCOM/VPCOMU with generic integer comparisons (clang)
These intrinsics can always be replaced with generic integer comparisons without any regression in codegen, even for -O0/-fast-isel cases.

Noticed while cleaning up vector integer comparison costs for PR40376.

A future commit will remove/autoupgrade the existing VPCOM/VPCOMU llvm intrinsics.

llvm-svn: 351687
2019-01-20 16:40:33 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Johannes Doerfert ac991bbb44 Emit !callback metadata and introduce the callback attribute
With commit r351627, LLVM gained the ability to apply (existing) IPO
  optimizations on indirections through callbacks, or transitive calls.
  The general idea is that we use an abstraction to hide the middle man
  and represent the callback call in the context of the initial caller.
  It is described in more detail in the commit message of the LLVM patch
  r351627, the llvm::AbstractCallSite class description, and the
  language reference section on callback-metadata.

  This commit enables clang to emit !callback metadata that is
  understood by LLVM. It does so in three different cases:
    1) For known broker functions declarations that are directly
       generated, e.g., __kmpc_fork_call for the OpenMP pragma parallel.
    2) For known broker functions that are identified by their name and
       source location through the builtin detection, e.g.,
       pthread_create from the POSIX thread API.
    3) For user annotated functions that carry the "callback(callee, ...)"
       attribute. The attribute has to include the name, or index, of
       the callback callee and how the passed arguments can be
       identified (as many as the callback callee has). See the callback
       attribute documentation for detailed information.

Differential Revision: https://reviews.llvm.org/D55483

llvm-svn: 351629
2019-01-19 05:36:54 +00:00
Zola Bridges 826ef59568 [clang][slh] add Clang attr no_speculative_load_hardening
Summary:
This attribute will allow users to opt specific functions out of
speculative load hardening. This compliments the Clang attribute
named speculative_load_hardening. When this attribute or the attribute
speculative_load_hardening is used in combination with the flags
-mno-speculative-load-hardening or -mspeculative-load-hardening,
the function level attribute will override the default during LLVM IR
generation. For example, in the case, where the flag opposes the
function attribute, the function attribute will take precendence.
The sticky inlining behavior of the speculative_load_hardening attribute
may cause a function with the no_speculative_load_hardening attribute
to be tagged with the speculative_load_hardening tag in
subsequent compiler phases which is desired behavior since the
speculative_load_hardening LLVM attribute is designed to be maximally
conservative.

If both attributes are specified for a function, then an error will be
thrown.

Reviewers: chandlerc, echristo, kristof.beyls, aaron.ballman

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54909

llvm-svn: 351565
2019-01-18 17:20:46 +00:00
Richard Smith 0444006fff Fix cleanup registration for lambda captures.
Lambda captures should be destroyed if an exception is thrown only if
the construction of the complete lambda-expression has not completed.
(If the lambda-expression has been fully constructed, any exception will
invoke its destructor, which will destroy the captures.)

This is directly modeled after how we handle the equivalent situation in
InitListExprs.

Note that EmitLambdaLValue was unreachable because in C++11 onwards the
frontend never creates the awkward situation where a prvalue expression
(such as a lambda) is used in an lvalue context (such as the left-hand
side of a class member access).

llvm-svn: 351487
2019-01-17 22:05:50 +00:00
Erik Pilkington 2ff012df81 [CodeGenObjC] Use a constant value for non-fragile ivar offsets when possible
If a class inherits from NSObject and has an implementation, then we
can assume that ivar offsets won't need to be updated by the runtime.
This allows us to index into the object using a constant value and
avoid loading from the ivar offset variable.

This patch was adapted from one written by Pete Cooper.

rdar://problem/10132568

Differential revision: https://reviews.llvm.org/D56802

llvm-svn: 351461
2019-01-17 18:18:53 +00:00
Vlad Tsyrklevich c93390b5c5 TLS: Respect visibility for thread_local variables on Darwin (PR40327)
Summary:
Teach clang to mark thread wrappers for thread_local variables with
hidden visibility when the original variable is marked with hidden
visibility. This is necessary on Darwin which exposes the thread wrapper
instead of the thread variable. The thread wrapper would previously
always be created with default visibility unless it had
linkonce*/weak_odr linkage.

Reviewers: rjmccall

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D56818

llvm-svn: 351457
2019-01-17 17:53:45 +00:00
Anton Korobeynikov 81cff31ccf CodeGen: Cast llvm.flt.rounds result to match __builtin_flt_rounds
llvm.flt.rounds returns an i32, but the builtin expects an integer. 
On targets where integers are not 32-bits clang tries to bitcast the result, causing an assertion failure.

The patch enables newlib build for msp430.

Patch by Edward Jones!

Differential Revision: https://reviews.llvm.org/D24461

llvm-svn: 351449
2019-01-17 15:21:55 +00:00
Philip Pfaffe 88a13b9159 [NewPM] Add -fsanitize={memory,thread} handling to clang
Summary: This is the missing bit to drive thread and memory sanitizers through clang using the new PassManager.

Reviewers: chandlerc, fedor.sergeev, vitalybuka, leonardchan

Subscribers: bollu, llvm-commits

Differential Revision: https://reviews.llvm.org/D56831

llvm-svn: 351423
2019-01-17 10:10:47 +00:00
Craig Topper 015585abb2 [X86] Add custom emission for the avx512 scatter builtins to convert from scalar integer to vXi1 for the mask arguments to the intrinsics.
llvm-svn: 351408
2019-01-17 00:34:19 +00:00
Craig Topper 931779761e Recommit r351160 "[X86] Make _xgetbv/_xsetbv on non-windows platforms"
V8 has been fixed now.

llvm-svn: 351391
2019-01-16 22:56:25 +00:00
Craig Topper bb5b06603b [X86] Add versions of the avx512 gather intrinsics that take the mask as a vXi1 vector instead of a scalar
We need to custom handle these so we can turn the scalar mask into a vXi1 vector.

Differential Revision: https://reviews.llvm.org/D56530

llvm-svn: 351390
2019-01-16 22:34:33 +00:00
Leonard Chan 837da5d3ec [Fixed Point Arithmetic] Fixed Point Subtraction
This patch covers subtraction between fixed point types and other fixed point
types or integers, using the conversion rules described in 4.1.4 of N1169.

Differential Revision: https://reviews.llvm.org/D55844

llvm-svn: 351371
2019-01-16 19:53:50 +00:00
Leonard Chan 86285d2e17 [Fixed Point Arithmetic] Add APFixedPoint to APValue
This adds APFixedPoint to the union of values that can be represented with an APValue.

Differential Revision: https://reviews.llvm.org/D56746

llvm-svn: 351368
2019-01-16 18:53:05 +00:00
Leonard Chan 2044ac89aa [Fixed Point Arithmetic] Fixed Point Addition
This patch covers addition between fixed point types and other fixed point
types or integers, using the conversion rules described in 4.1.4 of N1169.

Usual arithmetic rules do not apply to binary operations when one of the
operands is a fixed point type, and the result of the operation must be
calculated with the full precision of the operands, so we should not perform
any casting to a common type.

This patch does not include constant expression evaluation for addition of
fixed point types. That will be addressed in another patch since I think this
one is already big enough.

Differential Revision: https://reviews.llvm.org/D53738

llvm-svn: 351364
2019-01-16 18:13:59 +00:00
Anton Korobeynikov 383e827121 [MSP430] Improve support of 'interrupt' attribute
* Accept as an argument constants in range 0..63 (aligned with TI headers and linker scripts provided with TI GCC toolchain).
* Emit function attribute 'interrupt'='xx' instead of aliases (used in the backend to create a section for particular interrupt vector).
* Add more diagnostics.

Patch by Kristina Bessonova!

Differential Revision: https://reviews.llvm.org/D56663

llvm-svn: 351344
2019-01-16 13:44:01 +00:00
Philip Pfaffe 685c76d7a3 [NewPM][TSan] Reiterate the TSan port
Summary:
Second iteration of D56433 which got reverted in rL350719. The problem
in the previous version was that we dropped the thunk calling the tsan init
function. The new version keeps the thunk which should appease dyld, but is not
actually OK wrt. the current semantics of function passes. Hence, add a
helper to insert the functions only on the first time. The helper
allows hooking into the insertion to be able to append them to the
global ctors list.

Reviewers: chandlerc, vitalybuka, fedor.sergeev, leonardchan

Subscribers: hiraditya, bollu, llvm-commits

Differential Revision: https://reviews.llvm.org/D56538

llvm-svn: 351314
2019-01-16 09:28:01 +00:00
Sanjin Sijaric cfa2a2afa6 [SEH] Pass the frame pointer from SEH finally to finally functions
Pass the frame pointer that the first finally block receives onto the nested
finally block, instead of generating it using localaddr.

Differential Revision: https://reviews.llvm.org/D56463

llvm-svn: 351302
2019-01-16 07:39:44 +00:00
Eli Friedman c4c43b2bad [EH] Rename llvm.x86.seh.recoverfp intrinsic to llvm.eh.recoverfp
This is the clang counterpart to D56747.

Patch by Mandeep Singh Grang.

Differential Revision: https://reviews.llvm.org/D56748

llvm-svn: 351284
2019-01-16 00:50:44 +00:00
Peter Collingbourne e9f521069f CodeGen: Remove debug printf unintentionally added in r351228.
llvm-svn: 351241
2019-01-15 20:59:59 +00:00
Anton Korobeynikov 93165d648f [MSP430] Provide a toolchain description
This is an initial implementation for msp430 toolchain including
-mmcu option support
-mhwmult options support
-integrated-as by default

The toolchain uses msp430-elf-as as a linker and supports msp430-gcc toolchain tree.

Patch by Kristina Bessonova!

Differential Revision: https://reviews.llvm.org/D56658

llvm-svn: 351228
2019-01-15 19:44:05 +00:00
Benjamin Kramer 9c53890833 Revert "[X86] Make _xgetbv/_xsetbv on non-windows platforms"
This reverts commit r351160. Breaks building v8.

llvm-svn: 351210
2019-01-15 17:23:36 +00:00
Roman Lebedev bd1c087019 [clang][UBSan] Sanitization for alignment assumptions.
Summary:
UB isn't nice. It's cool and powerful, but not nice.
Having a way to detect it is nice though.
[[ https://wg21.link/p1007r3 | P1007R3: std::assume_aligned ]] / http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1007r2.pdf says:
```
We propose to add this functionality via a library function instead of a core language attribute.
...
If the pointer passed in is not aligned to at least N bytes, calling assume_aligned results in undefined behaviour.
```

This differential teaches clang to sanitize all the various variants of this assume-aligned attribute.

Requires D54588 for LLVM IRBuilder changes.
The compiler-rt part is D54590.

This is a second commit, the original one was r351105,
which was mass-reverted in r351159 because 2 compiler-rt tests were failing.

Reviewers: ABataev, craig.topper, vsk, rsmith, rnk, #sanitizers, erichkeane, filcab, rjmccall

Reviewed By: rjmccall

Subscribers: chandlerc, ldionne, EricWF, mclow.lists, cfe-commits, bkramer

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D54589

llvm-svn: 351177
2019-01-15 09:44:25 +00:00
Craig Topper 69aed7c364 [X86] Make _xgetbv/_xsetbv on non-windows platforms
Summary:
This patch attempts to redo what was tried in r278783, but was reverted.

These intrinsics should be available on non-windows platforms with "xsave" feature check. But on Windows platforms they shouldn't have feature check since that's how MSVC behaves.

To accomplish this I've added a MS builtin with no feature check. And a normal gcc builtin with a feature check. When _MSC_VER is not defined _xgetbv/_xsetbv will be macros pointing to the gcc builtin name.

I've moved the forward declarations from intrin.h to immintrin.h to match the MSDN documentation and used that as the header file for the MS builtin.

I'm not super happy with this implementation, and I'm open to suggestions for better ways to do it.

Reviewers: rnk, RKSimon, spatel

Reviewed By: rnk

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D56686

llvm-svn: 351160
2019-01-15 05:03:18 +00:00
Vlad Tsyrklevich 86e68fda3b Revert alignment assumptions changes
Revert r351104-6, r351109, r351110, r351119, r351134, and r351153. These
changes fail on the sanitizer bots.

llvm-svn: 351159
2019-01-15 03:38:02 +00:00
Roman Lebedev 7892c37455 [clang][UBSan] Sanitization for alignment assumptions.
Summary:
UB isn't nice. It's cool and powerful, but not nice.
Having a way to detect it is nice though.
[[ https://wg21.link/p1007r3 | P1007R3: std::assume_aligned ]] / http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1007r2.pdf says:
```
We propose to add this functionality via a library function instead of a core language attribute.
...
If the pointer passed in is not aligned to at least N bytes, calling assume_aligned results in undefined behaviour.
```

This differential teaches clang to sanitize all the various variants of this assume-aligned attribute.

Requires D54588 for LLVM IRBuilder changes.
The compiler-rt part is D54590.

Reviewers: ABataev, craig.topper, vsk, rsmith, rnk, #sanitizers, erichkeane, filcab, rjmccall

Reviewed By: rjmccall

Subscribers: chandlerc, ldionne, EricWF, mclow.lists, cfe-commits, bkramer

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D54589

llvm-svn: 351105
2019-01-14 19:09:27 +00:00
Dan Gohman 51532a524e [WebAssembly] Remove old builtins
This removes the old grow_memory and mem.grow-style builtins, leaving just
the memory.grow-style builtins.

Differential Revision: https://reviews.llvm.org/D56645

llvm-svn: 351089
2019-01-14 18:28:10 +00:00
Anastasia Stulova d1986d1b5a [OpenCL] Set generic addr space of 'this' in special class members.
Set address spaces of 'this' param correctly for implicit special
class members.

This also changes initialization conversion sequence to separate
address space conversion from other qualifiers in case of binding
reference to a temporary. In this case address space conversion  
should happen after the binding (unlike for other quals). This is
needed to materialize it correctly in the alloca address space.

Initial patch by Mikael Nilssoni!

Differential Revision: https://reviews.llvm.org/D56066

llvm-svn: 351053
2019-01-14 11:44:22 +00:00
Sam McCall e60151c915 [AST] RecursiveASTVisitor visits lambda classes when implicit visitation is on.
Summary:
This fixes ASTContext's parent map for nodes in such classes (e.g. operator()).
https://bugs.llvm.org/show_bug.cgi?id=39949

This also changes the observed shape of the AST for implicit RAVs.
- this includes AST MatchFinder: cxxRecordDecl() now matches lambda classes,
functionDecl() matches the call operator, and the parent chain is body -> call
operator -> lambda class -> lambdaexpr rather than body -> lambdaexpr.
- this appears not to matter for the ASTImporterLookupTable builder
- this doesn't matter for the other RAVs in-tree.

In order to do this, we remove the TraverseLambdaBody hook. The problem is it's
hard/weird to ensure this hook is called when traversing via the implicit class.
There were just two users of this hook in-tree, who use it to skip bodies.
I replaced these with explicitly traversing the captures only. Another approach
would be recording the bodies when the lambda is visited, and then recognizing
them later.
I'd be open to suggestion on how to preserve this hook, instead.

Reviewers: aaron.ballman, JonasToth

Subscribers: cfe-commits, rsmith, jdennett

Differential Revision: https://reviews.llvm.org/D56444

llvm-svn: 351047
2019-01-14 10:31:42 +00:00
Craig Topper 49488407aa [X86] Remove mask parameter from avx512 pmultishiftqb intrinsics. Use select in IR instead.
Fixes PR40259

llvm-svn: 351036
2019-01-14 08:46:51 +00:00
Craig Topper 689b3b71af [X86] Remove mask parameter from vpshufbitqmb intrinsics. Change result to a vXi1 vector.
We'll do the scalar<->vXi1 conversions with bitcasts in IR.

Fixes PR40258

llvm-svn: 351029
2019-01-14 00:03:55 +00:00
Teresa Johnson 84cecfcb3d [LTO] Add option to enable LTOUnit splitting, and disable unless needed
Summary:
Adds a new -f[no]split-lto-unit flag that is disabled by default to
control module splitting during ThinLTO. It is automatically enabled
for -fsanitize=cfi and -fwhole-program-vtables.

The new EnableSplitLTOUnit codegen flag is passed down to llvm
via a new module flag of the same name.

Depends on D53890.

Reviewers: pcc

Subscribers: ormris, mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D53891

llvm-svn: 350949
2019-01-11 18:32:07 +00:00
Brian Gesiak 5488ab4ddd [AST] Remove ASTContext from getThisType (NFC)
Summary:
https://reviews.llvm.org/D54862 removed the usages of `ASTContext&` from
within the `CXXMethodDecl::getThisType` method. Remove the parameter
altogether, as well as all usages of it. This does not result in any
functional change because the parameter was unused since
https://reviews.llvm.org/D54862.

Test Plan: check-clang

Reviewers: akyrtzi, mikael

Reviewed By: mikael

Subscribers: mehdi_amini, dexonsmith, cfe-commits

Differential Revision: https://reviews.llvm.org/D56509

llvm-svn: 350914
2019-01-11 01:54:53 +00:00
Richard Trieu f8b8b39c60 Fix header issues.
Several headers would fail to compile if other headers were not previously
included.  The usual issue is that a class is forward declared, but the
full definition is needed.  The requirement for the definition is use of
isa/dyn_cast or calling functions of pointer-packed data types such as
DenseMap or PointerIntPair.  Add missing includes to these headers.

SVals.h required an out-of-line method definition in the .cpp file to avoid
circular inclusion of headers with BasicValueFactory.h

llvm-svn: 350913
2019-01-11 01:32:35 +00:00
Richard Smith 2f72a7521a In nothrow new-expressions, null-check the result if we're going to
apply sanitizers to it.

This avoids a sanitizer false positive that we are initializing a null
pointer.

llvm-svn: 350779
2019-01-10 00:03:29 +00:00
Shoaib Meenai 39287e759f [CodeGen] Clarify comment about COFF common symbol alignment
After a discussion on the commit thread, it seems the 32 byte alignment
limitation is an MSVC toolchain artifact, not an inherent COFF
restriction. Clarify the comment accordingly, since saying COFF in the
comment but using isKnownWindowsMSVCEnvironment in the conditional is
confusing. Also add a newline before the comment, which is consistent
with the local style.

Differential Revision: https://reviews.llvm.org/D56466

llvm-svn: 350754
2019-01-09 20:05:16 +00:00
Florian Hahn 603467a9a4 Revert r350648: "Fix clang for r350647: Missed a function rename"
The related commit r350647 breaks thread sanitizer on some macOS builders, e.g.
http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/52725/

llvm-svn: 350718
2019-01-09 13:30:47 +00:00
Philip Pfaffe f341abba39 Fix clang for r350647: Missed a function rename
llvm-svn: 350648
2019-01-08 20:00:55 +00:00
Erich Keane 85c6224971 Limit COFF 'common' emission to <=32 alignment types.
As reported in PR33035, LLVM crashes if given a common object with an
alignment of greater than 32 bits. This is because the COFF file format
does not support these alignments, so emitting them is broken anyway.

This patch changes any global definitions greater than 32 bit alignment
to no longer be in 'common'.

https://bugs.llvm.org/show_bug.cgi?id=33035

Differential Revision: https://reviews.llvm.org/D56391

Change-Id: I48609289753b7f3b58c5e2bc1712756750fbd45a
llvm-svn: 350643
2019-01-08 18:44:22 +00:00
Paul Robinson b1ce7c8c01 Don't emit DW_AT_enum_class unless it's actually an 'enum class'.
Finishes off the functional part of PR36168.

Differential Revision: https://reviews.llvm.org/D56393

llvm-svn: 350636
2019-01-08 16:28:11 +00:00
Alexey Bataev 7bb3353f6a [OPENMP]Add call to __kmpc_push_target_tripcount() function.
Each we create the target regions with the teams distribute inner
region, we can better estimate number of the teams required to execute
the target region. Function __kmpc_push_target_tripcount() is used for
purpose, which accepts device_id and the number of the iterations,
performed by the associated loop.

llvm-svn: 350571
2019-01-07 21:30:43 +00:00
Craig Topper cd9e232a4d Recommit r350555 "[X86] Use funnel shift intrinsics for the VBMI2 vshld/vshrd builtins."
The MSVC limit hit in AutoUpgrade.cpp has been worked around for now.

llvm-svn: 350568
2019-01-07 21:00:41 +00:00
Craig Topper 33c9088783 Revert r350555 "[X86] Use funnel shift intrinsics for the VBMI2 vshld/vshrd builtins."
Had to revert the LLVM patch this depends on to fix a MSVC compiler limit in AutoUpgrade.cpp

llvm-svn: 350563
2019-01-07 19:39:25 +00:00
Craig Topper e34f2bb807 [X86] Use funnel shift intrinsics for the VBMI2 vshld/vshrd builtins.
Differential Revision: https://reviews.llvm.org/D56365

llvm-svn: 350555
2019-01-07 19:10:22 +00:00
Alexey Bataev 25d3de8a0a [OPENMP][NVPTX]Reduce number of barriers in reductions.
After the fix for the syncthreads we don't need to generate extra
barriers for the parallel reductions.

llvm-svn: 350530
2019-01-07 15:45:09 +00:00
Bruno Ricci 9b6dfac5ad [AST] Store some data of CXXNewExpr as trailing objects
Store the optional array size expression, optional initialization expression
and optional placement new arguments in a trailing array. Additionally store
the range for the parenthesized type-id in a trailing object if needed since
in the vast majority of cases the type is not parenthesized (not a single new
expression in the translation unit of SemaDecl.cpp has a parenthesized type-id).

This saves 2 pointers per CXXNewExpr in all cases, and 2 pointers + 8 bytes
per CXXNewExpr in the common case where the type is not parenthesized.

Differential Revision: https://reviews.llvm.org/D56134

Reviewed By: rjmccall

llvm-svn: 350527
2019-01-07 15:04:45 +00:00
Saleem Abdulrasool 175890e1eb CodeGen: fix autolink emission on ELF
The autolinking extension for ELF uses a slightly different format for
encoding the autolink information compared to COFF and MachO.  Account
for this in the CGM to ensure that we do not assert when emitting
assembly or an object file.

llvm-svn: 350476
2019-01-05 19:27:12 +00:00
Saleem Abdulrasool da32d7f147 CodeGen: switch iteration to range based for loop (NFC)
Change a loop to range based instead while working on cleaning up some
modules autolinking issues on Linux.  NFC.

llvm-svn: 350472
2019-01-05 18:39:32 +00:00
Peter Collingbourne 87f477b5e4 hwasan: Implement lazy thread initialization for the interceptor ABI.
The problem is similar to D55986 but for threads: a process with the
interceptor hwasan library loaded might have some threads started by
instrumented libraries and some by uninstrumented libraries, and we
need to be able to run instrumented code on the latter.

The solution is to perform per-thread initialization lazily. If a
function needs to access shadow memory or add itself to the per-thread
ring buffer its prologue checks to see whether the value in the
sanitizer TLS slot is null, and if so it calls __hwasan_thread_enter
and reloads from the TLS slot. The runtime does the same thing if it
needs to access this data structure.

This change means that the code generator needs to know whether we
are targeting the interceptor runtime, since we don't want to pay
the cost of lazy initialization when targeting a platform with native
hwasan support. A flag -fsanitize-hwaddress-abi={interceptor,platform}
has been introduced for selecting the runtime ABI to target. The
default ABI is set to interceptor since it's assumed that it will
be more common that users will be compiling application code than
platform code.

Because we can no longer assume that the TLS slot is initialized,
the pthread_create interceptor is no longer necessary, so it has
been removed.

Ideally, lazy initialization should only cost one instruction in the
hot path, but at present the call may cause us to spill arguments
to the stack, which means more instructions in the hot path (or
theoretically in the cold path if the spills are moved with shrink
wrapping). With an appropriately chosen calling convention for
the per-thread initialization function (TODO) the hot path should
always need just one instruction and the cold path should need two
instructions with no spilling required.

Differential Revision: https://reviews.llvm.org/D56038

llvm-svn: 350429
2019-01-04 19:27:04 +00:00
Teresa Johnson 6ed7913c98 [ThinLTO] Clang changes to utilize new pass to handle chains of aliases
Summary:
As with NameAnonGlobals, invoke the new CanonicalizeAliases via clang
when using the new PM.

Depends on D54507.

Reviewers: pcc, davidxl

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, cfe-commits

Differential Revision: https://reviews.llvm.org/D55620

llvm-svn: 350424
2019-01-04 19:05:01 +00:00
Erik Pilkington 1e36882b52 [ObjCARC] Add an new attribute, objc_externally_retained
This attribute, called "objc_externally_retained", exposes clang's
notion of pseudo-__strong variables in ARC. Pseudo-strong variables
"borrow" their initializer, meaning that they don't retain/release
it, instead assuming that someone else is keeping their value alive.

If a function is annotated with this attribute, implicitly strong
parameters of that function aren't implicitly retained/released in
the function body, and are implicitly const. This is useful to expose
for performance reasons, most functions don't need the extra safety
of the retain/release, so programmers can opt out as needed.

This attribute can also apply to declarations of local variables,
with similar effect.

Differential revision: https://reviews.llvm.org/D55865

llvm-svn: 350422
2019-01-04 18:33:06 +00:00
Alexey Bataev 8e009036c9 [OPENMP][NVPTX]Use new functions from the runtime library.
Updated codegen to use the new functions from the runtime library.

llvm-svn: 350415
2019-01-04 17:25:09 +00:00
Aaron Ballman 9bdf515c74 Add two new pragmas for controlling software pipelining optimizations.
This patch adds #pragma clang loop pipeline and #pragma clang loop pipeline_initiation_interval for debugging or reducing compile time purposes. It is possible to disable SWP for concrete loops to save compilation time or to find bugs by not doing SWP to certain loops. It is possible to set value of initiation interval to concrete number to save compilation time by not doing extra pipeliner passes or to check created schedule for specific initiation interval.

Patch by Alexey Lapshin.

llvm-svn: 350414
2019-01-04 17:20:00 +00:00
Daniel Dunbar a39bab36c6 Adopt SwiftABIInfo for WebAssembly.
Summary:
 - This adopts SwiftABIInfo as the base class for WebAssemblyABIInfo, which is in keeping with what is done for other targets for which Swift is supported.

 - This is a minimal patch to unblock exploration of WASM support for Swift (https://bugs.swift.org/browse/SR-9307)

Reviewers: rjmccall, sunfish

Reviewed By: rjmccall

Subscribers: ahti, dschuff, sbc100, jgravelle-google, aheejin, cfe-commits

Differential Revision: https://reviews.llvm.org/D56188

llvm-svn: 350372
2019-01-03 23:24:50 +00:00
Alexey Bataev a3924b517e [OPENMP][NVPTX]Use __kmpc_barrier_simple_spmd(nullptr, 0) instead of
nvvm_barrier0.

Use runtime functions instead of the direct call to the nvvm intrinsics.
It allows to prevent some dangerous LLVM optimizations, that breaks the
code for the NVPTX target.

llvm-svn: 350328
2019-01-03 16:25:35 +00:00
Philip Pfaffe b39a97c8f6 [NewPM] Port Msan
Summary:
Keeping msan a function pass requires replacing the module level initialization:
That means, don't define a ctor function which calls __msan_init, instead just
declare the init function at the first access, and add that to the global ctors
list.

Changes:
- Pull the actual sanitizer and the wrapper pass apart.
- Add a newpm msan pass. The function pass inserts calls to runtime
  library functions, for which it inserts declarations as necessary.
- Update tests.

Caveats:
- There is one test that I dropped, because it specifically tested the
  definition of the ctor.

Reviewers: chandlerc, fedor.sergeev, leonardchan, vitalybuka

Subscribers: sdardis, nemanjai, javed.absar, hiraditya, kbarton, bollu, atanasyan, jsji

Differential Revision: https://reviews.llvm.org/D55647

llvm-svn: 350305
2019-01-03 13:42:44 +00:00
Patrick Lyster e13b1e3299 [OpenMP] Added support for explicit mapping of classes using 'this' pointer. Differential revision: https://reviews.llvm.org/D55982
llvm-svn: 350252
2019-01-02 19:28:48 +00:00
Pete Cooper de0a8d37a0 Only convert objc messages to alloc to objc_alloc if the receiver is a class.
r348687 converted [Foo alloc] to objc_alloc(Foo).  However the objc runtime method only takes a Class, not an arbitrary pointer.

This makes sure we are messaging a class before we convert these messages.

rdar://problem/46943703

llvm-svn: 350224
2019-01-02 17:25:30 +00:00
Akira Hatanaka c7c7574ea3 [CodeGen] Replace '@' characters in block descriptors' symbol names with
'\1'.

'@' can't be used in block descriptors' symbol names since it is
reserved on ELF platforms as a separator between symbol names and symbol
versions.

See the discussion here: https://reviews.llvm.org/D50783.

Differential Revision: https://reviews.llvm.org/D54539

llvm-svn: 350157
2018-12-29 17:28:30 +00:00
David Chisnall 386477a541 [objc-gnustep2] Fix a bug in category generation.
We were not emitting a protocol definition while generating the category
method list.  This was fine in most cases, because something else in the
library typically referenced any given protocol, but it caused linker
failures if the category was the only reference to a given protocol.

llvm-svn: 350130
2018-12-28 17:44:54 +00:00
David Chisnall ddd06821c4 [objc-gnustep] Fix a copy-and-paste error.
We were emitting the null class symbol in the wrong section, which meant
that programs that contained no Objective-C classes would fail to link.

llvm-svn: 350092
2018-12-27 14:44:36 +00:00
Artem Belevich 9953577cb2 [CUDA] Treat extern global variable shadows same as regular extern vars.
This fixes compiler crash when we attempted to compile this code:

extern __device__ int data;
__device__ int data = 1;

Differential Revision: https://reviews.llvm.org/D56033

llvm-svn: 349981
2018-12-22 01:11:09 +00:00
Pete Cooper e5b64ea2b8 Convert some ObjC retain/release msgSends to runtime calls.
It is faster to directly call the ObjC runtime for methods such as retain/release instead of sending a message to those functions.

Differential Revision: https://reviews.llvm.org/D55869

Reviewed By: rjmccall

llvm-svn: 349952
2018-12-21 21:00:32 +00:00
Bruno Ricci c5885cffc5 [AST] Store the callee and argument expressions of CallExpr in a trailing array.
Since CallExpr::setNumArgs has been removed, it is now possible to store the
callee expression and the argument expressions of CallExpr in a trailing array.
This saves one pointer per CallExpr, CXXOperatorCallExpr, CXXMemberCallExpr,
CUDAKernelCallExpr and UserDefinedLiteral.

Given that CallExpr is used as a base of the above classes we cannot use
llvm::TrailingObjects. Instead we store the offset in bytes from the this pointer
to the start of the trailing objects and manually do the casts + arithmetic.

Some notes:

1.) I did not try to fit the number of arguments in the bit-fields of Stmt.
    This leaves some space for future additions and avoid the discussion about
    whether x bits are sufficient to hold the number of arguments.

2.) It would be perfectly possible to recompute the offset to the trailing
    objects before accessing the trailing objects. However the trailing objects
    are frequently accessed and benchmarks show that it is slightly faster to
    just load the offset from the bit-fields. Additionally, because of 1),
    we have plenty of space in the bit-fields of Stmt.

Differential Revision: https://reviews.llvm.org/D55771

Reviewed By: rjmccall

llvm-svn: 349910
2018-12-21 15:20:32 +00:00
Bruno Ricci 5fc4db7579 [AST][NFC] Pass the AST context to one of the ctor of DeclRefExpr.
All of the other constructors already take a reference to the AST context.
This avoids calling Decl::getASTContext in most cases. Additionally move
the definition of the constructor from Expr.h to Expr.cpp since it is calling
DeclRefExpr::computeDependence. NFC.

llvm-svn: 349901
2018-12-21 14:10:18 +00:00
Volodymyr Sapsai 232d22f380 [CodeGen] Fix assertion on emitting cleanup for object with inlined inherited constructor and non-trivial destructor.
Fixes assertion
> Assertion failed: (isa<X>(Val) && "cast<Ty>() argument of incompatible type!"), function cast, file llvm/Support/Casting.h, line 255.

It was triggered by trying to cast `FunctionDecl` to `CXXMethodDecl` as
`CGF.CurCodeDecl` in `CallBaseDtor::Emit`. It was happening because
cleanups were emitted in `ScalarExprEmitter::VisitExprWithCleanups`
after destroying `InlinedInheritingConstructorScope`, so
`CodeGenFunction.CurCodeDecl` didn't correspond to expected cleanup decl.

Fix the assertion by emitting cleanups before leaving
`InlinedInheritingConstructorScope` and changing `CurCodeDecl`.

Test cases based on a patch by Shoaib Meenai.

Fixes PR36748.

rdar://problem/45805151

Reviewers: rsmith, rjmccall

Reviewed By: rjmccall

Subscribers: jkorous, dexonsmith, cfe-commits, smeenai, compnerd

Differential Revision: https://reviews.llvm.org/D55543

llvm-svn: 349848
2018-12-20 22:43:26 +00:00
Haibo Huang 303b2333e4 Declares __cpu_model as dso local
__builtin_cpu_supports and __builtin_cpu_is use information in __cpu_model to decide cpu features. Before this change, __cpu_model was not declared as dso local. The generated code looks up the address in GOT when reading __cpu_model. This makes it impossible to use these functions in ifunc, because at that time GOT entries have not been relocated. This change makes it dso local.

Differential Revision: https://reviews.llvm.org/D53850

llvm-svn: 349825
2018-12-20 21:33:59 +00:00
Michael Kruse 0535137e4a [CodeGen] Generate llvm.loop.parallel_accesses instead of llvm.mem.parallel_loop_access metadata.
Instead of generating llvm.mem.parallel_loop_access metadata, generate
llvm.access.group on instructions and llvm.loop.parallel_accesses on
loops. There is one access group per generated loop.

This is clang part of D52116/r349725.

Differential Revision: https://reviews.llvm.org/D52117

llvm-svn: 349823
2018-12-20 21:24:54 +00:00
Simon Pilgrim 4597379227 [X86] Auto upgrade XOP/AVX512 rotation intrinsics to generic funnel shift intrinsics (clang)
This emits FSHL/FSHR generic intrinsics for the XOP VPROT and AVX512 VPROL/VPROR rotation intrinsics.

LLVM counterpart: https://reviews.llvm.org/D55938

Differential Revision: https://reviews.llvm.org/D55937

llvm-svn: 349796
2018-12-20 19:01:13 +00:00
Pete Cooper 6c47f54df8 Use @llvm.objc.clang.arc.use intrinsic instead of clang.arc.use function.
Calls to this function are deleted in the ARC optimizer.  However when the ARC
optimizer was updated to use intrinsics instead of functions (r349534), the corresponding
clang change (r349535) to use intrinsics missed this one so it wasn't being deleted.

llvm-svn: 349782
2018-12-20 18:05:41 +00:00
Simon Pilgrim 313dc85ce0 [X86][SSE] Auto upgrade PADDS/PSUBS intrinsics to SADD_SAT/SSUB_SAT generic intrinsics (clang)
This emits SADD_SAT/SSUB_SAT generic intrinsics for the SSE signed saturated math intrinsics.

LLVM counterpart: https://reviews.llvm.org/D55894

Differential Revision: https://reviews.llvm.org/D55890

llvm-svn: 349743
2018-12-20 11:53:45 +00:00
Simon Pilgrim a7b30b4a58 [X86][SSE] Auto upgrade PADDUS/PSUBUS intrinsics to UADD_SAT/USUB_SAT generic intrinsics (clang)
Sibling patch to D55855, this emits UADD_SAT/USUB_SAT generic intrinsics for the SSE saturated math intrinsics instead of expanding to a IR code sequence that could be difficult to reassemble.

Differential Revision: https://reviews.llvm.org/D55879

llvm-svn: 349631
2018-12-19 14:43:47 +00:00
Bill Wendling aa77513bb9 Emit ASM input in a constant context
Summary:
Some ASM input constraints (e.g., "i" and "n") require immediate values. At O0,
very few code transformations are performed. So if we cannot resolve to an
immediate when emitting the ASM input we shouldn't delay its processing.

Reviewers: rsmith, efriedma

Reviewed By: efriedma

Subscribers: rehana, efriedma, craig.topper, jyknight, cfe-commits

Differential Revision: https://reviews.llvm.org/D55616

llvm-svn: 349561
2018-12-18 22:54:03 +00:00
Kelvin Li ef57943e3f [OPENMP] parsing and sema support for 'close' map-type-modifier
A map clause with the close map-type-modifier is a hint to 
prefer that the variables are mapped using a copy into faster 
memory.

Patch by Ahsan Saghir (saghir)

Differential Revision: https://reviews.llvm.org/D55719

llvm-svn: 349551
2018-12-18 22:18:41 +00:00
Vedant Kumar 77dfca88b2 [CodeGen] Handle mixed-width ops in mixed-sign mul-with-overflow lowering
The special lowering for __builtin_mul_overflow introduced in r320902
fixed an ICE seen when passing mixed-sign operands to the builtin.

This patch extends the special lowering to cover mixed-width, mixed-sign
operands. In a few common scenarios, calls to muloti4 will no longer be
emitted.

This should address the latest comments in PR34920 and work around the
link failure seen in:

  https://bugzilla.redhat.com/show_bug.cgi?id=1657544

Testing:
- check-clang
- A/B output comparison with: https://gist.github.com/vedantk/3eb9c88f82e5c32f2e590555b4af5081

Differential Revision: https://reviews.llvm.org/D55843

llvm-svn: 349542
2018-12-18 21:05:03 +00:00
Alexey Bataev 6a1b06bcd4 [OPENMP][NVPTX]Emit shared memory buffer for reduction as 128 bytes
buffer.

Seems to me, nvlink has a bug with the proper support of the weakly
linked symbols. It does not allow to define several shared memory buffer
with the different sizes even with the weak linkage. Instead we always
use 128 bytes buffer to prevent nvlink from the error message emission.

llvm-svn: 349540
2018-12-18 21:01:42 +00:00
Pete Cooper 2cd3596b1a Generate objc intrinsics instead of runtime calls as the ARC optimizer now works only on intrinsics
Differential Revision: https://reviews.llvm.org/D55802

Reviewers: rjmccall
llvm-svn: 349535
2018-12-18 20:33:00 +00:00
Alexey Bataev 29d47fcb30 [OPENMP][NVPTX]Added extra sync point to the inter-warp copy function.
The parallel reduction operation requires an extra synchronization point
in the inter-warp copy function to avoid divergence.

llvm-svn: 349525
2018-12-18 19:20:15 +00:00
Erich Keane 2a4eea3061 [NFC] Fix usage of Builder.insert(new Bitcast...)in CodeGenFunction
This is exactly a "CreateBitCast", so refactor this to get rid of a
'new'.

Note that this slightly changes the test, as the Builder is now
seemingly smart enough to fold one of the bitcasts into the annotation
call.

Change-Id: I1733fb1fdf91f5c9d88651067130b9a4e7b5ab67
llvm-svn: 349506
2018-12-18 16:22:21 +00:00
JF Bastien 14daa20be1 Automatic variable initialization
Summary:
Add an option to initialize automatic variables with either a pattern or with
zeroes. The default is still that automatic variables are uninitialized. Also
add attributes to request uninitialized on a per-variable basis, mainly to disable
initialization of large stack arrays when deemed too expensive.

This isn't meant to change the semantics of C and C++. Rather, it's meant to be
a last-resort when programmers inadvertently have some undefined behavior in
their code. This patch aims to make undefined behavior hurt less, which
security-minded people will be very happy about. Notably, this means that
there's no inadvertent information leak when:

  - The compiler re-uses stack slots, and a value is used uninitialized.
  - The compiler re-uses a register, and a value is used uninitialized.
  - Stack structs / arrays / unions with padding are copied.

This patch only addresses stack and register information leaks. There's many
more infoleaks that we could address, and much more undefined behavior that
could be tamed. Let's keep this patch focused, and I'm happy to address related
issues elsewhere.

To keep the patch simple, only some `undef` is removed for now, see
`replaceUndef`. The padding-related infoleaks are therefore not all gone yet.
This will be addressed in a follow-up, mainly because addressing padding-related
leaks should be a stand-alone option which is implied by variable
initialization.

There are three options when it comes to automatic variable initialization:

  0. Uninitialized

    This is C and C++'s default. It's not changing. Depending on code
    generation, a programmer who runs into undefined behavior by using an
    uninialized automatic variable may observe any previous value (including
    program secrets), or any value which the compiler saw fit to materialize on
    the stack or in a register (this could be to synthesize an immediate, to
    refer to code or data locations, to generate cookies, etc).

  1. Pattern initialization

    This is the recommended initialization approach. Pattern initialization's
    goal is to initialize automatic variables with values which will likely
    transform logic bugs into crashes down the line, are easily recognizable in
    a crash dump, without being values which programmers can rely on for useful
    program semantics. At the same time, pattern initialization tries to
    generate code which will optimize well. You'll find the following details in
    `patternFor`:

    - Integers are initialized with repeated 0xAA bytes (infinite scream).
    - Vectors of integers are also initialized with infinite scream.
    - Pointers are initialized with infinite scream on 64-bit platforms because
      it's an unmappable pointer value on architectures I'm aware of. Pointers
      are initialize to 0x000000AA (small scream) on 32-bit platforms because
      32-bit platforms don't consistently offer unmappable pages. When they do
      it's usually the zero page. As people try this out, I expect that we'll
      want to allow different platforms to customize this, let's do so later.
    - Vectors of pointers are initialized the same way pointers are.
    - Floating point values and vectors are initialized with a negative quiet
      NaN with repeated 0xFF payload (e.g. 0xffffffff and 0xffffffffffffffff).
      NaNs are nice (here, anways) because they propagate on arithmetic, making
      it more likely that entire computations become NaN when a single
      uninitialized value sneaks in.
    - Arrays are initialized to their homogeneous elements' initialization
      value, repeated. Stack-based Variable-Length Arrays (VLAs) are
      runtime-initialized to the allocated size (no effort is made for negative
      size, but zero-sized VLAs are untouched even if technically undefined).
    - Structs are initialized to their heterogeneous element's initialization
      values. Zero-size structs are initialized as 0xAA since they're allocated
      a single byte.
    - Unions are initialized using the initialization for the largest member of
      the union.

    Expect the values used for pattern initialization to change over time, as we
    refine heuristics (both for performance and security). The goal is truly to
    avoid injecting semantics into undefined behavior, and we should be
    comfortable changing these values when there's a worthwhile point in doing
    so.

    Why so much infinite scream? Repeated byte patterns tend to be easy to
    synthesize on most architectures, and otherwise memset is usually very
    efficient. For values which aren't entirely repeated byte patterns, LLVM
    will often generate code which does memset + a few stores.

  2. Zero initialization

    Zero initialize all values. This has the unfortunate side-effect of
    providing semantics to otherwise undefined behavior, programs therefore
    might start to rely on this behavior, and that's sad. However, some
    programmers believe that pattern initialization is too expensive for them,
    and data might show that they're right. The only way to make these
    programmers wrong is to offer zero-initialization as an option, figure out
    where they are right, and optimize the compiler into submission. Until the
    compiler provides acceptable performance for all security-minded code, zero
    initialization is a useful (if blunt) tool.

I've been asked for a fourth initialization option: user-provided byte value.
This might be useful, and can easily be added later.

Why is an out-of band initialization mecanism desired? We could instead use
-Wuninitialized! Indeed we could, but then we're forcing the programmer to
provide semantics for something which doesn't actually have any (it's
uninitialized!). It's then unclear whether `int derp = 0;` lends meaning to `0`,
or whether it's just there to shut that warning up. It's also way easier to use
a compiler flag than it is to manually and intelligently initialize all values
in a program.

Why not just rely on static analysis? Because it cannot reason about all dynamic
code paths effectively, and it has false positives. It's a great tool, could get
even better, but it's simply incapable of catching all uses of uninitialized
values.

Why not just rely on memory sanitizer? Because it's not universally available,
has a 3x performance cost, and shouldn't be deployed in production. Again, it's
a great tool, it'll find the dynamic uses of uninitialized variables that your
test coverage hits, but it won't find the ones that you encounter in production.

What's the performance like? Not too bad! Previous publications [0] have cited
2.7 to 4.5% averages. We've commmitted a few patches over the last few months to
address specific regressions, both in code size and performance. In all cases,
the optimizations are generally useful, but variable initialization benefits
from them a lot more than regular code does. We've got a handful of other
optimizations in mind, but the code is in good enough shape and has found enough
latent issues that it's a good time to get the change reviewed, checked in, and
have others kick the tires. We'll continue reducing overheads as we try this out
on diverse codebases.

Is it a good idea? Security-minded folks think so, and apparently so does the
Microsoft Visual Studio team [1] who say "Between 2017 and mid 2018, this
feature would have killed 49 MSRC cases that involved uninitialized struct data
leaking across a trust boundary. It would have also mitigated a number of bugs
involving uninitialized struct data being used directly.". They seem to use pure
zero initialization, and claim to have taken the overheads down to within noise.
Don't just trust Microsoft though, here's another relevant person asking for
this [2]. It's been proposed for GCC [3] and LLVM [4] before.

What are the caveats? A few!

  - Variables declared in unreachable code, and used later, aren't initialized.
    This goto, Duff's device, other objectionable uses of switch. This should
    instead be a hard-error in any serious codebase.
  - Volatile stack variables are still weird. That's pre-existing, it's really
    the language's fault and this patch keeps it weird. We should deprecate
    volatile [5].
  - As noted above, padding isn't fully handled yet.

I don't think these caveats make the patch untenable because they can be
addressed separately.

Should this be on by default? Maybe, in some circumstances. It's a conversation
we can have when we've tried it out sufficiently, and we're confident that we've
eliminated enough of the overheads that most codebases would want to opt-in.
Let's keep our precious undefined behavior until that point in time.

How do I use it:

  1. On the command-line:

    -ftrivial-auto-var-init=uninitialized (the default)
    -ftrivial-auto-var-init=pattern
    -ftrivial-auto-var-init=zero -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang

  2. Using an attribute:

    int dont_initialize_me __attribute((uninitialized));

  [0]: https://users.elis.ugent.be/~jsartor/researchDocs/OOPSLA2011Zero-submit.pdf
  [1]: https://twitter.com/JosephBialek/status/1062774315098112001
  [2]: https://outflux.net/slides/2018/lss/danger.pdf
  [3]: https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00615.html
  [4]: 776a0955ef
  [5]: http://wg21.link/p1152

I've also posted an RFC to cfe-dev: http://lists.llvm.org/pipermail/cfe-dev/2018-November/060172.html

<rdar://problem/39131435>

Reviewers: pcc, kcc, rsmith

Subscribers: JDevlieghere, jkorous, dexonsmith, cfe-commits

Differential Revision: https://reviews.llvm.org/D54604

llvm-svn: 349442
2018-12-18 05:12:21 +00:00
Alex Lorenz 0a264f3928 [darwin] parse the SDK settings from SDKSettings.json if it exists and
pass in the -target-sdk-version to the compiler and backend

This commit adds support for reading the SDKSettings.json file in the Darwin
driver. This file is used by the driver to determine the SDK's version, and it
uses that information to pass it down to the compiler using the new
-target-sdk-version= option. This option is then used to set the appropriate
SDK Version module metadata introduced in r349119.

Note: I had to adjust the two ast tests as the SDKROOT environment variable
on macOS caused SDK version to be picked up for the compilation of source file
but not the AST.

rdar://45774000

Differential Revision: https://reviews.llvm.org/D55673

llvm-svn: 349380
2018-12-17 19:19:15 +00:00
Eric Fiselier 261875054e [Clang] Add __builtin_launder
Summary:
This patch adds `__builtin_launder`, which is required to implement `std::launder`. Additionally GCC provides `__builtin_launder`, so thing brings Clang in-line with GCC.

I'm not exactly sure what magic `__builtin_launder` requires, but  based on previous discussions this patch applies a `@llvm.invariant.group.barrier`. As noted in previous discussions, this may not be enough to correctly handle vtables.

Reviewers: rnk, majnemer, rsmith

Reviewed By: rsmith

Subscribers: kristina, Romain-Geissler-1A, erichkeane, amharc, jroelofs, cfe-commits, Prazek

Differential Revision: https://reviews.llvm.org/D40218

llvm-svn: 349195
2018-12-14 21:11:28 +00:00
Alexey Bataev ae51b96f99 [OPENMP][NVPTX]Improved interwarp copy function.
Inlined runtime with the current implementation of the interwarp copy
function leads to the undefined behavior because of the not quite
correct implementation of the barriers. Start using generic
__kmpc_barier function instead of the custom made barriers.

llvm-svn: 349192
2018-12-14 21:00:58 +00:00
Scott Linder de6beb02a5 Implement -frecord-command-line (-frecord-gcc-switches)
Implement options in clang to enable recording the driver command-line
in an ELF section.

Implement a new special named metadata, llvm.commandline, to support
frontends embedding their command-line options in IR/ASM/ELF.

This differs from the GCC implementation in some key ways:

* In GCC there is only one command-line possible per compilation-unit,
  in LLVM it mirrors llvm.ident and multiple are allowed.
* In GCC individual options are separated by NULL bytes, in LLVM entire
  command-lines are separated by NULL bytes. The advantage of the GCC
  approach is to clearly delineate options in the face of embedded
  spaces. The advantage of the LLVM approach is to support merging
  multiple command-lines unambiguously, while handling embedded spaces
  with escaping.

Differential Revision: https://reviews.llvm.org/D54487
Clang Differential Revision: https://reviews.llvm.org/D54489

llvm-svn: 349155
2018-12-14 15:38:15 +00:00
Craig Topper 1f2b181689 [Builltins][X86] Provide implementations of __lzcnt16, __lzcnt, __lzcnt64 for MS compatibility. Remove declarations from intrin.h and implementations from lzcntintrin.h
intrin.h had forward declarations for these and lzcntintrin.h had implementations that were only available with -mlzcnt or a -march that supported the lzcnt feature.

For MS compatibility we should always have these builtins available regardless of X86 being the target or the CPU support the lzcnt instruction. The backends should be able to gracefully fallback to something support even if its just shifts and bit ops.

Unfortunately, gcc also implements 2 of the 3 function names here on X86 when lzcnt feature is enabled.

This patch adds builtins for these for MSVC compatibility and drops the forward declarations from intrin.h. To keep the gcc compatibility the two intrinsics that collided have been turned into macros that use the X86 specific builtins with the lzcnt feature check. These macros are only defined when _MSC_VER is not defined. Without them being macros we can get a redefinition error because -ms-extensions doesn't seem to set _MSC_VER but does make the MS builtins available.

Should fix PR40014

Differential Revision: https://reviews.llvm.org/D55677

llvm-svn: 349098
2018-12-14 00:21:02 +00:00
Artem Belevich 7b05666a19 [CUDA] Make all host-side shadows of device-side variables undef.
The host-side code can't (and should not) access the values that may
only exist on the device side. E.g. address of a __device__ function
does not exist on the host side as we don't generate the code for it there.

Differential Revision: https://reviews.llvm.org/D55663

llvm-svn: 349087
2018-12-13 21:43:04 +00:00
Adrian Prantl 046d100b41 Reinstate DW_AT_comp_dir support after D55519.
The DIFile used by the CU is special and distinct from the main source
file. Its directory part specifies what becomes the DW_AT_comp_dir
(the compilation directory), even if the source file was specified
with an absolute path.

To support the .dwo workflow, a valid DW_AT_comp_dir is necessary even
if source files were specified with an absolute path.

llvm-svn: 349065
2018-12-13 17:53:29 +00:00
Mikael Nilsson 9d2872db74 [OpenCL] Add generic AS to 'this' pointer
Address spaces are cast into generic before invoking the constructor.

Added support for a trailing Qualifiers object in FunctionProtoType.

Note: This recommits the previously reverted patch, 
      but now it is commited together with a fix for lldb.

Differential Revision: https://reviews.llvm.org/D54862

llvm-svn: 349019
2018-12-13 10:15:27 +00:00
Reid Kleckner 43071080cd Remove unused Args parameter from EmitFunctionBody, NFC
llvm-svn: 349001
2018-12-13 01:33:20 +00:00
Reid Kleckner 25b56024aa Emit a proper diagnostic when attempting to forward inalloca arguments
The previous assertion was relatively easy to trigger, and likely will
be easy to trigger going forward. EmitDelegateCallArg is relatively
popular.

This cleanly diagnoses PR28299 while I work on a proper solution.

llvm-svn: 348991
2018-12-12 23:46:06 +00:00
Haibo Huang e177082972 Revert "Declares __cpu_model as dso local"
This reverts r348978

llvm-svn: 348982
2018-12-12 22:39:51 +00:00
Haibo Huang 6b22f59207 Declares __cpu_model as dso local
__builtin_cpu_supports and __builtin_cpu_is use information in __cpu_model to decide cpu features. Before this change, __cpu_model was not declared as dso local. The generated code looks up the address in GOT when reading __cpu_model. This makes it impossible to use these functions in ifunc, because at that time GOT entries have not been relocated. This change makes it dso local.

Differential Revision: https://reviews.llvm.org/D53850

llvm-svn: 348978
2018-12-12 22:04:12 +00:00
Erich Keane 8c94f07f54 Teach __builtin_unpredictable to work through implicit casts.
The __builtin_unpredictable implementation is confused by any implicit
casts, which happen in C++.  This patch strips those off so that
if/switch statements now work with it in C++.

Change-Id: I73c3bf4f1775cd906703880944f4fcdc29fffb0a
llvm-svn: 348969
2018-12-12 20:30:53 +00:00
Mikael Nilsson 90646732bf Revert "[OpenCL] Add generic AS to 'this' pointer"
Reverting because the patch broke lldb.

llvm-svn: 348931
2018-12-12 15:06:16 +00:00
Mikael Nilsson 78de84719b [OpenCL] Add generic AS to 'this' pointer
Address spaces are cast into generic before invoking the constructor.

Added support for a trailing Qualifiers object in FunctionProtoType.

Differential Revision: https://reviews.llvm.org/D54862

llvm-svn: 348927
2018-12-12 14:11:59 +00:00
Andrew Savonichev 87a7e436c0 [OpenCL] Fix for TBAA information of pointer after addresspacecast
Summary: When addresspacecast is generated resulting pointer should preserve TBAA information from original value.

Reviewers: rjmccall, yaxunl, Anastasia

Reviewed By: rjmccall

Subscribers: asavonic, kosarev, cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D55262

llvm-svn: 348919
2018-12-12 09:51:23 +00:00
Fangrui Song 2000170e27 [CodeGen] Fix -DBUILD_SHARED_LIBS=on build after rC348907
llvm-svn: 348911
2018-12-12 06:07:33 +00:00
Adrian Prantl c2a44ecf39 Remove CGDebugInfo::getOrCreateFile() and use TheCU->getFile() directly.
llvm-svn: 348866
2018-12-11 16:58:46 +00:00
Adrian Prantl aa5bad449b Reuse code from CGDebugInfo::getOrCreateFile() when creating the file
for the DICompileUnit.

This addresses post-commit feedback for D55085. Without this patch, a
main source file with an absolute paths may appear in different
DIFiles, once with the absolute path and once with the common prefix
between the absolute path and the current working directory.

Differential Revision: https://reviews.llvm.org/D55519

llvm-svn: 348865
2018-12-11 16:58:43 +00:00
Richard Trieu 6368818fd5 Move CodeGenOptions from Frontend to Basic
Basic uses CodeGenOptions and should not depend on Frontend.

llvm-svn: 348827
2018-12-11 03:18:39 +00:00