llvm-project

Commit Graph

Author	SHA1	Message	Date
Vedant Kumar	18348ea9b9	[ubsan] Pass a set of checks to skip to EmitTypeCheck() (NFC) CodeGenFunction::EmitTypeCheck accepts a bool flag which controls whether or not null checks are emitted. Make this a bit more flexible by changing the bool to a SanitizerSet. Needed for an upcoming change which deals with a scenario in which we only want to emit null checks. llvm-svn: 295514	2017-02-17 23:22:55 +00:00
Vedant Kumar	29ba8d9bfe	Revert "Retry: [ubsan] Reduce null checking of C++ object pointers (PR27581)" This reverts commit r295401. It breaks the ubsan self-host. It inserts object size checks once per C++ method which fire when the structure is empty. llvm-svn: 295494	2017-02-17 20:59:40 +00:00
Richard Smith	bc491203c7	Add an explicit derived class of FunctionDecl to model deduction guides rather than just treating them as FunctionDecls with a funny name. No functionality change intended. llvm-svn: 295491	2017-02-17 20:05:37 +00:00
Jonas Hahnfeld	b07931f01d	[OpenMP] Fix cancellation point in task with no cancel With tasks, the cancel may happen in another task. This has a different region info which means that we can't find it here. Differential Revision: https://reviews.llvm.org/D30091 llvm-svn: 295474	2017-02-17 18:32:58 +00:00
Jonas Hahnfeld	20fce72f1b	[OpenMP] Remove barriers at cancel and cancellation point This resolves a deadlock with the cancel directive when there is no explicit cancellation point. In that case, the implicit barrier acts as cancellation point. After removing the barrier after cancel, the now unmatched barrier for the explicit cancellation point has to go as well. This has probably worked before rL255992: With the calls for the explicit barrier, it was sure that all threads passed a barrier before exiting. Reported by Simon Convent and Joachim Protze! Differential Revision: https://reviews.llvm.org/D30088 llvm-svn: 295473	2017-02-17 18:32:51 +00:00
Justin Bogner	e91e9dd7bb	Rename DiagnosticInfoWithDebugLoc to WithLocation to match LLVM Updates for llvm r295465. llvm-svn: 295466	2017-02-17 17:34:49 +00:00
Vedant Kumar	55875b9955	Retry: [ubsan] Reduce null checking of C++ object pointers (PR27581) This patch teaches ubsan to insert exactly one null check for the 'this' pointer per method/lambda. Previously, given a load of a member variable from an instance method ('this->x'), ubsan would insert a null check for 'this', and another null check for '&this->x', before allowing the load to occur. Similarly, given a call to a method from another method bound to the same instance ('this->foo()'), ubsan would a redundant null check for 'this'. There is also a redundant null check in the case where the object pointer is a reference ('Ref.foo()'). This patch teaches ubsan to remove the redundant null checks identified above. Testing: check-clang and check-ubsan. I also compiled X86FastISel.cpp with -fsanitize=null using patched/unpatched clangs based on r293572. Here are the number of null checks emitted: ------------------------------------- \| Setup \| # of null checks \| ------------------------------------- \| unpatched, -O0 \| 21767 \| \| patched, -O0 \| 10758 \| ------------------------------------- Changes since the initial commit: don't rely on IRGen of C labels in the test. Differential Revision: https://reviews.llvm.org/D29530 llvm-svn: 295401	2017-02-17 02:03:51 +00:00
Vedant Kumar	4f94a94bea	Revert "[ubsan] Reduce null checking of C++ object pointers (PR27581)" This reverts commit r295391. It breaks this bot: http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/1898 I need to not rely on labels in the IR test. llvm-svn: 295396	2017-02-17 01:42:36 +00:00
Vedant Kumar	3e5a9a6be8	[ubsan] Reduce null checking of C++ object pointers (PR27581) This patch teaches ubsan to insert exactly one null check for the 'this' pointer per method/lambda. Previously, given a load of a member variable from an instance method ('this->x'), ubsan would insert a null check for 'this', and another null check for '&this->x', before allowing the load to occur. Similarly, given a call to a method from another method bound to the same instance ('this->foo()'), ubsan would a redundant null check for 'this'. There is also a redundant null check in the case where the object pointer is a reference ('Ref.foo()'). This patch teaches ubsan to remove the redundant null checks identified above. Testing: check-clang and check-ubsan. I also compiled X86FastISel.cpp with -fsanitize=null using patched/unpatched clangs based on r293572. Here are the number of null checks emitted: ------------------------------------- \| Setup \| # of null checks \| ------------------------------------- \| unpatched, -O0 \| 21767 \| \| patched, -O0 \| 10758 \| ------------------------------------- Differential Revision: https://reviews.llvm.org/D29530 llvm-svn: 295391	2017-02-17 01:05:42 +00:00
Arpith Chacko Jacob	fc711b1f47	[OpenMP] Teams reduction on the NVPTX device. This patch implements codegen for the reduction clause on any teams construct for elementary data types. It builds on parallel reductions on the GPU. Subsequently, the team master writes to a unique location in a global memory scratchpad. The last team to do so loads and reduces this array to calculate the final result. This patch emits two helper functions that are used by the OpenMP runtime on the GPU to perform reductions across teams. Patch by Tian Jin in collaboration with Arpith Jacob Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29879 llvm-svn: 295335	2017-02-16 16:48:49 +00:00
Arpith Chacko Jacob	101e8fb1f3	[OpenMP] Parallel reduction on the NVPTX device. This patch implements codegen for the reduction clause on any parallel construct for elementary data types. An efficient implementation requires hierarchical reduction within a warp and a threadblock. It is complicated by the fact that variables declared in the stack of a CUDA thread cannot be shared with other threads. The patch creates a struct to hold reduction variables and a number of helper functions. The OpenMP runtime on the GPU implements reduction algorithms that uses these helper functions to perform reductions within a team. Variables are shared between CUDA threads using shuffle intrinsics. An implementation of reductions on the NVPTX device is substantially different to that of CPUs. However, this patch is written so that there are minimal changes to the rest of OpenMP codegen. The implemented design allows the compiler and runtime to be decoupled, i.e., the runtime does not need to know of the reduction operation(s), the type of the reduction variable(s), or the number of reductions. The design also allows reuse of host codegen, with appropriate specialization for the NVPTX device. While the patch does introduce a number of abstractions, the expected use case calls for inlining of the GPU OpenMP runtime. After inlining and optimizations in LLVM, these abstractions are unwound and performance of OpenMP reductions is comparable to CUDA-canonical code. Patch by Tian Jin in collaboration with Arpith Jacob Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29758 llvm-svn: 295333	2017-02-16 16:20:16 +00:00
Arpith Chacko Jacob	bd6344c0be	Revert r295319 while investigating buildbot failure. llvm-svn: 295323	2017-02-16 14:25:35 +00:00
Arpith Chacko Jacob	8e170fc857	[OpenMP] Parallel reduction on the NVPTX device. This patch implements codegen for the reduction clause on any parallel construct for elementary data types. An efficient implementation requires hierarchical reduction within a warp and a threadblock. It is complicated by the fact that variables declared in the stack of a CUDA thread cannot be shared with other threads. The patch creates a struct to hold reduction variables and a number of helper functions. The OpenMP runtime on the GPU implements reduction algorithms that uses these helper functions to perform reductions within a team. Variables are shared between CUDA threads using shuffle intrinsics. An implementation of reductions on the NVPTX device is substantially different to that of CPUs. However, this patch is written so that there are minimal changes to the rest of OpenMP codegen. The implemented design allows the compiler and runtime to be decoupled, i.e., the runtime does not need to know of the reduction operation(s), the type of the reduction variable(s), or the number of reductions. The design also allows reuse of host codegen, with appropriate specialization for the NVPTX device. While the patch does introduce a number of abstractions, the expected use case calls for inlining of the GPU OpenMP runtime. After inlining and optimizations in LLVM, these abstractions are unwound and performance of OpenMP reductions is comparable to CUDA-canonical code. Patch by Tian Jin in collaboration with Arpith Jacob Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29758 llvm-svn: 295319	2017-02-16 14:03:36 +00:00
Anastasia Stulova	58984e7087	[OpenCL] Correct ndrange_t implementation Removed ndrange_t as Clang builtin type and added as a struct type in the OpenCL header. Use type name to do the Sema checking in enqueue_kernel and modify IR generation accordingly. Review: D28058 Patch by Dmitry Borisenkov! llvm-svn: 295311	2017-02-16 12:27:47 +00:00
Hans Wennborg	cac8ce06dd	[dllimport] Check for dtor references in functions Destructor references are not modelled explicitly in the AST. This adds checks for destructor calls due to variable definitions and temporaries. If a dllimport function references a non-dllimport destructor, it must not be emitted available_externally, as the referenced destructor might live across the DLL boundary and isn't exported. llvm-svn: 295258	2017-02-15 23:28:10 +00:00
Hans Wennborg	6c3d625fd9	[dllimport] Look through typedefs and arrays in HasNonDllImportDtor The function is used to check whether a type is a class with non-dllimport destructor. It needs to look through typedefs and array types. llvm-svn: 295257	2017-02-15 23:28:07 +00:00
Simon Pilgrim	27cc054b1c	Fix spelling mistake - paramater -> parameter. NFCI. llvm-svn: 295183	2017-02-15 15:12:06 +00:00
Akira Hatanaka	f1b3fc7356	[CodeGen][ObjC] Use the type of the captured field of the enclosing block or lambda. This is a follow-up to r281682, which fixed a bug in computeBlockInfo where the captured VarDecl's type, rather than the captured field type of the enclosing lambda or block, was used to compute the layout of a block. This commit makes similar changes to enterBlockScope. This is necessary to correctly determine whether a block capture requires cleanup. rdar://problem/30388124 llvm-svn: 295034	2017-02-14 06:46:55 +00:00
Nick Lewycky	0752762180	When the new expr's array size is an ICE, emit it as a constant expression. This bypasses integer sanitization checks which are redundant on the expression since it's been checked by Sema. Fixes a clang codegen assertion on "void test() { new int[0+1]{0}; }" when building with -fsanitize=signed-integer-overflow. llvm-svn: 295006	2017-02-13 23:49:55 +00:00
Reid Kleckner	9de921470d	[CodeGen] Treat auto-generated __dso_handle symbol as HiddenVisibility Fixes https://bugs.llvm.org/show_bug.cgi?id=31932 Based on a patch by Roland McGrath Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D29843 llvm-svn: 294978	2017-02-13 18:49:21 +00:00
Davide Italiano	945de43dbe	[PM] Add support for instrumented PGO in the new pass manager (clang-side) Differential Revision: https://reviews.llvm.org/D29309 llvm-svn: 294961	2017-02-13 16:07:05 +00:00
Saleem Abdulrasool	40db4772bd	CodeGen: use # as the comment leader for ARC marker Use # as the comment leader for AArch64 auto-release elision marker. This is to keep it in sync with the value used in swift. When building libdispatch for Linux AArch64, the auto-release elision marker was emitted. However, ELF uses # as the comment leader while MachO accepts both ; and #. Use the common marker for it instead. llvm-svn: 294877	2017-02-11 23:03:13 +00:00
Saleem Abdulrasool	c30cec26ed	CodeGen: annotate ObjC ARC functions with ABI constraints Certain ARC runtime functions have an ABI contract of being forwarding. Annotate the functions with the appropriate `returned` attribute on the arguments. This hoists some of the runtime ABI contract information into the frontend rather than the backend transformations. The test adjustments are to mark the returned function parameter as such. The minor change to the IR output is due to the fact that the returned reference of the object causes it to extend the lifetime of the object by returning an autoreleased return value. The result is that the explicit objc_autorelease call is no longer formed, as autorelease elision is now possible on the return. llvm-svn: 294872	2017-02-11 21:34:18 +00:00
Saleem Abdulrasool	5b1f0edf2d	docs: update docs for objc_storeStrong behaviour objc_storeStrong does not return a value. llvm-svn: 294855	2017-02-11 17:24:09 +00:00
Saleem Abdulrasool	e60561c073	CodeGen: rename variables to adhere to naming convention Adjust style before making more intrusive changes. NFC. llvm-svn: 294854	2017-02-11 17:24:07 +00:00
Simon Pilgrim	463cb8ac30	Wdocumentation fixes llvm-svn: 294740	2017-02-10 12:14:01 +00:00
Eric Christopher	cdbfd0edb5	Update C style comments to C++ style. llvm-svn: 294680	2017-02-10 00:20:26 +00:00
David Blaikie	8677e04240	Fix the -Werror build by removing an unused default in a fully covered switch llvm-svn: 294676	2017-02-10 00:06:38 +00:00
Amjad Aboud	546bc1103b	[DebugInfo] Added support to Clang FE for generating debug info for preprocessor macros. Added "-fdebug-macro" flag (and "-fno-debug-macro" flag) to enable (and to disable) emitting macro debug info. Added CC1 "-debug-info-macro" flag that enables emitting macro debug info. Differential Revision: https://reviews.llvm.org/D16135 llvm-svn: 294637	2017-02-09 22:07:24 +00:00
Davide Italiano	05f25fa950	[CodeGen] Remove unneeded `private`. NFCI. llvm-svn: 294623	2017-02-09 21:19:51 +00:00
Reid Kleckner	04f9f91da6	[MS] Implement the __fastfail intrinsic as a builtin __fastfail terminates the process immediately with a special system call. It does not run any process shutdown code or exception recovery logic. Fixes PR31854 llvm-svn: 294606	2017-02-09 18:31:06 +00:00
Reid Kleckner	a858981c1d	[MS] Fix C++ destructor thunk line info for a declaration Sometimes the MS ABI needs to emit thunks for declarations that don't have bodies. Destructor thunks make calls to inlinable functions, so they need line info or LLVM will complain. Fixes PR31893 llvm-svn: 294465	2017-02-08 16:09:32 +00:00
Dylan McKay	e8232d73f5	[AVR] Add support for the 'interrupt' and 'naked' attributes Summary: This teaches clang how to parse and lower the 'interrupt' and 'naked' attributes. This allows interrupt signal handlers to be written. Reviewers: aaron.ballman Subscribers: malcolm.parsons, cfe-commits Differential Revision: https://reviews.llvm.org/D28451 llvm-svn: 294402	2017-02-08 05:09:26 +00:00
Warren Ristow	8d17b40500	Prevent ICE in dllexport class with _Atomic data member Guard against a null pointer dereference that caused Clang to crash when processing a class containing an _Atomic qualified data member, and that is tagged with 'dllexport'. Differential Revision: https://reviews.llvm.org/D29208 llvm-svn: 293911	2017-02-02 17:53:34 +00:00
Saleem Abdulrasool	8de4e87305	CodeGen: add a LLVM_FALLTHROUGH to a fallthrough (NFC) Drive by cleanup noticed while investigating an IR verifier assertion. llvm-svn: 293867	2017-02-02 05:45:43 +00:00
Dehao Chen	5a3f890e06	Change debug-info-for-profiling from a TargetOption to a function attribute. Summary: cfe change for https://reviews.llvm.org/D29203 Reviewers: echristo, dblaikie Reviewed By: dblaikie Subscribers: mehdi_amini, cfe-commits Differential Revision: https://reviews.llvm.org/D29205 llvm-svn: 293834	2017-02-01 22:45:21 +00:00
Alex Lorenz	86d3232daf	[CodeGen][ObjC] Avoid asserting on block pointer types in isPointerZeroInitializable rdar://30111891 llvm-svn: 293787	2017-02-01 17:37:28 +00:00
Hans Wennborg	27dcc6c0e2	clang-cl: Evaluate arguments left-to-right in constructor call with initializer list (PR31831) clang-cl would evaluate the arguments right-to-left (see PR), and for non-Windows targets I suppose we only got it because we were already emitting left-to-right in CodeGenFunction::EmitCallArgs. Differential Revision: https://reviews.llvm.org/D29350 llvm-svn: 293732	2017-02-01 02:21:07 +00:00
Nirav Dave	0c86ccf4b4	[X86] Teach Clang about -mfentry flag Replace mcount calls with calls to fentry. Reviewers: hfinkel, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28001 llvm-svn: 293649	2017-01-31 17:00:35 +00:00
Matt Arsenault	a274b209f5	AMDGPU: Add builtin for fmed3 intrinsic llvm-svn: 293600	2017-01-31 03:42:07 +00:00
Vedant Kumar	d3a601b06b	Re-apply "[ubsan] Sanity-check shift amounts before truncation" This re-applies r293343 (reverts commit r293475) with a fix for an assertion failure caused by a missing integer cast. I tested this patch by using the built compiler to compile X86FastISel.cpp.o with ubsan. Original commit message: Ubsan does not report UB shifts in some cases where the shift exponent needs to be truncated to match the type of the shift base. We perform a range check on the truncated shift amount, leading to false negatives. Fix the issue (PR27271) by performing the range check on the original shift amount. Differential Revision: https://reviews.llvm.org/D29234 llvm-svn: 293572	2017-01-30 23:38:54 +00:00
Benjamin Kramer	2664a866db	[IRGen] Make header standalone. llvm-svn: 293485	2017-01-30 15:39:18 +00:00
Alex Lorenz	94c26be581	Revert "r293343 - [ubsan] Sanity-check shift amounts before truncation (fixes PR27271)" After r293343 clang fails to compile itself with -fsanitize=undefined ( http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_build/). rdar://30259929 llvm-svn: 293475	2017-01-30 11:37:18 +00:00
David Blaikie	b11c87324e	Reapply "DebugInfo: Omit class definitions even in the presence of available_externally vtables" Accounts for a case that caused an assertion failure by attempting to query for the vtable linkage of a non-dynamic type.t This reverts commit r292801. llvm-svn: 293462	2017-01-30 06:36:08 +00:00
David Blaikie	9ffe5a3525	Prototype of modules codegen First pass at generating weak definitions of inline functions from module files (& skipping (-O0) or emitting available_externally (optimizations) definitions where those modules are used). External functions defined in modules are emitted into the modular object file as well (this may turn an existing ODR violation (if that module were imported into multiple translations) into valid/linkable code). Internal symbols (static functions, for example) are not correctly supported yet. The symbol will be produced, internal, in the modular object - unreferenceable from the users. Reviewers: rsmith Differential Revision: https://reviews.llvm.org/D28845 llvm-svn: 293456	2017-01-30 05:00:26 +00:00
Arpith Chacko Jacob	cdda3daa7f	[OpenMP][NVPTX][CUDA] Adding support for printf for an NVPTX OpenMP device. Support for CUDA printf is exploited to support printf for an NVPTX OpenMP device. To reflect the support of both programming models, the file CGCUDABuiltin.cpp has been renamed to CGGPUBuiltin.cpp, and the call EmitCUDADevicePrintfCallExpr has been renamed to EmitGPUDevicePrintfCallExpr. Reviewers: jlebar Differential Revision: https://reviews.llvm.org/D17890 llvm-svn: 293444	2017-01-29 20:49:31 +00:00
Vedant Kumar	3db9974b2d	[ubsan] Sanity-check shift amounts before truncation (fixes PR27271) Ubsan does not report UB shifts in some cases where the shift exponent needs to be truncated to match the type of the shift base. We perform a range check on the truncated shift amount, leading to false negatives. Fix the issue (PR27271) by performing the range check on the original shift amount. Differential Revision: https://reviews.llvm.org/D29234 llvm-svn: 293343	2017-01-27 23:02:44 +00:00
Anastasia Stulova	af0a7bbbe2	[OpenCL] Add missing address spaces in IR generation of blocks Modify ObjC blocks impl wrt address spaces as follows: - keep default private address space for blocks generated as local variables (with captures); - add global address space for global block literals (no captures); - make the block invoke function and enqueue_kernel prototype with the generic AS block pointer parameter to accommodate both private and global AS cases from above; - add block handling into default AS because it's implemented as a special pointer type (BlockPointer) in the frontend and therefore it is used as a pointer everywhere. This is also needed to accommodate both private and global AS blocks for the two cases above. - removes ObjC RT specific symbols (NSConcreteStackBlock and NSConcreteGlobalBlock) in the OpenCL mode. Review: https://reviews.llvm.org/D28814 llvm-svn: 293286	2017-01-27 15:11:34 +00:00
Peter Collingbourne	b884716f6a	Re-apply r292662, "IRGen: Start using the WriteThinLTOBitcode pass." The internal build issue has been resolved. llvm-svn: 293231	2017-01-26 23:51:50 +00:00
Peter Collingbourne	f5d1712189	IRGen: When loading the main module in the distributed ThinLTO backend, look for the module containing the summary. Differential Revision: https://reviews.llvm.org/D29067 llvm-svn: 293209	2017-01-26 21:09:48 +00:00
Richard Smith	600b5261c4	PR0091R3: Implement parsing support for using templates as types. This change adds a new type node, DeducedTemplateSpecializationType, to represent a type template name that has been used as a type. This is modeled around AutoType, and shares a common base class for representing a deduced placeholder type. We allow deduced class template types in a few more places than the standard does: in conditions and for-range-declarators, and in new-type-ids. This is consistent with GCC and with discussion on the core reflector. This patch does not yet support deduced class template types being named in typename specifiers. llvm-svn: 293207	2017-01-26 20:40:47 +00:00
Stanislav Mekhanoshin	61da067393	Use TargetMachine adjustPassManager hook Differential Revision: https://reviews.llvm.org/D28340 llvm-svn: 293190	2017-01-26 16:49:21 +00:00
Arpith Chacko Jacob	cca61a3a74	[OpenMP] Codegen support for 'target teams' on the NVPTX device. This is a simple patch to teach OpenMP codegen to emit the construct in Generic mode. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29143 llvm-svn: 293183	2017-01-26 15:43:27 +00:00
Adam Nemet	7b796f825b	Support MIR opt-remarks with -fsave-optimization-record The handler that deals with IR passed/missed/analysis remarks is extended to also handle the corresponding MIR remarks. The more thorough testing in done via llc (rL293113, rL293121). Here we just make sure that the functionality is accessible through clang. llvm-svn: 293146	2017-01-26 04:07:11 +00:00
Akira Hatanaka	fdcd18b4c9	[CodeGen] Suppress emission of lifetime markers if a label has been seen in the current lexical scope. clang currently emits the lifetime.start marker of a variable when the variable comes into scope even though a variable's lifetime starts at the entry of the block with which it is associated, according to the C standard. This normally doesn't cause any problems, but in the rare case where a goto jumps backwards past the variable declaration to an earlier point in the block (see the test case added to lifetime2.c), it can cause mis-compilation. To prevent such mis-compiles, this commit conservatively disables emitting lifetime variables when a label has been seen in the current block. This problem was discussed on cfe-dev here: http://lists.llvm.org/pipermail/cfe-dev/2016-July/050066.html rdar://problem/30153946 Differential Revision: https://reviews.llvm.org/D27680 llvm-svn: 293106	2017-01-25 22:55:13 +00:00
Justin Lebar	b080b630b1	[CodeGen] [CUDA] Add the ability set default attrs on functions in linked modules. Summary: Now when you ask clang to link in a bitcode module, you can tell it to set attributes on that module's functions to match what we would have set if we'd emitted those functions ourselves. This is particularly important for fast-math attributes in CUDA compilations. Each CUDA compilation links in libdevice, a bitcode library provided by nvidia as part of the CUDA distribution. Without this patch, if we have a user-function F that is compiled with -ffast-math that calls a function G from libdevice, F will have the unsafe-fp-math=true (etc.) attributes, but G will have no attributes. Since F calls G, the inliner will merge G's attributes into F's. It considers the lack of an unsafe-fp-math=true attribute on G to be tantamount to unsafe-fp-math=false, so it "merges" these by setting unsafe-fp-math=false on F. This then continues up the call graph, until every function that (transitively) calls something in libdevice gets unsafe-fp-math=false set, thus disabling fastmath in almost all CUDA code. Reviewers: echristo Subscribers: hfinkel, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D28538 llvm-svn: 293097	2017-01-25 21:29:48 +00:00
Arpith Chacko Jacob	2cd6eeabfd	[OpenMP] Support for the proc_bind-clause on 'target parallel' on the NVPTX device. This patch adds support for the proc_bind clause on the Spmd construct 'target parallel' on the NVPTX device. Since the parallel region is created upon kernel launch, this clause can be safely ignored on the NVPTX device at codegen time for level 0 parallelism. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29128 llvm-svn: 293069	2017-01-25 16:55:10 +00:00
Arpith Chacko Jacob	99a1e0eba5	[OpenMP] Codegen support for 'target teams' on the host. This patch adds support for codegen of 'target teams' on the host. This combined directive has two captured statements, one for the 'teams' region, and the other for the 'parallel'. This target teams region is offloaded using the __tgt_target_teams() call. The patch sets the number of teams as an argument to this call. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29084 llvm-svn: 293005	2017-01-25 02:18:43 +00:00
Arpith Chacko Jacob	86f9e46365	Reverting commit because an NVPTX patch sneaked in. Break up into two patches. llvm-svn: 293003	2017-01-25 01:45:59 +00:00
Arpith Chacko Jacob	4dbf368e14	[OpenMP] Codegen support for 'target teams' on the host. This patch adds support for codegen of 'target teams' on the host. This combined directive has two captured statements, one for the 'teams' region, and the other for the 'parallel'. This target teams region is offloaded using the __tgt_target_teams() call. The patch sets the number of teams as an argument to this call. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29084 llvm-svn: 293001	2017-01-25 01:38:33 +00:00
Arpith Chacko Jacob	e04da5dee2	[OpenMP] Support for the num_threads-clause on 'target parallel' on the NVPTX device. This patch adds support for the Spmd construct 'target parallel' on the NVPTX device. This involves ignoring the num_threads clause on the device since the number of threads in this combined construct is already set on the host through the call to __tgt_target_teams(). Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29083 llvm-svn: 292999	2017-01-25 01:18:34 +00:00
Arpith Chacko Jacob	33c849a007	[OpenMP] Support for the num_threads-clause on 'target parallel'. The num_threads-clause on the combined directive applies to the 'parallel' region of this construct. We modify the NumThreadsClause class to capture the clause expression within the 'target' region. The offload runtime call for 'target parallel' is changed to __tgt_target_teams() with 1 team and the number of threads set by this clause or a default if none. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29082 llvm-svn: 292997	2017-01-25 00:57:16 +00:00
Peter Collingbourne	65cb42c1ce	IRGen: Factor out function CodeGenAction::loadModule. NFCI. llvm-svn: 292972	2017-01-24 19:55:38 +00:00
Peter Collingbourne	47d2364a51	IRGen: Factor out function clang::FindThinLTOModule. NFCI. llvm-svn: 292970	2017-01-24 19:54:37 +00:00
Hans Wennborg	251c204e57	Re-commit "Don't inline dllimport functions referencing non-imported methods" This re-commits r292522 with the addition that it also handles calls through pointer to member functions without crashing. llvm-svn: 292856	2017-01-23 23:57:50 +00:00
David L. Jones	7a7dd031e9	Add LF_ prefix to LibFunc enums in TargetLibraryInfo. Summary: The LibFunc::Func enum holds enumerators named for libc functions. Unfortunately, there are real situations, including libc implementations, where function names are actually macros (musl uses "#define fopen64 fopen", for example; any other transitively visible macro would have similar effects). Strictly speaking, a conforming C++ Standard Library should provide any such macros as functions instead (via <cstdio>). However, there are some "library" functions which are not part of the standard, and thus not subject to this rule (fopen64, for example). So, in order to be both portable and consistent, the enum should not use the bare function names. The old enum naming used a namespace LibFunc and an enum Func, with bare enumerators. This patch changes LibFunc to be an enum with enumerators prefixed with "LF_". (Unfortunately, a scoped enum is not sufficient to override macros.) These changes are for clang. See https://reviews.llvm.org/D28476 for LLVM. Reviewers: rsmith Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D28477 llvm-svn: 292849	2017-01-23 23:16:58 +00:00
David Blaikie	8cf0c27404	Revert "DebugInfo: Omit class definitions even in the presence of available_externally vtables" Patch crashing on a bootstrapping sanitizer bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/679 Reverting while I investigate. This reverts commit r292768. llvm-svn: 292801	2017-01-23 16:57:14 +00:00
Martin Bohme	5057766d87	Revert "IRGen: Start using the WriteThinLTOBitcode pass." Summary: This reverts commit r292662. This change broke internal builds. Will provide a reproducer internally. Subscribers: pcc, mehdi_amini, cfe-commits, mgorny Differential Revision: https://reviews.llvm.org/D29025 llvm-svn: 292791	2017-01-23 14:33:42 +00:00
David Blaikie	b06bcde1ab	DebugInfo: Omit class definitions even in the presence of available_externally vtables To ensure optimization level doesn't pessimize the -fstandalone-debug vtable debug info optimization (where class definitions are only emitted where the vtable is emitted - reducing redundant debug info) ensure the debug info class definition is still omitted when an available_externally vtable definition is emitted for optimization purposes. llvm-svn: 292768	2017-01-23 02:24:03 +00:00
Peter Collingbourne	6f16ac1473	IRGen: Start using the WriteThinLTOBitcode pass. This is the final change necessary to support CFI with ThinLTO. Differential Revision: https://reviews.llvm.org/D28843 llvm-svn: 292662	2017-01-20 22:39:16 +00:00
Reid Kleckner	25019ca828	Revert "Don't inline dllimport functions referencing non-imported methods" This reverts commit r292522. It appears to be causing crashes in builds using dllimport. llvm-svn: 292643	2017-01-20 20:44:50 +00:00
Alexey Bataev	880d8605e3	[OPENMP] Fix for PR31643: Clang crashes when compiling code on Windows with SEH and openmp In some cituations (during codegen for Windows SEH constructs) CodeGenFunction instance may have CurFn equal to nullptr. OpenMP related code does not expect such situation during cleanup. llvm-svn: 292590	2017-01-20 08:57:28 +00:00
Richard Smith	5e29dd3fe0	P0426: Make the library implementation of constexpr char_traits a little easier by providing a memchr builtin that returns char* instead of void*. Also add a __has_feature flag to indicate the presence of constexpr forms of the relevant <string> functions. llvm-svn: 292555	2017-01-20 00:45:35 +00:00
Hans Wennborg	7c650777b0	Don't inline dllimport functions referencing non-imported methods This is another follow-up to r246338. I had assumed methods were already handled by the AST visitor, but turns out they weren't. llvm-svn: 292522	2017-01-19 21:33:13 +00:00
Dehao Chen	b3a70de753	Add -fdebug-info-for-profiling to emit more debug info for sample pgo profile collection Summary: SamplePGO uses profile with debug info to collect profile. Unlike the traditional debugging purpose, sample pgo needs more accurate debug info to represent the profile. We add -femit-accurate-debug-info for this purpose. It can be combined with all debugging modes (-g, -gmlt, etc). It makes sure that the following pieces of info is always emitted: * start line of all subprograms * linkage name of all subprograms * standalone subprograms (functions that has neither inlined nor been inlined) The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch): -gmlt(orig) -gmlt(patched) -g 433.milc 4.68% 5.40% 19.73% 444.namd 8.45% 8.93% 45.99% 447.dealII 97.43% 115.21% 374.89% 450.soplex 27.75% 31.88% 126.04% 453.povray 21.81% 26.16% 92.03% 470.lbm 0.60% 0.67% 1.96% 482.sphinx3 5.77% 6.47% 26.17% 400.perlbench 17.81% 19.43% 73.08% 401.bzip2 3.73% 3.92% 12.18% 403.gcc 31.75% 34.48% 122.75% 429.mcf 0.78% 0.88% 3.89% 445.gobmk 6.08% 7.92% 42.27% 456.hmmer 10.36% 11.25% 35.23% 458.sjeng 5.08% 5.42% 14.36% 462.libquantum 1.71% 1.96% 6.36% 464.h264ref 15.61% 16.56% 43.92% 471.omnetpp 11.93% 15.84% 60.09% 473.astar 3.11% 3.69% 14.18% 483.xalancbmk 56.29% 81.63% 353.22% geomean 15.60% 18.30% 57.81% Debug info size change for -gmlt binary with this patch: 433.milc 13.46% 444.namd 5.35% 447.dealII 18.21% 450.soplex 14.68% 453.povray 19.65% 470.lbm 6.03% 482.sphinx3 11.21% 400.perlbench 8.91% 401.bzip2 4.41% 403.gcc 8.56% 429.mcf 8.24% 445.gobmk 29.47% 456.hmmer 8.19% 458.sjeng 6.05% 462.libquantum 11.23% 464.h264ref 5.93% 471.omnetpp 31.89% 473.astar 16.20% 483.xalancbmk 44.62% geomean 16.83% Reviewers: davidxl, andreadb, rob.lougher, dblaikie, echristo Reviewed By: dblaikie, echristo Subscribers: hfinkel, rob.lougher, andreadb, gbedwell, cfe-commits, probinson, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D25435 llvm-svn: 292458	2017-01-19 00:44:21 +00:00
Peter Collingbourne	1e1475ace5	Move vtable type metadata emission behind a cc1-level flag. In ThinLTO mode, type metadata will require the module to be written as a multi-module bitcode file, which is currently incompatible with the Darwin linker. It is also useful to be able to enable or disable multi-module bitcode for testing purposes. This introduces a cc1-level flag, -f{,no-}lto-unit, which is used by the driver to enable multi-module bitcode on all but Darwin+ThinLTO, and can also be used to enable/disable the feature manually. Differential Revision: https://reviews.llvm.org/D28877 llvm-svn: 292448	2017-01-18 23:55:27 +00:00
David Blaikie	75ed8ad69e	Remove now redundant code that ensured debug info for class definitions was emitted under certain circumstances Introduced in r181561 - it may've been subsumed by work done to allow emission of declarations for vtable types while still emitting some of their member functions correctly for those declarations. Whatever the reason, the tests pass without this code now. llvm-svn: 292439	2017-01-18 21:15:18 +00:00
Arpith Chacko Jacob	fe4890a68b	[OpenMP] Support for the if-clause on the combined directive 'target parallel'. The if-clause on the combined directive potentially applies to both the 'target' and the 'parallel' regions. Codegen'ing the if-clause on the combined directive requires additional support because the expression in the clause must be captured by the 'target' capture statement but not the 'parallel' capture statement. Note that this situation arises for other clauses such as num_threads. The OMPIfClause class inherits OMPClauseWithPreInit to support capturing of expressions in the clause. A member CaptureRegion is added to OMPClauseWithPreInit to indicate which captured statement (in this case 'target' but not 'parallel') captures these expressions. To ensure correct codegen of captured expressions in the presence of combined 'target' directives, OMPParallelScope was added to 'parallel' codegen. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28781 llvm-svn: 292437	2017-01-18 20:40:48 +00:00
Arpith Chacko Jacob	44a87c9f1b	[OpenMP] Codegen for the 'target parallel' directive on the NVPTX device. This patch adds codegen for the 'target parallel' directive on the NVPTX device. We term offload OpenMP directives such as 'target parallel' and 'target teams distribute parallel for' as SPMD constructs. SPMD constructs, in contrast to Generic ones like the plain 'target', can never contain a serial region. SPMD constructs can be handled more efficiently on the GPU and do not require the Warp Loop of the Generic codegen scheme. This patch adds SPMD codegen support for 'target parallel' on the NVPTX device and can be reused for other SPMD constructs. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28755 llvm-svn: 292428	2017-01-18 19:35:00 +00:00
Arpith Chacko Jacob	19b911cb75	[OpenMP] Codegen support for 'target parallel' on the host. This patch adds support for codegen of 'target parallel' on the host. It is also the first combined directive that requires two or more captured statements. Support for this functionality is included in the patch. A combined directive such as 'target parallel' has two captured statements, one for the 'target' and the other for the 'parallel' region. Two captured statements are required because each has different implicit parameters (see SemaOpenMP.cpp). For example, the 'parallel' has 'global_tid' and 'bound_tid' while the 'target' does not. The patch adds support for handling multiple captured statements based on the combined directive. When codegen'ing the 'target parallel' directive, the 'target' outlined function is created using the outer captured statement and the 'parallel' outlined function is created using the inner captured statement. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28753 llvm-svn: 292419	2017-01-18 18:18:53 +00:00
Arpith Chacko Jacob	42793e000a	Revert r292374 to debug Windows buildbot failure. llvm-svn: 292400	2017-01-18 15:36:05 +00:00
Arpith Chacko Jacob	68019578a3	[OpenMP] Codegen support for 'target parallel' on the host. This patch adds support for codegen of 'target parallel' on the host. It is also the first combined directive that requires two or more captured statements. Support for this functionality is included in the patch. A combined directive such as 'target parallel' has two captured statements, one for the 'target' and the other for the 'parallel' region. Two captured statements are required because each has different implicit parameters (see SemaOpenMP.cpp). For example, the 'parallel' has 'global_tid' and 'bound_tid' while the 'target' does not. The patch adds support for handling multiple captured statements based on the combined directive. When codegen'ing the 'target parallel' directive, the 'target' outlined function is created using the outer captured statement and the 'parallel' outlined function is created using the inner captured statement. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28753 llvm-svn: 292374	2017-01-18 15:14:52 +00:00
Dan Gohman	839f215e19	[WebAssembly] Add minimal support for the new wasm object format triple. llvm-svn: 292269	2017-01-17 21:46:38 +00:00
Arpith Chacko Jacob	43a8b7bc8c	[OpenMP] Refactor code that calls codegen for target regions on the device. This patch refactors code that calls codegen for target regions. Currently the codebase only supports the 'target' directive. The patch pulls out common target processing code into a static function that can be called by codegen for any target directive. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28752 llvm-svn: 292134	2017-01-16 15:26:02 +00:00
Malcolm Parsons	c6e4583dbb	Remove unused lambda captures. NFC llvm-svn: 291939	2017-01-13 18:55:32 +00:00
Reid Kleckner	791bbf6f18	Use less byval on 32-bit Windows x86 for classes with bases This comes up in V8, which has a Handle template class that wraps a typed pointer, and is frequently passed by value. The pointer is stored in the base, HandleBase. This change allows us to pass the struct as a pointer instead of using byval. This avoids creating tons of temporary allocas that we copy from during call lowering. Eventually, it would be good to use FCAs here instead. llvm-svn: 291917	2017-01-13 17:18:19 +00:00
Dehao Chen	a1bd2d6585	Pass -fprofile-sample-use to lto backends. Summary: LTO backend will not invoke SampleProfileLoader pass even if -fprofile-sample-use is specified. This patch passes the flag down so that pass manager can add the SampleProfileLoader pass correctly. Reviewers: mehdi_amini, tejohnson Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D28588 llvm-svn: 291870	2017-01-13 00:51:55 +00:00
Anna Zaks	e43b4fc0ae	[tsan] Do not report errors in __destroy_helper_block_ There is a synchronization point between the reference count of a block dropping to zero and it's destruction, which TSan does not observe. Do not report errors in the compiler-emitted block destroy method and everything called from it. This is similar to https://reviews.llvm.org/D25857 Differential Revision: https://reviews.llvm.org/D28387 llvm-svn: 291868	2017-01-13 00:50:50 +00:00
Richard Smith	fbe2369f1a	Improve handling of instantiated thread_local variables in Itanium C++ ABI. * Do not initialize these variables when initializing the rest of the thread_locals in the TU; they have unordered initialization so they can be initialized by themselves. This fixes a rejects-valid bug: we would make the per-variable initializer function internal, but put it in a comdat keyed off the variable, resulting in link errors when the comdat is selected from a different TU (as the per TU TLS init function tries to call an init function that does not exist). * On Darwin, when we decide that we're not going to emit a thread wrapper function at all, demote its linkage to External. Fixes a verifier failure on explicit instantiation of a thread_local variable on Darwin. llvm-svn: 291865	2017-01-13 00:43:31 +00:00
Dehao Chen	37c79c236d	Revert r291774 which caused buildbot failure. llvm-svn: 291775	2017-01-12 16:56:18 +00:00
Dehao Chen	bd3689de91	Pass -fprofile-sample-use to lto backends. Summary: LTO backend will not invoke SampleProfileLoader pass even if -fprofile-sample-use is specified. This patch passes the flag down so that pass manager can add the SampleProfileLoader pass correctly. Reviewers: mehdi_amini, tejohnson Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D28588 llvm-svn: 291774	2017-01-12 16:29:25 +00:00
Manman Ren	9803ee8e9a	Module: Do not add any link flags when an implementation TU of a module imports a header of that same module. This fixes a regression caused by r280409. rdar://problem/29930553 This is an updated version for r291628 (which was reverted in r291688). llvm-svn: 291689	2017-01-11 18:47:38 +00:00
Chad Rosier	c22abb3820	[ARM] Use generic bitreverse intrinsic, rather than ARM specific rbit. The backend already supports lowering this intrinsic to a rbit instruction. llvm-svn: 291582	2017-01-10 18:55:11 +00:00
Kelvin Li	da68118729	[OpenMP] Sema and parsing for 'target teams distribute simd’ pragma This patch is to implement sema and parsing for 'target teams distribute simd’ pragma. Differential Revision: https://reviews.llvm.org/D28252 llvm-svn: 291579	2017-01-10 18:08:18 +00:00
Matthias Braun	44bfe03da9	CGDecl: Skip static variable initializers in unreachable code This fixes http://llvm.org/PR31054 Differential Revision: https://reviews.llvm.org/D28505 llvm-svn: 291576	2017-01-10 17:43:01 +00:00
Chad Rosier	5a4a1be690	[AArch64] Use generic bitreverse intrinsic, rather than AArch64 specific. Differential Revision: https://reviews.llvm.org/D28400 llvm-svn: 291574	2017-01-10 17:20:28 +00:00
Arpith Chacko Jacob	bb36fe8dba	[OpenMP] Basic support for a parallel directive in a target region on an NVPTX device Summary: This patch introduces support for the execution of parallel constructs in a target region on the NVPTX device. Parallel regions must be in the lexical scope of the target directive. The master thread in the master warp signals parallel work for worker threads in worker warps on encountering a parallel region. Note: The patch does not yet support capture of arguments in a parallel region so the test cases are simple. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28145 llvm-svn: 291565	2017-01-10 15:42:51 +00:00
Benjamin Kramer	796c1d9b54	Use the correct ObjC EH personality This fixes ObjC exceptions on Win64 (which uses SEH), among others. Patch by Jonathan Schleifer! llvm-svn: 291408	2017-01-08 22:58:07 +00:00
Teresa Johnson	cffeb54fc9	[ThinLTO] Optionally ignore empty index file Summary: In order to simplify distributed build system integration, where actions may be scheduled before the Thin Link which determines the list of objects selected by the linker. The gold plugin currently will emit 0-sized index files for objects not selected by the link, to enable checking for expected output files by the build system. If the build system then schedules a backend action for these bitcode files, we want to be able to fall back to normal compilation instead of failing. Fallback is enabled under an option in LLVM (D28410), in which case a nullptr is returned from llvm::getModuleSummaryIndexForFile. Clang can just proceed with non-ThinLTO compilation in that case. I am investigating whether this can be addressed in our build system, but that is a longer term fix and so this enables a workaround in the meantime. Reviewers: mehdi_amini Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D28362 llvm-svn: 291303	2017-01-06 23:37:33 +00:00
Mehdi Amini	7f873070c4	Add a cc1 option to force disabling lifetime-markers emission from clang Summary: This intended as a debugging/development flag only. Differential Revision: https://reviews.llvm.org/D28385 llvm-svn: 291300	2017-01-06 23:18:09 +00:00

1 2 3 4 5 ...

10556 Commits