llvm-project

Commit Graph

Author	SHA1	Message	Date
Arthur Eubanks	4cb24ef90a	[clang] Remove Address::deprecated() from CGClass.cpp	2022-02-23 13:31:56 -08:00
Arthur Eubanks	6eec483584	[clang] Remove getPointerElementType() in EmitVTableTypeCheckedLoad()	2022-02-23 09:38:33 -08:00
Sri Hari Krishna Narayanan	5aa24558cf	OMPIRBuilder for Interop directive Implements the OMPIRBuilder portion for the Interop directive. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D105876	2022-01-27 14:53:18 -05:00
Kazu Hirata	d1b127b5b7	[clang] Remove unused forward declarations (NFC)	2022-01-08 11:56:40 -08:00
Nikita Popov	e8b98a5216	[CodeGen] Emit elementtype attributes for indirect inline asm constraints This implements the clang side of D116531. The elementtype attribute is added for all indirect constraints (*) and tests are updated accordingly. Differential Revision: https://reviews.llvm.org/D116666	2022-01-06 09:29:22 +01:00
Kazu Hirata	d677a7cb05	[clang] Remove redundant member initialization (NFC) Identified with readability-redundant-member-init.	2022-01-02 10:20:23 -08:00
Nikita Popov	1f07a4a569	[CodeGen] Avoid more pointer element type accesses	2021-12-27 12:00:22 +01:00
Alok Kumar Sharma	5eb271880c	[clang][OpenMP][DebugInfo] Debug support for variables in shared clause of OpenMP task construct Currently variables appearing inside shared clause of OpenMP task construct are not visible inside lldb debugger. After the current patch, lldb is able to show the variable ``` * thread #1, name = 'a.out', stop reason = breakpoint 1.1 frame #0: 0x0000000000400934 a.out`.omp_task_entry. [inlined] .omp_outlined.(.global_tid.=0, .part_id.=0x000000000071f0d0, .privates.=0x000000000071f0e8, .copy_fn.=(a.out`.omp_task_privates_map. at testshared.cxx:8), .task_t.=0x000000000071f0c0, __context=0x000000000071f0f0) at testshared.cxx:10:34 7 else { 8 #pragma omp task shared(svar) firstprivate(n) 9 { -> 10 printf("Task svar = %d\n", svar); 11 printf("Task n = %d\n", n); 12 svar = fib(n - 1); 13 } (lldb) p svar (int) $0 = 9 ``` Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D115510	2021-12-22 20:04:21 +05:30
Nikita Popov	55d7a12b86	[CodeGen] Avoid pointee type access during global var declaration All callers pass in a GlobalVariable, so we can conveniently fetch the type from there.	2021-12-21 11:48:37 +01:00
Nikita Popov	a0cf066eac	[CodeGen] Store element type in ParamValue ParamValue is basically a union between an Address and a Value*. To be able to reconstruct the Address, we now need to store the pointer element type.	2021-12-16 15:31:55 +01:00
Nikita Popov	58c8c53263	[CodeGen] Avoid more pointer element type accesses	2021-12-16 15:26:21 +01:00
Nikita Popov	9fa15e0073	[CodeGen] Remove an unused MakeAddrLValue() overload (NFC) This is unused and we should prefer the overloads accepting Address.	2021-12-16 11:49:20 +01:00
Nikita Popov	d930c3155c	[CodeGen] Pass element type to EmitCheckedInBoundsGEP() Same as for other GEP creation methods.	2021-12-15 14:03:33 +01:00
Sindhu Chittireddy	4706a297fb	Avoid setting tbaa on the store of return type of call to inline assembler. In 32bit mode, attaching TBAA metadata to the store following the call to inline assembler results in describing the wrong type by making a fake lvalue(i.e., whatever the inline assembler happens to leave in EAX:EDX.) Even if inline assembler somehow describes the correct type, setting TBAA information on return type of call to inline assembler is likely not correct, since TBAA rules need not apply to inline assembler. Differential Revision: https://reviews.llvm.org/D115320	2021-12-14 17:40:33 -08:00
Kazu Hirata	d0ac215dd5	[clang] Use isa instead of dyn_cast (NFC)	2021-11-14 09:32:40 -08:00
Jon Chesterfield	27177b82d4	[OpenMP] Lower printf to __llvm_omp_vprintf Extension of D112504. Lower amdgpu printf to `__llvm_omp_vprintf` which takes the same const char, void arguments as cuda vprintf and also passes the size of the void* alloca which will be needed by a non-stub implementation of `__llvm_omp_vprintf` for amdgpu. This removes the amdgpu link error on any printf in a target region in favour of silently compiling code that doesn't print anything to stdout. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D112680	2021-11-10 15:30:56 +00:00
Jorge Gorbe Moya	770ddf599d	Fix unused variable warning in release build	2021-11-09 19:48:42 -08:00
hsmahesha	3b9a85d10a	[CFE][Codegen] Make sure to maintain the contiguity of all the static allocas at the start of the entry block, which in turn would aid better code transformation/optimization. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D110257	2021-11-10 08:45:21 +05:30
Jon Chesterfield	0fa45d6d80	Revert "[OpenMP] Lower printf to __llvm_omp_vprintf" This reverts commit `db81d8f6c4`.	2021-11-08 20:28:57 +00:00
Jon Chesterfield	db81d8f6c4	[OpenMP] Lower printf to __llvm_omp_vprintf Extension of D112504. Lower amdgpu printf to `__llvm_omp_vprintf` which takes the same const char, void arguments as cuda vprintf and also passes the size of the void* alloca which will be needed by a non-stub implementation of `__llvm_omp_vprintf` for amdgpu. This removes the amdgpu link error on any printf in a target region in favour of silently compiling code that doesn't print anything to stdout. The exact set of changes to check-openmp probably needs revision before commit Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D112680	2021-11-08 18:38:00 +00:00
Mike Rice	6f9c25167d	[OpenMP] Initial parsing/sema for the 'omp loop' construct Adds basic parsing/sema/serialization support for the #pragma omp loop directive. Differential Revision: https://reviews.llvm.org/D112499	2021-10-28 08:26:43 -07:00
hsmahesha	db9c2d7751	[CFE][Codegen] Remove CodeGenFunction::InitTempAlloca() Sequel patch to https://reviews.llvm.org/D111316 Finally, remove the defintion of CodeGenFunction::InitTempAlloca(). Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D111324	2021-10-12 10:04:15 +05:30
Giorgis Georgakoudis	ac90dfc43a	Revert "[OpenMP] Codegen aggregate for outlined function captures" This reverts commit `1d66649adf`. Revert to fix AMG GPU issue.	2021-09-21 13:20:39 -07:00
Giorgis Georgakoudis	1d66649adf	[OpenMP] Codegen aggregate for outlined function captures Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call. Reviewed By: jdoerfert, jhuber6 Differential Revision: https://reviews.llvm.org/D102107	2021-09-21 10:50:04 -07:00
alokmishra.besu	000875c127	OpenMP 5.0 metadirective This patch supports OpenMP 5.0 metadirective features. It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind. A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h Currently this function return the index of the when clause with the highest score from the ones applicable in the Context. But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D91944	2021-09-18 13:40:44 -05:00
Nico Weber	31cca21565	Revert "OpenMP 5.0 metadirective" This reverts commit `c7d7b98e52`. Breaks tests on macOS, see comment on https://reviews.llvm.org/D91944	2021-09-18 09:10:37 -04:00
alokmishra.besu	347f3c186d	OpenMP 5.0 metadirective This patch supports OpenMP 5.0 metadirective features. It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind. A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h Currently this function return the index of the when clause with the highest score from the ones applicable in the Context. But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D91944	2021-09-17 16:30:06 -05:00
cchen	7efb825382	Revert "OpenMP 5.0 metadirective" This reverts commit `c7d7b98e52`.	2021-09-17 16:14:16 -05:00
cchen	c7d7b98e52	OpenMP 5.0 metadirective This patch supports OpenMP 5.0 metadirective features. It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind. A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h Currently this function return the index of the when clause with the highest score from the ones applicable in the Context. But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D91944	2021-09-17 16:03:13 -05:00
Michael Kruse	650bbc5620	[OpenMP][OpenMPIRBuilder] Implement loop unrolling. Recommit of `707ce34b06`. Don't introduce a dependency to the LLVMPasses component, instead register the required passes individually. Add methods for loop unrolling to the OpenMPIRBuilder class and use them in Clang if `-fopenmp-enable-irbuilder` is enabled. The unrolling methods are: * `unrollLoopFull` * `unrollLoopPartial` * `unrollLoopHeuristic` `unrollLoopPartial` and `unrollLoopHeuristic` can use compiler heuristics to automatically determine the unroll factor. If possible, that is if no CanonicalLoopInfo is required to pass to another method, metadata for LLVM's LoopUnrollPass is added. Otherwise the unroll factor is determined using the same heurstics as user by LoopUnrollPass. Not requiring a CanonicalLoopInfo, especially with `unrollLoopHeuristic` allows greater flexibility. With full unrolling and partial unrolling with known unroll factor, instead of duplicating instructions by the OpenMPIRBuilder, the full unroll is still delegated to the LoopUnrollPass. In case of partial unrolling the loop is first tiled using the existing `tileLoops` methods, then the inner loop fully unrolled using the same mechanism. Reviewed By: jdoerfert, kiranchandramohan Differential Revision: https://reviews.llvm.org/D107764	2021-09-04 19:18:58 -05:00
PeixinQiao	a42380ce83	[OMPIRBuilder] Add ordered directive to OMPBuilder Add support for ordered directive in the OpenMPIRBuilder. This patch also modidies clang to use the ordered directive when the option -fopenmp-enable-irbuilder is enabled. Also fix one ICE when parsing one canonical for loop with the relational operator LE or GE in openmp region by replacing unary increment operation of the expression of the variable "Expr A" minus the variable "Expr B" (++(Expr A - Expr B)) with binary addition operation of the experssion of the variable "Expr A" minus the variable "Expr B" and the expression with constant value "1" (Expr A - Expr B + "1"). Reviewed By: Meinersbur, kiranchandramohan Differential Revision: https://reviews.llvm.org/D107430	2021-09-03 09:37:58 +08:00
Roman Lebedev	50634deaa5	Revert "[OpenMP][OpenMPIRBuilder] Implement loop unrolling." Breaks build with -DBUILD_SHARED_LIBS=ON ``` CMake Error: The inter-target dependency graph contains the following strongly connected component (cycle): "LLVMFrontendOpenMP" of type SHARED_LIBRARY depends on "LLVMPasses" (weak) "LLVMipo" of type SHARED_LIBRARY depends on "LLVMFrontendOpenMP" (weak) "LLVMCoroutines" of type SHARED_LIBRARY depends on "LLVMipo" (weak) "LLVMPasses" of type SHARED_LIBRARY depends on "LLVMCoroutines" (weak) depends on "LLVMipo" (weak) At least one of these targets is not a STATIC_LIBRARY. Cyclic dependencies are allowed only among static libraries. CMake Generate step failed. Build files cannot be regenerated correctly. ``` This reverts commit `707ce34b06`.	2021-09-02 12:42:23 +03:00
Michael Kruse	707ce34b06	[OpenMP][OpenMPIRBuilder] Implement loop unrolling. Add methods for loop unrolling to the OpenMPIRBuilder class and use them in Clang if `-fopenmp-enable-irbuilder` is enabled. The unrolling methods are: * `unrollLoopFull` * `unrollLoopPartial` * `unrollLoopHeuristic` `unrollLoopPartial` and `unrollLoopHeuristic` can use compiler heuristics to automatically determine the unroll factor. If possible, that is if no CanonicalLoopInfo is required to pass to another method, metadata for LLVM's LoopUnrollPass is added. Otherwise the unroll factor is determined using the same heurstics as user by LoopUnrollPass. Not requiring a CanonicalLoopInfo, especially with `unrollLoopHeuristic` allows greater flexibility. With full unrolling and partial unrolling with known unroll factor, instead of duplicating instructions by the OpenMPIRBuilder, the full unroll is still delegated to the LoopUnrollPass. In case of partial unrolling the loop is first tiled using the existing `tileLoops` methods, then the inner loop fully unrolled using the same mechanism. Reviewed By: jdoerfert, kiranchandramohan Differential Revision: https://reviews.llvm.org/D107764	2021-09-02 02:37:25 -05:00
Andrei Elovikov	1724a16437	[NFC][clang] Move IR-independent parts of target MV support to X86TargetParser.cpp ...that is located under llvm/lib/Support/. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D108423	2021-08-30 09:48:48 -07:00
Simon Pilgrim	7f48bd3bed	CGBuiltin.cpp - pass SVETypeFlags by const reference. NFC. Don't pass the struct by value.	2021-08-22 12:13:17 +01:00
Alexander Potapenko	b0391dfc73	[clang][Codegen] Introduce the disable_sanitizer_instrumentation attribute The purpose of __attribute__((disable_sanitizer_instrumentation)) is to prevent all kinds of sanitizer instrumentation applied to a certain function, Objective-C method, or global variable. The no_sanitize(...) attribute drops instrumentation checks, but may still insert code preventing false positive reports. In some cases though (e.g. when building Linux kernel with -fsanitize=kernel-memory or -fsanitize=thread) the users may want to avoid any kind of instrumentation. Differential Revision: https://reviews.llvm.org/D108029	2021-08-20 14:01:06 +02:00
Melanie Blower	bc5b5ea037	[clang][patch][FPEnv] Make initialization of C++ globals strictfp aware @kpn pointed out that the global variable initialization functions didn't have the "strictfp" metadata set correctly, and @rjmccall said that there was buggy code in SetFPModel and StartFunction, this patch is to solve those problems. When Sema creates a FunctionDecl, it sets the FunctionDeclBits.UsesFPIntrin to "true" if the lexical FP settings (i.e. a combination of command line options and #pragma float_control settings) correspond to ConstrainedFP mode. That bit is used when CodeGen starts codegen for a llvm function, and it translates into the "strictfp" function attribute. See bugs.llvm.org/show_bug.cgi?id=44571 Reviewed By: Aaron Ballman Differential Revision: https://reviews.llvm.org/D102343	2021-07-29 12:02:37 -04:00
Giorgis Georgakoudis	fb0cf01795	Revert "[OpenMP] Codegen aggregate for outlined function captures" This reverts commit `e9c7291cb2`. Fix failing tests	2021-07-19 07:54:26 -07:00
Jamie Schmeiser	73840f9f81	thread_local support for AIX Summary: The AIX linker will produce errors on unresolved weak symbols. Change the generated code to not check for the initialization function but just call it and ensure that it always exists. Also, the AIX atexit routine has a different name (and signature) so call it correctly. Update the lit tests to test on AIX appropriately. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: hubert.reinterpretcast (Hubert Tong) Differential Revision: https://reviews.llvm.org/D104420	2021-07-19 10:03:22 -04:00
Giorgis Georgakoudis	e9c7291cb2	[OpenMP] Codegen aggregate for outlined function captures Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D102107	2021-07-16 23:27:44 -07:00
Graham Hunter	c6a91ee6aa	[Clang][OpenMP] Monotonic does not apply to SIMD The codegen for simd constructs was affected by the presence (or absence) of the 'monotonic' schedule modifier for worksharing loops. The modifier is only intended to apply to the scheduling of chunks for a thread, not iterations of a loop inside a chunk. In addition, the monotonic modifier was applied to worksharing loops by default if no schedule clause was present; the referenced part of the OpenMP 4.5 spec in the code (section 2.7.1) only applies if the user specified a schedule clause with a static kind but no modifier. Without a user-specified schedule clause we should default to nonmonotonic scheduling. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D103793	2021-06-22 10:24:11 +01:00
Michael Kruse	a22236120f	[OpenMP] Implement '#pragma omp unroll'. Implementation of the unroll directive introduced in OpenMP 5.1. Follows the approach from D76342 for the tile directive (i.e. AST-based, not using the OpenMPIRBuilder). Tries to use `llvm.loop.unroll.*` metadata where possible, but has to fall back to an AST representation of the outer loop if the partially unrolled generated loop is associated with another directive (because it needs to compute the number of iterations). Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D99459	2021-06-10 14:30:17 -05:00
Hsiangkai Wang	2b13ff6979	[Clang][CodeGen] Set the size of llvm.lifetime to unknown for scalable types. If the memory object is scalable type, we do not know the exact size of it at compile time. Set the size of lifetime marker to unknown if the object is scalable one. Differential Revision: https://reviews.llvm.org/D102822	2021-06-07 23:30:13 +08:00
Ten Tzen	797ad70152	[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1 This patch is the Part-1 (FE Clang) implementation of HW Exception handling. This new feature adds the support of Hardware Exception for Microsoft Windows SEH (Structured Exception Handling). This is the first step of this project; only X86_64 target is enabled in this patch. Compiler options: For clang-cl.exe, the option is -EHa, the same as MSVC. For clang.exe, the extra option is -fasync-exceptions, plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual. NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change. The rules for C code: For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules: * First, no exception can move in or out of _try region., i.e., no "potential faulty instruction can be moved across _try boundary. * Second, the order of exceptions for instructions 'directly' under a _try must be preserved (not applied to those in callees). * Finally, global states (local/global/heap variables) that can be read outside of _try region must be updated in memory (not just in register) before the subsequent exception occurs. The impact to C++ code: Although SEH is a feature for C code, -EHa does have a profound effect on C++ side. When a C++ function (in the same compilation unit with option -EHa ) is called by a SEH C function, a hardware exception occurs in C++ code can also be handled properly by an upstream SEH _try-handler or a C++ catch(...). As such, when that happens in the middle of an object's life scope, the dtor must be invoked the same way as C++ Synchronous Exception during unwinding process. Design: A natural way to achieve the rules above in LLVM today is to allow an EH edge added on memory/computation instruction (previous iload/istore idea) so that exception path is modeled in Flow graph preciously. However, tracking every single memory instruction and potential faulty instruction can create many Invokes, complicate flow graph and possibly result in negative performance impact for downstream optimization and code generation. Making all optimizations be aware of the new semantic is also substantial. This design does not intend to model exception path at instruction level. Instead, the proposed design tracks and reports EH state at BLOCK-level to reduce the complexity of flow graph and minimize the performance-impact on CPP code under -EHa option. One key element of this design is the ability to compute State number at block-level. Our algorithm is based on the following rationales: A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping into a _try is not allowed. The single entry must start with a seh_try_begin() invoke with a correct State number that is the initial state of the SEME. Through control-flow, state number is propagated into all blocks. Side exits marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[]. Note side exits can ONLY jump into parent scopes (lower state number). Thus, when a block succeeds various states from its predecessors, the lowest State triumphs others. If some exits flow to unreachable, propagation on those paths terminate, not affecting remaining blocks. For CPP code, object lifetime region is usually a SEME as SEH _try. However there is one rare exception: jumping into a lifetime that has Dtor but has no Ctor is warned, but allowed: Warning: jump bypasses variable with a non-trivial destructor In that case, the region is actually a MEME (multiple entry multiple exits). Our solution is to inject a eha_scope_begin() invoke in the side entry block to ensure a correct State. Implementation: Part-1: Clang implementation described below. Two intrinsic are created to track CPP object scopes; eha_scope_begin() and eha_scope_end(). _scope_begin() is immediately added after ctor() is called and EHStack is pushed. So it must be an invoke, not a call. With that it's also guaranteed an EH-cleanup-pad is created regardless whether there exists a call in this scope. _scope_end is added before dtor(). These two intrinsics make the computation of Block-State possible in downstream code gen pass, even in the presence of ctor/dtor inlining. Two intrinsic, seh_try_begin() and seh_try_end(), are added for C-code to mark _try boundary and to prevent from exceptions being moved across _try boundary. All memory instructions inside a _try are considered as 'volatile' to assure 2nd and 3rd rules for C-code above. This is a little sub-optimized. But it's acceptable as the amount of code directly under _try is very small. Part-2 (will be in Part-2 patch): LLVM implementation described below. For both C++ & C-code, the state of each block is computed at the same place in BE (WinEHPreparing pass) where all other EH tables/maps are calculated. In addition to _scope_begin & _scope_end, the computation of block state also rely on the existing State tracking code (UnwindMap and InvokeStateMap). For both C++ & C-code, the state of each block with potential trap instruction is marked and reported in DAG Instruction Selection pass, the same place where the state for -EHsc (synchronous exceptions) is done. If the first instruction in a reported block scope can trap, a Nop is injected before this instruction. This nop is needed to accommodate LLVM Windows EH implementation, in which the address in IPToState table is offset by +1. (note the purpose of that is to ensure the return address of a call is in the same scope as the call address. The handler for catch(...) for -EHa must handle HW exception. So it is 'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches C++ exceptions). Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW exceptions can be passed through. Original llvm-dev [RFC] discussions can be found in these two threads below: https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html Differential Revision: https://reviews.llvm.org/D80344/new/	2021-05-17 22:42:17 -07:00
Florian Hahn	6c31295493	[clang] Refactor mustprogress handling, add it to all loops in c++11+. Currently Clang does not add mustprogress to inifinite loops with a known constant condition, matching C11 behavior. The forward progress guarantee in C++11 and later should allow us to add mustprogress to any loop (http://eel.is/c++draft/intro.progress#1). This allows us to simplify the code dealing with adding mustprogress a bit. Reviewed By: aaron.ballman, lebedev.ri Differential Revision: https://reviews.llvm.org/D96418	2021-04-30 14:13:47 +01:00
Joshua Haberman	8344675908	Implemented [[clang::musttail]] attribute for guaranteed tail calls. This is a Clang-only change and depends on the existing "musttail" support already implemented in LLVM. The [[clang::musttail]] attribute goes on a return statement, not a function definition. There are several constraints that the user must follow when using [[clang::musttail]], and these constraints are verified by Sema. Tail calls are supported on regular function calls, calls through a function pointer, member function calls, and even pointer to member. Future work would be to throw a warning if a users tries to pass a pointer or reference to a local variable through a musttail call. Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D99517	2021-04-15 17:12:21 -07:00
cchen	e0c2125d1d	[OpenMP] Added codegen for masked directive Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D100514	2021-04-15 12:55:07 -05:00
yifeng.dongyifeng	3a6a80b641	[Clang][Coroutine][DebugInfo] In c++ coroutine, clang will emit different debug info variables for parameters and move-parameters. The first one is the real parameters of the coroutine function, the other one just for copying parameters to the coroutine frame. Considering the following c++ code: ``` struct coro { ... }; coro foo(struct test & t) { ... co_await suspend_always(); ... co_await suspend_always(); ... co_await suspend_always(); } int main(int argc, char *argv[]) { auto c = foo(...); c.handle.resume(); ... } ``` Function foo is the standard coroutine function, and it has only one parameter named t (ignoring this at first), when we use the llvm code to compile this function, we can get the following ir: ``` !2921 = distinct !DISubprogram(name: "foo", linkageName: "_ZN6Object3fooE4test", scope: !2211, file: !45, li\ ne: 48, type: !2329, scopeLine: 48, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagDefi\ nition \| DISPFlagOptimized, unit: !44, declaration: !2328, retainedNodes: !2922) !2924 = !DILocalVariable(name: "t", arg: 2, scope: !2921, file: !45, line: 48, type: !838) ... !2926 = !DILocalVariable(name: "t", scope: !2921, type: !838, flags: DIFlagArtificial) ``` We can find there are two `the same` DIVariable named t in the same dwarf scope for foo.resume. And when we try to use llvm-dwarfdump to dump the dwarf info of this elf, we get the following output: ``` 0x00006684: DW_TAG_subprogram DW_AT_low_pc (0x00000000004013a0) DW_AT_high_pc (0x00000000004013a8) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_object_pointer (0x0000669c) DW_AT_GNU_all_call_sites (true) DW_AT_specification (0x00005b5c "_ZN6Object3fooE4test") 0x000066a5: DW_TAG_formal_parameter DW_AT_name ("t") DW_AT_decl_file ("/disk1/yifeng.dongyifeng/my_code/llvm/build/bin/coro-debug-1.cpp") DW_AT_decl_line (48) DW_AT_type (0x00004146 "test") 0x000066ba: DW_TAG_variable DW_AT_name ("t") DW_AT_type (0x00004146 "test") DW_AT_artificial (true) ``` The elf also has two 't' in the same scope. But unluckily, it might let the debugger confused. And failed to print parameters for O0 or above. This patch will make coroutine parameters and move parameters use the same DIVar and try to fix the problems that I mentioned before. Test Plan: check-clang Reviewed By: aprantl, jmorse Differential Revision: https://reviews.llvm.org/D97533	2021-04-12 11:10:47 +08:00
Xiangling Liao	d508561798	[AIX] Support init priority attribute Differential Revision: https://reviews.llvm.org/D99291	2021-04-08 15:40:09 -04:00
Xun Li	c7a39c833a	[Coroutine][Clang] Force emit lifetime intrinsics for Coroutines tl;dr Correct implementation of Corouintes requires having lifetime intrinsics available. Coroutine functions are functions that can be suspended and resumed latter. To do so, data that need to stay alive after suspension must be put on the heap (i.e. the coroutine frame). The optimizer is responsible for analyzing each AllocaInst and figure out whether it should be put on the stack or the frame. In most cases, for data that we are unable to accurately analyze lifetime, we can just conservatively put them on the heap. Unfortunately, there exists a few cases where certain data MUST be put on the stack, not on the heap. Without lifetime intrinsics, we are unable to correctly analyze those data's lifetime. To dig into more details, there exists cases where at certain code points, the current coroutine frame may have already been destroyed. Hence no frame access would be allowed beyond that point. The following is a common code pattern called "Symmetric Transfer" in coroutine: ``` auto tmp = await_suspend(); __builtin_coro_resume(tmp.address()); return; ``` In the above code example, `await_suspend()` returns a new coroutine handle, which we will obtain the address and then resume that coroutine. This essentially "transfered" from the current coroutine to a different coroutine. During the call to `await_suspend()`, the current coroutine may be destroyed, which should be fine because we are not accessing any data afterwards. However when LLVM is emitting IR for the above code, it needs to emit an AllocaInst for `tmp`. It will then call the `address` function on tmp. `address` function is a member function of coroutine, and there is no way for the LLVM optimizer to know that it does not capture the `tmp` pointer. So when the optimizer looks at it, it has to conservatively assume that `tmp` may escape and hence put it on the heap. Furthermore, in some cases `address` call would be inlined, which will generate a bunch of store/load instructions that move the `tmp` pointer around. Those stores will also make the compiler to think that `tmp` might escape. To summarize, it's really difficult for the mid-end to figure out that the `tmp` data is short-lived. I made some attempt in D98638, but it appears to be way too complex and is basically doing the same thing as inserting lifetime intrinsics in coroutines. Also, for reference, we already force emitting lifetime intrinsics in O0 for AlwaysInliner: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Passes/PassBuilder.cpp#L1893 Differential Revision: https://reviews.llvm.org/D99227	2021-03-25 13:46:20 -07:00

1 2 3 4 5 ...

1540 Commits