llvm-project

Commit Graph

Author	SHA1	Message	Date
Chuanqi Xu	1cedc51ff5	[Coroutines] Don't merge readnone calls in presplit coroutines Another alternative to fix the thread identification problem in coroutines. We plan to fix this problem by unifying memory effecting attributes. See https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579. But it may be a long-term project. And it is a pity that the coroutines can't resume in different threads for years. So this one is temporary fix. It may cause unnecessary performance regression for coroutines. But correctness are more important. And this one is planned to be reverted after we are able to unify the memory effecting attributes actually. Reviewed By: jdoerfert, rjmccall Differential Revision: https://reviews.llvm.org/D135550	2022-10-17 10:22:43 +08:00
Nikita Popov	6e504d637d	[ValueTracking] Handle constant exprs in isKnownNonZero() Handle constant expressions by falling through to the general operator-based code. In particular, this adds support for bitcast and GEP expressions.	2022-10-04 11:58:07 +02:00
Simon Pilgrim	5849fcb635	Revert rG1b7089fe67b924bdd5ecef786a34bdba7a88778f "[SLP] Add ScalarizationOverheadBuilder helper to track vector extractions" Revert rGef89409a59f3b79ae143b33b7d8e6ee6285aa42f "Fix 'unused-lambda-capture' gcc warning. NFCI." Revert rG926ccfef032d206dcbcdf74ca1e3a9ebf4d1be45 "[SLP] ScalarizationOverheadBuilder - demand all elements for scalarization if the extraction index is unknown / out of bounds" Revert ScalarizationOverheadBuilder sequence from D134605 - when accumulating extraction costs by Type (instead of specific Value), we are not distinguishing enough when they are coming from the same source or not, and we always just count the cost once. This needs addressing before we can use getScalarizationOverhead properly.	2022-09-30 11:22:48 +01:00
Simon Pilgrim	1b7089fe67	[SLP] Add ScalarizationOverheadBuilder helper to track vector extractions Instead of accumulating all extraction costs separately and then adjusting for repeated subvector extractions, this patch collects all the extractions and then converts to calls to getScalarizationOverhead to improve the accuracy of the costs. I'm not entirely satisfied with the getExtractWithExtendCost handling yet - this still just adds all the getExtractWithExtendCost costs together - it really needs to be replaced with a "getScalarizationOverheadWithExtend", but that will require further refactoring first. This replaces my initial attempt in D124769. Differential Revision: https://reviews.llvm.org/D134605	2022-09-27 14:49:07 +01:00
Simon Pilgrim	2315ae35cf	[Coroutines] Regenerate coro-retcon-resume-values.ll	2022-09-25 19:58:18 +01:00
Adrian Vogelsgesang	5af06ba7dc	[Coro][Debuginfo] Add debug info to `__NoopCoro_ResumeDestroy` function With this commit, we now attach an `DISubprogram` to the LLVM-generated `_NoopCoro_ResumeDestroy` function. Thereby, lldb can show a `std::coroutine_handle` to a `std::noop_coroutine` as ``` continuation = coro frame = 0x555555560d98 { resume = 0x0000555555555c50 (a.out`__NoopCoro_ResumeDestroy) destroy = 0x0000555555555c50 (a.out`__NoopCoro_ResumeDestroy) } ``` instead of ``` continuation = coro frame = 0x555555560d98 { resume = 0x0000555555555c50 (a.out`___lldb_unnamed_symbol211) destroy = 0x0000555555555c50 (a.out`___lldb_unnamed_symbol211) } ``` I renamed the function from `NoopCoro.ResumeDestroy` to `_NoopCoro_ResumeDestroy` because: * the leading `_` makes sure this is a reserved name and should not clash with any user-provided names * the `.` was replaced by a `_`, so the name is now a valid identifier in C, making it allows me to type its name in the debugger Differential Revision: https://reviews.llvm.org/D132580	2022-08-26 05:49:52 -07:00
Chuanqi Xu	17631ac676	[Coroutines] Store the index for final suspend point if there is unwind coro end Closing https://github.com/llvm/llvm-project/issues/57339 The root cause for this issue is an pre-mature optimization to eliminate the index for the final suspend point since we feel like we can judge if a coroutine is suspended at the final suspend by if resume_fn_addr is null. However this is not true if the coroutine exists via an exception in promise.unhandled_exception(). According to [dcl.fct.def.coroutine]p14: > If the evaluation of the expression promise.unhandled_exception() > exits via an exception, the coroutine is considered suspended at the > final suspend point. But from the perspective of the implementation, we can't set the coro index to the final suspend point directly since it breaks the states. To fix the issue, we block the optimization if we find there is any unwind coro end, which indicates that it is possible that the coroutine exists via an exception from promise.unhandled_exception(). Test Plan: folly	2022-08-26 14:05:46 +08:00
Ting Wang	d2d77e050b	[PowerPC][Coroutines] Add tail-call check with call information for coroutines Fixes #56679. Reviewed By: ChuanqiXu, shchenz Differential Revision: https://reviews.llvm.org/D131953	2022-08-21 22:20:40 -04:00
Chuanqi Xu	e190b7cc90	[Coroutines] Maintain the position of final suspend Closing https://github.com/llvm/llvm-project/issues/56329 The problem happens when we try to simplify the suspend points. We might break the assumption that the final suspend lives in the last slot of Shape.CoroSuspends. This patch tries to main the assumption and fixes the problem.	2022-08-12 13:05:08 +08:00
Arnold Schwaighofer	6ef223c041	[coro async] Mark async suspend function and its resume function pointer intrinsic as nomerge Coroutine splitting is not possible if the one-to-one mapping between the two is lost. Every suspend point must have a matching continuation function pointer. rdar://98404664 Differential Revision: https://reviews.llvm.org/D131684	2022-08-11 11:43:30 -07:00
Chuanqi Xu	230d6f93aa	[Coroutines] Remove lifetime intrinsics for spliied allocas in coroutine frames Closing https://github.com/llvm/llvm-project/issues/56919 It is meaningless to preserve the lifetime markers for the spilled allocas in the coroutine frames and it would block some optimizations too.	2022-08-05 14:50:43 +08:00
Augie Fackler	12c0bf8ba9	tests: add attributes that would normally come from inferattrs As my goal is to remove at least _some_ functions from the static list in MemoryBuiltins.cpp, these tests either need to run inferattrs or statically declare these attributes to keep passing. A couple of tests had alternate cases which are no longer meaningful, e.g. `malloc-load-removal.ll`. Differential Revision: https://reviews.llvm.org/D123087	2022-07-25 17:29:00 -04:00
Sanjay Patel	bfb9b8e075	[Passes] add a tail-call-elim pass near the end of the opt pipeline We call tail-call-elim near the beginning of the pipeline, but that is too early to annotate calls that get added later. In the motivating case from issue #47852, the missing 'tail' on memset leads to sub-optimal codegen. I experimented with removing the early instance of tail-call-elim instead of just adding another pass, but that appears to be slightly worse for compile-time: +0.15% vs. +0.08% time. "tailcall" shows adding the pass; "tailcall2" shows moving the pass to later, then adding the original early pass back (so 1596886802 is functionally equivalent to 180b0439dc ): https://llvm-compile-time-tracker.com/index.php?config=NewPM-O3&stat=instructions&remote=rotateright Note that there was an effort to split the tail call functionality into 2 passes - that could help reduce compile-time if we find that this change costs more in compile-time than expected based on the preliminary testing: D60031 Differential Revision: https://reviews.llvm.org/D130374	2022-07-25 15:25:47 -04:00
Arnold Schwaighofer	58e6ee0e1f	llvm.swift.async.context.addr cannot be modeled as NoMem because we don't want it to be cse'd accross async suspends An async suspend models the split between two partial async functions. `llvm.swift.async.context.addr ` will have a different value in the two partial functions so it is not correct to generally CSE the instruction. rdar://97336162 Differential Revision: https://reviews.llvm.org/D130201	2022-07-22 11:50:58 -07:00
Nicolai Hähnle	1ddc51d89d	Inliner: don't mark call sites as 'nounwind' if that would be redundant When F calls G calls H, G is nounwind, and G is inlined into F, then the inlined call-site to H should be effectively nounwind so as not to lose information during inlining. If H itself is nounwind (which often happens when H is an intrinsic), we no longer mark the callsite explicitly as nounwind. Previously, there were cases where the inlined call-site of H differs from a pre-existing call-site of H in F only in the explicitly added nounwind attribute, thus preventing common subexpression elimination. v2: - just check CI->doesNotThrow v3 (resubmit after revert at `3443788087`): - update Clang tests Differential Revision: https://reviews.llvm.org/D129860	2022-07-20 14:17:23 +02:00
Chuanqi Xu	645d2dd3a9	Revert "Don't treat readnone call in presplit coroutine as not access memory" This reverts commit `57224ff4a6`. This commit may trigger crashes on some workloads. Revert it for clearness.	2022-07-20 17:00:58 +08:00
Chuanqi Xu	57224ff4a6	Don't treat readnone call in presplit coroutine as not access memory To solve the readnone problems in coroutines. See https://discourse.llvm.org/t/address-thread-identification-problems-with-coroutine/62015 for details. According to the discussion, we decide to fix the problem by inserting isPresplitCoroutine() checks in different passes instead of wrapping/unwrapping readnone attributes in CoroEarly/CoroCleanup passes. In this direction, we might not be able to cover every case at first. Let's take a "find and fix" strategy. Reviewed By: nikic, nhaehnle, jyknight Differential Revision: https://reviews.llvm.org/D127383	2022-07-20 10:37:23 +08:00
Arnold Schwaighofer	bc4870f09e	[coro async] Add missing llvm.coro.id.async intrinsic to declaresCoroCleanupIntrinsics rdar://97214593 Differential Revision: https://reviews.llvm.org/D130038	2022-07-19 07:25:04 -07:00
Arnold Schwaighofer	28ebd13d63	[coro async] Fix code to run coro.async.end cleanup like the legacy pass did The code executed for the Switch ABI does not change. rdar://97074714 Differential Revision: https://reviews.llvm.org/D129865	2022-07-18 10:41:29 -07:00
Nicolai Hähnle	3443788087	Revert "Inliner: don't mark call sites as 'nounwind' if that would be redundant" This reverts commit `9905c37981`. Looks like there are Clang changes that are affected in trivial ways. Will look into it.	2022-07-18 17:43:35 +02:00
Nicolai Hähnle	9905c37981	Inliner: don't mark call sites as 'nounwind' if that would be redundant When F calls G calls H, G is nounwind, and G is inlined into F, then the inlined call-site to H should be effectively nounwind so as not to lose information during inlining. If H itself is nounwind (which often happens when H is an intrinsic), we no longer mark the callsite explicitly as nounwind. Previously, there were cases where the inlined call-site of H differs from a pre-existing call-site of H in F only in the explicitly added nounwind attribute, thus preventing common subexpression elimination. v2: - just check CI->doesNotThrow Differential Revision: https://reviews.llvm.org/D129860	2022-07-18 17:28:52 +02:00
Kristina Bessonova	44736c1d49	[CloneFunction][DebugInfo] Avoid cloning DILexicalBlocks of inlined subprograms If DISubpogram was not cloned (e.g. we are cloning a function that has other functions inlined into it, and subprograms of the inlined functions are not supposed to be cloned), it doesn't make sense to clone its DILexicalBlocks as well. Otherwise we'll get duplicated DILexicalBlocks that may confuse debug info emission in AsmPrinter. I believe it also makes no sense cloning any DILocalVariables or maybe other local entities, if their parent subprogram was not cloned, cause they will be dangling and will not participate in futher emission. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D127102	2022-07-18 13:14:52 +02:00
Nikita Popov	2a721374ae	[IR] Don't use blockaddresses as callbr arguments Following some recent discussions, this changes the representation of callbrs in IR. The current blockaddress arguments are replaced with `!` label constraints that refer directly to callbr indirect destinations: ; Before: %res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo)) to label %asm.fallthrough [label %foo] ; After: %res = callbr i8* asm "", "=r,r,!i"(i8* %x) to label %asm.fallthrough [label %foo] The benefit of this is that we can easily update the successors of a callbr, without having to worry about also updating blockaddress references. This should allow us to remove some limitations: * Allow unrolling/peeling/rotation of callbr, or any other clone-based optimizations (https://github.com/llvm/llvm-project/issues/41834) * Allow duplicate successors (https://github.com/llvm/llvm-project/issues/45248) This is just the IR representation change though, I will follow up with patches to remove limtations in various transformation passes that are no longer needed. Differential Revision: https://reviews.llvm.org/D129288	2022-07-15 10:18:17 +02:00
Yuanfang Chen	fcb7d76d65	[coroutine] add nomerge function attribute to `llvm.coro.save` It is illegal to merge two `llvm.coro.save` calls unless their `llvm.coro.suspend` users are also merged. Marks it "nomerge" for the moment. This reverts D129025. Alternative to D129025, which affects other token type users like WinEH. Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D129530	2022-07-12 10:39:38 -07:00
Chuanqi Xu	e3b4452e07	[Debug] [Coroutines] Get rid of DW_ATE_address Closing https://github.com/llvm/llvm-project/issues/55916 This patch tries to get rid of DW_ATE_address and enhance the test coverage. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D127625	2022-07-07 10:47:09 +08:00
Chuanqi Xu	7137ebc4ce	[Debug] [Coroutine] Adjust the scope and name for coroutine frame Previously the scope of debug type of __coro_frame is limited in the current function. It looked good at the first sight. But it prevent us to print the type in splitted functions and other functions. Also the debug type is different for different coroutine functions. So it makes sense to rename the debug type to make it related to the function name. After this patch, we could access the coroutine frame type in a function by `function_name.coro_frame_ty`. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D127623	2022-07-07 10:35:32 +08:00
Chuanqi Xu	7a567c60f2	[Coroutines] Add REQUIRES clause to skip unsupported targets	2022-06-30 11:37:41 +08:00
Chuanqi Xu	0b5ead6590	[WebAssembly] Don't set musttail for coroutines when tail-call is not enabled The C++20 Coroutines couldn't be compiled to WebAssembly due to an optimization named symmetric transfer requires the support for musttail calls but WebAssembly doesn't support it yet. This patch tries to fix the problem by adding a supportsTailCalls method to TargetTransformImpl to skip the symmetric transfer when tail-call feature is not supported. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D128794	2022-06-30 11:15:40 +08:00
Yuanfang Chen	e2e9e708e5	[Coroutine] Remove the '!func_sanitize' metadata for split functions There is no proper RTTI for these split functions. So just delete the metadata. Fixes https://github.com/llvm/llvm-project/issues/49689. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D116130	2022-06-27 12:09:13 -07:00
Chuanqi Xu	24e53b01d5	Revert "[Coroutines] Only do symmetric transfer if optimization is on" This reverts commit `7782e080e8`. According to the discussion of WG21, symmetric transfer is a desired feature.	2022-06-27 10:54:56 +08:00
Chuanqi Xu	7782e080e8	[Coroutines] Only do symmetric transfer if optimization is on Symmetric transfer is not a part of C++ standards. So the vendors is not forced to implement it any way. Given the symmetric transfer nowadays is an optimization. It makes more sense to enable it only if the optimization is enabled. It is also helpful for the compilation speed in O0.	2022-06-20 16:20:36 +08:00
Chuanqi Xu	735e6c40b5	[Coroutines] Convert coroutine.presplit to enum attr This is required by @nikic in https://reviews.llvm.org/D127383 to decrease the cost to check whether a function is a coroutine and this fixes a FIXME too. Reviewed By: rjmccall, ezhulenev Differential Revision: https://reviews.llvm.org/D127471	2022-06-14 14:23:46 +08:00
Chuanqi Xu	733d7cf964	[Debug] [Coroutines] Add deref operator for non complex expression Background: When we construct coroutine frame, we would insert a dbg.declare intrinsic for it: ``` %hdl = call void @llvm.coro.begin() ; would return coroutine handle call void @llvm.dbg.declare(metadata ptr %hdl, metadata ![[DEBUG_VARIABLE: __coro_frame]], metadata !DIExpression()) ``` And in the splitted coroutine, it looks like: ``` define void @coro_func.resume(ptr *hdl) { entry.resume: call void @llvm.dbg.declare(metadata ptr %hdl, metadata ![[DEBUG_VARIABLE: __coro_frame]], metadata !DIExpression()) } ``` And we would salvage the debug info by inserting a new alloca here: ``` define void @coro_func.resume(ptr %hdl) { entry.resume: %frame.debug = alloca ptr call void @llvm.dbg.declare(metadata ptr %frame.debug, metadata ![[DEBUG_VARIABLE: __coro_frame]], metadata !DIExpression()) store ptr %hdl, %frame.debug } ``` But now, the problem comes since the `dbg.declare` refers to the address of that alloca instead of actual coroutine handle. I saw there are codes to solve the problem but it only applies to complex expression only. I feel if it is OK to relax the condition to make it work for `__coro_frame`. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D126277	2022-06-08 10:53:51 +08:00
Arnold Schwaighofer	5c902af572	[coro async] Add code to support dynamic aligment of over-aligned types in async frames Async context frames are allocated with a maximum alignment. If a type requests an alignment bigger than that dynamically align the address in the frame. Differential Revision: https://reviews.llvm.org/D126715	2022-06-03 07:06:14 -07:00
Chuanqi Xu	02d6845234	[NFC] [Coroutines] Remove EnableReuseStorageInFrame option The EnableReuseStorageInFrame option is designed for testing only. But it is better to use *_PASS_WITH_PARAMS macro to keep consistent with other passes.	2022-05-10 17:28:43 +08:00
Chuanqi Xu	405bf90235	[NFC] [Pipelines] Hoist CoroCleanup as Module Pass This is similar to previous patch https://reviews.llvm.org/D123925. It could also reduce the time we call declaresCoroCleanupIntrinsics. And it is helpful for further changes. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D124362	2022-05-05 15:15:09 +08:00
Chuanqi Xu	7d40f562e7	[Pipelines] Hoist CoroCleanup to avoid blocking optimizations CoroCleanup is designed to lowering all the remaining coroutine intrinsics. It is required to run after CoroSplit only. However, the position of CoroCleanup now is far too late. The downside here is that the unlowered coroutine instrincs might blocking other optimizations too. So it should be a pure win to hoist the position of CoroCleanup. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D124360	2022-05-05 15:13:27 +08:00
Simon Pilgrim	e04ca7c4f1	[Coroutines] Regenerate coro-retcon-resume-values.ll	2022-05-01 13:21:55 +01:00
Chuanqi Xu	7eaa84eac3	[NFC] Code cleanups for coroutine after we remvoed legacy passes	2022-04-21 15:32:46 +08:00
Chuanqi Xu	483efc9ad0	[Pipelines] Remove Legacy Passes in Coroutines The legacy passes are deprecated now and would be removed in near future. This patch tries to remove legacy passes in coroutines. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D123918	2022-04-21 10:59:11 +08:00
Chuanqi Xu	f9bee35689	[Pipelines] Hoist CoroEarly as a module pass This change could reduce the time we call `declaresCoroEarlyIntrinsics`. And it is helpful for future changes. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D123925	2022-04-19 11:04:24 +08:00
Nikita Popov	792f80e166	[CoroSplit] Use freeze instead of bitcast for dummy instructions Not all types that can appear in arguments can be bitcasts -- in particular, bitcasts do not support struct types.	2022-04-01 13:07:25 +02:00
Nikita Popov	68d27587e4	[CoroSplit] Handle argument being the frame pointer (PR54523) If the frame pointer is an argument of the original pointer (which happens with opaque pointers), then we currently first replace the argument with undef, which will prevent later replacement of the old frame pointer with the new one. Fix this by replacing arguments with some dummy instructions first, and then replacing those with undef later. This gives us a chance to replace the frame pointer before it becomes undef. Fixes https://github.com/llvm/llvm-project/issues/54523. Differential Revision: https://reviews.llvm.org/D122375	2022-04-01 12:37:29 +02:00
Arthur Eubanks	9bd66b312c	[PassManager][Coroutine] Run passes under -O0 conditionally and run GlobalDCE CoroSplit lowers various coroutine intrinsics. It's a CGSCC pass and CGSCC passes don't run on unreachable functions. Normally GlobalDCE will come along and delete unreachable functions, but we don't run GlobalDCE under -O0, so an unreachable function with coroutine intrinsics may never have CoroSplit run on it. This patch adds GlobalDCE when coroutines intrinsics are present. It also now runs all coroutine passes conditional when coroutine intrinsics are present. This should also solve the -O0 regression reported in D105877 due to LazyCallGraph construction. Fixes https://github.com/llvm/llvm-project/issues/54117 Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D122275	2022-03-23 11:03:26 -07:00
Nikita Popov	ce6ca00a92	[CoroSplit] Avoid self-replacement With opaque pointers, the bitcast might be a no-op, and this can end up trying to replace a value with itself, which is illegal.	2022-03-14 13:53:31 +01:00
Michael Gottesman	0b647fc529	[debug-info] Debug salvage llvm.dbg.addr in original function that point into the coroutine frame when splitting coros. We are already doing this in the split functions while we clone. This just handles the original function. I also updated the coroutine split test to validate that we are always referring to the msg in the context object instead of in a shadow copy. rdar://83957028 Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D121324	2022-03-09 14:02:09 -08:00
Nikita Popov	1bd33691cb	[CoroElide] Remove fallback for frame layout determination Only determine the frame layout based on dereferenceable and align attributes, and remove the type-based fallback, which is incompatible with opaque pointers. The dereferenceable attribute is required, while the align attribute uses default alignment of 1 (commonly, align 1 attributes do not get placed, relying on default alignment). The CoroSplit pass producing the resume function adds the necessary attributes in `7daed35911/llvm/lib/Transforms/Coroutines/CoroSplit.cpp (L840)`, and their presence is checked in coro-debug.ll at least. Differential Revision: https://reviews.llvm.org/D120988	2022-03-07 11:23:02 +01:00
Nikita Popov	9bca4ea364	[Coroutines] Allow FramePtr to be an Argument With opaque pointers, after splitRetconCoroutine() the FramePtr may be an Argument rather than an Instruction. With typed pointers, this currently doesn't happen because the FramePtr would be a bitcast instruction. Fix this by making FramePtr a Value and adding a helper for the "after FramePtr" insertion point, which would be the start of the function in the Argument case. Differential Revision: https://reviews.llvm.org/D120994	2022-03-07 10:58:56 +01:00
Nikita Popov	a266af7211	[InstCombine] Canonicalize SPF to min/max intrinsics Now that integer min/max intrinsics have good support in both InstCombine and other passes, start canonicalizing SPF min/max to intrinsic min/max. Once this sticks, we can stop matching SPF min/max in various places, and can remove hacks we have for preventing infinite loops and breaking of SPF canonicalization. Differential Revision: https://reviews.llvm.org/D98152	2022-02-24 09:01:20 +01:00
Michael Gottesman	13681ad654	[move-function] Make test more generally by removing unneeded line. Otherwise this is can be sensitive in the face of changes in register names. I also gardened the test case a little to make it look a little nicer. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D120276	2022-02-21 14:40:23 -08:00

1 2 3 4 5 ...

263 Commits