llvm-project

Commit Graph

Author	SHA1	Message	Date
Dávid Bolvanský	33b0c63775	[Libcalls, Attrs] Annotate libcalls with noundef I think we can use here same logic as for nonnull. strlen(X) - X must be noundef => valid pointer. for libcalls with size arg, we add noundef only if size is known and greater than 0 - so pointers must be noundef (valid ones) Reviewed By: jdoerfert, aqjune Differential Revision: https://reviews.llvm.org/D95122	2021-02-20 04:10:07 +01:00
Nikita Popov	be9889b350	[MemorySSA] Don't treat lifetime.end as NoAlias MemorySSA currently treats lifetime.end intrinsics as not aliasing anything. This breaks MemorySSA-based MemCpyOpt, because we'll happily move a read of a pointer below a lifetime.end intrinsic, as no clobber is reported. I think the MemorySSA modelling here isn't correct: lifetime.end(p) has approximately the same effect as doing a memcpy(p, undef), and should be treated as a clobber. This patch removes the special handling of lifetime.end, leaving alias analysis to handle it appropriately. Differential Revision: https://reviews.llvm.org/D95763	2021-02-04 20:58:28 +01:00
Nikita Popov	6e52eebc2a	[MemCpyOpt] Add test for incorrect optimization across lifetime (NFC) This only affects the MemorySSA-based implementation.	2021-01-29 12:57:02 +01:00
Florian Hahn	292077072e	[Local] Treat calls that may not return as being alive. With the addition of the `willreturn` attribute, functions that may not return (e.g. due to an infinite loop) are well defined, if they are not marked as `willreturn`. This patch updates `wouldInstructionBeTriviallyDead` to not consider calls that may not return as dead. This patch still provides an escape hatch for intrinsics, which are still assumed as willreturn unconditionally. It will be removed once all intrinsics definitions have been reviewed and updated. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D94106	2021-01-23 16:05:14 +00:00
Nikita Popov	b1c2f1282a	[BasicAA] Move assumption tracking into AAQI D91936 placed the tracking for the assumptions into BasicAA. However, when recursing over phis, we may use fresh AAQI instances. In this case AssumptionBasedResults from an inner AAQI can reesult in a removal of an element from the outer AAQI. To avoid this, move the tracking into AAQI. This generally makes more sense, as the NoAlias assumptions themselves are also stored in AAQI. The test case only produces an assertion failure with D90094 reapplied. I think the issue exists independently of that change as well, but I wasn't able to come up with a reproducer.	2021-01-17 10:34:35 +01:00
Mircea Trofin	f9a27df16b	[FileCheck] Enforce --allow-unused-prefixes=false for llvm/test/Transforms Explicitly opt-out llvm/test/Transforms/Attributor. Verified by flipping the default value of allow-unused-prefixes and observing that none of the failures were under llvm/test/Transforms. Differential Revision: https://reviews.llvm.org/D92404	2020-12-09 08:51:38 -08:00
modimo	1860331932	[MemCpyOpt] Correctly merge alias scopes during call slot optimization When MemCpyOpt performs call slot optimization it will concatenate the `alias.scope` metadata between the function call and the memcpy. However, scoped AA relies on the domains in metadata to be maintained in a caller-callee relationship. Naive concatenation breaks this assumption leading to bad AA results. The fix is to take the intersection of domains then union the scopes within those domains. The original bug came from a case of rust bad codegen which uses this bad aliasing to perform additional memcpy optimizations. As show in the added test case `%src` got forwarded past its lifetime leading to a dereference of garbage data. Testing ninja check-llvm Reviewed By: jeroen.dobbelaere Differential Revision: https://reviews.llvm.org/D91576	2020-12-03 09:23:37 -08:00
Nikita Popov	624af932a8	[MemCpyOpt] Port to MemorySSA This is a straightforward port of MemCpyOpt to MemorySSA following the approach of D26739. MemDep queries are replaced with MSSA queries without changing the overall structure of the pass. Some care has to be taken to account for differences between these APIs (MemDep also returns reads, MSSA doesn't). Differential Revision: https://reviews.llvm.org/D89207	2020-12-01 17:57:41 +01:00
Sanjay Patel	c5a4d80fd4	[ValueTracking][MemCpyOpt] avoid crash on inttoptr with vector pointer type (PR48075)	2020-11-22 12:54:18 -05:00
Matt Arsenault	1d1234b2a4	OpaquePtr: Update more tests to use typed sret	2020-11-20 20:08:43 -05:00
Matt Arsenault	20c43d6bd5	OpaquePtr: Bulk update tests to use typed sret	2020-11-20 17:58:26 -05:00
Matt Arsenault	06c192d454	OpaquePtr: Bulk update tests to use typed byval Upgrade of the IR text tests should be the only thing blocking making typed byval mandatory. Partially done through regex and partially manual.	2020-11-20 14:00:46 -05:00
Simon Pilgrim	d2f8961b7b	[MemCpyOpt] Remove unused check-prefixes Just use default CHECK	2020-11-09 12:18:20 +00:00
Anna Thomas	afe92642cc	Revert "[CaptureTracking] Avoid overly restrictive dominates check" This reverts commit `15694fd6ad`. Need to investigate and fix a failing clang test: synchronized.m. Might need a test update.	2020-11-05 12:27:15 -05:00
Anna Thomas	15694fd6ad	[CaptureTracking] Avoid overly restrictive dominates check CapturesBefore tracker has an overly restrictive dominates check when the `BeforeHere` and the capture point are in different basic blocks. All we need to check is that there is no path from the capture point to `BeforeHere` (which is less stricter than the dominates check). See added testcase in one of the users of CapturesBefore. Reviewed-By: jdoerfert Differential Revision: https://reviews.llvm.org/D90688	2020-11-05 11:38:50 -05:00
Nikita Popov	3e37543111	[MemCpyOpt] Move GEP during call slot optimization When performing a call slot optimization to a GEP destination, it will currently usually fail, because the GEP is directly before the memcpy and as such does not dominate the call. We should move it above the call if that satisfies the domination requirement. I think that a constant-index GEP is the only useful thing to move here, as otherwise isDereferenceablePointer couldn't look through it anyway. As such I'm not trying to generalize this further. Differential Revision: https://reviews.llvm.org/D89623	2020-10-22 20:40:56 +02:00
sstefan1	fbfb1c7909	[IR] Make nosync, nofree and willreturn default for intrinsics. D70365 allows us to make attributes default. This is a follow up to actually make nosync, nofree and willreturn default. The approach we chose, for now, is to opt-in to default attributes to avoid introducing problems to target specific intrinsics. Intrinsics with default attributes can be created using `DefaultAttrsIntrinsic` class.	2020-10-20 11:57:19 +02:00
Nikita Popov	50cc9a0e61	[MemCpyOpt] Extract common function for unwinding check These two cases should be using the same logic. Not NFC, as this resolves the TODO regarding use of the underlying object.	2020-10-17 15:30:39 +02:00
Florian Hahn	51ff04567b	Recommit "[DSE] Switch to MemorySSA-backed DSE by default." After investigation by @asbirlea, the issue that caused the revert appears to be an issue in the original source, rather than a problem with the compiler. This patch enables MemorySSA DSE again. This reverts commit `915310bf14`.	2020-10-16 09:02:53 +01:00
Nikita Popov	cd6f40f432	[MemCpyOpt] Add test scaffolding for MSSA based MemCpyOpt This adds an -enable-memcpyopt-memoryssa option that currently does nothing apart from requiring MSSA as a dependency. The tests are split to run both with the option disabled and enabled. I went with this rather than the separate directory DSE uses, as I found it convenient to have a direct side-by-side comparison of differences. Differential Revision: https://reviews.llvm.org/D89206	2020-10-13 21:45:05 +02:00
Nikita Popov	e79ca751fc	[MemCpyOpt] Fix MemorySSA preservation moveUp() moves instructions, so we should move the corresponding memory accesses as well. We should also move the store instruction itself: Even though we'll end up removing it later, this gives us a correct MemoryDef to replace. The implementation is somewhat more complicated than it should be, because we also handle the case where P does not have a memory access due to a degnerate AA pipeline. Hopefully, the need for this will go away in the future, when the rest of the pass is based on MSSA. Differential Revision: https://reviews.llvm.org/D88778	2020-10-13 21:39:09 +02:00
Nikita Popov	baa3b87015	[MemCpyOpt] Don't shorten memset if memcpy operands may be the same If the memcpy operands are the same (which is allowed since D86815) then the memcpy is effectively a no-op and the partially overlapping memset is not dead. Differential Revision: https://reviews.llvm.org/D89192	2020-10-13 21:19:19 +02:00
Nikita Popov	39c39e8a7f	[MemCpyOpt] Don't shorten memset if destination observable through unwinding MemCpyOpt can shorten a memset if it is later partially overwritten by a memcpy. It checks that the destination is not read in between, but we also need to make sure that the destination cannot be observed via unwinding. Differential Revision: https://reviews.llvm.org/D89190	2020-10-13 21:12:19 +02:00
Nikita Popov	d7186fe371	[MemCpyOpt] Add lifetime may alias test (NFC) Test the case where a lifetime intrinsic may alias the memcpy source. Other cases test must or no alias.	2020-10-11 17:08:28 +02:00
Nikita Popov	bdb193a6ed	[MemCpyOpt] Add additional byval tests (NFC) Test read/write clobbers and the the non-local case.	2020-10-11 15:22:31 +02:00
Nikita Popov	329dbdaaaf	[MemCpyOpt] Add test for incorrect memset DSE (NFC) We can't shorten the memset if there's a throwing call in between and the destination is non-local.	2020-10-10 16:11:14 +02:00
Nikita Popov	5e855f1e80	[MemCpyOpt] Don't hoist store that's not guaranteed to execute MemCpyOpt can hoist stores while load+store pairs into memcpy. This hoisting can currently result in stores being executed that weren't guaranteed to execute in the original problem. Differential Revision: https://reviews.llvm.org/D89154	2020-10-10 10:26:28 +02:00
Nikita Popov	466c8296f2	[MemCpyOpt] Add test for incorrectly hoisted store (NFC)	2020-10-09 20:52:08 +02:00
Nikita Popov	7a01fc5abe	[MemCpyOpt] Add additional callslot test cases (NFC) For cases where the destination is captured.	2020-10-07 18:06:29 +02:00
Nikita Popov	616f545048	[MemCpyOpt] Use dereferenceable pointer helper The call slot optimization has some home-grown code for checking whether the destination is dereferenceable. Replace this with the generic isDereferenceableAndAlignedPointer() helper. I'm not checking alignment here, because that is currently handled separately and may be an enforced alignment for allocas. The clean way of integrating that part would probably be to accept a callback in isDereferenceableAndAlignedPointer() for the actual isAligned check, which would then have a chance to use an enforced alignment instead. This allows the destination to be a GEP (among other things), though the two open TODOs may prevent it from working in practice. Differential Revision: https://reviews.llvm.org/D88805	2020-10-06 18:41:19 +02:00
Nikita Popov	6b441ca523	[MemCpyOpt] Check for throwing calls during call slot optimization When performing call slot optimization for a non-local destination, we need to check whether there may be throwing calls between the call and the copy. Otherwise, the early write to the destination may be observable by the caller. This was already done for call slot optimization of load/store, but not for memcpys. For the sake of clarity, I'm moving this check into the common optimization function, even if that does need an additional instruction scan for the load/store case. As efriedma pointed out, this check is not sufficient due to potential accesses from another thread. This case is left as a TODO. Differential Revision: https://reviews.llvm.org/D88799	2020-10-06 18:24:40 +02:00
Nikita Popov	8aaa731349	[MemCpyOpt] Add tests for call slot optimization with GEPs (NFC)	2020-10-04 22:26:05 +02:00
Nikita Popov	22664a3251	[MemCpyOpt] Don't use array allocas in tests (NFC) Apparently querying dereferenceability of array allocations is being intentionally penalized (https://reviews.llvm.org/D41398), so avoid using them in tests.	2020-10-04 21:50:27 +02:00
Nikita Popov	2c48dd7c3a	[MemCpyOpt] Add additional call slot tests (NFC) The case of a destination read between call and memcpy was not covered anywhere (but is handled correctly). However, a potentially throwing call between the call and the memcpy appears to be miscompiled.	2020-10-04 17:26:46 +02:00
Nikita Popov	baaada39c2	[MemCpyOpt] Remove unnecessary -dse from test (NFC) This one doesn't even have any dead stores to eliminate...	2020-10-03 11:28:49 +02:00
Nikita Popov	84feca6a84	[MemCpyOpt] Add tests from D40802 (NFC) Even though that patch didn't stick, we should retain the test coverage.	2020-10-02 20:28:38 +02:00
Nikita Popov	64c54c5459	[MemCpyOpt] Regnerate test checks (NFC)	2020-10-02 18:42:13 +02:00
Philip Reames	bb0344644a	[memcpyopt] Conservatively handle non-integral pointers If we allow the non-integral pointers to become memset and memcpy, we loose the ability to reason about pointer propagation. This patch is modeled on changes we've carried downstream for a long time, figured it was worth being equally conservative for other users. There is room to refine the semantics and handling here if anyone is motivated.	2020-10-01 16:46:56 -07:00
Florian Hahn	915310bf14	Revert "[DSE] Switch to MemorySSA-backed DSE by default." There appears to be a mis-compile with MemorySSA-backed DSE in combination with llvm.lifetime.end. It currently appears like DSE is doing the right thing and the llvm.lifetime.end markers are incorrect. The reverted patch uncovers the mis-compile. This patch temporarily switches back to the legacy DSE implementation, while we investigate. This reverts commit `9d172c8e9c`.	2020-09-26 18:35:27 +01:00
Florian Hahn	9d172c8e9c	Recommit "[DSE] Switch to MemorySSA-backed DSE by default." This switches to using DSE + MemorySSA by default again, after fixing the issues reported after the first commit. Notable fixes `fc82006331`, `a0017c2bc2`. This reverts commit `3a59628f3c`.	2020-09-18 11:05:00 +01:00
Florian Hahn	3a59628f3c	Revert "[DSE] Switch to MemorySSA-backed DSE by default." This reverts commit `fb109c42d9`. Temporarily revert due to a mis-compile pointed out at D87163.	2020-09-15 18:07:56 +01:00
Florian Hahn	fb109c42d9	[DSE] Switch to MemorySSA-backed DSE by default. The tests have been updated and I plan to move them from the MSSA directory up. Some end-to-end tests needed small adjustments. One difference to the legacy DSE is that legacy DSE also deletes trivially dead instructions that are unrelated to memory operations. Because MemorySSA-backed DSE just walks the MemorySSA, we only visit/check memory instructions. But removing unrelated dead instructions is not really DSE's job and other passes will clean up. One noteworthy change is in llvm/test/Transforms/Coroutines/ArgAddr.ll, but I think this comes down to legacy DSE not handling instructions that may throw correctly in that case. To cover this with MemorySSA-backed DSE, we need an update to llvm.coro.begin to treat it's return value to belong to the same underlying object as the passed pointer. There are some minor cases MemorySSA-backed DSE currently misses, e.g. related to atomic operations, but I think those can be implemented after the switch. This has been discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html For the MultiSource/SPEC2000/SPEC2006 the number of eliminated stores goes from ~17500 (legayc DSE) to ~26300 (MemorySSA-backed). More numbers and details in the thread on llvm-dev. Impact on CTMark: ``` Legacy Pass Manager exec instrs size-text O3 + 0.60% - 0.27% ReleaseThinLTO + 1.00% - 0.42% ReleaseLTO-g. + 0.77% - 0.33% RelThinLTO (link only) + 0.87% - 0.42% RelLO-g (link only) + 0.78% - 0.33% ``` http://llvm-compile-time-tracker.com/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions ``` New Pass Manager exec instrs. size-text O3 + 0.95% - 0.25% ReleaseThinLTO + 1.34% - 0.41% ReleaseLTO-g. + 1.71% - 0.35% RelThinLTO (link only) + 0.96% - 0.41% RelLO-g (link only) + 2.21% - 0.35% ``` http://195.201.131.214:8000/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions Reviewed By: asbirlea, xbolva00, nikic Differential Revision: https://reviews.llvm.org/D87163	2020-09-10 22:24:32 +01:00
Florian Hahn	6bc5e866bd	[MemCpyOpt] Account for case that MemInsertPoint == BI. In that case, the new MemoryDef needs to be inserted before MemInsertPoint.	2020-09-04 14:04:08 +01:00
Florian Hahn	e2fc6a31d3	[MemCpyOpt] Preserve MemorySSA. This patch updates MemCpyOpt to preserve MemorySSA. It uses the MemoryDef at the insertion point of the builder and inserts the new def after that def. In some cases, we just modify a memory instruction. In that case, get the defining access, then remove the memory access and add a new one. If the defining access is in a different block, insert a new def at the beginning of the current block, otherwise after the defining access. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D86651	2020-09-04 09:05:33 +01:00
Arthur Eubanks	9bb6ce78be	Rename scoped-noalias -> scoped-noalias-aa Summary: To match NewPM name. Also the new name is clearer and more consistent. Subscribers: jvesely, nhaehnle, hiraditya, asbirlea, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D84542	2020-07-24 12:14:27 -07:00
Fangrui Song	f31811f2dc	[BasicAA] Rename deprecated -basicaa to -basic-aa Follow-up to D82607 Revert an accidental change (empty.ll) of D82683	2020-06-26 20:41:37 -07:00
Eli Friedman	4532a50899	Infer alignment of unmarked loads in IR/bitcode parsing. For IR generated by a compiler, this is really simple: you just take the datalayout from the beginning of the file, and apply it to all the IR later in the file. For optimization testcases that don't care about the datalayout, this is also really simple: we just use the default datalayout. The complexity here comes from the fact that some LLVM tools allow overriding the datalayout: some tools have an explicit flag for this, some tools will infer a datalayout based on the code generation target. Supporting this properly required plumbing through a bunch of new machinery: we want to allow overriding the datalayout after the datalayout is parsed from the file, but before we use any information from it. Therefore, IR/bitcode parsing now has a callback to allow tools to compute the datalayout at the appropriate time. Not sure if I covered all the LLVM tools that want to use the callback. (clang? lli? Misc IR manipulation tools like llvm-link?). But this is at least enough for all the LLVM regression tests, and IR without a datalayout is not something frontends should generate. This change had some sort of weird effects for certain CodeGen regression tests: if the datalayout is overridden with a datalayout with a different program or stack address space, we now parse IR based on the overridden datalayout, instead of the one written in the file (or the default one, if none is specified). This broke a few AVR tests, and one AMDGPU test. Outside the CodeGen tests I mentioned, the test changes are all just fixing CHECK lines and moving around datalayout lines in weird places. Differential Revision: https://reviews.llvm.org/D78403	2020-05-14 13:03:50 -07:00
Huihui Zhang	4f5af9d70d	[ValueTracking] Fix usage of DataLayout::getTypeStoreSize() Summary: DataLayout::getTypeStoreSize() returns TypeSize. For cases where it can not be scalable vector (e.g., GlobalVariable), explicitly call TypeSize::getFixedSize(). For cases where scalable property doesn't matter, (e.g., check for zero-sized type), use TypeSize::isNonZero(). Reviewers: sdesmalen, efriedma, apazos, reames Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76454	2020-03-20 16:52:15 -07:00
Huihui Zhang	1993f95f2b	[ValueTracking][SVE] Fix getOffsetFromIndex for scalable vector. Summary: Return None if GEP index type is scalable vector. Size of scalable vectors are multiplied by a runtime constant. Avoid transforming: %a = bitcast i8* %p to <vscale x 16 x i8>* %tmp0 = getelementptr <vscale x 16 x i8>, <vscale x 16 x i8>* %a, i64 0 store <vscale x 16 x i8> zeroinitializer, <vscale x 16 x i8>* %tmp0 %tmp1 = getelementptr <vscale x 16 x i8>, <vscale x 16 x i8>* %a, i64 1 store <vscale x 16 x i8> zeroinitializer, <vscale x 16 x i8>* %tmp1 into: %a = bitcast i8* %p to <vscale x 16 x i8>* %tmp0 = getelementptr <vscale x 16 x i8>, <vscale x 16 x i8>* %a, i64 0 %1 = bitcast <vscale x 16 x i8>* %tmp0 to i8* call void @llvm.memset.p0i8.i64(i8* align 16 %1, i8 0, i64 32, i1 false) Reviewers: sdesmalen, efriedma, apazos, reames Reviewed By: sdesmalen Subscribers: tschuett, hiraditya, rkruppe, arphaman, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76464	2020-03-20 14:48:29 -07:00
Pierre-vh	2809abbd98	[Transform][MemCpyOpt] Add missing DebugLoc to %tmpbitcast Fix for https://bugs.llvm.org/show_bug.cgi?id=37967 Differential Revision: https://reviews.llvm.org/D75173	2020-02-28 15:20:51 +00:00
Juneyoung Lee	ad9ae6ee2b	MemCpyOpt cannot use ABI alignment even if it was not given Summary: This patch fixes https://bugs.llvm.org/show_bug.cgi?id=44388 which incorrectly assigns an ABI alignment to memset when there was no explicit alignment given. Reviewers: gchatelet, lenary, nikic Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74083	2020-02-06 06:21:55 +09:00
Eric Christopher	fd39b1bb20	Revert "Revert "As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there."" This reapplies: `8ff85ed905` Original commit message: As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there. This change doesn't include any change to move from selection dag to fast isel and that will come with other numbers that should help inform that decision. There also haven't been any real debuggability studies with this pipeline yet, this is just the initial start done so that people could see it and we could start tweaking after. Test updates: Outside of the newpm tests most of the updates are coming from either optimization passes not run anymore (and without a compelling argument at the moment) that were largely used for canonicalization in clang. Original post: http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html Tags: #llvm Differential Revision: https://reviews.llvm.org/D65410 This reverts commit `c9ddb02659`.	2019-11-26 20:28:52 -08:00
Muhammad Omair Javaid	c9ddb02659	Revert "As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there." This reverts commit `8ff85ed905`. This commit introduced 9 new failures on lldb buildbot host at http://lab.llvm.org:8014/builders/lldb-aarch64-ubuntu Following tests were failing: lldb-api :: functionalities/tail_call_frames/ambiguous_tail_call_seq1/TestAmbiguousTailCallSeq1.py lldb-api :: functionalities/tail_call_frames/ambiguous_tail_call_seq2/TestAmbiguousTailCallSeq2.py lldb-api :: functionalities/tail_call_frames/disambiguate_call_site/TestDisambiguateCallSite.py lldb-api :: functionalities/tail_call_frames/disambiguate_paths_to_common_sink/TestDisambiguatePathsToCommonSink.py lldb-api :: functionalities/tail_call_frames/disambiguate_tail_call_seq/TestDisambiguateTailCallSeq.py lldb-api :: functionalities/tail_call_frames/inlining_and_tail_calls/TestInliningAndTailCalls.py lldb-api :: functionalities/tail_call_frames/sbapi_support/TestTailCallFrameSBAPI.py lldb-api :: functionalities/tail_call_frames/thread_step_out_message/TestArtificialFrameStepOutMessage.py lldb-api :: functionalities/tail_call_frames/thread_step_out_or_return/TestSteppingOutWithArtificialFrames.py lldb-api :: functionalities/tail_call_frames/unambiguous_sequence/TestUnambiguousTailCalls.py Tags: #llvm Differential Revision: https://reviews.llvm.org/D65410	2019-11-26 09:32:13 +05:00
Eric Christopher	8ff85ed905	As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there. This change doesn't include any change to move from selection dag to fast isel and that will come with other numbers that should help inform that decision. There also haven't been any real debuggability studies with this pipeline yet, this is just the initial start done so that people could see it and we could start tweaking after. Test updates: Outside of the newpm tests most of the updates are coming from either optimization passes not run anymore (and without a compelling argument at the moment) that were largely used for canonicalization in clang. Original post: http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html Tags: #llvm Differential Revision: https://reviews.llvm.org/D65410	2019-11-25 17:16:46 -08:00
Sam Elliott	d6e6aa8a42	[MemCpyOpt] Fixing Incorrect Code Motion while Handling Aggregate Type Values Summary: When MemCpyOpt is handling aggregate type values, if an instruction (let's call it P) between the targeting load (L) and store (S) clobbers the source pointer of L, it will try to hoist S before P. This process will also hoist S's data dependency instructions. However, the current implementation has a bug that if one of S's dependency instructions is //also// a user of P, MemCpyOpt will not prevent it from being hoisted above P and cause a use-before-define error. For example, in the newly added test file (i.e. `aggregate-type-crash.ll`), it will try to hoist both `store %my_struct %1, %my_struct* %3` and its dependent, `%3 = bitcast i8* %2 to %my_struct`, above `%2 = call i8 @my_malloc(%my_struct* %0)`. Creating the following BB: ``` entry: %1 = bitcast i8* %4 to %my_struct* %2 = bitcast %my_struct* %1 to i8* %3 = bitcast %my_struct* %0 to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %2, i8* align 4 %3, i64 8, i1 false) %4 = call i8* @my_malloc(%my_struct* %0) ret void ``` Where there is a use-before-define error between `%1` and `%4`. Update: The compiler for the Pony Programming Language [also encounter the same bug](https://github.com/ponylang/ponyc/issues/3140) Patch by Min-Yih Hsu (myhsu) Reviewers: eugenis, pcc, dblaikie, dneilson, t.p.northover, lattner Reviewed By: eugenis Subscribers: lenary, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66060 llvm-svn: 375403	2019-10-21 10:00:34 +00:00
Rong Xu	686fa4bbfb	[ValueTracking] Improve pointer offset computation for cases of same base This patch improves the handling of pointer offset in GEP expressions where one argument is the base pointer. isPointerOffset() is being used by memcpyopt where current code synthesizes consecutive 32 bytes stores to one store and two memset intrinsic calls. With this patch, we convert the stores to one memset intrinsic. Differential Revision: https://reviews.llvm.org/D67989 llvm-svn: 374454	2019-10-10 21:30:43 +00:00
David Bolvansky	0e0fbae1a4	[BuildLibCalls] Noalias annotation Summary: I think this is better solution than annotating callsites in IC/SLC. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66217 llvm-svn: 368875	2019-08-14 16:50:06 +00:00
David Bolvansky	038d604f4f	[SimplifyLibCalls] Add noalias from known callsites Summary: Should be fine for memcpy, strcpy, strncpy. Reviewers: jdoerfert, efriedma Reviewed By: jdoerfert Subscribers: uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66135 llvm-svn: 368724	2019-08-13 17:18:46 +00:00
David Bolvansky	39130314fe	[SimplifyLibCalls] Add dereferenceable bytes from known callsites Summary: int mm(char a, char b) { return memcmp(a,b,16); } Currently: define dso_local i32 @mm(i8* nocapture readonly %a, i8* nocapture readonly %b) local_unnamed_addr #1 { entry: %call = tail call i32 @memcmp(i8* %a, i8* %b, i64 16) ret i32 %call } After patch: define dso_local i32 @mm(i8* nocapture readonly %a, i8* nocapture readonly %b) local_unnamed_addr #1 { entry: %call = tail call i32 @memcmp(i8* dereferenceable(16) %a, i8* dereferenceable(16) %b, i64 16) ret i32 %call } Reviewers: jdoerfert, efriedma Reviewed By: jdoerfert Subscribers: javed.absar, spatel, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66079 llvm-svn: 368657	2019-08-13 09:11:49 +00:00
Hideto Ueno	cc0a4cdc89	[FunctionAttrs] Annotate "willreturn" for intrinsics Summary: In D62801, new function attribute `willreturn` was introduced. In short, a function with `willreturn` is guaranteed to come back to the call site(more precise definition is in LangRef). In this patch, willreturn is annotated for LLVM intrinsics. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: jvesely, nhaehnle, sstefan1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64904 llvm-svn: 367184	2019-07-28 06:09:56 +00:00
Eric Christopher	cee313d288	Revert "Temporarily Revert "Add basic loop fusion pass."" The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552	2019-04-17 04:52:47 +00:00
Eric Christopher	a863435128	Temporarily Revert "Add basic loop fusion pass." As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546	2019-04-17 02:12:23 +00:00
Johannes Doerfert	00102c7d95	[ValueTracking] Look through casts when determining non-nullness Bitcast and certain Ptr2Int/Int2Ptr instructions will not alter the value of their operand and can therefore be looked through when we determine non-nullness. Differential Revision: https://reviews.llvm.org/D54956 llvm-svn: 352293	2019-01-26 23:40:35 +00:00
Reid Kleckner	84a2d29681	[BasicAA] Fix AA bug on dynamic allocas and stackrestore Summary: BasicAA has special logic for unescaped allocas, which normally applies equally well to dynamic and static allocas. However, llvm.stackrestore has the power to end the lifetime of dynamic allocas, without referring to them directly. stackrestore is already marked with the most conservative memory modification attributes, but because the alloca is not escaped, the normal logic produces incorrect results. I think BasicAA needs a special case here to teach it about the relationship between dynamic allocas and stackrestore. Fixes PR40118 Reviewers: gbiv, efriedma, george.burgess.iv Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55969 llvm-svn: 349945	2018-12-21 19:59:03 +00:00
Nikita Popov	dc73a6edde	Reapply "[MemCpyOpt] memset->memcpy forwarding with undef tail" Currently memcpyopt optimizes cases like memset(a, byte, N); memcpy(b, a, M); to memset(a, byte, N); memset(b, byte, M); if M <= N. Often this allows further simplifications down the line, which drop the first memset entirely. This patch extends this optimization for the case where M > N, but we know that the bytes a[N..M] are undef due to alloca/lifetime.start. This situation arises relatively often for Rust code, because Rust does not initialize trailing structure padding and loves to insert redundant memcpys. This also fixes https://bugs.llvm.org/show_bug.cgi?id=39844. The previous version of this patch did not perform dependency checking properly: While the dependency is checked at the position of the memset, the used size must be that of the memcpy. Previously the size of the memset was used, which missed modification in the region MemSetSize..CopySize, resulting in miscompiles. The added tests cover variations of this issue. Differential Revision: https://reviews.llvm.org/D55120 llvm-svn: 349078	2018-12-13 20:04:27 +00:00
David L. Jones	54c01ad6a9	Revert r348645 - "[MemCpyOpt] memset->memcpy forwarding with undef tail" This revision caused trucated memsets for structs with padding. See: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610520.html llvm-svn: 349002	2018-12-13 03:15:11 +00:00
Nikita Popov	94b8e2ea4e	[MemCpyOpt] memset->memcpy forwarding with undef tail Currently memcpyopt optimizes cases like memset(a, byte, N); memcpy(b, a, M); to memset(a, byte, N); memset(b, byte, M); if M <= N. Often this allows further simplifications down the line, which drop the first memset entirely. This patch extends this optimization for the case where M > N, but we know that the bytes a[N..M] are undef due to alloca/lifetime.start. This situation arises relatively often for Rust code, because Rust does not initialize trailing structure padding and loves to insert redundant memcpys. This also fixes https://bugs.llvm.org/show_bug.cgi?id=39844. For the implementation, I'm reusing a bit of code for a similar existing optimization (direct memcpy of undef). I've also added memset support to MemDepAnalysis GetLocation -- Instead, getPointerDependencyFrom could be used, but it seems to make more sense to add this to GetLocation and thus make the computation cachable. Differential Revision: https://reviews.llvm.org/D55120 llvm-svn: 348645	2018-12-07 21:16:58 +00:00
Nikita Popov	4ca00df571	[MemCpyOpt] Add tests for memset->memcpy forwaring with undef tail; NFC These are baseline tests for D55120. llvm-svn: 348644	2018-12-07 21:16:52 +00:00
JF Bastien	73d8e4e531	Merge clang's isRepeatedBytePattern with LLVM's isBytewiseValue Summary: his code was in CGDecl.cpp and really belongs in LLVM's isBytewiseValue. Teach isBytewiseValue the tricks clang's isRepeatedBytePattern had, including merging undef properly, and recursing on more types. clang part of this patch: D51752 Subscribers: dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D51751 llvm-svn: 342709	2018-09-21 05:17:42 +00:00
Bjorn Pettersson	8e484dc531	[MemCpyOpt] Skip optimizing basic blocks not reachable from entry Summary: Skip basic blocks not reachable from the entry node in MemCpyOptPass::iterateOnFunction. Code that is unreachable may have properties that do not exist for reachable code (an instruction in a basic block can for example be dominated by a later instruction in the same basic block, for example if there is a single block loop). MemCpyOptPass::processStore is only safe to use for reachable basic blocks, since it may iterate past the basic block beginning when used for unreachable blocks. By simply skipping to optimize unreachable basic blocks we can avoid asserts such as "Assertion `!NodePtr->isKnownSentinel()' failed." in MemCpyOptPass::processStore. The problem was detected by fuzz tests. Reviewers: eli.friedman, dneilson, efriedma Reviewed By: efriedma Subscribers: efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D45889 llvm-svn: 330635	2018-04-23 19:55:04 +00:00
Daniel Neilson	6f1eb58e92	[MemCpyOpt] Update to new API for memory intrinsic alignment Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the MemCpyOpt pass to cease using: 1) The old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. 2) The old IRBuilder CreateMemCpy/CreateMemMove single-alignment APIs in favour of the new API that allows setting source and destination alignments independently. We also add a few tests to fill gaps in the testing of this pass. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960, rL325816, rL327398, rL327421 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 328097	2018-03-21 14:14:55 +00:00
Daniel Neilson	1e68724d24	Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set\|cpy\|move)\.p([^(])$(.), i32, i1$~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(])i8$i8([^])\ (.), i8 (.), i8 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i16$i8([^])\ (.), i8 (.), i16 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i32$i8([^])\ (.), i8 (.), i32 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i64$i8([^])\ (.), i8 (.), i64 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i128$i8([^])\ (.), i8 (.), i128 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i8$i8([^])\ (.), i8 (.), i8 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i8(i8\2 align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i16$i8([^])\ (.), i8 (.), i16 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i16(i8\2 align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i32$i8([^])\ (.), i8 (.), i32 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i32(i8\2 align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i64$i8([^])\ (.), i8 (.), i64 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i64(i8\2 align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i128$i8([^])\ (.), i8 (.), i128 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i128(i8\2 align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8$i8([^])\ (.), i8([^])\ (.), i8 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i8(i8\3 \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16$i8([^])\ (.), i8([^])\ (.), i16 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i16(i8\3 \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32$i8([^])\ (.), i8([^])\ (.), i32 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i32(i8\3 \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64$i8([^])\ (.), i8([^])\ (.), i64 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i64(i8\3 \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128$i8([^])\ (.), i8([^])\ (.), i128 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i128(i8\3 \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8$i8([^])\ (.), i8([^])\ (.), i8 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16$i8([^])\ (.), i8([^])\ (.), i16 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32$i8([^])\ (.), i8([^])\ (.), i32 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64$i8([^])\ (.), i8([^])\ (.), i64 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128$i8([^])\ (.), i8([^])\ (.), i128 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965	2018-01-19 17:13:12 +00:00
Reid Kleckner	6d31001cd6	Revert "[memcpyopt] Teach memcpyopt to optimize across basic blocks" This reverts r321138. It seems there are still underlying issues with memdep. PR35519 seems to still be present if debug info is enabled. We end up losing a memcpy. Somehow during store to memset merging, we insert the memset after the memcpy or fail to update the memdep analysis to account for the newly inserted memset of a pair. Reduced test case: #include <assert.h> #include <stdio.h> #include <string> #include <utility> #include <vector> void do_push_back( std::vector<std::pair<std::string, std::vector<std::string>>>* crls) { crls->push_back(std::make_pair(std::string(), std::vector<std::string>())); } int __attribute__((optnone)) main() { // Put some data in the vector and then remove it so we take the push_back // fast path. std::vector<std::pair<std::string, std::vector<std::string>>> crl_set; crl_set.push_back({"asdf", {}}); crl_set.pop_back(); printf("first word in vector storage: %p\n", (void)crl_set.data()); // Do the push_back which may fail to initialize the data. do_push_back(&crl_set); auto first = &crl_set.back().first; printf("first word in vector storage (should be zero): %p\n", (void*)crl_set.data()); assert(first->empty()); puts("ok"); } Compile with libc++, enable optimizations, and enable debug info: $ clang++ -stdlib=libc++ -g -O2 t.cpp -o t.exe -Wl,-rpath=llvm/build/lib This program will assert with this change. llvm-svn: 321510	2017-12-28 05:10:33 +00:00
Dan Gohman	aa3922819e	[memcpyopt] Teach memcpyopt to optimize across basic blocks This teaches memcpyopt to make a non-local memdep query when a local query indicates that the dependency is non-local. This notably allows it to eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%. This is r319482 and r319483, along with fixes for PR35519: fix the optimization that merges stores into memsets to preserve cached memdep info, and fix memdep's non-local caching strategy to not assume that larger queries are always more conservative than smaller ones. Fixes PR28958 and PR35519. Differential Revision: https://reviews.llvm.org/D40802 llvm-svn: 321138	2017-12-20 01:36:25 +00:00
Hans Wennborg	146a9c3e51	Revert r319482 and r319483 "[memcpyopt] Teach memcpyopt to optimize across basic blocks" This caused PR35519. > [memcpyopt] Teach memcpyopt to optimize across basic blocks > > This teaches memcpyopt to make a non-local memdep query when a local query > indicates that the dependency is non-local. This notably allows it to > eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%. > > Fixes PR28958. > > Differential Revision: https://reviews.llvm.org/D38374 > > [memcpyopt] Commit file missed in r319482. > > This change was meant to be included with r319482 but was accidentally > omitted. llvm-svn: 319873	2017-12-06 01:47:55 +00:00
Dan Gohman	59e4c0b938	[memcpyopt] Teach memcpyopt to optimize across basic blocks This teaches memcpyopt to make a non-local memdep query when a local query indicates that the dependency is non-local. This notably allows it to eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%. Fixes PR28958. Differential Revision: https://reviews.llvm.org/D38374 llvm-svn: 319482	2017-11-30 22:10:53 +00:00
Matt Arsenault	90e4f719e1	Fix some misc. -enable-var-scope violations llvm-svn: 318006	2017-11-13 01:47:52 +00:00
Matt Arsenault	f10061ec70	Add address space mangling to lifetime intrinsics In preparation for allowing allocas to have non-0 addrspace. llvm-svn: 299876	2017-04-10 20:18:21 +00:00
Matt Arsenault	daa08875b3	[MemCpyOpt] Only replace memcpy with bitcast if address spaces match Patch by James Price llvm-svn: 299866	2017-04-10 19:00:25 +00:00
Fiona Glaser	a9bd572b6f	MemCpyOptimizer: don't create new addrspace casts This isn't safe on all targets, and since we don't have a way to know it's safe, avoid doing it for now. llvm-svn: 297788	2017-03-14 22:37:38 +00:00
Bryant Wong	7cb744621b	[MemCpyOpt] Don't sink LoadInst below possible clobber. Differential Revision: https://reviews.llvm.org/D26811 llvm-svn: 290611	2016-12-27 17:58:12 +00:00
Benjamin Kramer	1697d39eef	[MemCpyOpt] Don't emit IR in an unspecified order Argument evaluation order is one of the edge cases where Clang differs from GCC, yielding different IR depending on which compiler LLVM was built with. Make the order deterministic and tune the test to actually verify the order instead of trying to hide it. llvm-svn: 286126	2016-11-07 17:47:28 +00:00
Tim Shen	3ad8b43cc2	[MemCpy] Add comments for r279769 Differential Revision: https://reviews.llvm.org/D23846 llvm-svn: 279778	2016-08-25 21:03:46 +00:00
Tim Shen	a3dbead2d6	[MemCpy] Check for alias in performMemCpyToMemSetOptzn, instead of the identity of two operands Summary: This fixes pr29105. The reason is that lifetime marks creates new aliasing pointers the original ones, but before this patch aliases were not checked in performMemCpyToMemSetOptzn. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23846 llvm-svn: 279769	2016-08-25 19:27:26 +00:00
Anna Thomas	037e540f08	[AliasAnalysis] Treat invariant.start as read-memory Summary: We teach alias analysis that invariant.start is readonly. This helps with GVN and memcopy optimizations that currently treat. invariant.start as a clobber. We need to treat this as readonly, so that DSE does not incorrectly remove stores prior to the invariant.start Reviewers: sanjoy, reames, majnemer, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23214 llvm-svn: 278138	2016-08-09 17:18:05 +00:00
Sean Silva	6347df0f81	[PM] Port MemCpyOpt to the new PM. The need for all these Lookup* functions is just because of calls to getAnalysis inside methods (i.e. not at the top level) of the runOnFunction method. They should be straightforward to clean up when the old PM is gone. llvm-svn: 272615	2016-06-14 02:44:55 +00:00
Tim Shen	7aa0ad65ce	[MemCpyOpt] Do not exchange llvm.lifetime.start and llvm.memcpy Reviewers: iteratee Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21087 llvm-svn: 272192	2016-06-08 19:42:32 +00:00
David Majnemer	d99068d26d	[MemCpyOpt] Don't perform callslot optimization across may-throw calls An exception could prevent a store from occurring but MemCpyOpt's callslot optimization would fire anyway, causing the store to occur. This fixes PR27849. llvm-svn: 270892	2016-05-26 19:24:24 +00:00
Jun Bum Lim	f28beac419	[MemCpyOpt] Use MaxIntSize in byte instead of bit Summary: This change fix the bug in isProfitableToUseMemset() where MaxIntSize shoule be in byte, not bit. Reviewers: arsenm, joker.eph, mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20176 llvm-svn: 269433	2016-05-13 16:52:24 +00:00
Tim Northover	3961735f03	Revert "MemCpyOpt: combine local load/store sequences into memcpy." This reverts commit r269125. It was in my tree when I ran "git svn dcommit". It's really still under review. llvm-svn: 269127	2016-05-10 21:49:40 +00:00
Tim Northover	6c65c71639	MemCpyOpt: combine local load/store sequences into memcpy. Sort of the BB-local equivalent to idiom-recognizer: if we have a basic-block that really implements a memcpy operation, backends can benefit from seeing this. llvm-svn: 269125	2016-05-10 21:48:11 +00:00
Amaury Sechet	bdb261b4c0	Imporove load to store => memcpy Summary: This now try to reorder instructions in order to help create the optimizable pattern. Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer Differential Revision: http://reviews.llvm.org/D16523 llvm-svn: 263503	2016-03-14 22:52:27 +00:00
Mehdi Amini	0535003bef	Fix PR26051: Memcpy optimization should introduce a call to memcpy before the store destination position This is a conservative fix, I expect Amaury to relax this. Follow-up for r256923 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 256999	2016-01-06 23:50:22 +00:00
Amaury Sechet	3235c08253	Promote aggregate store to memset when possible Summary: As per title. This will allow the optimizer to pick up on it. Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15923 llvm-svn: 256969	2016-01-06 19:47:24 +00:00
Amaury Sechet	d3b2c0fd94	Improve load/store to memcpy for aggregate Summary: It turns out that if we don't try to do it at the store location, we can do it before any operation that alias the load, as long as no operation alias the store. Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15903 llvm-svn: 256923	2016-01-06 09:30:39 +00:00
Amaury Sechet	a0c242cdfd	Implement load to store => memcpy in MemCpyOpt for aggregates Summary: Most of the tool chain is able to optimize scalar and memcpy like operation effisciently while it isn't that good with aggregates. In order to improve the support of aggregate, we try to change aggregate manipulation into either scalar or memcpy like ones whenever possible without loosing informations. This is one such opportunity. Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15894 llvm-svn: 256868	2016-01-05 20:17:48 +00:00
Pete Cooper	67cf9a723b	Revert "Change memcpy/memset/memmove to have dest and source alignments." This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543	2015-11-19 05:56:52 +00:00
Pete Cooper	72bc23ef02	Change memcpy/memset/memmove to have dest and source alignments. Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.llvm\.memset.)i32\ [0-9]\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, / isVolatile / false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, / isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511	2015-11-18 22:17:24 +00:00
Akira Hatanaka	5dda592643	Sort the enums in Attributes.h in case insensitive alphabetical order. Sort the enums in preparation for moving the attributes to a table-gen file. rdar://problem/19836465 llvm-svn: 252692	2015-11-11 02:11:46 +00:00
Andrea Di Biagio	99493df257	[MemCpyOpt] Fix wrong merging adjacent nontemporal stores into memset calls. Pass MemCpyOpt doesn't check if a store instruction is nontemporal. As a consequence, adjacent nontemporal stores are always merged into a memset call. Example: ;;; define void @foo(<4 x float>* nocapture %p) { entry: store <4 x float> zeroinitializer, <4 x float>* %p, align 16, !nontemporal !0 %p1 = getelementptr inbounds <4 x float>, <4 x float>* %dst, i64 1 store <4 x float> zeroinitializer, <4 x float>* %p1, align 16, !nontemporal !0 ret void } !0 = !{i32 1} ;;; In this example, the two nontemporal stores are combined to a memset of zero which does not preserve the nontemporal hint. Later on the backend (tested on a x86-64 corei7) expands that memset call into a sequence of two normal 16-byte aligned vector stores. opt -memcpyopt example.ll -S -o - \| llc -mcpu=corei7 -o - Before: xorps %xmm0, %xmm0 movaps %xmm0, 16(%rdi) movaps %xmm0, (%rdi) With this patch, we no longer merge nontemporal stores into calls to memset. In this example, llc correctly expands the two stores into two movntps: xorps %xmm0, %xmm0 movntps %xmm0, 16(%rdi) movntps %xmm0, (%rdi) In theory, we could extend the usage of !nontemporal metadata to memcpy/memset calls. However a change like that would only have the effect of forcing the backend to expand !nontemporal memsets back to sequences of store instructions. A memset library call would not have exactly the same semantic of a builtin !nontemporal memset call. So, SelectionDAG will have to conservatively expand it back to a sequence of !nontemporal stores (effectively undoing the merging). Differential Revision: http://reviews.llvm.org/D13519 llvm-svn: 249820	2015-10-09 10:53:41 +00:00

1 2 3 4 5

243 Commits