llvm-project

Commit Graph

Author	SHA1	Message	Date
Hans Wennborg	2bc57d85eb	Don't override __attribute__((no_stack_protector)) by inlining (PR52886) Since `26c6a3e736`, LLVM's inliner will "upgrade" the caller's stack protector attribute based on the callee. This lead to surprising results with Clang's no_stack_protector attribute added in `4fbf84c173` (D46300). Consider the following code compiled with clang -fstack-protector-strong -Os (https://godbolt.org/z/7s3rW7a1q). extern void h(int* p); inline __attribute__((always_inline)) int g() { return 0; } int __attribute__((__no_stack_protector__)) f() { int a[1]; h(a); return g(); } LLVM will inline g() into f(), and f() would get a stack protector, against the users explicit wishes, potentially breaking the program e.g. if h() changes the value of the stack cookie. That's a miscompile. More recently, `bc044a88ee` (D91816) addressed this problem by preventing inlining when the stack protector is disabled in the caller and enabled in the callee or vice versa. However, the problem remained if the callee is marked always_inline as in the example above. This affected users, see e.g. http://crbug.com/1274129 and http://llvm.org/pr52886. One way to fix this would be to prevent inlining also in the always_inline case. Despite the name, always_inline does not guarantee inlining, so this would be legal but potentially surprising to users. However, I think the better fix is to not enable the stack protector in a caller based on the callee. The motivation for the old behaviour is unclear, it seems counter-intuitive, and causes real problems as we've seen. This commit implements that fix, which means in the example above, g() gets inlined into f() (also without always_inline), and f() is emitted without stack protector. I think that matches most developers' expectations, and that's also what GCC does. Another effect of this change is that a no_stack_protector function can now be inlined into a stack protected function, e.g. (https://godbolt.org/z/hafP6W856): extern void h(int* p); inline int __attribute__((__no_stack_protector__)) __attribute__((always_inline)) g() { return 0; } int f() { int a[1]; h(a); return g(); } I think that's fine. Such code would be unusual since no_stack_protector is normally applied to a program entry point which sets up the stack canary. And even if such code exists, inlining doesn't change the semantics: there is still no stack cookie setup/check around entry/exit of the g() code region, but there may be in the surrounding context, as there was before inlining. This also matches GCC. See also the discussion at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722 Differential revision: https://reviews.llvm.org/D116589	2022-01-13 12:04:49 +01:00
Sanjay Patel	6bd127b079	[InstSimplify] use knownbits to fold more udiv/urem We could use knownbits on both operands for even more folds (and there are already tests in place for that), but this is enough to recover the example from: https://github.com/llvm/llvm-project/issues/51934 (the tests are derived from the code in that example) I am assuming no noticeable compile-time impact from this because udiv/urem are rare opcodes. Differential Revision: https://reviews.llvm.org/D116616	2022-01-12 14:59:43 -05:00
Rosie Sumpter	552eb372cb	[LoopVectorize] Pass a vector type to isLegalMaskedGather/Scatter This is required to query the legality more precisely in the LoopVectorizer. This adds another TTI function named 'forceScalarizeMaskedGather/Scatter' function to work around the hack introduced for MVE, where isLegalMaskedGather/Scatter would return an answer by second-guessing where the function was called from, based on the Type passed in (vector vs scalar). The new interface makes this explicit. It is also used by X86 to check for vector widths where gather/scatters aren't profitable (or don't exist) for certain subtargets. Differential Revision: https://reviews.llvm.org/D115329	2022-01-12 13:34:12 +00:00
Mircea Trofin	248d55af3e	[NFC][MLGO] Use LazyCallGraph::Node to track functions. This avoids the InlineAdvisor carrying the responsibility of deleting Function objects. We use LazyCallGraph::Node objects instead, which are stable in memory for the duration of the Module-wide performance of CGSCC passes started under the same ModuleToPostOrderCGSCCPassAdaptor (which is the case here) Differential Revision: https://reviews.llvm.org/D116964	2022-01-11 19:23:47 -08:00
Mircea Trofin	1f5dceb1d0	[MLGO] Add support for multiple training traces per module This happens in e.g. regalloc, where we trace decisions per function, but wouldn't want to spew N log files (i.e. one per function). So we output a key-value association, where the key is an ID for the sub-module object, and the value is the tensorflow::SequenceExample. The current relation with protobuf is tenuous, so we're avoiding a custom message type in favor of using the `Struct` message, but that requires the values be wire-able strings, hence base64 encoding. We plan on resolving the protobuf situation shortly, and improve the encoding of such logs, but this is sufficient for now for setting up regalloc training. Differential Revision: https://reviews.llvm.org/D116985	2022-01-11 16:13:31 -08:00
Mircea Trofin	a81b0c978f	[NFC][MLGO] Remove the word "inliner" in a generic error message.	2022-01-11 12:39:16 -08:00
Arthur Eubanks	bf52210e25	[NFC][LazyCallGraph] Remove check in removeDeadFunction() if graph is empty If we're in removeDeadFunction(), we should have already constructed the call graph. Differential Revision: https://reviews.llvm.org/D115676	2022-01-11 10:17:13 -08:00
Florian Hahn	f0ef1ea6dd	[IRBuilder] Introduce folder using inst-simplify, use for Or fold. Alternative to D116817. This introduces a new value-based folding interface for Or (FoldOr), which takes 2 values and returns an existing Value or a constant if the Or can be simplified. Otherwise nullptr is returned. This replaces the more restrictive CreateOr which takes 2 constants. This is the used to implement a folder that uses InstructionSimplify. The logic to simplify `Or` instructions is moved there. Subsequent patches are going to transition other CreateXXX to the more general FoldXXX interface. Reviewed By: nikic, lebedev.ri Differential Revision: https://reviews.llvm.org/D116935	2022-01-11 17:30:48 +00:00
Philip Reames	8f553da492	[instsimplify] Add a comment and test for a highly confusing case	2022-01-11 09:24:10 -08:00
Philip Reames	e838949bee	[GlobalsModRef] Apply indirect-global rule to all globals initialized from noalias calls Extend the existing malloc-family specific optimization to all noalias calls. This allows us to handle allocation wrappers, and removes a dependency on a lib-func check in favor of generic attribute usage. Differential Revision: https://reviews.llvm.org/D116980	2022-01-11 08:44:31 -08:00
Florian Hahn	8a469e2050	[InstSimplify] Fold inbounds GEP to poison if base is undef. D92270 updated constant expression folding to fold inbounds GEP to poison if the base is undef. Apply the same logic to SimplifyGEPInst. The justification is that we can choose an out-of-bounds pointer as base pointer. Reviewed By: nikic, lebedev.ri Differential Revision: https://reviews.llvm.org/D117015	2022-01-11 16:11:22 +00:00
Roman Lebedev	5ceb070bbb	[SCEV] `getSequentialMinMaxExpr()`: look into `umin` when deduplicating operands We could just merge all umin into umin_seq, but that is likely a pessimization, so don't do that, but pretend that we did for the purpose of deduplication.	2022-01-11 18:51:57 +03:00
Roman Lebedev	5e16650792	[SCEV] `getSequentialMinMaxExpr()`: keep only the first instance of an operand Having the same operand more than once doesn't change the outcome here, neither reduction-wise nor poison-wise. We must keep the first instance specifically though.	2022-01-11 16:51:53 +03:00
Roman Lebedev	76a0abbc13	[SCEV] Reenable umin_seq support and fix the `computeSCEVAtScope()` This reverts commit `f62f47f5e1`.	2022-01-11 16:03:35 +03:00
Nikita Popov	3946095b88	[MemoryBuiltins] Remove unused isOpNewLikeFn() (NFC) This function is no longer used since `2cafbcb560`.	2022-01-11 12:27:23 +01:00
Nikita Popov	b56f6f1913	[MemoryBuiltins] Remove unused isStrdupLikeFn() function (NFC) This function is no longer used after `dcbc91f40c`.	2022-01-11 12:26:20 +01:00
Philip Reames	f62f47f5e1	Partial revert of `82fb4f4` Two crashes have been reported. This change disables the new logic while leaving the new node in tree. Hopefully, that's enough to allow investigation without breakage while avoiding massive churn.	2022-01-10 18:18:34 -08:00
Philip Reames	5265ac72c6	[MemoryBuiltin] Add an API for checking if an unused allocation can be removed [NFC] Not all allocation functions are removable if unused. An example of a non-removable allocation would be a direct call to the replaceable global allocation function in C++. An example of a removable one - at least according to historical practice - would be malloc.	2022-01-10 15:43:39 -08:00
Roman Lebedev	82fb4f4b22	[SCEV] Sequential/in-order `UMin` expression As discussed in https://github.com/llvm/llvm-project/issues/53020 / https://reviews.llvm.org/D116692, SCEV is forbidden from reasoning about 'backedge taken count' if the branch condition is a poison-safe logical operation, which is conservatively correct, but is severely limiting. Instead, we should have a way to express those poison blocking properties in SCEV expressions. The proposed semantics is: ``` Sequential/in-order min/max SCEV expressions are non-commutative variants of commutative min/max SCEV expressions. If none of their operands are poison, then they are functionally equivalent, otherwise, if the operand that represents the saturation point* of given expression, comes before the first poison operand, then the whole expression is not poison, but is said saturation point. ``` * saturation point - the maximal/minimal possible integer value for the given type The lowering is straight-forward: ``` compare each operand to the saturation point, perform sequential in-order logical-or (poison-safe!) ordered reduction over those checks, and if reduction returned true then return saturation point else return the naive min/max reduction over the operands ``` https://alive2.llvm.org/ce/z/Q7jxvH (2 ops) https://alive2.llvm.org/ce/z/QCRrhk (3 ops) Note that we don't need to check the last operand: https://alive2.llvm.org/ce/z/abvHQS Note that this is not commutative: https://alive2.llvm.org/ce/z/FK9e97 That allows us to handle the patterns in question. Reviewed By: nikic, reames Differential Revision: https://reviews.llvm.org/D116766	2022-01-10 20:51:26 +03:00
Philip Reames	1d127315e7	Minor style tweaks following `fb93659`	2022-01-10 09:32:29 -08:00
Bryce Wilson	7febd60a90	[instcombine] Add align return attributes for operator new(..., align_val) (Split from original patch to separate non-NFC part and add coverage. I typoed when adding the new test, so this change includes the typo fix to let libfunc recongize the signature. Didn't figure it was worth another separate commit.) Differential Revision: https://reviews.llvm.org/D116851 (part 2 of 2)	2022-01-10 09:15:20 -08:00
Bryce Wilson	fb936595fa	[MemoryBuiltins] Add field for alignment argument [NFC] There are a few places where the alignment argument for AlignedAllocLike functions was previously hardcoded. This patch adds an getAllocAlignment function and a change to the MemoryBuiltin table to allow alignment arguments to be found generically. This will shortly allow alignment inference on operator new's with align_val params and an extension to Attributor's HeapToStack. The former will follow shortly - I split Bryce's patch for purpose of having the large change be NFC. The later will be reviewed separately. Differential Revision: https://reviews.llvm.org/D116851 (part 1 of 2)	2022-01-10 09:15:20 -08:00
Simon Pilgrim	fd1094f318	[ConstantFolding] Clean up Intrinsics::abs undef handling Match cttz/ctlz handling by assuming C1 == 0 if C1 != 1 - I've added an assertion as well. Fixes static analyzer nullptr dereference warnings.	2022-01-10 17:04:03 +00:00
Nikita Popov	92d55e7336	[MemoryBuiltins] Remove isNoAliasFn() in favor of isNoAliasCall() We currently have two similar implementations of this concept: isNoAliasCall() only checks for the noalias return attribute. isNoAliasFn() also checks for allocation functions. We should switch to only checking the attribute. SLC is responsible for inferring the noalias return attribute for non-new allocation functions (with a missing case fixed in `348bc76e35`). For new, clang is responsible for setting the attribute, if -fno-assume-sane-operator-new is not passed. Differential Revision: https://reviews.llvm.org/D116800	2022-01-10 09:18:15 +01:00
Simon Pilgrim	be7dbd674c	[DivergenceAnalysis] Simplify inRegion test based on whether the RegionLoop pointer is null or not More closely matches the documentation Requested by @nikic	2022-01-08 14:30:10 +00:00
Simon Pilgrim	b3f193a980	[DivergenceAnalysis] Fix static analyzer warning about dereference of nullptr We're testing that the RegionLoop pointer is null in the first part of the check, so we need to check that its non-null before dereferencing it in a later part of the check.	2022-01-08 13:57:33 +00:00
Kazu Hirata	b932bdf59f	[llvm] Remove redundant member initialization (NFC) Identified with readability-redundant-member-init.	2022-01-07 17:45:09 -08:00
Philip Reames	f38873537b	[MemoryBuiltin] Cleanup stale todo comments [NFC] strdup/strndup are already partially implemented, move remaining comment to relevant place. Remaining named routines are copy routines and mostly handled via intrinsics already - they do not allocate new memory.	2022-01-07 13:57:20 -08:00
Roman Lebedev	32300375f5	[NFCI] `ScalarEvolution::getRangeRef()`: collapse `SCEVMinMaxExpr` handling	2022-01-08 00:23:08 +03:00
Arthur Eubanks	d51e3474e0	[LazyCallGraph] Ignore empty RefSCCs rather than shift RefSCCs when removing dead functions This is in preparation for D115545 which attempts to delete discardable functions if they are unused. With that change, shifting RefSCCs becomes noticeable in compile time. This change makes the LCG update negligible again. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D116776	2022-01-07 09:42:23 -08:00
Philip Reames	6b0ff0969d	Extract utility function for checking initial value of allocation [NFC, try 2] This is a reoccuring pattern, we can consolidate three copies into one. The main motivation is to reduce usages of isMallocLike. The original commit (which was quickly reverted) didn't account for the allocation function could be an invoke, test coverage for that case added in this commit.	2022-01-07 08:44:08 -08:00
Roman Lebedev	a5a6960d1c	[NFCI][IR] MinMaxIntrinsic: add some more helper methods, and use them	2022-01-07 13:02:11 +03:00
Philip Reames	c6a0c1585a	Revert "Extract utility function for checking initial value of allocation [NFC]" This reverts commit `9ce30fe86f`. Appears to be causing a problem on a buildbot, revert while investigating. https://green.lab.llvm.org/green//job/clang-stage1-RA/26818/consoleFull#-1502953973d489585b-5106-414a-ac11-3ff90657619c	2022-01-06 19:05:51 -08:00
Philip Reames	9ce30fe86f	Extract utility function for checking initial value of allocation [NFC] This is a reoccuring pattern, we can consolidate three copies into one. The main motivation is to reduce usages of isMallocLike.	2022-01-06 18:02:14 -08:00
Philip Reames	5d1cfd4348	Remove unused LookThroughBitCast param in isXAllocLike functions [NFC] This parameter took the non-default value exactly twice, and neither had semantic effect.	2022-01-06 18:02:13 -08:00
Philip Reames	7052670e96	Move getMallocAllocatedType and getMallocArraySize to GlobalOpt [NFC] These are implementation details of the global-opt transform and not easily reuseable, so remove them from the analysis header.	2022-01-06 18:02:13 -08:00
Philip Reames	67a3331e4f	Inline extractMallocCall to sole use and delete [NFC]	2022-01-06 18:02:13 -08:00
Philip Reames	4b0fc924a9	Delete unused extractCallocCall routine [NFC]	2022-01-06 18:02:13 -08:00
Philip Reames	cffd268316	Demote getMallocType to implementation routine in MemoryBuiltins [NFC]	2022-01-06 18:02:13 -08:00
Daniil Suchkov	524abc68f2	Introduce NewPM .dot printers for DomTree This patch adds a couple of NewPM function passes (dot-dom and dot-dom-only) that dump DomTree into .dot files. Reviewed-By: aeubanks Differential Revision: https://reviews.llvm.org/D116629	2022-01-05 23:25:40 +00:00
Nico Weber	085f078307	Revert "Revert D109159 "[amdgpu] Enable selection of `s_cselect_b64`."" This reverts commit `859ebca744`. The change contained many unrelated changes and e.g. restored unit test failes for the old lld port.	2022-01-05 13:10:25 -05:00
David Salinas	859ebca744	Revert D109159 "[amdgpu] Enable selection of `s_cselect_b64`." This reverts commit `640beb38e7`. That commit caused performance degradtion in Quicksilver test QS:sGPU and a functional test failure in (rocPRIM rocprim.device_segmented_radix_sort). Reverting until we have a better solution to s_cselect_b64 codegen cleanup Change-Id: Ibf8e397df94001f248fba609f072088a46abae08 Reviewed By: kzhuravl Differential Revision: https://reviews.llvm.org/D115960 Change-Id: Id169459ce4dfffa857d5645a0af50b0063ce1105	2022-01-05 17:57:32 +00:00
Philip Reames	c16fd6a376	Rename doesNotReadMemory to onlyWritesMemory globally [NFC] The naming has come up as a source of confusion in several recent reviews. onlyWritesMemory is consist with onlyReadsMemory which we use for the corresponding readonly case as well.	2022-01-05 08:52:55 -08:00
Nikita Popov	3dc1907d06	[ConstantFold] Use ConstantFoldLoadFromUniformValue() in more places In particular, this also preserves undef when loading from padding, rather than converting it to zero through a different codepath. This is the remaining part of D115924.	2022-01-05 12:47:50 +01:00
Nikita Popov	99c6b12b92	[ConstantFolding] Unify handling of load from uniform value There are a number of places that specially handle loads from a uniform value where all the bits are the same (zero, one, undef, poison), because we a) don't care about the load offset in that case b) it bypasses casts that might not be legal generally but do work with uniform values. We had multiple implementations of this, with a different set of supported values each time. This replaces two usages with a more complete helper. Other usages will be replaced separately, because they have larger impact. This is part of D115924.	2022-01-05 12:30:46 +01:00
Mircea Trofin	a120fdd337	[NFC][MLGO]Add RTTI support for MLModelRunner and simplify runner setup	2022-01-04 19:46:14 -08:00
Sanjay Patel	1e50d06466	[Analysis] fix swapped operands to computeConstantRange This was noted in post-commit review for D116322 / `0edf99950e` . I am not seeing how to expose the bug in a test though because we don't pass an assumption cache into this analysis from there.	2022-01-04 13:13:50 -05:00
Philip Reames	b061d86c69	[SCEV] Compute exit count from overflow check expressed w/ x.with.overflow intrinsics This ports the logic we generate in instcombine for a single use x.with.overflow check for use in SCEV's analysis. The result is that we can prove trip counts for many checks, and (through existing logic) often discharge them. Motivation comes from compiling a simple example with -ftrapv. Differential Revision: https://reviews.llvm.org/D116499	2022-01-04 09:44:23 -08:00
Florian Hahn	d8276208be	[LAA] Remove overeager assertion for aggregate types. `0a00d64` turned an early exit here into an assertion, but the assertion can be triggered, as PR52920 shows. The later code is agnostic to the accessed type, so just drop the assert. The patch also adds tests for LAA directly and loop-load-elimination to show the behavior is sane.	2022-01-04 15:20:35 +00:00
Nikita Popov	71b2c4a3cf	[ConstantFolding] Remove unused ConstantFoldLoadThroughGEPConstantExpr() This API is no longer used since `bbeaf2aac6`.	2022-01-04 12:37:12 +01:00

1 2 3 4 5 ...

11174 Commits