Commit Graph

885 Commits

Author SHA1 Message Date
Kazu Hirata a7938c74f1 [llvm] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-25 21:42:52 -07:00
Kazu Hirata 3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3.
2022-06-25 11:56:50 -07:00
Kazu Hirata aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Kazu Hirata e0e687a615 [llvm] Don't use Optional::hasValue (NFC) 2022-06-20 10:38:12 -07:00
Guillaume Chatelet dc9c2eac98 [NFC][Alignment] Simplify code 2022-06-10 15:25:28 +00:00
Simon Moll b8c2781ff6 [NFC] format InstructionSimplify & lowerCaseFunctionNames
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName".  This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.

This is the alternative to the less invasive clang-format only patch: D126783

Reviewed By: spatel, rengolin

Differential Revision: https://reviews.llvm.org/D126889
2022-06-09 16:10:08 +02:00
Sanjay Patel ebbc37391f [InstCombine] allow variable shift amount in bswap + shift fold
When shifting by a byte-multiple:
bswap (shl X, Y) --> lshr (bswap X), Y
bswap (lshr X, Y) --> shl (bswap X), Y

This was limited to constants as a first step in D122010 / 60820e53ec ,
but issue #55327 shows a source example (and there's a test based on that here)
where a variable shift amount is used in this pattern.
2022-05-18 14:38:16 -04:00
Nikita Popov d9ad6a2c8b [InstCombine] Fix unused variable warning (NFC) 2022-05-13 12:43:21 +02:00
Serge Pavlov eb28da89a6 [InstCombine] Remove side effect of replaced constrained intrinsics
If a constrained intrinsic call was replaced by some value, it was not
removed in some cases. The dangling instruction resulted in useless
instructions executed in runtime. It happened because constrained
intrinsics usually have side effect, it is used to model the interaction
with floating-point environment. In some cases side effect is actually
absent or can be ignored.

This change adds specific treatment of constrained intrinsics so that
their side effect can be removed if it actually absents.

Differential Revision: https://reviews.llvm.org/D118426
2022-05-07 19:04:11 +07:00
Serge Pavlov e1554ac63a Revert "[InstCombine] Remove side effect of replaced constrained intrinsics"
This reverts commit 83914ee96f.
The change caused discussion: https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20220502/1034841.html
2022-05-06 01:09:16 +07:00
Serge Pavlov 83914ee96f [InstCombine] Remove side effect of replaced constrained intrinsics
If a constrained intrinsic call was replaced by some value, it was not
removed in some cases. The dangling instruction resulted in useless
instructions executed in runtime. It happened because constrained
intrinsics usually have side effect, it is used to model the interaction
with floating-point environment. In some cases it is correct behavior
but often the side effect is actually absent or can be ignored.

This change adds specific treatment of constrained intrinsics so that
their side effect can be removed if it actually absents.

Differential Revision: https://reviews.llvm.org/D118426
2022-05-05 12:02:42 +07:00
Sanjay Patel 14f257620c [InstCombine] add type constraint to intrinsic+shuffle fold
This check is in the related fold for binops,
but it was missed when the code was adapted
for intrinsics in 432c199e84. The new test
would crash when trying to create a new
intrinsic with mismatched types.
2022-05-04 13:07:26 -04:00
Sanjay Patel 7e6d318c50 [InstCombine] move shuffle after funnel shift with same-shuffled operands
This extends 432c199e84 and 9c4770eaab with an intrinsic
cited directly in issue #46238

Eventually, we will want to use llvm::isTriviallyVectorizable()
or create some new API for this list, but for now, I am intentionally
making a minimum change to reduce risk and only affect an intrinsic
with regression tests in place.
2022-05-04 13:07:26 -04:00
Sanjay Patel 15042f44a2 [InstCombine] propagate FMF when reordering intrinsics and shuffles
This was missed when extending the fold to allow fma with
9c4770eaab
2022-05-04 12:10:38 -04:00
Sanjay Patel 9c4770eaab [InstCombine] move shuffle after fma with same-shuffled operands
https://alive2.llvm.org/ce/z/sD-JVv

This extends 432c199e84 with a 3 arg intrinsic to demonstrate
that the code works with the extra operand.

Eventually, we will want to use llvm::isTriviallyVectorizable()
or create some new API for this list, but for now, I am intentionally
making a minimum change to reduce risk and only affect an intrinsic
with regression tests in place.
2022-05-04 11:50:38 -04:00
Sanjay Patel 432c199e84 [InstCombine] move shuffle after min/max with same-shuffled operands
This is an intrinsic version of the existing fold for binops.
As a first step, I only allowed min/max, but the code is set
up to make adding more intrinsics easy (with more or less than
2 arguments).

This (and possible follow-ups) are discussed in issue #46238.
2022-05-03 16:23:11 -04:00
Nikita Popov 1881711fbb [InstCombine] Remove memset of undef value
This removes memset with undef char. We already do this for stores
of undef value.

This comes with the caveat that this optimization is not, strictly
speaking, legal for undef values, because we might be overwriting
a poison value. However, our entire load/store model currently still
operates on undef values, so we need to support undef here as well
for internal consistency.

Once https://github.com/llvm/llvm-project/issues/52930 is resolved,
these and related folds can be limited to poison -- I've added
FIXMEs to that effect.

Differential Revision: https://reviews.llvm.org/D124173
2022-04-29 14:51:18 +02:00
Nikita Popov 46c2b41d02 [InstCombine] Remove dead code (NFC)
This was a leftover condition without code.
2022-04-21 15:53:53 +02:00
serge-sans-paille aa15ea47e2 [builtin_object_size] Basic support for posix_memalign
It actually implements support for seeing through loads, using alias analysis to
refine the result.

This is rather limited, but I didn't want to rely on more than available
analysis at that point (to be gentle with compilation time), and it does seem to
catch common scenario, as showcased by the included tests.

Differential Revision: https://reviews.llvm.org/D122431
2022-04-08 09:31:11 +02:00
Augie Fackler f3c702fbd1 InstCombineCalls: fix annotateAnyAllocCallSite to report changes
Spotted during review of D123052.

Differential Revision: https://reviews.llvm.org/D123232
2022-04-07 13:49:09 -04:00
Augie Fackler f120be6c86 InstCombineCalls: when adding an align attribute, never reduce it
Sometimes we can infer an align from an allocalign but the function
already promised it'd be more-aligned than the allocalign and there's an
existing align that we shouldn't reduce. Make sure we handle that
correctly.

Differential Revision: https://reviews.llvm.org/D121642
2022-04-07 12:38:44 -04:00
Augie Fackler ca051a46fb InstCombineCalls: infer return alignment from allocalign attributes
This exposes a couple of lingering bugs, which will be fixed in
the next two commits.

Differential Revision: https://reviews.llvm.org/D123052
2022-04-07 12:38:44 -04:00
Nikita Popov 682ef39b1a [InstCombine] Remove call to getPointerElementType()
This was erroneously re-introduced as part of
bb0b23174e.
2022-03-29 16:52:29 +02:00
Johannes Doerfert bb0b23174e [InstCombineCalls] Optimize call of bitcast even w/ parameter attributes
Before we gave up if a call through bitcast had parameter attributes.
Interestingly, we allowed attributes for the return value already. We
now handle both the same way, namely, we drop the ones that are
incompatible with the new type and keep the rest. This cannot cause
"more UB" than initially present.

Differential Revision: https://reviews.llvm.org/D119967
2022-03-28 20:57:52 -05:00
chenglin.bi 52f323d0f1 [InstCombine] Fold abs of known negative operand when source is sub
When abs source comes from (x - y), check if a "x > y" dominating
condition exists.

Fixes #54132

Differential Revision: https://reviews.llvm.org/D122013
2022-03-23 15:21:33 -04:00
Philip Reames 7abefc4222 [instcombine] Fold away memset/memmove from otherwise unused alloca
The motivation for this is that while both memcpyopt and dse will catch this case, both are limited by MSSA's walk back threshold when finding clobbers.  As such, if you have a memcpy of an otherwise dead alloca placed towards the end of a long basic block with lots of other memory instructions, it would be missed.  This is a bit undesirable for such an "obviously" useless bit of code.

As noted in comments, we should probably generalize instcombine's escape analysis peephole (see visitAllocInst) to allow read xor write.  Doing that would subsume this code in a more general way, but is also a more involved change.  For the moment, I went with the easiest fix.
2022-03-22 13:48:48 -07:00
Sanjay Patel 60820e53ec [InstCombine] try to canonicalize logical shift after bswap
When shifting by a byte-multiple:
bswap (shl X, C) --> lshr (bswap X), C
bswap (lshr X, C) --> shl (bswap X), C

This is an IR implementation of a transform suggested in D120648.
The "swaps cancel" test models the motivating optimization from
that proposal.

Alive2 checks (as noted in the other review, we could use
knownbits to handle shift-by-variable-amount, but that can be an
enhancement patch):
https://alive2.llvm.org/ce/z/pXUaRf
https://alive2.llvm.org/ce/z/ZnaMLf

Differential Revision: https://reviews.llvm.org/D122010
2022-03-22 09:10:55 -04:00
Nikita Popov c1b9667148 [InstCombine] Support opaque pointers in callee bitcast fold
To make this actually trigger, we also need to check whether the
function types differ, which is a hidden cast under opaque pointers.
The transform is somewhat less relevant there because it is
primarily about pointer bitcasts, but it can also happen with other
bit- or pointer-castable types.

Byval handling is easier with opaque pointers because there is no
need to adjust the byval type, we only need to make sure that it's
still a pointer.
2022-03-03 11:07:39 +01:00
Nikita Popov 6c8adc5054 [InstCombine] Remove unnecessary byval check in callee cast fold
The logic for handling this was fixed in
8d7f118ab2, but the check for byval
on the callee was retained. This resulted in a weird situation
where the transform would work depending on whether the byval
was only on the call or on both the call and the function.
2022-03-03 10:55:14 +01:00
serge-sans-paille 59630917d6 Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line:
before: 1062981579
after:  1062494547

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120817
2022-03-03 07:56:34 +01:00
Nikita Popov 9353ed6a53 [InstCombine] Don't call matchSAddSubSat() for SPF (NFC)
Only call it for intrinsic min/max. The moved implementation is
unchanged apart from the one-use check: It is now hardcoded to
one-use, without the two-use special case for SPF.
2022-02-28 10:41:56 +01:00
Simon Pilgrim be1ffda0a5 [InstCombine] visitCallInst - pull out repeated bswap scalar type bitwidth. NFC. 2022-02-18 17:33:11 +00:00
Sanjay Patel 58df2da054 [InstCombine] push constant operand down/outside in sequence of min/max intrinsics
A generalization like this was suggested in D119754.
This is the inverse direction of D119851,
and we get all of the folds there plus the one that was missed.

There is precedence for this kind of transform in instcombine
with "or" instructions (but strangely only with that one opcode AFAICT).

Similar justification as in the other patch:
The line between instcombine and reassociate for these kinds of folds
is blurry. This doesn't appear to have much cost and gives us the
expected wins from repeated folds as seen in the last set of test diffs.

Differential Revision: https://reviews.llvm.org/D119955
2022-02-17 10:36:37 -05:00
Sanjay Patel 6357ccf57f [InstCombine] reassociate min/max intrinsics with constant operands
Integer min/max operations are associative:
  max (max X, C0), C1 --> max X, (max C0, C1) --> max X, NewC

https://alive2.llvm.org/ce/z/wW5HVM

This would avoid a regression when we canonicalize to min/max intrinsics
(see D98152 ).

Differential Revision: https://reviews.llvm.org/D119754
2022-02-15 08:31:23 -05:00
Roman Lebedev cd9e6a9c10
[NFC][InstCombine] `visitCallInst()`: make comment more understandable 2022-02-05 02:15:07 +03:00
Anna Thomas 4fc52db116 [InstCombine] Remove weaker fence adjacent to a stronger fence
We have an instCombine rule to remove identical consecutive fences.
We can extend this to remove weaker fences when we have consecutive stronger
fence.

As stated in the LangRef, a fence with a stronger ordering also implies
ordering weaker than itself: "A fence which has seq_cst ordering, in addition to
having both acquire and release semantics specified above, participates in the
global program order of other seq_cst operations and/or fences."

Reviewed-By: reames

Differential Revision: https://reviews.llvm.org/D118607
2022-02-01 11:05:34 -08:00
Nikita Popov 8d992862a0 [InstCombine] Remove some pointer element type accesses
One of these is guarded against opaque pointers, and the others
were accessing the call function type in a rather convoluted way.
2022-01-27 10:15:35 +01:00
Nikita Popov aa97bc116d [NFC] Remove uses of PointerType::getElementType()
Instead use either Type::getPointerElementType() or
Type::getNonOpaquePointerElementType().

This is part of D117885, in preparation for deprecating the API.
2022-01-25 09:44:52 +01:00
Sanjay Patel 2e26633af0 [IR] document and update ctlz/cttz intrinsics to optionally return poison rather than undef
The behavior in Analysis (knownbits) implements poison semantics already,
and we expect the transforms (for example, in instcombine) derived from
those semantics, so this patch changes the LangRef and remaining code to
be consistent. This is one more step in removing "undef" from LLVM.

Without this, I think https://github.com/llvm/llvm-project/issues/53330
has a legitimate complaint because that report wants to allow subsequent
code to mask off bits, and that is allowed with undef values. The clang
builtins are not actually documented anywhere AFAICT, but we might want
to add that to remove more uncertainty.

Differential Revision: https://reviews.llvm.org/D117912
2022-01-23 11:22:48 -05:00
Caroline Concatto ad43217a04 [InstCombine] Fold for masked gather when loading the same value each time.
This patch checks in the masked gather when the first operand value is a
splat and the mask is all one, because the masked gather is reloading the
same value each time. This patch replaces this pattern of masked gather by
a scalar load of the value and splats it in a vector.

Differential Revision: https://reviews.llvm.org/D115726
2022-01-21 14:19:51 +00:00
Pawe Bylica 1d7604fdce
[InstCombine] Simplify bswap -> shift
Simplify bswap(x) to shl(x) or lshr(x) if x has exactly one
"active byte", i.e. all active bits are contained in boundaries
of a single byte of x.

https://alive2.llvm.org/ce/z/nvbbU5
https://alive2.llvm.org/ce/z/KiiL3J

Reviewed By: spatel, craig.topper, lebedev.ri

Differential Revision: https://reviews.llvm.org/D117680
2022-01-21 01:25:30 +01:00
Nikita Popov c63a3175c2 [AttrBuilder] Remove ctor accepting AttributeList and Index
Use the AttributeSet constructor instead. There's no good reason
why AttrBuilder itself should exact the AttributeSet from the
AttributeList. Moving this out of the AttrBuilder generally results
in cleaner code.
2022-01-15 22:39:31 +01:00
Caroline Concatto 8e5a5b619d [InstCombine] Fold for masked scatters to a uniform address
When masked scatter intrinsic does a uniform store to a destination
address from a source vector, and in this case, the mask is all one value.
This patch replaces the masked scatter with an extracted element of the
last lane of the source vector and stores it in the destination vector.
This patch also folds when the value in the masked scatter is a splat.
In this case, the mask cannot be all zero, and it folds to a scalar store
of the value in the destination pointer.

Differential Revision: https://reviews.llvm.org/D115724
2022-01-14 09:44:34 +00:00
Philip Reames 5265ac72c6 [MemoryBuiltin] Add an API for checking if an unused allocation can be removed [NFC]
Not all allocation functions are removable if unused.  An example of a non-removable allocation would be a direct call to the replaceable global allocation function in C++.  An example of a removable one - at least according to historical practice - would be malloc.
2022-01-10 15:43:39 -08:00
Bryce Wilson fb936595fa [MemoryBuiltins] Add field for alignment argument [NFC]
There are a few places where the alignment argument for AlignedAllocLike functions was previously hardcoded. This patch adds an getAllocAlignment function and a change to the MemoryBuiltin table to allow alignment arguments to be found generically.

This will shortly allow alignment inference on operator new's with align_val params and an extension to Attributor's HeapToStack.  The former will follow shortly - I split Bryce's patch for purpose of having the large change be NFC.  The later will be reviewed separately.

Differential Revision: https://reviews.llvm.org/D116851 (part 1 of 2)
2022-01-10 09:15:20 -08:00
Philip Reames f4c54683d6 [instcombine] Infer alignment for aligned_alloc with potentially zero size
This change removes a previous restriction where we had to prove the allocation performed by aligned_alloc was non-zero in size before using the align parameter to annotate the result.  I believe this was conservatism around the C11 specification of this routine which allowed UB when size was not a multiple of alignment, but if so, it was a partial one at best.  (ex: align 32, size 16  was equally UB, but not restricted)  The spec has since been clarified to require nullptr return, not UB.

A nullptr - the documented return for this function on failure for all cases after UB mentioned above was removed - is trivially aligned for any power of two.  This isn't totally new behavior even for this transform, we'd previously annotate potentially failing allocs (e.g. huge sizes) meaning we were putting align on potentially null pointers anyways.  This change simpy does the same for all failure modes.
2022-01-10 08:48:49 -08:00
Serge Guelton d2cc6c2d0c Use a sorted array instead of a map to store AttrBuilder string attributes
Using and std::map<SmallString, SmallString> for target dependent attributes is
inefficient: it makes its constructor slightly heavier, and involves extra
allocation for each new string attribute. Storing the attribute key/value as
strings implies extra allocation/copy step.

Use a sorted vector instead. Given the low number of attributes generally
involved, this is cheaper, as showcased by

https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions

Differential Revision: https://reviews.llvm.org/D116599
2022-01-10 14:49:53 +01:00
Philip Reames 2cafbcb560 [instcombine] Key deref vs deref_or_null annotation of allocation sites off nonnull attribute
Goal is to remove use of isOpNewLike.  I looked at a couple approaches to this, and this turned out to be the cheapest one.  Just letting deref_or_null be generated causes a bunch of test diffs, and I couldn't convince myself there wasn't a real regression somewhere.  A generic instcombine to convert deref_or_null + nonnull to deref is annoying complicated since you have to mix facts from callsite and declaration while manipulating only existing call site attributes.  It just wasn't worth the code complexity.

Note that the change in new-delete-itanium.ll is a real regression.  If you have a callsite which overrides the builtin status of a nobuiltin declaration, *and* you don't put the apppriate attributes on that callsite, you may lose the deref fact.  I decided this didn't matter; if anyone disagrees, you can add this case to the generic non-null inference.
2022-01-08 10:33:54 -08:00
Philip Reames dcbc91f40c [instcombine] Delete duplicate object size logic
nstCombine appears to duplicate the allocation size logic used inside getObjectSize when figuring out which attributes are safe to place on the callsite. We can use the existing utility function instead.

The test change is correct. With aligned_alloc, a zero alignment is required to return nullptr. As such, deref_or_null is a correct attribute to use.

Differential Revision: https://reviews.llvm.org/D116816
2022-01-07 10:32:26 -08:00
Nick Desaulniers 95ba0e4563 [SimplifyLibCalls] propagate tail flags on CallInsts
I noticed we weren't propagating tail flags on calls when
FortifiedLibCallSimplifier.optimizeCall() was replacing calls to runtime
checked calls to the non-checked routines (when safe to do so). Make
sure to check this before replacing the original calls!

Also, avoid any libcall transforms when notail/musttail is present.

PR46734
Fixes: https://github.com/llvm/llvm-project/issues/46079

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D107872
2021-12-13 11:18:30 -08:00