If both the character and string are known, but the length
potentially isn't, we can optimize the memchr() call to a select
of either the known position of the character or null.
Split off from https://reviews.llvm.org/D122836.
Handle the simple constant char case before the bitmask optimization.
This will allow extending the code to handle a non-constant size
argument in a followup change.
Split out from https://reviews.llvm.org/D122836.
If the memchr() size is 1, then we can convert the call into a
single-byte comparison. This works even if both the string and the
character are unknown.
Split off from https://reviews.llvm.org/D122836.
We should always be calculating a byte-wise difference here.
Previously this calculated the pointer difference while taking
the pointer element type into account, which is incorrect.
I noticed we weren't propagating tail flags on calls when
FortifiedLibCallSimplifier.optimizeCall() was replacing calls to runtime
checked calls to the non-checked routines (when safe to do so). Make
sure to check this before replacing the original calls!
Also, avoid any libcall transforms when notail/musttail is present.
PR46734
Fixes: https://github.com/llvm/llvm-project/issues/46079
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D107872
In TargetLibraryInfoImpl::isValidProtoForLibFunc we no longer
need the IsSizeTTy lambda function and the SizeTTy object. Instead
we just follow the regular structure of checking for integer types
given an exepected number of bits.
We could try harder to screen out libcalls by
function signature (and that would be a much larger
change than for sprintf alone), but that might make
the transition to type-less pointers more difficult.
https://llvm.org/PR51200
This fixes a bug at LibCallSimplifier::optimizeMemChr which does the following transformation:
```
// memchr("\r\n", C, 2) != nullptr -> (1 << C & ((1 << '\r') | (1 << '\n')))
// != 0
// after bounds check.
```
As written above, a bounds check on C (whether it is less than integer bitwidth) is done before doing `1 << C` otherwise 1 << C will overflow.
If the bounds check is false, the result of (1 << C & ...) must not be used at all, otherwise the result of shift (which is poison) will contaminate the whole results.
A correct way to encode this is `select i1 (bounds check), (1 << C & ...), false` because select does not allow the unused operand to contaminate the result.
However, this optimization was introducing `and (bounds check), (1 << C & ...)` which cannot do that.
The bug was found from compilation of this C++ code: https://reviews.llvm.org/rG2fd3037ac615#1007197
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D104901
- When emitting libcalls, do not only pass the calling convention from the
function prototype but also the attributes.
- Do not pass attributes from e.g. libc memcpy to llvm.memcpy.
Review: Reid Kleckner, Eli Friedman, Arthur Eubanks
Differential Revision: https://reviews.llvm.org/D103992
This can be seen as a follow up to commit 0ee439b705,
that changed the second argument of __powidf2, __powisf2 and
__powitf2 in compiler-rt from si_int to int. That was to align with
how those runtimes are defined in libgcc.
One thing that seem to have been missing in that patch was to make
sure that the rest of LLVM also handle that the argument now depends
on the size of int (not using the si_int machine mode for 32-bit).
When using __builtin_powi for a target with 16-bit int clang crashed.
And when emitting libcalls to those rtlib functions, typically when
lowering @llvm.powi), the backend would always prepare the exponent
argument as an i32 which caused miscompiles when the rtlib was
compiled with 16-bit int.
The solution used here is to use an overloaded type for the second
argument in @llvm.powi. This way clang can use the "correct" type
when lowering __builtin_powi, and then later when emitting the libcall
it is assumed that the type used in @llvm.powi matches the rtlib
function.
One thing that needed some extra attention was that when vectorizing
calls several passes did not support that several arguments could
be overloaded in the intrinsics. This patch allows overload of a
scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with
an entry for powi.
Differential Revision: https://reviews.llvm.org/D99439
When rewriting
powf(2.0, itofp(x)) -> ldexpf(1.0, x)
exp2(sitofp(x)) -> ldexp(1.0, sext(x))
exp2(uitofp(x)) -> ldexp(1.0, zext(x))
the wrong type was used for the second argument in the ldexp/ldexpf
libc call, for target architectures with 16 bit "int" type.
The transform incorrectly used a bitcasted function pointer with
a 32-bit argument when emitting the ldexp/ldexpf call for such
targets.
The fault is solved by using the correct function prototype
in the call, by asking TargetLibraryInfo about the size of "int".
TargetLibraryInfo by default derives the size of the int type by
assuming that it is 16 bits for 16-bit architectures, and
32 bits otherwise. If this isn't true for a target it should be
possible to override that default in the TargetLibraryInfo
initializer.
Differential Revision: https://reviews.llvm.org/D99438
Before this change LLVM cannot simplify printf in following cases:
printf("%s", "") --> noop
printf("%s", str"\n") --> puts(str)
From the other hand GCC can perform such transformations for many years:
https://godbolt.org/z/7nnqbedfe
Differential Revision: https://reviews.llvm.org/D100724
Fix for PR49984
This was discovered during Attributor testing.
Memset was always created with alignment of 1
and in case when strncpy alignment was changed
it triggered an assertion in the AttrBuilder.
Memset will now be created with appropriate alignment.
Differential Revision: https://reviews.llvm.org/D100875
D24453 enabled libcalls simplication for ARM PCS. This may cause
caller/callee calling conventions mismatch in some situations such as
LTO. This patch makes instcombine aware that the compatible calling
conventions differences are benign (not emitting undef idom).
Differential Revision: https://reviews.llvm.org/D99773
See: https://bugs.llvm.org/show_bug.cgi?id=47613
There was an extra sqrt call because shrinking emitted a new powf and at the same time optimizePow replaces the previous pow with sqrt and as the result we have two instructions that will be in worklist of InstCombie despite the fact that %powf is not used by anyone (it is alive because of errno).
As the result we have two instructions:
%powf = call fast float @powf(float %x, float 5.000000e-01)
%sqrt = call fast double @sqrt(double %dx)
%powf will be converted to %sqrtf on a later iteration.
As a quick fix for that I moved shrinking to the end of optimizePow so that pow is replaced with sqrt at first that allows not to emit a new shrunk powf.
Differential Revision: https://reviews.llvm.org/D98235
I think we can use here same logic as for nonnull.
strlen(X) - X must be noundef => valid pointer.
for libcalls with size arg, we add noundef only if size is known and greater than 0 - so pointers must be noundef (valid ones)
Reviewed By: jdoerfert, aqjune
Differential Revision: https://reviews.llvm.org/D95122
If the call result is unused, we should let it get DCEd rather
than replacing it. Also, don't try to replace an existing sincos
with another one (unless it's as part of combining sin and cos).
This avoids an infinite combine loop if the calls are not DCEd
as expected, which can happen with D94106 and lack of willreturn
annotation in hand-crafted IR.
In https://llvm.org/PR48810 , we are crashing while trying to
propagate attributes from mempcpy (returns void*) to memcpy
(returns nothing - void).
We can avoid the crash by removing known incompatible
attributes for the void return type.
I'm not sure if this goes far enough (should we just drop all
attributes since this isn't the same function?). We also need
to audit other transforms in LibCallSimplifier to make sure
there are no other cases that have the same problem.
Differential Revision: https://reviews.llvm.org/D95088
A conversion from `pow` to `sqrt` shall not call an `errno`-setting
`sqrt` with -//infinity//: the `sqrt` will set `EDOM` where the `pow`
call need not.
This patch avoids the erroneous (pun not intended) transformation by
applying the restrictions discussed in the thread for
https://lists.llvm.org/pipermail/llvm-dev/2020-September/145051.html.
The existing tests are updated (depending on emphasis in the checks for
library calls, avoidance of overlap, and overall coverage):
- to add `ninf`, retaining the intended library call,
- to use the intrinsic, retaining the use of `select`, or
- to expect the replacement to not occur.
The following is tested:
- The pow intrinsic folds to a `select` instruction to
handle -//infinity//.
- The pow library call folds, with `ninf`, to `sqrt` without the
`select` instruction associated with handling -//infinity//.
- The pow library call does not fold to `sqrt` without `ninf`.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D87877
The current code for handling pow(x, y) where y is an integer plus 0.5
is not explicitly guarded against attempting to transform the case where
abs(y) is exactly 0.5.
The latter case is meant to be handled by `replacePowWithSqrt`. Indeed,
if the pow(x, integer+0.5) case proceeds past a certain point, it will
hit an assertion by attempting to form pow(x, 0) using `getPow`.
This patch adds an explicit check to prevent attempting the
pow(x, integer+0.5) transformation on pow(x, +/-0.5) as suggested during
the review of D87877. This has the effect of retaining the shrinking of
`pow` to `powf` when the `sqrt` libcall cannot be formed.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D88066
Currently we bail out early for strlen calls with a GEP operand, if none
of the GEP specific optimizations fire. But there could be later
optimizations that still apply, which we currently miss out on.
An example is that we do not apply the following optimization
strlen(x) == 0 --> *x == 0
Unless I am missing something, there seems to be no reason for bailing
out early there.
Fixes PR47149.
Reviewed By: lebedev.ri, xbolva00
Differential Revision: https://reviews.llvm.org/D85886
This reverts commit 6dbf0cfcf7.
That commit caused failed assertions, e.g. like this:
$ cat sprintf-strcpy.c
char *ptr; void func(void) { ptr += sprintf(ptr, "%s", ""); }
$ clang -c sprintf-strcpy.c -O2 -target x86_64-linux-gnu
clang: ../lib/IR/Value.cpp:473: void llvm::Value::doRAUW(llvm::Value*,
llvm::Value::ReplaceMetadataUses): Assertion `New->getType() ==
getType() && "replaceAllUses of value with new value of different
type!"' failed.
Transformation creates big strings for big C values, so bail out for C > 128.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D86004
Currently, a transformation like pow(2.0, x) -> exp2(x) copies the pow
attribute list verbatim and applies it to exp2. This works out fine
when the attribute list is empty, but when it isn't clang may error due
due to the mismatch.
The source function and destination don't necessarily have anything
to do with one another, attribute-wise. So it makes sense to remove
the attribute lists (this is similar to what IPO does in this
situation).
This was discovered after implementing the `noundef` param attribute.
Differential Revision: https://reviews.llvm.org/D82820
This intrinsic implements IEEE-754 operation roundToIntegralTiesToEven,
and performs rounding to the nearest integer value, rounding halfway
cases to even. The intrinsic represents the missed case of IEEE-754
rounding operations and now llvm provides full support of the rounding
operations defined by the standard.
Differential Revision: https://reviews.llvm.org/D75670
We already check hasNoNaNs and that x is finite and strictly positive.
That only leaves the following special cases (taken from the Linux man
page for pow):
If x is +1, the result is 1.0 (even if y is a NaN).
If the absolute value of x is less than 1, and y is negative infinity, the result is positive infinity.
If the absolute value of x is greater than 1, and y is negative infinity, the result is +0.
If the absolute value of x is less than 1, and y is positive infinity, the result is +0.
If the absolute value of x is greater than 1, and y is positive infinity, the result is positive infinity.
The first case is handled elsewhere, and this transformation preserves
all the others, so there is no need to limit it to hasNoInfs.
Differential Revision: https://reviews.llvm.org/D79409
We check that C is finite and strictly positive, but there's no need to
check that it's normal too. exp2 should be just as accurate on denormals
as pow is.
Differential Revision: https://reviews.llvm.org/D79413
I don't think there's any good reason not to do this transformation when
the pow has multiple uses.
Differential Revision: https://reviews.llvm.org/D79407
optimizePow does not create any new calls to pow, so it should work
regardless of whether the pow library function is available. This allows
it to optimize the llvm.pow intrinsic on targets with no math library.
Based on a patch by Tim Renouf.
Differential Revision: https://reviews.llvm.org/D68231
This essentially reverts some of the SimplifyLibcalls part changes of D45736 [SimplifyLibcalls] Replace locked IO with unlocked IO.
C11 7.21.5.2 The fflush function
> If stream is a null pointer, the fflush function performs this flushing action on all streams for which the behavior is defined above.
i.e. fopen'ed FILE* is inherently captured.
POSIX.1-2017 getc_unlocked, getchar_unlocked, putc_unlocked, putchar_unlocked - stdio with explicit client locking
> These functions can safely be used in a multi-threaded program if and only if they are called while the invoking thread owns the ( FILE *) object, as is the case after a successful call to the flockfile() or ftrylockfile() functions.
After a thread fopen'ed a FILE*, when it is calling foobar() which is now replaced by foobar_unlocked(),
if another thread is concurrently calling fflush(0), the behavior is undefined.
C11 7.22.4.4 The exit function
> Next, all open streams with unwritten buffered data are flushed, all open streams are closed, and all files created by the tmpfile function are removed.
The replacement is only feasible if the program is single threaded, or exit or fflush(0) is never called.
See also http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180528/556615.html
for how the replacement makes libc interceptors difficult to implement.
dalias: in a worst case, it's unbounded data corruption because of concurrent access to pointers
without synchronization. f->wpos or rpos could get outside of the buffer, thread A could do
f->wpos += j after knowing j is in bounds, while thread B also changes it concurrently.
This can produce exploitable conditions depending on libc internals.
Revert the SimplifyLibcalls part change because the cons obviously
overweigh the pros. Even when the replacement is feasible, the benefit
is indemonstrable, more so in an application instead of an artificial
glibc benchmark. Theoretically the replacement could be beneficial when
calling getc_unlocked/putc_unlocked in a loop, but then it is better
using a blocked IO operation and the user is likely aware of that.
The function attribute inference is still useful and thus kept.
Reviewed By: xbolva00
Differential Revision: https://reviews.llvm.org/D75933
This changes the SimplifyLibCalls utility to accept an IRBuilderBase,
which allows us to pass through the IRBuilder used by InstCombine.
This will ensure that new instructions get added to the worklist.
The annotated test-case drops from 4 to 2 InstCombine iterations thanks
to this.
To achieve this, I'm adding an IRBuilderBase::OperandBundlesGuard,
which is basically the same as the existing InsertPointGuard and
FastMathFlagsGuard, but for operand bundles. Also add a
setDefaultOperandBundles() method so these can be set outside the
constructor.
Differential Revision: https://reviews.llvm.org/D74792
Bionic has had `__strlen_chk` for a while. Optimizing that into a
constant is quite profitable, when possible.
Differential Revision: https://reviews.llvm.org/D74079
Summary:
This is a follow up on https://reviews.llvm.org/D71473#inline-647262.
There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.
Reviewers: xbolva00, courbet, bollu
Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73099
As discussed in PR44330:
https://bugs.llvm.org/show_bug.cgi?id=44330
...the transform from pow(X, -0.5) libcall/intrinsic to
reciprocal square root can result in small deviations from
the expected result due to differences in the pow()
implementation and/or the extra rounding step from the division.
This patch proposes to allow that difference with either the
'approximate functions' or 'reassociate' FMF:
http://llvm.org/docs/LangRef.html#fast-math-flags
In practice, this likely means that the code is compiled with
all of 'fast' (-ffast-math), but I have preserved the existing
specializations for -0.0/-INF that enable generating safe code
if those special values are allowed simultaneously with
allowing approximation/reassociation.
The question about whether a similar restriction is needed for
the non-reciprocal case -- pow(X, 0.5) -- is deferred. That
transform is allowed without FMF currently, and this patch does
not change that behavior.
Differential Revision: https://reviews.llvm.org/D71706
Summary:
This is a resubmit of D71473.
This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients.
Functions will be deprecated one by one and as in tree code is cleaned up.
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: aaron.ballman, courbet
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71547
Summary:
This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients.
Functions will be deprecated one by one and as in tree code is cleaned up.
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: arsenm, jvesely, nhaehnle, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71473
Constructor invocations such as `APFloat(APFloat::IEEEdouble(), 0.0)`
may seem like they accept a FP (floating point) value, but the overload
they reach is actually the `integerPart` one, not a `float` or `double`
overload (which only exists when `fltSemantics` isn't passed).
This may lead to possible loss of data, by the conversion from `float`
or `double` to `integerPart`.
To prevent future mistakes, a new constructor overload, which accepts
any FP value and marked with `delete`, to prevent its usage.
Fixes PR34095.
Differential Revision: https://reviews.llvm.org/D70425
Summary:
In case of a need to distinguish different query sites for gradual commit or
debugging of PGSO. NFC.
Reviewers: davidxl
Subscribers: hiraditya, zzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70510
Summary:
This fixes PR43081, where the transformation of `strchr(p, 0) -> p +
strlen(p)` can cause a segfault, if `-fno-builtin-strlen` is used. In
that case, `emitStrLen` returns nullptr, which CreateGEP is not designed
to handle. Also add the minimized code from the PR as a test case.
Reviewers: xbolva00, spatel, jdoerfert, efriedma
Reviewed By: efriedma
Subscribers: lebedev.ri, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D70143
Add own version of the mathematical constants from the upcoming C++20 `std::numbers`.
Differential revision: https://reviews.llvm.org/D68257
llvm-svn: 374207
bcopy is still widely used mainly for network apps. Sadly, LLVM has no optimizations for bcopy, but there are some for memmove.
Since bcopy == memmove, it is profitable to transform bcopy to memmove and use current optimizations for memmove for free here.
llvm-svn: 373537
Expand the simplification of special cases of `log()` to include `log2()`
and `log10()` as well as intrinsics and more types.
Differential revision: https://reviews.llvm.org/D67199
llvm-svn: 373261
Summary:
Motivation:
- If we can fold it to strdup, we should (strndup does more things than strdup).
- Annotation mechanism. (Works for strdup well).
strdup and strndup are part of C 20 (currently posix fns), so we should optimize them.
Reviewers: efriedma, jdoerfert
Reviewed By: jdoerfert
Subscribers: lebedev.ri, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67679
llvm-svn: 372636