Commit Graph

505 Commits

Author SHA1 Message Date
David Green 6f81903e89 [LV][SLP] Mark fptosi_sat as vectorizable
This adds fptosi_sat and fptoui_sat to the list of trivially
vectorizable functions, mainly so that the loop vectorizer can vectorize
the instruction. Marking them as trivially vectorizable also allows them
to be SLP vectorized, and Scalarized.

The signature of a fptosi_sat requires two type overrides
(@llvm.fptosi.sat.v2i32.v2f32), unlike other intrinsics that often only
take a single. This patch alters hasVectorInstrinsicOverloadedScalarOpd
to isVectorIntrinsicWithOverloadTypeAtArg, so that it can mark the first
operand of the intrinsic as a overloaded (but not scalar) operand.

Differential Revision: https://reviews.llvm.org/D124358
2022-05-03 09:32:34 +01:00
Nikita Popov 597946a4dd [ConstantFold] Don't convert getelementptr to ptrtoint+inttoptr
ConstantFolding currently converts "getelementptr i8, Ptr, (sub 0, V)"
to "inttoptr (sub (ptrtoint Ptr), V)". This transform is, taken by
itself, correct, but does came with two issues:

1. It unnecessarily broadens provenance by introducing an inttoptr.
   We generally prefer not to introduce inttoptr during optimization.
2. For the case where V == ptrtoint Ptr, this folds to inttoptr 0,
   which further folds to null. In that case provenance becomes
   incorrect. This has been observed as a real-world miscompile with
   rustc.

We should probably address that incorrect inttoptr 0 fold at some
point, but in either case we should also drop this inttoptr-introducing
fold. Instead, replace it with a fold rooted at
ptrtoint(getelementptr), which seems to cover the original
motivation for this fold (test2 in the changed file).

Differential Revision: https://reviews.llvm.org/D124677
2022-05-02 10:24:46 +02:00
David Green 9727c77d58 [NFC] Rename Instrinsic to Intrinsic 2022-04-25 18:13:23 +01:00
Nikita Popov 806450805d [ConstFold] Don't fold calls with mismatching function type
With opaque pointers, this is no longer ensured through pointer
type identity.
2022-03-11 14:09:23 +01:00
serge-sans-paille 71c3a5519d Cleanup includes: LLVMAnalysis
Number of lines output by preprocessor:
before: 1065940348
after:  1065307662

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120659
2022-03-01 18:01:54 +01:00
Serge Pavlov 6982c38cb1 [ConstantFolding] Fix folding of constrained compare intrinsics
The change fixes treatment of constrained compare intrinsics if
compared values are of vector type.

Differential revision: https://reviews.llvm.org/D110322
2022-02-27 10:19:19 +07:00
David Sherwood dc0657277f Fix warning introduced by 47eff645d8 2022-02-22 09:37:16 +00:00
David Sherwood 47eff645d8 [InstCombine] Bail out of load-store forwarding for scalable vector types
This patch fixes an invalid TypeSize->uint64_t implicit conversion in
FoldReinterpretLoadFromConst. If the size of the constant is scalable
we bail out of the optimisation for now.

Tests added here:

  Transforms/InstCombine/load-store-forward.ll

Differential Revision: https://reviews.llvm.org/D120240
2022-02-22 09:26:04 +00:00
Sanjay Patel 7cc0a29b3f [Analysis] propagate poison through add/sub saturate intrinsics
A more general enhancement needs to add tests and make sure
that intrinsics that return structs are correct. There are also
target-specific intrinsics, and I'm not sure what behavior is
expected for those.
2022-02-15 10:45:32 -05:00
Sanjay Patel 00218c188b [Analysis] propagate poison through integer min/max intrinsics
A more general enhancement needs to add tests and make sure
that intrinsics that return structs are correct. There are also
target-specific intrinsics, and I'm not sure what behavior is
expected for those.
2022-02-15 10:45:32 -05:00
Serge Pavlov d2f132f0b7 [ConstantFolding] Fold constrained compare intrinsics
The change implements constant folding of ‘llvm.experimental.constrained.fcmp’
and ‘llvm.experimental.constrained.fcmps’ intrinsics.

Differential Revision: https://reviews.llvm.org/D110322
2022-02-03 16:45:56 +07:00
Nikita Popov aa97bc116d [NFC] Remove uses of PointerType::getElementType()
Instead use either Type::getPointerElementType() or
Type::getNonOpaquePointerElementType().

This is part of D117885, in preparation for deprecating the API.
2022-01-25 09:44:52 +01:00
Sanjay Patel 2e26633af0 [IR] document and update ctlz/cttz intrinsics to optionally return poison rather than undef
The behavior in Analysis (knownbits) implements poison semantics already,
and we expect the transforms (for example, in instcombine) derived from
those semantics, so this patch changes the LangRef and remaining code to
be consistent. This is one more step in removing "undef" from LLVM.

Without this, I think https://github.com/llvm/llvm-project/issues/53330
has a legitimate complaint because that report wants to allow subsequent
code to mask off bits, and that is allowed with undef values. The clang
builtins are not actually documented anywhere AFAICT, but we might want
to add that to remove more uncertainty.

Differential Revision: https://reviews.llvm.org/D117912
2022-01-23 11:22:48 -05:00
Nikita Popov b4900296e4 [ConstantFold] Allow all float types in reinterpret load folding
Rather than hardcoding just half, float and double, allow all
floating point types.
2022-01-21 09:26:51 +01:00
Nikita Popov 6a19cb837c [ConstantFold] Support pointers in reinterpret load folding
Peculiarly, the necessary code to handle pointers (including the
check for non-integral address spaces) is already in place,
because we were already allowing vectors of pointers here, just
not plain pointers.
2022-01-21 09:13:37 +01:00
Nikita Popov 05cd9a0596 [ConstantFold] Simplify type check in reinterpret load folding (NFC)
Keep a list of allowed types, but then always construct the map
type the same way. We need an integer with the same width as the
original type.
2022-01-21 09:06:35 +01:00
Nikita Popov f104cc38f4 [ConstantFold] Don't fold load from non-byte-sized vector
Following up on 1470f94d71 (r63981173):

The result here (probably) depends on endianness. Don't bother
trying to handle this exotic case, just bail out.
2022-01-17 17:01:47 +01:00
Nikita Popov 20d9c51dc0 [ConstantFold] Check for uniform value before reinterpret load
The reinterpret load code will convert undef values into zero.
Check the uniform value case before it to produce a better result
for all-undef initializers.

However, the uniform value handling will return the uniform value
even if the access is out of bounds, while the reinterpret load
code will return undef. Add an explicit check to retain the
previous result in this case.
2022-01-14 10:18:02 +01:00
Nikita Popov aba7c3c033 [ConstantFold] Check uniform value in ConstantFoldLoadFromConst()
This case is automatically handled if ConstantFoldLoadFromConstPtr()
is used. Make sure that ConstantFoldLoadFromConst() also handles it.
2022-01-13 14:40:19 +01:00
Simon Pilgrim fd1094f318 [ConstantFolding] Clean up Intrinsics::abs undef handling
Match cttz/ctlz handling by assuming C1 == 0 if C1 != 1 - I've added an assertion as well.

Fixes static analyzer nullptr dereference warnings.
2022-01-10 17:04:03 +00:00
Roman Lebedev a5a6960d1c
[NFCI][IR] MinMaxIntrinsic: add some more helper methods, and use them 2022-01-07 13:02:11 +03:00
Nikita Popov 3dc1907d06 [ConstantFold] Use ConstantFoldLoadFromUniformValue() in more places
In particular, this also preserves undef when loading from padding,
rather than converting it to zero through a different codepath.

This is the remaining part of D115924.
2022-01-05 12:47:50 +01:00
Nikita Popov 99c6b12b92 [ConstantFolding] Unify handling of load from uniform value
There are a number of places that specially handle loads from a
uniform value where all the bits are the same (zero, one, undef,
poison), because we a) don't care about the load offset in that
case b) it bypasses casts that might not be legal generally but
do work with uniform values.

We had multiple implementations of this, with a different set of
supported values each time. This replaces two usages with a more
complete helper. Other usages will be replaced separately, because
they have larger impact.

This is part of D115924.
2022-01-05 12:30:46 +01:00
Nikita Popov 71b2c4a3cf [ConstantFolding] Remove unused ConstantFoldLoadThroughGEPConstantExpr()
This API is no longer used since bbeaf2aac6.
2022-01-04 12:37:12 +01:00
Nikita Popov 5afbfe33e7 [ConstantFold] Make icmp of gep fold offset based
We can fold an equality or unsigned icmp between base+offset1 and
base+offset2 with inbounds offsets by comparing the offsets directly.

This replaces a pair of specialized folds that tried to reason
based on the GEP structure instead. One of those folds was plain
wrong (because it does not account for negative offsets), while
the other is unnecessarily complicated and limited (e.g. it will
fail with bitcasts involved).

The disadvantage of this change is that it requires data layout,
so the fold is no longer performed by datalayout-independent
constant folding. I don't think this is a loss in practice, but
it does regress the ConstantExprFold.ll test, which checks folding
without running any passes.

Differential Revision: https://reviews.llvm.org/D116332
2022-01-03 09:41:37 +01:00
Serge Pavlov 77b923d0db [ConstantFolding] Do not remove side effect from constrained functions
According to the discussion in https://reviews.llvm.org/D110322 the code
that removes side effect from replaced function call is deleted.

Differential Revision: https://reviews.llvm.org/D115870
2021-12-22 13:45:49 +07:00
Nikita Popov 2926d6d335 [ConstantFold][GlobalOpt] Don't create x86_mmx null value
This fixes the assertion failure reported at
https://reviews.llvm.org/D114889#3198921 with a straightforward
check, until the cleaner fix in D115924 can be reapplied.
2021-12-21 09:11:41 +01:00
Kazu Hirata 500c4b68dc [llvm] Construct SmallVector with iterator ranges (NFC) 2021-12-20 23:43:24 -08:00
Nikita Popov aeb36ae0f4 Revert "[ConstantFolding] Unify handling of load from uniform value"
This reverts commit 9fd4f80e33.

This breaks SingleSource/Regression/C/gcc-c-torture/execute/pr19687.c
in test-suite. Either the test is incorrect, or clang is generating
incorrect union initialization code. I've submitted
https://reviews.llvm.org/D115994 to fix the test, assuming my
interpretation is correct. Reverting this in the meantime as it
may take some time to resolve.
2021-12-18 20:46:52 +01:00
Nikita Popov 9fd4f80e33 [ConstantFolding] Unify handling of load from uniform value
There are a number of places that specially handle loads from a
uniform value where all the bits are the same (zero, one, undef,
poison), because we a) don't care about the load offset in that
case and b) it bypasses casts that might not be legal generally
but do work with uniform values.

We had multiple implementations of this, with a different set of
supported values each time, as well as incomplete type checks in
some cases. In particular, this fixes the assertion reported in
https://reviews.llvm.org/D114889#3198921, as well as a similar
assertion that could be triggered via constant folding.

Differential Revision: https://reviews.llvm.org/D115924
2021-12-17 17:05:06 +01:00
Nikita Popov 65bec04295 [ConstantFold] Handle same type in ConstantFoldLoadThroughBitcast
Usually the case where the types are the same ends up being handled
fine because it's legal to do a trivial bitcast to the same type.
However, this is not true for aggregate types. Short-circuit the
whole code if the types match exactly to account for this.
2021-12-10 16:39:50 +01:00
David Green ab0c5cea0b [ARM] Use v2i1 for MVE and CDE intrinsics
This adjusts all the MVE and CDE intrinsics now that v2i1 is a legal
type, to use a <2 x i1> as opposed to emulating the predicate with a
<4 x i1>. The v4i1 workarounds have been removed leaving the natural
v2i1 types, notably in vctp64 which now generates a v2i1 type.

AutoUpgrade code has been added to upgrade old IR, which needs to
convert the old v4i1 to a v2i1 be converting it back and forth to an
integer with arm.mve.v2i and arm.mve.i2v intrinsics. These should be
optimized away in the final assembly.

Differential Revision: https://reviews.llvm.org/D114455
2021-12-03 15:27:58 +00:00
Peter Waller 98f08752f7 [InstCombine][ConstantFolding] Make ConstantFoldLoadThroughBitcast TypeSize-aware
The newly added test previously caused the compiler to fail an
assertion. It looks like a strightforward TypeSize upgrade.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D112142
2021-10-28 12:15:15 +00:00
Nikita Popov 710596a1e1 [ConstantFolding] Accept offset in ConstantFoldLoadFromConstPtr (NFCI)
As this API is now internally offset-based, we can accept a starting
offset and remove the need to create a temporary bitcast+gep
sequence to perform an offset load. The API now mirrors the
ConstantFoldLoadFromConst() API.
2021-10-23 17:59:39 +02:00
Nikita Popov c5b5b7f621 [ConstantFolding] Remove ConstantFoldLoadThroughGEPIndices() API (NFC)
The last user of this API went away in
4f5e9a2bb2.
2021-10-23 16:59:29 +02:00
Nikita Popov 3a10fe2d89 [Loads] Use more powerful constant folding API
This follows up on D111023 by exporting the generic "load value
from constant at given offset as given type" and using it in the
store to load forwarding code. We now need to make sure that the
load size is smaller than the store size, previously this was
implicitly ensured by ConstantFoldLoadThroughBitcast().

Differential Revision: https://reviews.llvm.org/D112260
2021-10-22 18:33:03 +02:00
Simon Pilgrim c288241795 [ConstantFolding] ConstantFoldScalarCall2 - early-out if getLibFunc fails. NFC. 2021-10-16 11:20:19 +01:00
Simon Pilgrim c18cf10a04 [ConstantFolding] Use getValueAPF const ref value where possible. NFC.
Don't copy the value if we can avoid it.
2021-10-16 11:20:19 +01:00
Simon Pilgrim 76ca0d67ab [ConstantFolding] ConstantFoldScalarCall1 - early-out if getLibFunc fails. NFC. 2021-10-16 11:20:18 +01:00
Nikita Popov c117d77e93 [ConstantFold] Refactor load folding
This refactors load folding to happen in two cleanly separated
steps: ConstantFoldLoadFromConstPtr() takes a pointer to load from
and decomposes it into a constant initializer base and an offset.
Then ConstantFoldLoadFromConst() loads from that initializer at
the given offset. This makes the core logic independent of having
actual GEP expressions (and those GEP expressions having certain
structure) and will allow exposing ConstantFoldLoadFromConst() as
an independent API in the future.

This is mostly only a refactoring, but it does make the folding
logic slightly more powerful.

Differential Revision: https://reviews.llvm.org/D111023
2021-10-05 18:07:57 +02:00
Jay Foad a9bceb2b05 [APInt] Stop using soft-deprecated constructors and methods in llvm. NFC.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.

Differential Revision: https://reviews.llvm.org/D110807
2021-10-04 08:57:44 +01:00
Kazu Hirata d34cd75d89 [Analysis, CodeGen] Migrate from arg_operands to args (NFC)
Note that arg_operands is considered a legacy name.  See
llvm/include/llvm/IR/InstrTypes.h for details.
2021-10-03 08:22:20 -07:00
Alex Richardson 9049a1c61e [ConstantFolding] Fold ptrtoint(gep i8 null, x) -> x
I was looking at some missed optimizations in CHERI-enabled targets and
noticed that we weren't removing vtable indirection for calls via known
pointers-to-members. The underlying reason for this is that we represent
pointers-to-function-members as {i8 addrspace(200)*, i64} and generate the
constant offsets using (gep i8 null, <index>). We use a constant GEP here
since inttoptr should be avoided for CHERI capabilities. The pointer-to-member
call uses ptrtoint to extract the index, and due to this missing fold we can't
infer the actual value loaded from the vtable.
This is the initial constant folding change for this pattern, I will add
an InstCombine fold as a follow-up.

We could fold all inbounds GEP to null (and therefore the ptrtoint to
zero) since zero is the only valid offset for an inbounds GEP. If the
offset is not zero, that GEP is poison and therefore returning 0 is valid
(https://alive2.llvm.org/ce/z/Gzb5iH). However, Clang currently generates
inbounds GEPs on NULL for hand-written offsetof() expressions, so this
could lead to miscompilations.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D110245
2021-09-28 17:57:36 +01:00
Nikita Popov dd0226561e [IR] Add helper to convert offset to GEP indices
We implement logic to convert a byte offset into a sequence of GEP
indices for that offset in a number of places. This patch adds a
DataLayout::getGEPIndicesForOffset() method, which implements the
core logic. I've updated SROA, ConstantFolding and InstCombine to
use it, and there's a few more places where it looks relevant.

Differential Revision: https://reviews.llvm.org/D110043
2021-09-20 20:18:16 +02:00
Nikita Popov 58db5f6e95 [ConstFold] Support opaque pointers in constexpr GEPs
Support opaque pointers in SymbolicallyEvaluateGEP() by using the
value type of a GlobalValue base or falling back to i8 if there
isn't one. We don't unconditionally generate i8 GEPs here because
that would lose inrange attribues, and because some optimizations
on globals currently rely on GEP types (e.g. the globals SROA
mentioned in the comment).

Differential Revision: https://reviews.llvm.org/D109297
2021-09-07 20:50:29 +02:00
Roman Lebedev 3f1f08f0ed
Revert @llvm.isnan intrinsic patchset.
Please refer to
https://lists.llvm.org/pipermail/llvm-dev/2021-September/152440.html
(and that whole thread.)

TLDR: the original patch had no prior RFC, yet it had some changes that
really need a proper RFC discussion. It won't be productive to discuss
such an RFC, once it's actually posted, while said patch is already
committed, because that introduces bias towards already-committed stuff,
and the tree is potentially in broken state meanwhile.

While the end result of discussion may lead back to the current design,
it may also not lead to the current design.

Therefore i take it upon myself
to revert the tree back to last known good state.

This reverts commit 4c4093e6e3.
This reverts commit 0a2b1ba33a.
This reverts commit d9873711cb.
This reverts commit 791006fb8c.
This reverts commit c22b64ef66.
This reverts commit 72ebcd3198.
This reverts commit 5fa6039a5f.
This reverts commit 9efda541bf.
This reverts commit 94d3ff09cf.
2021-09-02 13:53:56 +03:00
Arthur Eubanks 3f4d00bc3b [NFC] More get/removeAttribute() cleanup 2021-08-17 21:05:41 -07:00
Serge Pavlov 4c4093e6e3 Introduce intrinsic llvm.isnan
This is recommit of the patch 16ff91ebcc,
reverted in 0c28a7c990 because it had
an error in call of getFastMathFlags (base type should be FPMathOperator
but not Instruction). The original commit message is duplicated below:

    Clang has builtin function '__builtin_isnan', which implements C
    library function 'isnan'. This function now is implemented entirely in
    clang codegen, which expands the function into set of IR operations.
    There are three mechanisms by which the expansion can be made.

    * The most common mechanism is using an unordered comparison made by
      instruction 'fcmp uno'. This simple solution is target-independent
      and works well in most cases. It however is not suitable if floating
      point exceptions are tracked. Corresponding IEEE 754 operation and C
      function must never raise FP exception, even if the argument is a
      signaling NaN. Compare instructions usually does not have such
      property, they raise 'invalid' exception in such case. So this
      mechanism is unsuitable when exception behavior is strict. In
      particular it could result in unexpected trapping if argument is SNaN.

    * Another solution was implemented in https://reviews.llvm.org/D95948.
      It is used in the cases when raising FP exceptions by 'isnan' is not
      allowed. This solution implements 'isnan' using integer operations.
      It solves the problem of exceptions, but offers one solution for all
      targets, however some can do the check in more efficient way.

    * Solution implemented by https://reviews.llvm.org/D96568 introduced a
      hook 'clang::TargetCodeGenInfo::testFPKind', which injects target
      specific code into IR. Now only SystemZ implements this hook and it
      generates a call to target specific intrinsic function.

    Although these mechanisms allow to implement 'isnan' with enough
    efficiency, expanding 'isnan' in clang has drawbacks:

    * The operation 'isnan' is hidden behind generic integer operations or
      target-specific intrinsics. It complicates analysis and can prevent
      some optimizations.

    * IR can be created by tools other than clang, in this case treatment
      of 'isnan' has to be duplicated in that tool.

    Another issue with the current implementation of 'isnan' comes from the
    use of options '-ffast-math' or '-fno-honor-nans'. If such option is
    specified, 'fcmp uno' may be optimized to 'false'. It is valid
    optimization in general, but it results in 'isnan' always returning
    'false'. For example, in some libc++ implementations the following code
    returns 'false':

        std::isnan(std::numeric_limits<float>::quiet_NaN())

    The options '-ffast-math' and '-fno-honor-nans' imply that FP operation
    operands are never NaNs. This assumption however should not be applied
    to the functions that check FP number properties, including 'isnan'. If
    such function returns expected result instead of actually making
    checks, it becomes useless in many cases. The option '-ffast-math' is
    often used for performance critical code, as it can speed up execution
    by the expense of manual treatment of corner cases. If 'isnan' returns
    assumed result, a user cannot use it in the manual treatment of NaNs
    and has to invent replacements, like making the check using integer
    operations. There is a discussion in https://reviews.llvm.org/D18513#387418,
    which also expresses the opinion, that limitations imposed by
    '-ffast-math' should be applied only to 'math' functions but not to
    'tests'.

    To overcome these drawbacks, this change introduces a new IR intrinsic
    function 'llvm.isnan', which realizes the check as specified by IEEE-754
    and C standards in target-agnostic way. During IR transformations it
    does not undergo undesirable optimizations. It reaches instruction
    selection, where is lowered in target-dependent way. The lowering can
    vary depending on options like '-ffast-math' or '-ffp-model' so the
    resulting code satisfies requested semantics.

    Differential Revision: https://reviews.llvm.org/D104854
2021-08-06 14:32:27 +07:00
Serge Pavlov 0c28a7c990 Revert "Introduce intrinsic llvm.isnan"
This reverts commit 16ff91ebcc.
Several errors were reported mainly test-suite execution time. Reverted
for investigation.
2021-08-04 17:18:15 +07:00
Serge Pavlov 16ff91ebcc Introduce intrinsic llvm.isnan
Clang has builtin function '__builtin_isnan', which implements C
library function 'isnan'. This function now is implemented entirely in
clang codegen, which expands the function into set of IR operations.
There are three mechanisms by which the expansion can be made.

* The most common mechanism is using an unordered comparison made by
  instruction 'fcmp uno'. This simple solution is target-independent
  and works well in most cases. It however is not suitable if floating
  point exceptions are tracked. Corresponding IEEE 754 operation and C
  function must never raise FP exception, even if the argument is a
  signaling NaN. Compare instructions usually does not have such
  property, they raise 'invalid' exception in such case. So this
  mechanism is unsuitable when exception behavior is strict. In
  particular it could result in unexpected trapping if argument is SNaN.

* Another solution was implemented in https://reviews.llvm.org/D95948.
  It is used in the cases when raising FP exceptions by 'isnan' is not
  allowed. This solution implements 'isnan' using integer operations.
  It solves the problem of exceptions, but offers one solution for all
  targets, however some can do the check in more efficient way.

* Solution implemented by https://reviews.llvm.org/D96568 introduced a
  hook 'clang::TargetCodeGenInfo::testFPKind', which injects target
  specific code into IR. Now only SystemZ implements this hook and it
  generates a call to target specific intrinsic function.

Although these mechanisms allow to implement 'isnan' with enough
efficiency, expanding 'isnan' in clang has drawbacks:

* The operation 'isnan' is hidden behind generic integer operations or
  target-specific intrinsics. It complicates analysis and can prevent
  some optimizations.

* IR can be created by tools other than clang, in this case treatment
  of 'isnan' has to be duplicated in that tool.

Another issue with the current implementation of 'isnan' comes from the
use of options '-ffast-math' or '-fno-honor-nans'. If such option is
specified, 'fcmp uno' may be optimized to 'false'. It is valid
optimization in general, but it results in 'isnan' always returning
'false'. For example, in some libc++ implementations the following code
returns 'false':

    std::isnan(std::numeric_limits<float>::quiet_NaN())

The options '-ffast-math' and '-fno-honor-nans' imply that FP operation
operands are never NaNs. This assumption however should not be applied
to the functions that check FP number properties, including 'isnan'. If
such function returns expected result instead of actually making
checks, it becomes useless in many cases. The option '-ffast-math' is
often used for performance critical code, as it can speed up execution
by the expense of manual treatment of corner cases. If 'isnan' returns
assumed result, a user cannot use it in the manual treatment of NaNs
and has to invent replacements, like making the check using integer
operations. There is a discussion in https://reviews.llvm.org/D18513#387418,
which also expresses the opinion, that limitations imposed by
'-ffast-math' should be applied only to 'math' functions but not to
'tests'.

To overcome these drawbacks, this change introduces a new IR intrinsic
function 'llvm.isnan', which realizes the check as specified by IEEE-754
and C standards in target-agnostic way. During IR transformations it
does not undergo undesirable optimizations. It reaches instruction
selection, where is lowered in target-dependent way. The lowering can
vary depending on options like '-ffast-math' or '-ffp-model' so the
resulting code satisfies requested semantics.

Differential Revision: https://reviews.llvm.org/D104854
2021-08-04 15:27:49 +07:00