According to IR LangRef, the FMF flag:
contract
Allow floating-point contraction (e.g. fusing a multiply followed by an
addition into a fused multiply-and-add).
reassoc
Allow reassociation transformations for floating-point instructions.
This may dramatically change results in floating-point.
My understanding is that these two flags shouldn't imply each other,
as we might have a SDNode that can be reassociated with others, but
not contractble.
eg: We may want following fmul/fad/fsub to freely reassoc, but don't
want fma being generated here.
%F = fmul reassoc double %A, %B ; <double> [#uses=1]
%G = fmul reassoc double %C, %D ; <double> [#uses=1]
%H = fadd reassoc double %F, %G ; <double> [#uses=1]
%I = fsub reassoc double %H, %E ; <double> [#uses=1]
Before https://reviews.llvm.org/D45710, `reassoc` flag actually
did not imply isContratable either.
The current implementation also only check the flag in fadd node,
ignoring fmul node, this patch update that as well.
Reviewed By: spatel, qiucf
Differential Revision: https://reviews.llvm.org/D104247
TargetMachine::shouldAssumeDSOLocal currently implies dso_local for such definitions.
Adding explicit dso_local makes these tests align with the clang -fpic behavior
and allow the removal of the TargetMachine::shouldAssumeDSOLocal special case.
Rewrite preemption.ll to dsolocal-static.ll and dsolocal-pic.ll, and add
"PIC Level" metadata.
From LangRef, FMF contract should not enable reassociating to form
arbitrary contractions. So it should not help rearrange nodes like
(fma (fmul x, c1), c2, y) into (fma x, c1*c2, y).
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D89527
DAG combiner folds (fma a 1.0 b) into (fadd a b) but the flag isn't
propagated into new fadd. This patch fixes that.
Some code in visitFMA is redundant and such support for vector constants
is missing. Need follow-up patch to clean.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D87037
On PowerPC, FNMSUB (both VSX and non-VSX version) means -(a*b-c). But
the backend used to generate these instructions regardless whether nsz
flag exists or not. If a*b-c==0, such transformation changes sign of
zero.
This patch introduces PPC specific FNMSUB ISD opcode, which may help
improving combined FMA code sequence.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D76585
A lot of tests under PowerPC are using fast flag, while fast is just
alias of 7 fast-math flags. This change makes test points clearer.
mc-instrlat.ll and sms-iterator.ll keeps unchanged since they are not
testing fast-math behavior. (one for machine combiner crash, one for
machine pipeliner bug)
Reviewed By: steven.zhang, spatel
Differential Revision: https://reviews.llvm.org/D78989
PowerPC hits an assertion due to somewhat the same reason as https://reviews.llvm.org/D70975.
Though there are already some hack, it still failed with some case, when the operand 0 is NOT
a const fp, it is another fma that with const fp. And that const fp is negated which result in multi-uses.
A better fix is to check the uses of the negated const fp. If there are already use of its negated
value, we will have benefit as no extra Node is added.
Differential revision: https://reviews.llvm.org/D75501
This is a positive combination as long as the NEG is NOT free,
as we are reducing the number of NEG from two to one.
Differential Revision: https://reviews.llvm.org/D72312