The neutral value is -0.0, not 0.0. This doesn't matter for "fast"
reductions due to nsz, but does matter for reassoc-only and seq
reductions.
Change tests to mostly use -0.0 where the neutral value was intended,
and add some additional test coverage in some places. Also update
LangRef to use the right value.
D75689 turns the faddp pattern into a shuffle with vector add.
Match this new pattern in target-specific DAG combine, rather than ISel,
because legalization (for v2f32) turns it into a bit of a mess.
- extended to cover f16, f32, f64 and i64
This patch changes how LLVM handles the accumulator/start value
in the reduction, by never ignoring it regardless of the presence of
fast-math flags on callsites. This change introduces the following
new intrinsics to replace the existing ones:
llvm.experimental.vector.reduce.fadd -> llvm.experimental.vector.reduce.v2.fadd
llvm.experimental.vector.reduce.fmul -> llvm.experimental.vector.reduce.v2.fmul
and adds functionality to auto-upgrade existing LLVM IR and bitcode.
Reviewers: RKSimon, greened, dmgreen, nikic, simoll, aemerson
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D60261
llvm-svn: 363035
The scalar start/accumulator value of the fadd- and fmul reduction
should match the result type of the reduction, as well as the vector
element-type of the input vector. Although this was not explicitly
specified in the LangRef, it was taken for granted in code implementing
the reductions. The patch also fixes the LangRef by adding this
constraint.
Reviewed By: aemerson, nikic
Differential Revision: https://reviews.llvm.org/D60260
llvm-svn: 361133