DAGCombiner::visitStore can clear the upper bits of constants
used by stores. This leads prevents them from being recognized as
sign extended negative values making them more expensive to
materialize.
This patch uses the hasAllNBitUsers method from D107658 to make
a negative constant if none of the users care about the upper bits.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D108052
For positive constants we try shifting left to remove leading zeros
and fill the bottom bits with 1s. We then materialize that constant
shift it right.
This patch adds a new strategy to try filling the bottom bits with
zeros instead. This catches some additional cases.
The default legalization strategy is PromoteFloat which keeps
half in single precision format through multiple floating point
operations. Conversion to/from float is done at loads, stores,
bitcasts, and other places that care about the exact size being 16
bits.
This patches switches to the alternative method softPromoteHalf.
This aims to keep the type in 16-bit format between every operation.
So we promote to float and immediately round for any arithmetic
operation. This should be closer to the IR semantics since we
are rounding after each operation and not accumulating extra
precision across multiple operations. X86 is the only other
target that enables this today. See https://reviews.llvm.org/D73749
I had to update getRegisterTypeForCallingConv to force f16 to
use f32 when the F extension is enabled. This way we can still
pass it in the lower bits of an FPR for ilp32f and lp64f ABIs.
The softPromoteHalf would otherwise always give i16 as the
argument type.
Reviewed By: asb, frasercrmck
Differential Revision: https://reviews.llvm.org/D99148
Without Zfh the half type isn't legal, but it could still be
used as an argument/return in IR. Clang will not generate this today.
Previously we promoted the half value to float for arguments and
returns if the F extension is enabled but Zfh isn't. Then depending on
which ABI is enabled we would pass it in either an FPR or a GPR in
float format.
If the F extension isn't enabled, it would get passed in the lower
16 bits of a GPR in half format.
With this patch the value will always in half format and will be
in the lower bits of a GPR or FPR. This should be consistent
with where the bits are located when Zfh is enabled.
I've based this implementation off of how this is done on ARM.
I've manually nan-boxed the value to 32 bits using integer ops.
It looks like flw, fsw, fmv.s, fmv.w.x, fmf.x.w won't
canonicalize nans so should leave the value alone. I think those
are the instructions that could get used on this value.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D98670