Commit Graph

25 Commits

Author SHA1 Message Date
Ulrich Weigand 0f0a8b7784 [SystemZ] Add support for new cpu architecture - arch13
This patch series adds support for the next-generation arch13
CPU architecture to the SystemZ backend.

This includes:
- Basic support for the new processor and its features.
- Assembler/disassembler support for new instructions.
- CodeGen for new instructions, including new LLVM intrinsics.
- Scheduler description for the new processor.
- Detection of arch13 as host processor.

Note: No currently available Z system supports the arch13
architecture.  Once new systems become available, the
official system name will be added as supported -march name.

llvm-svn: 365932
2019-07-12 18:13:16 +00:00
Jonas Paulsson 19871f848b [CodeMetrics] Don't let extends of i1 be free.
getUserCost() currently returns TCC_Free for any extend of a compare (i1)
result. It seems this is only true in a limited number of cases where for
example two compares are chained. Even in those types of cases it seems
unlikely that they are generally free, while they may be in some cases.

This patch therefore removes this special handling of cast of i1. No tests
are failing because of this.

If some target want the old behavior, it could override getUserCost().

Review: Hal Finkel, Chandler Carruth, Evgeny Astigeevich, Simon Pilgrim,
        Ulrich Weigand
https://reviews.llvm.org/D54742/new/

llvm-svn: 360970
2019-05-17 01:26:35 +00:00
Jonas Paulsson 8ae0f88b13 [SystemZ::TTI] Return zero cost for ICmp that becomes Load And Test.
A loaded value with multiple users compared with 0 will become a load and
test single instruction. The load is not folded in this case (multiple
users), but the compare instruction is eliminated.

This patch returns 0 cost for the icmp in these cases.

Review: Ulrich Weigand
https://reviews.llvm.org/D55111

llvm-svn: 348141
2018-12-03 14:30:18 +00:00
Jonas Paulsson b1d014883c [SystemZ::TTI] i8/i16 operands extension costs revisited
Three minor changes to these extra costs:

* For ICmp instructions, instead of adding 2 all the time for extending each
  operand, this is only done if that operand is neither a load or an
  immediate.

* The operands extension costs for divides removed, because we now use a high
  cost already for the divide (20).

* The costs for lhsr/ashr extra costs removed as this did not seem useful.

Review: Ulrich Weigand
https://reviews.llvm.org/D55053

llvm-svn: 347961
2018-11-30 07:09:34 +00:00
Jonas Paulsson 06acb3a236 [SystemZ::TTI] Improve cost for compare of i64 with extended i32 load
CGF/CLGF compares an i64 register with a sign/zero extended loaded i32 value
in memory.

This patch makes such a load considered foldable and so gets a 0 cost.

Review: Ulrich Weigand
https://reviews.llvm.org/D54944

llvm-svn: 347735
2018-11-28 08:58:27 +00:00
Jonas Paulsson d6b7aca911 [SystemZ::TTI] Improve costs for i16 add, sub and mul against memory.
AH, SH and MH costs are already covered in the cases where LHS is 32 bits and
RHS is 16 bits of memory sign-extended to i32.

As these instructions are also used when LHS is i16, this patch recognizes
that the loads will get folded then as well.

Review: Ulrich Weigand
https://reviews.llvm.org/D54940

llvm-svn: 347734
2018-11-28 08:31:50 +00:00
Jonas Paulsson 011a503f25 [SystemZ::TTI] Improved cost values for comparison against memory.
Single instructions exist for i8 and i16 comparisons of memory against a
small immediate.

This patch makes sure that if the load in these cases has a single user (the
ICmp), it gets a 0 cost (folded), and also that the ICmp gets a cost of 1.

Review: Ulrich Weigand
https://reviews.llvm.org/D54897

llvm-svn: 347733
2018-11-28 08:08:05 +00:00
Jonas Paulsson 5da8e432b9 [SystemZ::TTI] Return zero cost for scalar load/store connected with a bswap.
Since byte-swapping loads and stores are supported, a 'load -> bswap' or
'bswap -> store' sequence should have the cost of one.

Review: Ulrich Weigand
https://reviews.llvm.org/D54870

llvm-svn: 347732
2018-11-28 07:52:34 +00:00
Jonas Paulsson 96782c2c0b [SystemZTTIImpl] Give correct cost values for vector bswap intrinsics.
Implement getIntrinsicInstrCost() and return costs reflecting that bswap can
be done with a vperm per vector register.

Review: Ulrich Weigand
https://reviews.llvm.org/D54789

llvm-svn: 347445
2018-11-22 07:17:29 +00:00
Jonas Paulsson 5cea85dd59 [SystemZ::TTI] Improve accuracy of costs for vector fp <-> int conversions
Improve getCastInstrCost() by respecting the different types of Src and Dst
for vector integer <-> fp conversions.

This means that extracting from integer becomes more expensive (by the
extraction penalty), and the extraction from fp becomes cheaper (no longer
has a false extraction penalty).

Review: Ulrich Weigand
https://reviews.llvm.org/D54423

llvm-svn: 346663
2018-11-12 15:32:27 +00:00
Jonas Paulsson cced2a2775 [SystemZ::TTI] Improve cost handling of uint/sint to fp conversions.
Let i8/i16 uint/sint to fp conversions cost 1 if operand is a load.

Since the load already does the extension, there is no extra cost (previously
returned 2).

Review: Ulrich Weigand
https://reviews.llvm.org/D54028

llvm-svn: 346009
2018-11-02 17:53:31 +00:00
Jonas Paulsson 6749c24f40 [SystemZ::TTI] Recognize the higher cost of scalar i1 -> fp conversion
Scalar i1 to fp conversions are done with a branch sequence, so it should
have a higher cost.

Review: Ulrich Weigand
https://reviews.llvm.org/D53924

llvm-svn: 345818
2018-11-01 09:05:32 +00:00
Jonas Paulsson f15a53bc81 [SystemZ::TTI] Accurate costs for i1->double vector conversions
This factors out a new method getBoolVecToIntConversionCost() containing the
code for vector sext/zext of i1, in order to reuse it for i1 to double vector
conversions.

Review: Ulrich Weigand
https://reviews.llvm.org/D53923

llvm-svn: 345817
2018-11-01 09:01:51 +00:00
Jonas Paulsson af8e036c29 [SystemZ] Improve isFoldableLoad() for Sub, SDiv and UDiv.
Sub, SDiv and UDiv are not commutative, so only the RHS operand can fold a
load. This patch adds a check for this.

Review: Ulrich Weigand
https://reviews.llvm.org/D53791

llvm-svn: 345596
2018-10-30 13:41:03 +00:00
Jonas Paulsson b7caa809e1 [SystemZ] Improve getMemoryOpCost() to find foldable loads that are converted.
The SystemZ backend can do arithmetic of memory by loading and then extending
one of the operands. Similarly, a load + truncate can be folded into an
operand.

This patch improves the SystemZ TTI cost function to recognize this.

Review: Ulrich Weigand
https://reviews.llvm.org/D52692

llvm-svn: 345327
2018-10-25 22:28:25 +00:00
Jonas Paulsson 4645711a8d [SystemZ] Improve handling and cost estimates of vector integer div/rem
Enable the DAG optimization that converts vector div/rem with constants into
multiply+shifts sequences by expanding them early. This is needed since
ISD::SMUL_LOHI is 'Custom' lowered on SystemZ, and will therefore not be
available to BuildSDIV after legalization.

Better cost values for these instructions based on how they will be
implemented (a constant divisor is cheaper).

Review: Ulrich Weigand
https://reviews.llvm.org/D53196

llvm-svn: 345321
2018-10-25 21:47:22 +00:00
Jonas Paulsson bf66f38705 [SystemZ] Temporarily disable high VFs with integer div/rem.
Until mischeduler is clever enough to avoid spilling in a vectorized loop
with many (scalar) DLRs it is better to avoid high vectorization factors (8
and above).

llvm-svn: 344129
2018-10-10 09:30:29 +00:00
Jonas Paulsson 2c8b33770c [SystemZ] Take better care when computing needed vector registers in TTI.
A new function getNumVectorRegs() is better to use for the number of needed
vector registers instead of getNumberOfParts(). This is to make sure that the
number of vector registers (and typically operations) required for a vector
type is accurate.

getNumberOfParts() which was previously used works by splitting the vector
type until it is legal gives incorrect results for types with a non
power of two number of elements (rare).

A new static function getScalarSizeInBits() that also checks for a pointer
type and returns 64U for it since otherwise it gets a value of 0). Used in a
few places where Ty may be pointer.

Review: Ulrich Weigand
llvm-svn: 344115
2018-10-10 07:36:27 +00:00
Jonas Paulsson 77df2f2f38 [SystemZ] Adjust cost functions for subtargets that use LI + LOC instead of IPM
After recent improvements which makes better use of LOC instead of IPM, the
TTI cost functions also needs to be updated to reflect this.

This involves sext, zext and xor of i1.

The tests were updated so that for z13 the new costs are expected, while the
old costs are still checked for on zEC12.

Review: Ulrich Weigand
https://reviews.llvm.org/D51339

llvm-svn: 342207
2018-09-14 06:46:55 +00:00
Ulrich Weigand 33435c4c9c [SystemZ] Add support for IBM z14 processor (2/3)
This adds support for the new 32-bit vector float instructions of z14.
This includes:
- Enabling the instructions for the assembler/disassembler.
- CodeGen for the instructions, including new LLVM intrinsics.
- Scheduler description support for the instructions.
- Update to the vector cost function calculations.

In general, CodeGen support for the new v4f32 instructions closely
matches support for the existing v2f64 instructions.

llvm-svn: 308195
2017-07-17 17:42:48 +00:00
Jonas Paulsson 8722ade770 [SystemZ] Modelling of costs of divisions with a constant power of 2.
Such divisions will eventually be implemented with shifts which should
be reflected in the cost function.

Review: Ulrich Weigand
llvm-svn: 303254
2017-05-17 12:46:26 +00:00
Renato Golin ab85113a93 [SystemZ] Fix target specific tests
llvm-svn: 300078
2017-04-12 17:14:46 +00:00
Jonas Paulsson 9b4875434e [SystemZ] Updated test fp-cast.ll
This did not get included in the previous commit for SystemZ cost functions.

llvm-svn: 300053
2017-04-12 12:11:41 +00:00
Jonas Paulsson fccc7d66c3 [SystemZ] TargetTransformInfo cost functions implemented.
getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(),
getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(),
getInterleavedMemoryOpCost() implemented.

Interleaved access vectorization enabled.

BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads,
in which case the cost of the z/sext instruction becomes 0.

Review: Ulrich Weigand, Renato Golin.
https://reviews.llvm.org/D29631

llvm-svn: 300052
2017-04-12 11:49:08 +00:00
Jonas Paulsson facc4c5c7b [BasicTTIImpl] Bugfix in getIntrinsicInstrCost()
Don't call getScalarizationOverhead(RetTy, true, false) if RetTy is void type.

Review: Hal Finkel
https://reviews.llvm.org/D31024

llvm-svn: 297954
2017-03-16 14:05:34 +00:00