Commit Graph

14 Commits

Author SHA1 Message Date
Amy Kwan ba627a32e1 [PowerPC] Update Refactored Load/Store Implementation, XForm VSX Patterns, and Tests
This patch includes the following updates to the load/store refactoring effort introduced in D93370:
 - Update various VSX patterns that use to "force" an XForm, to instead just XForm.
   This allows the ability for the patterns to compute the most optimal addressing
   mode (and to produce a DForm instruction when possible)
- Update pattern and test case for the LXVD2X/STXVD2X intrinsics
- Update LIT test cases that use to use the XForm instruction to use the DForm instruction

Differential Revision: https://reviews.llvm.org/D95115
2021-07-16 09:28:48 -05:00
Qiu Chaofan b820339752 [PowerPC] Support f128 under VSX
This patch is the last one in backend to support fp128 type in
pre-POWER9 subtargets with VSX, removing temporary option and updating
remaining tests.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D92374
2021-04-20 15:49:52 +08:00
Esme-Yi 29eb3dcfe6 [PowerPC] Materialize i64 constants by enumerated patterns.
Summary: Some constants can be handled with less instructions than our current results. And it seems our original approach is not very easy to extend. Therefore this patch proposes to materialize all 64-bit constants by enumerated patterns.
I traversed almost all constants to verified the functionality of these pattens. A traversed comparison of the number of instructions used by the original method and the new method has also been completed, where no degradation was caused by this patch. This patch also passed Bootstrap test and SPEC test.
Improvements of this patch are shown in llvm/test/CodeGen/PowerPC/constants-i64.ll

Reviewed By: steven.zhang, stefanp

Differential Revision: https://reviews.llvm.org/D92089
2020-12-21 05:21:07 +00:00
QingShan Zhang 2b84784a25 [NFC][Test] Add test coverage for IEEE Long Double on Power8 2020-11-16 03:45:51 +00:00
Qiu Chaofan d14e51806b [PowerPC] Skip IEEE 128-bit FP type in FastISel
Vector types, quadword integers and f128 currently cannot be handled in
FastISel. We did not skip f128 type in lowering arguments, which causes
a crash. This patch will fix it.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D90206
2020-11-03 11:17:11 +08:00
QingShan Zhang 8b6674e64f [NFC][Test] Update the test with update_llc_test_checks.py 2020-10-09 02:26:03 +00:00
Jay Foad 62fd7f767c [MachineScheduler] Fix the TopDepth/BotHeightReduce latency heuristics
tryLatency compares two sched candidates. For the top zone it prefers
the one with lesser depth, but only if that depth is greater than the
total latency of the instructions we've already scheduled -- otherwise
its latency would be hidden and there would be no stall.

Unfortunately it only tests the depth of one of the candidates. This can
lead to situations where the TopDepthReduce heuristic does not kick in,
but a lower priority heuristic chooses the other candidate, whose depth
*is* greater than the already scheduled latency, which causes a stall.

The fix is to apply the heuristic if the depth of *either* candidate is
greater than the already scheduled latency.

All this also applies to the BotHeightReduce heuristic in the bottom
zone.

Differential Revision: https://reviews.llvm.org/D72392
2020-07-17 11:02:13 +01:00
Lei Huang 90b1a710ae [PowerPC] Enable default support of quad precision operations
Summary: Remove option guarding support of quad precision operations.

Reviewers: nemanjai, #powerpc, steven.zhang

Reviewed By: nemanjai, #powerpc, steven.zhang

Subscribers: qiucf, wuzish, nemanjai, hiraditya, kbarton, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83437
2020-07-10 13:27:48 -05:00
Jinsong Ji 7aafdef627 [MachineScheduler] checkResourceLimit boundary condition update
When we call checkResourceLimit in bumpCycle or bumpNode, and we
know the resource count has just reached the limit (the equations
 are equal). We should return true to mark that we are resource
limited for next schedule, or else we might continue to schedule
in favor of latency for 1 more schedule and create a schedule that
 actually overbook the resource.

When we call checkResourceLimit to estimate the resource limite before
scheduling, we don't need to return true even if the equations are
equal, as it shouldn't limit the schedule for it .

Differential Revision: https://reviews.llvm.org/D62345

llvm-svn: 362805
2019-06-07 14:54:47 +00:00
QingShan Zhang f24ec7bdd0 [Power9] Enable the Out-of-Order scheduling model for P9 hw
When switched to the MI scheduler for P9, the hardware is modeled as out of order.
However, inside the MI Scheduler algorithm, we still use the in-order scheduling model
as the MicroOpBufferSize isn't set. The MI scheduler take it as the hw cannot buffer
the op. So, only when all the available instructions issued, the pending instruction
could be scheduled. That is not true for our P9 hw in fact.

This patch is trying to enable the Out-of-Order scheduling model. The buffer size 44 is
picked from the P9 hw spec, and the perf test indicate that, its value won't hurt the cpu2017.

With this patch, there are 3 specs improved over 3% and 1 spec deg over 3%. The detail is as follows:

x264_r: +6.95%
cactuBSSN_r: +6.94%
lbm_r: +4.11%
xz_r: -3.85%

And the GEOMEAN for all the C/C++ spec in spec2017 is about 0.18% improved. 

Reviewer: Nemanjai
Differential Revision: https://reviews.llvm.org/D55810

llvm-svn: 350285
2019-01-03 05:04:18 +00:00
Stefan Pintilie f384606799 [PowerPC] Emit xscpsgndp instead of xxlor when copying floating point scalar registers for P9
This patch will address using the xscpsgndp instruction to copy floating point
scalar registers instead of the xxlor (specifically XXLORf) instruction that is
currently used. Additionally, this patch of utilizing xscpsgndp will apply to
P9, while pre-P9 will still use xxlor.

Patch by amyk

Differential Revision: https://reviews.llvm.org/D50004

llvm-svn: 340643
2018-08-24 20:00:24 +00:00
Stefan Pintilie 94259ba13a [PowerPC] [NFC] Update __float128 tests
Add the two options -ppc-vsr-nums-as-vr and -ppc-asm-full-reg-names to
the __float128 tests. Then modify the tests as required.

llvm-svn: 336940
2018-07-12 20:18:57 +00:00
Lei Huang 66e22c21c3 [Power9] Optimize codgen for conversions of int to float128
Optimize code sequences for integer conversion to fp128 when the integer is a result of:
  * float->int
  * float->long
  * double->int
  * double->long

Differential Revision: https://reviews.llvm.org/D48429

llvm-svn: 336316
2018-07-05 07:46:01 +00:00
Lei Huang a26f3be454 [Power9] Implement float128 parameter passing and return values
This patch enable parameter passing and return by value for float128 types.
Passing aggregate/union which contain float128 members will be submitted in
subsequent patches.

Differential Revision: https://reviews.llvm.org/D47552

llvm-svn: 336306
2018-07-05 04:10:15 +00:00