llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	454256ef4f	[AMDGPU] Correct the known bits calculation for MUL_I24. I'm not entirely sure, but based on how ComputeNumSignBits handles ISD::MUL, I believe this code was miscounting the number of sign bits. As an example of an incorrect result let's say that countMinSignBits returned 1 for the left hand side and 24 for the right hand side. LHSValBits would be 23 and RHSValBits would be 0 and the sum would be 23. This would cause the code to set 9 high bits as zero/one. Now suppose the real values for the left side is 0x800000 and the right hand side is 0xffffff. The product is 0x00800000 which has 8 sign bits not 9. The number of valid bits for the left and right operands is now the number of non-sign bits + 1. If the sum of the valid bits of the left and right sides exceeds 32, then the result may overflow and we can't say anything about the sign of the result. If the sum is 32 or less then it won't overflow and we know the result has at least 1 sign bit. For the previous example, the code will now calculate the left side valid bits as 24 and the right side as 1. The sum will be 25 and the sign bits will be 32 - 25 + 1 which is 8, the correct value. Differential Revision: https://reviews.llvm.org/D116469	2022-01-14 08:54:54 -08:00
Craig Topper	fc7af123dd	[AMDGPU] Pre-commit test for D116469. NFC The multiply in this test is miscompiled to 0. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D117280	2022-01-14 08:54:38 -08:00
Jay Foad	e4e95f14f1	[LiveIntervals] Repair live intervals that gain subranges In repairIntervalsInRange, if the new instructions refer to subregs but the old instructions did not, make sure any existing live interval for the superreg is updated to have subranges. Also skip repairing any range that we have recalculated from scratch, partly for efficiency but also to avoids some cases that repairOldRegInRange can't handle. The existing test/CodeGen/AMDGPU/twoaddr-regsequence.mir provides some test coverage for this change: when TwoAddressInstructionPass converts REG_SEQUENCE into subreg copies, the live intervals will now get subranges and MachineVerifier will verify that the subranges are correct. Unfortunately MachineVerifier does not complain if the subranges are not present, so the test also passed before this patch. This patch also fixes ~800 of the ~1500 failures in the whole CodeGen lit test suite when -early-live-intervals is forced on. Differential Revision: https://reviews.llvm.org/D110328	2021-09-24 11:58:08 +01:00
Joe Nash	3ce1b9631a	[AMDGPU] Switch PostRA sched to MachineSched Use GCNHazardRecognizer in postra sched. Updated tests for the new schedules. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D109536 Change-Id: Ia86ba2ae168f12fb34b4d8efdab491f84d936cde	2021-09-14 15:11:27 -04:00
Simon Pilgrim	6085593c12	[AMDGPU] simplifyI24 - replace GetDemandedBits with SimplifyMultipleUseDemandedBits GetDemandedBits mostly just calls SimplifyMultipleUseDemandedBits now, but it does a very blunt constant simplification that SimplifyMultipleUseDemandedBits avoids. If we need to demand bits from constants we should handle this through ShrinkDemandedConstant/targetShrinkDemandedConstant. @arsenm confirmed that the sign extended immediates are better for code size. Differential Revision: https://reviews.llvm.org/D74857	2020-02-20 12:03:08 +00:00
Jay Foad	4f17b1784e	Fix for AMDGPU MUL_I24 known bits calculation Summary: At present, the code calculating known bits of AMDGPU MUL_I24 confuses the concepts of "non-negative number" and "positive number". In some situations, it results in incorrect code. I have a case where the optimizer replaces the result of calculating MUL_I24(-5, 0) with -8. Reviewers: foad, arsenm Reviewed By: arsenm Subscribers: foad, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Patch by Eugene Kuznetsov. Differential Revision: https://reviews.llvm.org/D70367	2019-12-16 10:25:57 +00:00

6 Commits