Commit Graph

36663 Commits

Author SHA1 Message Date
Kazushi (Jam) Marukawa dd0159bd81 [VE] Add vand, vor, and vxor intrinsic instructions
Add vand, vor, and vxor intrinsic instructions and regression tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92454
2020-12-02 22:52:54 +09:00
Jay Foad d28624a209 [AMDGPU] Stop adding an implicit def of vcc_hi for wave32
This doesn't seem to be needed for anything.

Differential Revision: https://reviews.llvm.org/D92400
2020-12-02 10:11:42 +00:00
Qiu Chaofan ffa2dce590 [PowerPC] Fix FLT_ROUNDS_ on little endian
In lowering of FLT_ROUNDS_, FPSCR content will be moved into FP register
and then GPR, and then truncated into word.

For subtargets without direct move support, it will store and then load.
The load address needs adjustment (+4) only on big-endian targets. This
patch fixes it on using generic opcodes on little-endian and subtargets
with direct-move.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D91845
2020-12-02 17:16:32 +08:00
Max Kazantsev 16bee4d368 [Test] One CodeGen test showing missing opportunity on move elimination 2020-12-02 13:16:34 +07:00
QingShan Zhang 47f784ace6 [PowerPC] Promote the i1 to i64 for SINT_TO_FP/FP_TO_SINT
i1 is the native type for PowerPC if crbits is enabled. However, we need
to promote the i1 to i64 as we didn't have the pattern for i1.

Reviewed By: Qiu Chao Fang

Differential Revision: https://reviews.llvm.org/D92067
2020-12-02 05:37:45 +00:00
Kazushi (Jam) Marukawa c1762bcf0a [VE] Add vcmp, vmax, and vmin intrinsic instructions
Add vcmp, vmax, and vmin intrinsic instructions and regression tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92387
2020-12-02 11:16:52 +09:00
Jessica Paquette b6b0a80eb9 Fix typo in testcase runline that got there because I have very bad hands
llvm/test/CodeGen/AArch64/GlobalISel/speculative-hardening-brcond.mir had a
slash in its runline.
2020-12-01 16:57:46 -08:00
Jessica Paquette c82f002cea [AArch64][GlobalISel] Don't write to WZR in non-flag-setting G_BRCOND case
We are avoiding writing to WZR just about everywhere else.

Also update the code to use MachineIRBuilder for the sake of consistency.

We also didn't have a GlobalISel testcase for this path, so add a simple one
now.

Differential Revision: https://reviews.llvm.org/D90626
2020-12-01 16:45:37 -08:00
Jessica Paquette 6c3fa97d8a [AArch64][GlobalISel] Select Bcc when it's better than TB(N)Z
Instead of falling back to selecting TB(N)Z when we fail to select an
optimized compare against 0, select Bcc instead.

Also simplify selectCompareBranch a little while we're here, because the logic
was kind of hard to follow.

At -O0, this is a 0.1% geomean code size improvement for CTMark.

A simple example of where this can kick in is here:
https://godbolt.org/z/4rra6P

In the example above, GlobalISel currently produces a subs, cset, and tbnz.
SelectionDAG, on the other hand, just emits a compare and b.le.

Differential Revision: https://reviews.llvm.org/D92358
2020-12-01 15:45:14 -08:00
David Blaikie 615f63e149 Revert "[FastISel] Flush local value map on ever instruction" and dependent patches
This reverts commit cf1c774d6a.

This change caused several regressions in the gdb test suite - at least
a sample of which was due to line zero instructions making breakpoints
un-lined. I think they're worth investigating/understanding more (&
possibly addressing) before moving forward with this change.

Revert "[FastISel] NFC: Clean up unnecessary bookkeeping"
This reverts commit 3fd39d3694.

Revert "[FastISel] NFC: Remove obsolete -fast-isel-sink-local-values option"
This reverts commit a474657e30.

Revert "Remove static function unused after cf1c774."
This reverts commit dc35368ccf.

Revert "[lldb] Fix TestThreadStepOut.py after "Flush local value map on every instruction""
This reverts commit 53a14a47ee.
2020-12-01 14:26:23 -08:00
Rahman Lavaee e0bf234930 Let .llvm_bb_addr_map section use the same unique id as its associated .text section.
Currently, `llvm_bb_addr_map` sections are generated per section names because we use
the `LinkedToSymbol` argument of getELFSection. This will cause the address map tables of functions
grouped into the same section when `-function-sections=true -unique-section-names=false` which is not
the intended behaviour. This patch lets the unique id of every `.text` section propagate to the associated
`.llvm_bb_addr_map` section.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D92113
2020-12-01 09:21:00 -08:00
David Green eedf0ed63e [ARM] Mark select and selectcc of MVE vector operations as expand.
We already expand select and select_cc in codegenprepare, but they can
still be generated under some situations. Explicitly mark them as expand
to ensure they are not produced, leading to a failure to select the
nodes.

Differential Revision: https://reviews.llvm.org/D92373
2020-12-01 15:05:55 +00:00
Simon Pilgrim 1b209ff9e3 [DAG] Move vselect(icmp_ult, 0, sub(x,y)) -> usubsat(x,y) to DAGCombine (PR40111)
Move the X86 VSELECT->USUBSAT fold to DAGCombiner - there's nothing target specific about these folds.
2020-12-01 14:25:29 +00:00
Kazushi (Jam) Marukawa 10b164d2f7 [VE] Add vmul and vdiv intrinsic instructions
Add vmul and vdiv intrinsic instructions and regression tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92377
2020-12-01 23:03:49 +09:00
Simon Pilgrim 00f4269cef [X86] Add PR48223 usubsat test case 2020-12-01 13:57:08 +00:00
Hans Wennborg 2ca4785ac7 Remove rm -f cortex-a57-misched-mla.s; hopefully the bots have all cycled past it now 2020-12-01 13:50:49 +01:00
Simon Pilgrim 6dbd0d36a1 [DAG] Move vselect(icmp_ult, -1, add(x,y)) -> uaddsat(x,y) to DAGCombine (PR40111)
Move the X86 VSELECT->UADDSAT fold to DAGCombiner - there's nothing target specific about these folds.

The SSE42 test diffs are relatively benign - its avoiding an extra constant load in exchange for an extra xor operation - there are extra register moves, which is annoying as all those operations should commute them away.

Differential Revision: https://reviews.llvm.org/D91876
2020-12-01 11:56:26 +00:00
Kazushi (Jam) Marukawa c3fe6ea22e [VE] Add vadd and vsub intrinsic instructions
Add vadd and vsub intrinsic instructions and regression tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92332
2020-12-01 19:57:22 +09:00
David Green 09d82fa95f [AArch64] Update pass pipeline test. NFC 2020-12-01 10:40:04 +00:00
David Green 7923d71b4a [ARM] PREDICATE_CAST demanded bits
The PREDICATE_CAST node is used to model moves between MVE predicate
registers and gpr's, and eventually become a VMSR p0, rn. When moving to
a predicate only the bottom 16 bits of the sources register are
demanded. This adds a simple fold for that, allowing it to potentially
remove instructions like uxth.

Differential Revision: https://reviews.llvm.org/D92213
2020-12-01 10:32:24 +00:00
Amara Emerson 87ff156414 [AArch64][GlobalISel] Fix crash during legalization of a vector G_SELECT with scalar mask.
The lowering of vector selects needs to first splat the scalar mask into a vector
first.

This was causing a crash when building oggenc in the test suite.

Differential Revision: https://reviews.llvm.org/D91655
2020-11-30 16:37:49 -08:00
Paul Robinson a474657e30 [FastISel] NFC: Remove obsolete -fast-isel-sink-local-values option
This option is not used for anything after #dc35368 (D91734).
2020-11-30 10:55:49 -08:00
Harald van Dijk cdac34bd47
[X86] Zero-extend pointers to i64 for x86_64
For LP64 mode, this has no effect as pointers are already 64 bits.
For ILP32 mode (x32), this extension is specified by the ABI.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D91338
2020-11-30 18:51:23 +00:00
Francesco Petrogalli f6150aa41a [SelectionDAGBuilder] Update signature of `getRegsAndSizes()`.
The mapping between registers and relative size has been updated to
use TypeSize to account for the size of scalable EVTs.

The patch is a NFCI, if not for the fact that with this change the
function `getUnderlyingArgRegs` does not raise a warning for implicit
conversion of `TypeSize` to `unsigned` when generating machine code
from the test added to the patch.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D92096
2020-11-30 17:38:51 +00:00
Kazushi (Jam) Marukawa 6834b3d6d5 [VE] Optimize prologue/epilogue instructions about GOT
Optimize prologue/epilogue instructions if a given function use GOT but
do not call other functions by eliminating FP.  Previously, we had wrong
implementations taken from other architectures.  Update regression tests
also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92313
2020-12-01 02:22:31 +09:00
Kazushi (Jam) Marukawa 6fe610535f [VE] Clean check routines of branch types
Previously, these check routines accepted non-generatble instructions.
This time, I clean them and add assert for those non-generatable
instructions.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92254
2020-12-01 02:19:37 +09:00
Craig Topper bfc4f29f46 [RISCV] Combine (GORCI (GORCI x, C2), C1) -> (GORCI x, C1|C2).
Unlike GREVI, GORCI stages can't be undone, but they are
redundant if done more than once.

Differential Revision: https://reviews.llvm.org/D92295
2020-11-30 08:42:46 -08:00
Craig Topper 76d1026b59 [RISCV] Custom legalize bswap/bitreverse to GREVI with Zbp extension to enable them to combine with other GREVI instructions
This enables bswap/bitreverse to combine with other GREVI patterns or each other without needing to add more special cases to the DAG combine or new DAG combines.

I've also enabled the existing GREVI combine for GREVIW so that it can pick up the i32 bswap/bitreverse on RV64 after they've been type legalized to GREVIW.

Differential Revision: https://reviews.llvm.org/D92253
2020-11-30 08:30:40 -08:00
Simon Pilgrim 8fcc8c3148 [X86] Add vbmi2 test coverage for vector rotations
We should be using the funnel shift instructions for vXi16 types.
2020-11-30 16:01:01 +00:00
Dmitri Gribenko 234a5297aa Add 'asserts' requiremnt to test/CodeGen/ARM/cortex-a57-misched-mla.mir
'-debug-only=machine-scheduler' only works when asserts are enabled.
2020-11-30 15:19:27 +01:00
Hans Wennborg 25d54abca5 Try harder to get rid off cortex-a57-misched-mla.s 2020-11-30 14:30:50 +01:00
Kazushi (Jam) Marukawa 686988a50f [VE] Optimize prologue/epilogue instructions
Optimize eliminate FP mechanism.  This time optimize a function which has
no call but fixed stack objects.  LLVM eliminates FP on such functions now.
Also, optimize GOT/PLT registers save/restore instructions if a given
function doesn't uses them.  In addition, remove generating mechanism of
`.cfi` instructions since those are taken from other architectures and not
inspected yet.  Update regression tests, also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92251
2020-11-30 22:22:33 +09:00
Hans Wennborg 273641fedc Try to fix bots after 112b3cb by removing cortex-a57-misched-mla.s 2020-11-30 14:15:56 +01:00
Kazushi (Jam) Marukawa 44a679eaa4 [VE] Change the behaviour of truncate
Change the way to truncate i64 to i32 in I64 registers.  VE assumed
sext values previously.  Change it to zext values this time to make
it match to the LLVM behaviour.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92226
2020-11-30 22:12:45 +09:00
Simon Pilgrim 83d79ca5bf [X86][AVX512] Only lower to VPALIGNR if we have BWI (PR48322) 2020-11-30 10:51:24 +00:00
Evgeny Leviant 129523588f Fix test case 2020-11-30 12:35:28 +03:00
David Green d5387c044d [ARM] Constant predicate tests. NFC 2020-11-30 09:18:25 +00:00
Evgeny Leviant 112b3cb6ba [TableGen][SchedModels] Fix read/write variant substitution
Patch fixes multiple issues related to expansion of variant sched reads and
writes.

Differential revision: https://reviews.llvm.org/D90844
2020-11-30 11:55:55 +03:00
Juneyoung Lee 53040a968d [ConstantFold] Fold more operations to poison
This patch folds more operations to poison.

Alive2 proof: https://alive2.llvm.org/ce/z/mxcb9G (it does not contain tests about div/rem because they fold to poison when raising UB)

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D92270
2020-11-29 21:19:48 +09:00
Nikita Popov e987fbdd85 [BasicAA] Generalize recursive phi alias analysis
For recursive phis, we skip the recursive operands and check that
the remaining operands are NoAlias with an unknown size. Currently,
this is limited to inbounds GEPs with positive offsets, to
guarantee that the recursion only ever increases the pointer.

Make this more general by only requiring that the underlying object
of the phi operand is the phi itself, i.e. it it based on itself in
some way. To compensate, we need to use a beforeOrAfterPointer()
location size, as we no longer have the guarantee that the pointer
is strictly increasing.

This allows us to handle some additional cases like negative geps,
geps with dynamic offsets or geps that aren't inbounds.

Differential Revision: https://reviews.llvm.org/D91914
2020-11-29 10:25:23 +01:00
Harald van Dijk 78a30c830b
[X86] Add -verify-machineinstrs to pic.ll
This ensures that failures show up in regular builds, rather than only
when expensive checks are enabled.

Differential Revision: https://reviews.llvm.org/D91339
2020-11-28 17:54:44 +00:00
Craig Topper 6ee22ca6ce [RISCV] Add tests for existing (rotr (bswap X), (i32 16))->grevi pattern for RV32. Extend same pattern to rotl and GREVIW.
Not sure why bswap was treated specially. This also applies to bitreverse
or generic grevi. We can improve this in future patches.
For now I just wanted to get the consistency and the test coverage
as I plan to make some other changes around bswap.
2020-11-27 18:09:01 -08:00
Kazushi (Jam) Marukawa 3bd78b7cc0 [VE] Optimize emitSPAdjustment function
Optimize emitSPAdjustment function to generate as small as possible
instructions to adjust SP.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92174
2020-11-28 08:06:31 +09:00
Craig Topper 29807a023c [RISCV] Remove stale FIXMEs from a couple test cases. NFC 2020-11-27 12:07:27 -08:00
Craig Topper fa0f01a3c0 [RISCV][LegalizeTypes] Teach type legalizer that it can promote UMIN/UMAX using SExtPromotedInteger if that's better for the target.
If Sext is cheaper than Zext for a target, we can use that to promote the operands of UMIN/UMAX. Using sext just makes numbers with the sign bit set even larger when treated as an unsigned number and it has no effect on number without the sign bit set. So the relative order doesn't change. This is similar to what we already do for promoting SETCC.

This is helpful on RISCV where i32 arguments are sign extended on RV64 and many instructions are able to produce results with 33 sign bits.

Differential Revision: https://reviews.llvm.org/D92128
2020-11-27 11:37:25 -08:00
Krzysztof Parzyszek b7bde0e4f3 [Hexagon] Improve check for HVX types
Allow non-simple types, like <17 x i32> to be treated as HVX vector
types.
2020-11-27 13:33:10 -06:00
Simon Pilgrim 2ad2e91016 [X86] Add AVX2/AVX512 test coverage in sat-add.ll
Shows the failure to combine to uaddsat
2020-11-27 16:11:02 +00:00
Simon Pilgrim c4628460b7 [Hexagon] Add HVX support for ISD::SMAX/SMIN/UMAX/UMIN instead of custom dag patterns
Followup to D92112 now that I've learnt about HVX type splitting.

This is some necessary cleanup work for min/max ops to eventually help us move the add/sub sat patterns into DAGCombine - D91876.

Differential Revision: https://reviews.llvm.org/D92169
2020-11-27 15:46:11 +00:00
Simon Pilgrim 969918e177 [DAG] Legalize umin(x,y) -> sub(x,usubsat(x,y)) and umax(x,y) -> add(x,usubsat(y,x)) iff usubsat is legal
If usubsat() is legal, this is likely to result in smaller codegen expansion than the default cmp+select codegen expansion.

Allows us to move the x86-specific lowering to the generic expansion code.

Differential Revision: https://reviews.llvm.org/D92183
2020-11-27 11:18:58 +00:00
Simon Pilgrim 4b9c2bbdb6 [X86] Regenerate extract-store.ll tests
Rename prefix from X32 to X86 as we typically use X32 for gnux32 triples
2020-11-27 11:18:57 +00:00