Commit Graph

32064 Commits

Author SHA1 Message Date
Craig Topper 1bf4bbc492 [LegalizeTypes][RISCV][WebAssembly] Expand ABS in PromoteIntRes_ABS if it will expand to sra+xor+sub later.
If we promote the ABS and then Expand in LegalizeDAG, then both the
sra and the xor will have their inputs sign extended. This generates
extra code on RISCV which lacks an i8 or i16 sign extend instructon.
If we expand during type legalization, then only the sra will get its
input sign extended. RISCV is able to combine this with the sra by
doing a shift left followed by an sra.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D121664
2022-03-15 08:27:39 -07:00
Craig Topper ad94dfb9a0 [DAGCombiner][RISCV] Adjust (aext (and (trunc x), cst)) -> (and x, cst) to sext cst based on target preference
RISCV strong prefers i32 values be sign extended to i64. This combine
was always zero extending the constant using APInt methods.

This adjusts the code so that it calls getNode using ISD::ANY_EXTEND instead.
getNode will call TLI.isSExtCheaperThanZExt to decide how to handle
the constant.

Tests were copied from D121598 where I noticed that we were creating
constants that were hard to materialize.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D121650
2022-03-15 08:26:47 -07:00
Simon Pilgrim 7262eacd41 Revert rG9c542a5a4e1ba36c24e48185712779df52b7f7a6 "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
Mane of the build bots are complaining: Unknown command line argument '-lower-global-dtors'
2022-03-15 13:01:35 +00:00
Fangrui Song 252bc2b9f5 [MachineLICM] Simplify code and avoid adding nullptr values to ParentMap. NFC 2022-03-15 01:24:01 -07:00
Julian Lettner 9c542a5a4e Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121327
2022-03-14 17:51:18 -07:00
Amara Emerson 8cbf18cb04 [GlobalISel] Fix store merging incorrectly merging volatile stores.
The existing volatile checks only handle aliasing hazards between stores,
but that isn't enough since by that point volatile stores may have already
been added to the current candidate group.
2022-03-14 13:48:51 -07:00
Kazu Hirata 9286786e87 [CodeGen] Remove an unused variable introduced in D121128 2022-03-14 11:41:04 -07:00
Mircea Trofin 294eca35a0 [regalloc] Remove -consider-local-interval-cost
Discussed extensively on D98232. The functionality introduced in D35816
never worked correctly. In D98232, it was fixed, but, as it was
introducing a large compile-time regression, and the value of the
original patch was called into doubt, we disabled it by default
everywhere. A year later, it appears that caused no grief, so it seems
safe to remove the disabled code.

This should be accompanied by re-opening bug 26810.

Differential Revision: https://reviews.llvm.org/D121128
2022-03-14 10:49:16 -07:00
Sanjay Patel c2592c374e [SDAG] simplify bitwise logic with repeated operand
We do not have general reassociation here (and probably
do not need it), but I noticed these were missing in
patches/tests motivated by D111530, so we can at
least handle the simplest patterns.

The VE test diff looks correct, but we miss that
pattern in IR currently:
https://alive2.llvm.org/ce/z/u66_PM
2022-03-13 11:12:30 -04:00
Wenlei He 4f320ca4ba [DebugInfo] Include DW_TAG_skeleton_unit when looking for parent UnitDie
`DIE::getUnitDie` looks up parent DIE until compile unit or type unit is found. However for skeleton CU with debug fission, we would have DW_TAG_skeleton_unit instead of DW_TAG_compile_unit as top level DIE.

This change fixes the look up so we can get DW_TAG_skeleton_unit as UnitDie for skeleton CU.

Differential Revision: https://reviews.llvm.org/D120610
2022-03-12 13:27:42 -08:00
serge-sans-paille ed98c1b376 Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121332
2022-03-12 17:26:40 +01:00
Yuanfang Chen d538ad53c3 [JMCInstrument] infer proper path style based on debug info
By default, the path style is decided by the host. This patch makes JMC
uses the path style used by the SP directory. This makes JMC output
host-independent.

Fixes: https://github.com/llvm/llvm-project/issues/54219

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D121236
2022-03-10 10:50:44 -08:00
Lorenzo Albano 28cfa764c2 [VP] Strided loads/stores
This patch introduces two new experimental IR intrinsics and SDAG nodes
to represent vector strided loads and stores.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D114884
2022-03-10 18:46:54 +01:00
Nico Weber a278250b0f Revert "Cleanup codegen includes"
This reverts commit 7f230feeea.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10 07:59:22 -05:00
serge-sans-paille 7f230feeea Cleanup codegen includes
after:  1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169
2022-03-10 10:00:30 +01:00
Xiang1 Zhang c31014322c TLS loads opimization (hoist)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D120000
2022-03-10 09:29:06 +08:00
Stanislav Mekhanoshin 0be6fd44f3 [SDAG] Use MMO flags in MemSDNode folding
SDNodes with different target flags may now be folded together
rightfully resulting in the assertion in the refineAlignment.
Folding nodes with different target flags may result in the
wrong load instructions produced at least on the AMDGPU.

Fixes: SWDEV-326805

Differential Revision: https://reviews.llvm.org/D121335
2022-03-09 14:25:22 -08:00
Sanjay Patel 341623653d [SDAG] match rotate pattern with extra 'or' operation
This is another fold generalized from D111530.
We can find a common source for a rotate operation hidden inside an 'or':
https://alive2.llvm.org/ce/z/9pV8hn

Deciding when this is profitable vs. a funnel-shift is tricky, but this
does not show any regressions: if a target has a rotate but it does not
have a funnel-shift, then try to form the rotate here. That is why we
don't have x86 test diffs for the scalar tests that are duplicated from
AArch64 ( 74a65e3834 ) - shld/shrd are available. That also makes it
difficult to show vector diffs - the only case where I found a diff was
on x86 AVX512 or XOP with i64 elements.

There's an additional check for a legal type to avoid a problem seen
with x86-32 where we form a 64-bit rotate but then it gets split
inefficiently. We might avoid that by adding more rotate folds, but
I didn't check to see what is missing on that path.

This gets most of the motivating patterns for AArch64 / ARM that are in
D111530.

We still need a couple of enhancements to setcc pattern matching with
rotate/funnel-shift to get the rest.

Differential Revision: https://reviews.llvm.org/D120933
2022-03-09 13:19:00 -05:00
Thomas Preud'homme 67c14d5c69 [MachinePipeliner] Fix isPseduo typo. 2022-03-09 15:26:39 +00:00
Tom Stellard fb616c9b31 SafeStack: Re-enable SafeStack coloring optimization
This was disabled in 2acea2786b as a
work-around for Issue #31491.  I've reduced the test case from that bug
and confirmed that it is now fixed.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D120866
2022-03-08 15:10:41 -08:00
Craig Topper 29511ec7da [LegalizeTypes][VP] Add widening and splitting support for VP_FMA.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D120854
2022-03-08 09:59:59 -08:00
Craig Topper c392b9924e [LegalizeTypes][VP] Add splitting and widening support for VP_FNEG.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D120785
2022-03-08 09:59:34 -08:00
Fraser Cormack 17310f3d19 [SelectionDAG][NFC] Address a few clang-tidy warnings
Fix a couple of else-after-return warnings and some unnecessary
parentheses.
2022-03-08 16:22:26 +00:00
Yuanfang Chen eddd94c27d Reland "[clang][debug] port clang-cl /JMC flag to ELF"
This relands commit 7313474319.

It failed on Windows/Mac because `-fjmc` is only checked for ELF targets.
Check the flag unconditionally instead and issue a warning for non-ELF targets.
2022-03-07 21:55:41 -08:00
Yuanfang Chen f46fa4de4a Revert "[clang][debug] port clang-cl /JMC flag to ELF"
This reverts commit 7313474319.

Break bots:
http://45.33.8.238/win/54551/step_7.txt
http://45.33.8.238/macm1/29590/step_7.txt
2022-03-07 12:40:43 -08:00
Craig Topper 8e132c5c1d [LegalizeTypes][ARM][X86] Change ExpandIntRes_ABS to use sra+xor+sub.
Previously we used sra+add+xor if ADDCARRY is supported. This changes
to sra+xor+sub is SUBCARRY is available.

This is consistent with the recent change to the default expansion
in LegalizeDAG.

Differential Revision: https://reviews.llvm.org/D121039
2022-03-07 11:28:32 -08:00
Yuanfang Chen 7313474319 [clang][debug] port clang-cl /JMC flag to ELF
The motivation is to enable the MSVC-style JMC instrumentation usable by a ELF-based
debugger. Since there is no prior experience implementing JMC feature for ELF-based
debugger, it might be better to just reuse existing MSVC-style JMC instrumentation.
For debuggers that support both ELF&COFF (like lldb), the JMC implementation might
be shared between ELF&COFF. If this is found to inadequate, it is pretty low-cost
switching to alternatives.

Implementation:
- The '-fjmc' is already a driver and cc1 flag. Wire it up for ELF in the driver.
- Refactor the JMC instrumentation pass a little bit.
- The ELF handling is different from MSVC in two places:
  * the flag section name is ".just.my.code" instead of ".msvcjmc"
  * the way default function is provided: MSVC uses /alternatename; ELF uses weak function.

Based on D118428.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D119910
2022-03-07 10:16:24 -08:00
David Green 4388f4f776 [DAG] Don't convert undef to 0 when creating buildvector
When inserting undef into buildvectors created from shuffles of
buildvectors, we convert elements to the largest needed type. This had
the effect of converting undef into 0, which isn't needed as the
buildvector implicitly truncates and trunc(zext(undef)) == undef.

Differential Revision: https://reviews.llvm.org/D121002
2022-03-06 18:35:34 +00:00
Sanjay Patel f4b53972ce [SDAG] fold bitwise logic with shifted operands
This extends acb96ffd14 to 'and' and 'xor' opcodes.

Copying from that message:

LOGIC (LOGIC (SH X0, Y), Z), (SH X1, Y) --> LOGIC (SH (LOGIC X0, X1), Y), Z

https://alive2.llvm.org/ce/z/QmR9rR

This is a reassociation + factoring fold. The common shift operation is moved
after a bitwise logic op on 2 input operands.
We get simpler cases of these patterns in IR, but I suspect we would miss all
of these exact tests in IR too. We also handle the simpler form of this plus
several other folds in DAGCombiner::hoistLogicOpWithSameOpcodeHands().
2022-03-05 11:14:45 -05:00
Jeremy Morse 0e96d95d13 [DebugInfo][InstrRef] Accept register-reads after isel in any block
When lowering LLVM-IR to instruction referencing stuff, if a value is
defined by a COPY, we try and follow the register definitions back to where
the value was defined, and build an instruction reference to that
instruction. In a few scenarios (such as arguments), this isn't possible.
I added some assertions to catch cases that weren't explicitly whitelisted.

Over the course of a few months, several more scenarios have cropped up,
the lastest is the llvm.read_register intrinsic, which lets LLVM-IR read an
arbitary register at any point. In the face of this, there's little point
in validating whether debug-info reads a register in an expected scenario.
Thus: this patch just deletes those assertions, and adds a regression test
to check that something is done with the llvm.read_register intrinsic.

Fixes #54190

Differential Revision: https://reviews.llvm.org/D121001
2022-03-04 17:01:12 +00:00
Paul Walker 42b4a6227e [DAGCombine] Prevent illegal ISD::SPLAT_VECTOR operations post legalisation.
When triggered during operation legalisation the affected combine
generates a splat_vector that when custom lowered for SVE fixed
length code generation, results in the original precombine sequence
and thus we enter a legalisation/combine hang.

NOTE: The patch contains no tests because I observed this issue
only when combined with other work that might never become public.
The current way AArch64 lowers ISD::SPLAT_VECTOR meant a specific
test was not possible so I'm hoping the DAGCombiner fix can be seen
as obvious. The AArch64ISelLowering change is requirted to maintain
existing code quality.

Differential Revision: https://reviews.llvm.org/D120735
2022-03-04 11:54:03 +00:00
Maksim Panchenko 7e570308f2 [NFC] Fix typos
Reviewed By: yota9, Amir

Differential Revision: https://reviews.llvm.org/D120859
2022-03-03 13:26:39 -08:00
Vasileios Porpodas 6f9640d6a3 [RegAlloc] Add a complexity limit in growRegion() to cap compilation time.
growRegion() does not scale in code with BBs with a very large number of edges.
In such code growRegion() becomes a compile-time bottleneck, consuming 60% of
the total compilation time.
This patch adds a limit to the complexity of growRegion() by incrementing a counter
in each iteration. We bail out once the limit is reached.

Differential Revision: https://reviews.llvm.org/D120752
2022-03-03 11:31:07 -08:00
Paul Robinson 7b85f0f32f [PS4] isPS4 and isPS4CPU are not meaningfully different 2022-03-03 11:36:59 -05:00
Sanjay Patel e9302bf7ef [SDAG] try harder to remove a rotate from X == 0
https://alive2.llvm.org/ce/z/mJP7XP

This can be viewed as expanding the compare into and/or-of-compares:
https://alive2.llvm.org/ce/z/bkZYWE
followed by reduction of each compare.

This could be extended in several ways:
1. There's a (X & Y) == -1 sibling.
2. We can recurse through more than 1 'or'.
3. The fold could be generalized beyond rotates - any operation that
   only changes the order of bits (bswap, bitreverse).

This is a transform noted in D111530.
2022-03-03 09:25:46 -05:00
Sanjay Patel c33dbc2a2d [SDAG] refactor foldSetCCWithRotate; NFC
There are more potential optimizations to make here,
so rearrange to make it easier to append those.
2022-03-02 16:42:05 -05:00
Craig Topper ab7a7cc1dd Revert "[LegalizeTypes][VP] Add splitting and widening support for VP_FNEG."
This reverts commit ac93f95861.

Committed by accident.
2022-03-02 10:00:22 -08:00
Craig Topper 324c0a7206 [SelectionDAG][RISCV] Emit a canonical sign bit test from ExpandIntRes_ABS.
Instead of emitting 0 > Hi, emit Hi < 0. If Hi needs to be expanded again
this will allow the special case for sign bit tests in ExpandIntOp_SETCC
to trigger.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120761
2022-03-02 09:47:26 -08:00
Craig Topper ac93f95861 [LegalizeTypes][VP] Add splitting and widening support for VP_FNEG.
Differential Revision: https://reviews.llvm.org/D120785
2022-03-02 09:47:05 -08:00
Daniel McIntosh d636b76eca [CodeGen] Use AdjustStackOffset for Callee Saved Registers in PEI::calculateFrameObjectOffsets
Also, changes how the CSR loop is indexed, which should avoid bugs like the one fixed by rG4a57bb5a3b74bdad9b0518009a7d7ac7ca2ac650

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D120668
2022-03-02 11:41:12 -05:00
Nikita Popov 6fde043951 [MachineSink] Disable if there are any irreducible cycles
This is an alternative to D120330, which disables MachineSink for
functions with irreducible cycles entirely. This avoids both the
correctness problem, and ensures we don't perform non-profitable
sinks into cycles. At the same time, it may also disable
profitable sinks in the same function. This can be made more
precise by using MachineCycleInfo in the future.

Fixes https://github.com/llvm/llvm-project/issues/53990.

Differential Revision: https://reviews.llvm.org/D120800
2022-03-02 16:57:29 +01:00
Simon Pilgrim 5cce97d61e [DAG] isSplatValue - improve ISD::VECTOR_SHUFFLE splat detection
Currently we only check for splat shuffles, this extends it to see if the source operand is a splat across the demanded elts based upon the shuffle mask
2022-03-02 15:32:24 +00:00
spupyrev bcdc047731 speeding up ext-tsp for huge instances
Differential Revision: https://reviews.llvm.org/D120780
2022-03-02 07:17:48 -08:00
Simon Pilgrim df0a2b4f30 [DAG] SelectionDAG::isSplatValue - add initial BITCAST handling
This patch adds support for recognising vector splats by peeking through bitcasts to vectors with smaller element types - if all the offset subelements are splats then the bitcasted vector is a splat as well.

We don't have great coverage for isSplatValue so I've made this pretty specific to the use case I'm trying to fix - regressions in some vXi64 vector shift by splat cases that 32-bit x86 doesn't recognise because the shift amount buildvector has been type legalised to v2Xi32.

We can add further support (floats, bitcast from larger element types, undef elements) when we have actual test coverage.

Differential Revision: https://reviews.llvm.org/D120553
2022-03-02 11:25:51 +00:00
Xiang1 Zhang 65588a0776 Revert "TLS loads opimization (hoist)"
Revert for more reviews

This reverts commit 30e612ebdf.
2022-03-02 14:10:11 +08:00
Mircea Trofin cb2160760e [nfc][codegen] Move RegisterBank[Info].h under CodeGen
This wraps up from D119053. The 2 headers are moved as described,
fixed file headers and include guards, updated all files where the old
paths were detected (simple grep through the repo), and `clang-format`-ed it all.

Differential Revision: https://reviews.llvm.org/D119876
2022-03-01 21:53:25 -08:00
Xiang1 Zhang 30e612ebdf TLS loads opimization (hoist)
Reviewed By: Wang Pheobe, Topper Craig

Differential Revision: https://reviews.llvm.org/D120000
2022-03-02 10:37:24 +08:00
Craig Topper 8787726609 [LegalizeTypes] Remove incomplete StrictFP support from SplitVecRes_UnaryOp. NFC
There is no handling of Chain operands in this function so it can't
work. There's a separate splitting function for all strict fp nodes.
2022-03-01 15:43:57 -08:00
Zequan Wu 5c9e20d7d0 [PDB] Add char8_t type
Differential Revision: https://reviews.llvm.org/D120690
2022-03-01 13:39:51 -08:00
serge-sans-paille a494ae43be Cleanup includes: TransformsUtils
Estimation on the impact on preprocessor output:
before: 1065307662
after:  1064800684

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120741
2022-03-01 21:00:07 +01:00