Commit Graph

30883 Commits

Author SHA1 Message Date
Matt Arsenault fae05692a3 CodeGen: Print/parse LLTs in MachineMemOperands
This will currently accept the old number of bytes syntax, and convert
it to a scalar. This should be removed in the near future (I think I
converted all of the tests already, but likely missed a few).

Not sure what the exact syntax and policy should be. We can continue
printing the number of bytes for non-generic instructions to avoid
test churn and only allow non-scalar types for generic instructions.

This will currently print the LLT in parentheses, but accept parsing
the existing integers and implicitly converting to scalar. The
parentheses are a bit ugly, but the parser logic seems unable to deal
without either parentheses or some keyword to indicate the start of a
type.
2021-06-30 16:54:13 -04:00
Jon Roelofs a642872476 [GISel] Support llvm.memcpy.inline
Differential revision: https://reviews.llvm.org/D105072
2021-06-30 12:39:05 -07:00
Jeremy Morse 4955544162 [LiveDebugValues][InstrRef][1/2] Recover more clobbered variable locations
In various circumstances, when we clobber a register there may be
alternative locations that the value is live in. The classic example would
be a value loaded from the stack, and then clobbered: the value is still
available on the stack. InstrRefBasedLDV was coping with this at block
starts where it's forced to pick a location, however it wasn't searching
for alternative locations when values were clobbered.

This patch notifies the "Transfer Tracker" object when clobbers occur, and
it's able to find alternatives and issue DBG_VALUEs for that location. See:
the added test.

Differential Revision: https://reviews.llvm.org/D88405
2021-06-30 16:56:25 +01:00
Bradley Smith 002911503f [TargetLowering][AArch64][SVE] Take into account accessed type when clamping address
When clamping the index for a memory access to a stacked vector we must
take into account the entire type being accessed, not just assume that
we are accessing only a single element.

Differential Revision: https://reviews.llvm.org/D105016
2021-06-30 13:30:18 +01:00
David Blaikie 632e15e766 Conditionalize function only used in an assert to address -Wunused-function 2021-06-29 16:39:59 -07:00
Matt Arsenault 990278d026 CodeGen: Store LLT instead of uint64_t in MachineMemOperand
GlobalISel is relying on regular MachineMemOperands to track all of
the memory properties of accesses. Just the raw byte size is
insufficent to disambiguate all situations. For example, if we need to
split an unaligned extending load, we need to know the number of bits
in the original source value and can't infer it from the result
type. This is also a problem for extending vector loads.

This does decrease the maximum representable size from the full
uint64_t bytes to a maximum of 16-bits. No in tree testcases hit this,
other than places using UINT64_MAX for unknown sizes. This may be an
issue for G_MEMCPY and co., although they can just use unknown size
for large static sizes. This also has potential for backend abuse by
relying on the type when it really shouldn't be relevant after
selection.

This does not include the necessary MIR printer/parser changes to
represent this.
2021-06-29 17:38:51 -04:00
Matt Arsenault 49fa6abf74 Revert "GlobalISel: Use MMO helper for getting the size in bits"
This reverts commit dc98adfb44.

This should still be done, but this is currently causing some commit
ordering issues.
2021-06-29 17:38:51 -04:00
Craig Topper 9132299836 [LegalizeTypes][VE] Don't Expand BITREVERSE/BSWAP during type legalization promotion if they will be promoted for NVT in op legalization.
We were trying to expand these if they were going to be expanded
in op legalization so that we generated the minimum number of
operations. We failed to take into account that NVT could be
promoted to another legal type in op legalization.

Hoping this fixes the issue on the VE target reported as a follow
up to D96681. The check line changes were taken from before
1e46b6f401 so this patch does
appear to improve some cases that had previously regressed.
2021-06-29 11:00:11 -07:00
Jeremy Morse e63b18bc84 Catch an extremely obvious memory leak, thanks asan
https://lab.llvm.org/buildbot/#/builders/5/builds/9208

(dbg-phis-merging-in-ldv.mir and dbg-phis-with-loops.mir in the asan
 check stage)
2021-06-29 15:47:17 +01:00
Jeremy Morse 010108bb2c [DebugInstrRef][3/3] Follow DBG_PHI instructions through LiveDebugValues
This patch reads machine value numbers from DBG_PHI instructions (marking
where SSA PHIs used to be), and matches them up with DBG_INSTR_REF
instructions that refer to them. Essentially they are two separate parts of
a DBG_VALUE: the place to read the value (register and program position),
and where the variable is assigned that value.

Sometimes these DBG_PHIs can be duplicated, usually by tail duplication.
This corresponds to the SSA structure of the program being destroyed, and
the original PHI being split. When this happens: run LLVMs standard
SSAUpdater utility, to work out what values should appear in which blocks.
The majority of this patch is boilerplate to make use of SSAUpdater.

If there are any additional PHIs on the path between multiple DBG_PHIs and
their using DBG_INSTR_REF, their existance is validated, just in case a
value gets clobbered along the way (see dbg-phis-with-loops.mir for
several examples).

Differential Revision: https://reviews.llvm.org/D86814
2021-06-29 14:45:13 +01:00
Michael Liao e818eface8 [MIRParser] Add machine metadata.
- Add standalone metadata parsing support so that machine metadata nodes
  could be populated before and accessed during MIR is parsed.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D103282
2021-06-28 22:29:36 -04:00
Scott Linder 8cd35ad854 [DebugInfo] Enforce implicit constraints on `distinct` MDNodes
Add UNIQUED and DISTINCT properties in Metadata.def and use them to
implement restrictions on the `distinct` property of MDNodes:

* DIExpression can currently be parsed from IR or read from bitcode
  as `distinct`, but this property is silently dropped when printing
  to IR. This causes accepted IR to fail to round-trip. As DIExpression
  appears inline at each use in the canonical form of IR, it cannot
  actually be `distinct` anyway, as there is no syntax to describe it.
* Similarly, DIArgList is conceptually always uniqued. It is currently
  restricted to only appearing in contexts where there is no syntax for
  `distinct`, but for consistency it is treated equivalently to
  DIExpression in this patch.
* DICompileUnit is already restricted to always being `distinct`, but
  along with adding general support for the inverse restriction I went
  ahead and described this in Metadata.def and updated the parser to be
  general. Future nodes which have this restriction can share this
  support.

The new UNIQUED property applies to DIExpression and DIArgList, and
forbids them to be `distinct`. It also implies they are canonically
printed inline at each use, rather than via MDNode ID.

The new DISTINCT property applies to DICompileUnit, and requires it to
be `distinct`.

A potential alternative change is to forbid the non-inline syntax for
DIExpression entirely, as is done with DIArgList implicitly by requiring
it appear in the context of a function. For example, we would forbid:

    !named = !{!0}
    !0 = !DIExpression()

Instead we would only accept the equivalent inlined version:

    !named = !{!DIExpression()}

This essentially removes the ability to create a `distinct` DIExpression
by construction, as there is no syntax for `distinct` inline. If this
patch is accepted as-is, the result would be that the non-canonical
version is accepted, but the following would be an error and produce a diagnostic:

    !named = !{!0}
    ; error: 'distinct' not allowed for !DIExpression()
    !0 = distinct !DIExpression()

Also update some documentation to consistently use the inline syntax for
DIExpression, and to describe the restrictions on `distinct` for nodes
where applicable.

Reviewed By: StephenTozer, t-tye

Differential Revision: https://reviews.llvm.org/D104827
2021-06-28 21:20:04 +00:00
Melanie Blower 931e95687d [llvm][clang][fpenv] Create new intrinsic llvm.arith.fence to control FP optimization at expression level
This intrinsic blocks floating point transformations by the optimizer.

Author: Pengfei

Reviewed By: LuoYuanke, Andy Kaylor, Craig Topper, kpn

Differential Revision: https://reviews.llvm.org/D99675
2021-06-28 12:26:52 -04:00
Sander de Smalen 0e09d18c6a Reland [GlobalISel] NFC: Have LLT::getSizeInBits/Bytes return a TypeSize.
This patch relands https://reviews.llvm.org/D104454, but fixes some failing
builds on Mac OS which apparently has a different definition for size_t,
that caused 'ambiguous operator overload' for the implicit conversion
of TypeSize to a scalar value.

This reverts commit b732e6c9a8.
2021-06-28 15:24:27 +01:00
Ahsan Saghir 31ef15e044 Teach peephole optimizer to not emit sub-register defs
Peephole optimizer should not be introducing sub-reg definitions
as they are illegal in machine SSA phase. This patch modifies
the optimizer to not emit sub-register definitions.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D103408
2021-06-28 09:24:07 -05:00
Brendon Cahoon f9f5d41545 [AMDGPU][GlobalISel] Legalize and select G_SBFX and G_UBFX
Adds legalizer, register bank select, and instruction
select support for G_SBFX and G_UBFX. These opcodes generate
scalar or vector ALU bitfield extract instructions for
AMDGPU. The instructions allow both constant or register
values for the offset and width operands.

The 32-bit scalar version is expanded to a sequence that
combines the offset and width into a single register.

There are no 64-bit vgpr bitfield extract instructions, so the
operations are expanded to a sequence of instructions that
implement the operation. If the width is a constant,
then the 32-bit bitfield extract instructions are used.

Moved the AArch64 specific code for creating G_SBFX to
CombinerHelper.cpp so that it can be used by other targets.
Only bitfield extracts with constant offset and width values
are handled currently.

Differential Revision: https://reviews.llvm.org/D100149
2021-06-28 09:06:44 -04:00
David Blaikie 1b112c80a6 PR37255: DebugInfo: LTO with -g inlined into -gmlt combined with Split DWARF without CU cross-references
A combination of features ^ that lead to a mismatch of expectations
about how a subprogram definition DIE would be produced with/without a
declaration when taking full -g debug info and inlining it into a -gmlt
CU - specifically when using Split DWARF that doesn't support cross-CU
references, so we have to put the -g debug info into the -gmlt CU, which
gets confusing about which mode is respected.

This patch comes down on respecting the CU the debug info is emitted
into, rather than preserving the full debug info when it's emitted into
the gmlt CU.
2021-06-27 14:40:38 -07:00
David Green 2887f14639 [ISel] Port AArch64 SABD and UABD to DAGCombine
This ports the AArch64 SABD and USBD over to DAG Combine, where they can be
used by more backends (notably MVE in a follow-up patch). The matching code
has changed very little, just to handle legal operations and types
differently. It selects from (ABS (SUB (EXTEND a), (EXTEND b))), producing
a ubds/abdu which is zexted to the original type.

Differential Revision: https://reviews.llvm.org/D91937
2021-06-26 19:34:16 +01:00
David Green b8c8bb0769 [DAG] Fold neg(splat(neg(x)) -> splat(x)
This add as a fold of sub(0, splat(sub(0, x))) -> splat(x). This can
come up in the lowering of right shifts under AArch64, where we generate
a shift left of a negated number.

Differential Revision: https://reviews.llvm.org/D103755
2021-06-25 19:53:29 +01:00
Hendrik Greving e15e1417b9 [ModuloSchedule] Pass loop block explicitly to kernel rewriter.
This change is NFC upstream. We pass in the loop's block to the kernel
rewriter explicitly, instead of assuming it's the loop's top block. This
change is made for downstream targets where this assumption doesn't hold.

Differential Revision: https://reviews.llvm.org/D104811
2021-06-25 09:51:22 -07:00
Sander de Smalen b732e6c9a8 Revert "[GlobalISel] NFC: Have LLT::getSizeInBits/Bytes return a TypeSize."
This patch seems to be causing build errors, reverting it for now.

This reverts commit aeab9d9570.
2021-06-25 17:37:16 +01:00
Sander de Smalen aeab9d9570 [GlobalISel] NFC: Have LLT::getSizeInBits/Bytes return a TypeSize.
To reflect that the size may be scalable, a TypeSize is returned
instead of an unsigned. In places where the result is used,
it currently relies on an implicit cast of TypeSize -> uint64_t,
which asserts that the type is not scalable.

This patch is NFC for fixed-width vectors.

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D104454
2021-06-25 17:06:50 +01:00
Sander de Smalen c9acd2f32e [GlobalISel] NFC: Change LLT::changeNumElements to LLT::changeElementCount.
Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D104453
2021-06-25 15:54:00 +01:00
Sander de Smalen 968980ef08 [GlobalISel] NFC: Change LLT::scalarOrVector to take ElementCount.
Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D104452
2021-06-25 11:26:16 +01:00
Martin Storsjö 42f74e8249 [llvm] Rename StringRef _lower() method calls to _insensitive()
This is a mechanical change. This actually also renames the
similarly named methods in the SmallString class, however these
methods don't seem to be used outside of the llvm subproject, so
this doesn't break building of the rest of the monorepo.
2021-06-25 00:22:01 +03:00
Craig Topper 03f9e04bc3 [TargetLowering][ARM] Don't alter opaque constants in TargetLowering::ShrinkDemandedConstant.
We don't constant fold based on demanded bits elsewhere in
SimplifyDemandedBits, so I don't think we should shrink them either.

The affected ARM test changes because a constant become non-opaque
and eventually enabled some constant folding. This no longer happens.
I checked and InstCombine is able to simplify this test. I'm not sure exactly
what it was trying to test.

Reviewed By: lebedev.ri, dmgreen

Differential Revision: https://reviews.llvm.org/D104832
2021-06-24 10:09:36 -07:00
Sander de Smalen d5e14ba88c [GlobalISel] NFC: Change LLT::vector to take ElementCount.
This also adds new interfaces for the fixed- and scalable case:
* LLT::fixed_vector
* LLT::scalable_vector

The strategy for migrating to the new interfaces was as follows:
* If the new LLT is a (modified) clone of another LLT, taking the
  same number of elements, then use LLT::vector(OtherTy.getElementCount())
  or if the number of elements is halfed/doubled, it uses .divideCoefficientBy(2)
  or operator*. That is because there is no reason to specifically restrict
  the types to 'fixed_vector'.
* If the algorithm works on the number of elements (as unsigned), then
  just use fixed_vector. This will need to be fixed up in the future when
  modifying the algorithm to also work for scalable vectors, and will need
  then need additional tests to confirm the behaviour works the same for
  scalable vectors.
* If the test used the '/*Scalable=*/true` flag of LLT::vector, then
  this is replaced by LLT::scalable_vector.

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D104451
2021-06-24 11:26:12 +01:00
Stephen Tozer c72705678c Partial Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"
This is a partial reapply of the original commit and the followup commit
that were previously reverted; this reapply also includes a small fix
for a potential source of non-determinism, but also has a small change
to turn off variadic debug value salvaging, to ensure that any future
revert/reapply steps to disable and renable this feature do not risk
causing conflicts.

Differential Revision: https://reviews.llvm.org/D91722

This reverts commit 386b66b2fc.
2021-06-24 09:46:38 +01:00
Carl Ritson 6b0f98d442 [ValueTypes] Define MVTs for v3i64/v3f64 to complement v6i32/v6f32
Having type symmetry with these is somewhat necessary when implementing support for 192-bit values.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D104621
2021-06-24 12:41:22 +09:00
modimo a7b62699c8 [NFC] [DwarfEHPrepare] Add additional stats for EH
Stats added:

1. NumCleanupLandingPadsUnreachable: how many cleanup landing pads were optimized as unreachable
1. NumCleanupLandingPadsRemaining: how many cleanup landing pads remain
1. NumNoUnwind: Number of functions with nounwind attribute
1. NumUnwind: Number of functions with unwind attribute

DwarfEHPrepare is always run a single time as part of `TargetPassConfig::addISelPasses()` which makes it an ideal place near the end of the pipeline to record this information.

Example output from clang built with exceptions cumulative during thinLTO backend (NumCleanupLandingPadsUnreachable was not incremented):

	"dwarfehprepare.NumCleanupLandingPadsRemaining": 123660,
	"dwarfehprepare.NumNoUnwind": 323836,
	"dwarfehprepare.NumUnwind": 472893,

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D104161
2021-06-23 17:09:30 -07:00
Craig Topper 91319534ba [CGP][RISCV] Teach CodeGenPrepare::optimizeSwitchInst to honor isSExtCheaperThanZExt.
This optimization pre-promotes the input and constants for a
switch instruction to a legal type so that all the generated compares
share the same extend. Since RISCV prefers sext for i32 to i64
extends, we should honor that to use sext.w instead of a pair
of shifts.

Reviewed By: jrtc27

Differential Revision: https://reviews.llvm.org/D104612
2021-06-23 15:38:11 -07:00
Xun Li f09ec01f1f [SjLj] Insert UnregisterFn before musttail call
When inserting UnregisterFn, if there is a musttail call, we must insert before the call so that we don't break the musttail call contract.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D104807
2021-06-23 15:33:55 -07:00
Xun Li f8c84da23b Revert "[SjLj] Insert UnregisterFn before musttail call"
This reverts commit f36703ada3.
Test failure: https://lab.llvm.org/buildbot#builders/104/builds/3450
2021-06-23 15:31:35 -07:00
Xun Li f36703ada3 [SjLj] Insert UnregisterFn before musttail call
When inserting UnregisterFn, if there is a musttail call, we must insert before the call so that we don't break the musttail call contract.

Differential Revision: https://reviews.llvm.org/D104807
2021-06-23 14:29:46 -07:00
Jinsong Ji c125af82a5 [DAGCombine] Check reassoc flags in aggressive fsub fusion
The is from discussion in https://reviews.llvm.org/D104247#inline-993387

The contract and reassoc flags shouldn't imply each other .

All the aggressive fsub fusion reassociate operations,
we should guard them with reassoc flag check.

Reviewed By: mcberg2017

Differential Revision: https://reviews.llvm.org/D104723
2021-06-23 13:59:40 +00:00
Jon Roelofs 493d6928fe [Remarks] Make memsize remarks report as an analysis, not a missed opportunity.
Differential revision: https://reviews.llvm.org/D104078
2021-06-22 18:22:47 -07:00
zhijian bd240b3d77 [AIX][XCOFF] generate eh_info when vector registers are saved according to the traceback table.
Summary:

generate eh_info when vector registers are saved according to the traceback table.

struct eh_info_t {

unsigned version;       /* EH info version 0 */
#if defined(64BIT)

char _pad[4];           /* padding */
#endif

unsigned long lsda;     /* Pointer to Language Specific Data Area */
unsigned long personality; /* Pointer to the personality routine */
};

the value of lsda and personality is zero when the number of vector registers saved is large zero and there is not personality of the function

Reviewers: Jason Liu
Differential Revision: https://reviews.llvm.org/D103651
2021-06-22 13:01:31 -04:00
Fangrui Song f53d791520 Improve the diagnostic of DiagnosticInfoResourceLimit (and warn-stack-size in particular)
Before: `warning: stack size limit exceeded (888) in main`
After: `warning: stack frame size (888) exceeds limit (100) in function 'main'` (the -Wframe-larger-than limit will be mentioned)

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D104667
2021-06-22 09:55:20 -07:00
Sander de Smalen bd7f7e2eba [GlobalISel] Add scalable property to LLT types.
This patch aims to add the scalable property to LLT. The rest of the
patch-series changes the interfaces to take/return ElementCount and
TypeSize, which both have the ability to represent the scalable property.

The changes are mostly mechanical and aim to be non-functional changes
for fixed-width vectors.

For scalable vectors some unit tests have been added, but no effort has
been put into making any of the GlobalISel algorithms work with scalable
vectors yet. That will be left as future work.

The work is split into a series of 5 patches to make reviews easier.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D104450
2021-06-22 08:43:34 +01:00
Eli Friedman 74909e4b6e Rename MachineMemOperand::getOrdering -> getSuccessOrdering.
Since this method can apply to cmpxchg operations, make sure it's clear
what value we're actually retrieving.  This will help ensure we don't
accidentally ignore the failure ordering of cmpxchg in the future.

We could potentially introduce a getOrdering() method on AtomicSDNode
that asserts the operation isn't cmpxchg, but not sure that's
worthwhile.

Differential Revision: https://reviews.llvm.org/D103338
2021-06-21 16:49:27 -07:00
Nick Desaulniers 8ace121305 [IR] convert warn-stack-size from module flag to fn attr
Otherwise, this causes issues when building with LTO for object files
that use different values.

Link: https://github.com/ClangBuiltLinux/linux/issues/1395

Reviewed By: dblaikie, MaskRay

Differential Revision: https://reviews.llvm.org/D104342
2021-06-21 15:09:25 -07:00
Jinsong Ji 3996311ee1 [DAGCombine] reassoc flag shouldn't enable contract
According to IR LangRef, the FMF flag:

contract
Allow floating-point contraction (e.g. fusing a multiply followed by an
addition into a fused multiply-and-add).

reassoc
Allow reassociation transformations for floating-point instructions.
This may dramatically change results in floating-point.

My understanding is that these two flags shouldn't imply each other,
as we might have a SDNode that can be reassociated with others, but
not contractble.

eg: We may want following fmul/fad/fsub to freely reassoc, but don't
want fma being generated here.

   %F = fmul reassoc double %A, %B         ; <double> [#uses=1]
   %G = fmul reassoc double %C, %D         ; <double> [#uses=1]
   %H = fadd reassoc double %F, %G         ; <double> [#uses=1]
   %I = fsub reassoc double %H, %E         ; <double> [#uses=1]

Before https://reviews.llvm.org/D45710, `reassoc` flag actually
did not imply isContratable either.

The current implementation also only check the flag in fadd node,
ignoring fmul node, this patch update that as well.

Reviewed By: spatel, qiucf

Differential Revision: https://reviews.llvm.org/D104247
2021-06-21 21:15:43 +00:00
Hendrik Greving 96994427f2 RegisterCoalescer: Fix iterating through use operands.
Fixes a minor bug when trying to iterate through use operands when
updating debug use operands.

Extends a test to include above.

Differential Revision: https://reviews.llvm.org/D104576
2021-06-21 09:17:54 -07:00
Craig Topper 3a8c7060cc [TypePromotion] Prune Intrinsic includes. NFC
TypePromotion is meant to be a generic pass and doesn't reference
any ARM intrinsics so it shouldn't include IntrinsicsARM.h.
The other Intrinsic related headers appear to be unneeded as well.
2021-06-20 13:04:02 -07:00
Fangrui Song 59d90fe817 Simplify some typedef struct 2021-06-19 11:36:44 -07:00
Michael Liao b9c05aff20 [MIRPrinter] Add machine metadata support.
- Distinct metadata needs generating in the codegen to attach correct
  AAInfo on the loads/stores after lowering, merging, and other relevant
  transformations.
- This patch adds 'MachhineModuleSlotTracker' to help assign slot
  numbers to these newly generated unnamed metadata nodes.
- To help 'MachhineModuleSlotTracker' track machine metadata, the
  original 'SlotTracker' is rebased from 'AbstractSlotTrackerStorage',
  which provides basic interfaces to create/retrive metadata slots. In
  addition, once LLVM IR is processsed, additional hooks are also
  introduced to help collect machine metadata and assign them slot
  numbers.
- Finally, if there is any such machine metadata, 'MIRPrinter' outputs
  an additional 'machineMetadataNodes' field containing all the
  definition of those nodes.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D103205
2021-06-19 12:48:08 -04:00
Hongtao Yu bd52495518 [CSSPGO] Undoing the concept of dangling pseudo probe
As a follow-up to https://reviews.llvm.org/D104129, I'm cleaning up the danling probe related code in both the compiler and llvm-profgen.

I'm seeing a 5% size win for the pseudo_probe section for SPEC2017 and 10% for Ciner. Certain benchmark such as 602.gcc has a 20% size win. No obvious difference seen on build time for SPEC2017 and Cinder.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D104477
2021-06-18 15:14:11 -07:00
Simon Pilgrim 7353beda4a [DAG] SelectionDAG::computeKnownBits - use APInt::insertBits to merge subvector knownbits. NFCI.
As noticed on D104472 we can use APInt::insertBits which will avoid a lot of temporary APInt creations
2021-06-18 14:59:01 +01:00
Jon Roelofs a2ab765029 [GISel] Eliminate redundant bitmasking
This was a GISel vs SDAG regression that showed up at -Os on arm64 in:
SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding.test

https://llvm.godbolt.org/z/aecjodsjG

Differential revision: https://reviews.llvm.org/D103334
2021-06-17 12:53:00 -07:00
David Green fda8b4714e [InterleaveAccess] Copy fast math flags when adjusting binary operators in interleave access pass
The Interleave Access pass will convert shuffle(binop(load, load)) to
binop(shuffle(load), shuffle(load)), in order to create more
interleaving load patterns (VLD2/3/4) that might have been messed up by
instcombine. As shown in D104247 we were missing copying IR flags to the
new instruction though, which should just be kept the same as the
original instruction.

Differential Revision: https://reviews.llvm.org/D104255
2021-06-17 09:53:33 +01:00
Bjorn Pettersson 4c7f820b2b Update @llvm.powi to handle different int sizes for the exponent
This can be seen as a follow up to commit 0ee439b705,
that changed the second argument of __powidf2, __powisf2 and
__powitf2 in compiler-rt from si_int to int. That was to align with
how those runtimes are defined in libgcc.
One thing that seem to have been missing in that patch was to make
sure that the rest of LLVM also handle that the argument now depends
on the size of int (not using the si_int machine mode for 32-bit).
When using __builtin_powi for a target with 16-bit int clang crashed.
And when emitting libcalls to those rtlib functions, typically when
lowering @llvm.powi), the backend would always prepare the exponent
argument as an i32 which caused miscompiles when the rtlib was
compiled with 16-bit int.

The solution used here is to use an overloaded type for the second
argument in @llvm.powi. This way clang can use the "correct" type
when lowering __builtin_powi, and then later when emitting the libcall
it is assumed that the type used in @llvm.powi matches the rtlib
function.

One thing that needed some extra attention was that when vectorizing
calls several passes did not support that several arguments could
be overloaded in the intrinsics. This patch allows overload of a
scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with
an entry for powi.

Differential Revision: https://reviews.llvm.org/D99439
2021-06-17 09:38:28 +02:00
Sushma Unnibhavi 2193347e72 [M68k][GloballSel] Adding initial GlobalISel infrastructure
Wiring up GlobalISel for the M68k backend

Differential Revision: https://reviews.llvm.org/D101819
2021-06-16 10:48:38 -06:00
David Spickett e4ecd83fe9 [llvm][AArch64] Handle arrays of struct properly (from IR)
This only applies to FastIsel. GlobalIsel seems to sidestep
the issue.

This fixes https://bugs.llvm.org/show_bug.cgi?id=46996

One of the things we do in llvm is decide if a type needs
consecutive registers. Previously, we just checked if it
was an array or not.
(plus an SVE specific check that is not changing here)

This causes some confusion when you arbitrary IR like:
```
%T1 = type { double, i1 };
define [ 1 x %T1 ] @foo() {
entry:
  ret [ 1 x %T1 ] zeroinitializer
}
```

We see it is an array so we call CC_AArch64_Custom_Block
which bails out when it sees the i1, a type we don't want
to put into a block.

This leaves the location of the double in some kind of
intermediate state and leads to odd codegen. Which then crashes
the backend because it doesn't know how to implement
what it's been asked for.

You get this:
```
  renamable $d0 = FMOVD0
  $w0 = COPY killed renamable $d0
```

Rather than this:
```
  $d0 = FMOVD0
  $w0 = COPY $wzr
```

The backend knows how to copy 64 bit to 64 bit registers,
but not 64 to 32. It can certainly be taught how but the real
issue seems to be us even trying to assign a register block
in the first place.

This change makes the logic of
AArch64TargetLowering::functionArgumentNeedsConsecutiveRegisters
a bit more in depth. If we find an array, also check that all the
nested aggregates in that array have a single member type.

Then CC_AArch64_Custom_Block's assumption of a type that looks
like [ N x type ] will be valid and we get the expected codegen.

New tests have been added to exercise these situations. Note that
some of the output is not ABI compliant. The aim of this change is
to simply handle these situations and not to make our processing
of arbitrary IR ABI compliant.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D104123
2021-06-16 13:56:01 +00:00
Dylan Fleming dab05335a6 [SVE] Fix PromoteIntRes_TRUNCATE not to call getVectorNumElements
Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D104115
2021-06-16 13:09:43 +01:00
Rong Xu 82a0bb1afc [SampleFDO] Place the discriminator flag variable into the used list.
We create flag variable "__llvm_fs_discriminator__" in the binary
to indicate that FSAFDO hierarchical discriminators are used.

This variable might be GC'ed by the linker since it is not explicitly
reference. I initially added the var to the use list in pass
MIRFSDiscriminator but it did not work. It turned out the used global
list is collected in lowering (before MIR pass) and then emitted in
the end of pass pipeline.

Here I add the variable to the use list in IR level's AddDiscriminators
pass. The machine level code is still keep in the case IR's
AddDiscriminators is not invoked. If this is the case, this just use
-Wl,--export-dynamic-symbol=__llvm_fs_discriminator__
to force the emit.

Differential Revision: https://reviews.llvm.org/D103988
2021-06-15 21:51:04 -07:00
Rong Xu 95f9026c17 Revert "[SampleFDO] Using common linkage for the discriminator flag variable"
This reverts commit 434fed5aff.

Post commit review suggested to use another implmenentation.
Detailed can be found in the review.
2021-06-15 21:22:23 -07:00
Rong Xu 434fed5aff [SampleFDO] Using common linkage for the discriminator flag variable
We create flag variable "__llvm_fs_discriminator__" in the binary
to indicate that FSAFDO hierarchical discriminators are used.

This variable might be GC'ed by the linker since it is not explicitly
reference. I initially added the var to the use list in pass
MIRFSDiscriminator but it did not work. It turned out the used global
list is collected in lowering (before MIR pass) and then emitted in
the end of pass pipeline.

In this patch, we use a "common" linkage for this variable so that
it will be GC'ed by the linker.

Differential Revision: https://reviews.llvm.org/D103988
2021-06-15 14:51:27 -07:00
Roman Lebedev 585e65d330
[TLI] SimplifyDemandedVectorElts(): handle SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT(?, 0))
Iff we have `SCALAR_TO_VECTOR` (and we demand it's only defined 0'th element),
and said scalar was produced by `EXTRACT_VECTOR_ELT` from the 0'th element
of some vector, then we can just continue traversal into said source vector.

This comes up in X86 vector uniform shift lowering.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D104250
2021-06-14 23:52:53 +03:00
Saleem Abdulrasool 5b5833b9e0 SelectionDAG: repair the Windows build
6e5628354e regressed the Windows build as
the return type no longer matched in both branches for the return value
type deduction.  This uses a bit more compiler magic to deal with that.
2021-06-14 08:25:36 -07:00
zhijian 7ed515d168 [AIX][XCOFF] emit vector info of traceback table.
Summary:

emit vector info of traceback table.

Reviewers: Jason Liu,Hubert Tong
Differential Revision: https://reviews.llvm.org/D93659
2021-06-14 11:15:22 -04:00
Roman Lebedev 0f94c3c80d
[NFC][DAGCombine] Extract getFirstIndexOf() lambda back into a function
Not all supported compilers like such lambdas, at least one buildbot is unhappy.
2021-06-14 16:25:59 +03:00
Roman Lebedev 6e5628354e
[DAGCombine] reduceBuildVecToShuffle(): sort input vectors by decreasing size
The sorting, obviously, must be stable, else we will have random assembly fluctuations.

Apparently there was no test coverage that would benefit from that,
so i've added one test.

The sorting consists of two parts - just sort the input vectors,
and recompute the shuffle mask -> input vector mapping.
I don't believe we need to do anything else.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D104187
2021-06-14 16:18:37 +03:00
Jeroen Dobbelaere bb8ce25e88 Intrinsic::getName: require a Module argument
Ensure that we provide a `Module` when checking if a rename of an intrinsic is necessary.

This fixes the issue that was detected by https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32288
(as mentioned by @fhahn), after committing D91250.

Note that the `LLVMIntrinsicCopyOverloadedName` is being deprecated in favor of `LLVMIntrinsicCopyOverloadedName2`.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D99173
2021-06-14 14:52:29 +02:00
RamNalamothu 167e7afcd5 Implement DW_CFA_LLVM_* for Heterogeneous Debugging
Add support in MC/MIR for writing/parsing, and DebugInfo.

This is part of the Extensions for Heterogeneous Debugging defined at
https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html

Specifically the CFI instructions implemented here are defined at
https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html#cfa-definition-instructions

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D76877
2021-06-14 08:51:50 +05:30
Simon Pilgrim 2c4ee1e112 RegUsageInfoPropagate.cpp - remove unused <string> and <map> includes. NFCI. 2021-06-13 15:19:24 +01:00
Florian Hahn 5cd66420cc
Revert "[X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB"
This reverts commit 1b748faf2b because it
breaks building the llvm-test-suite with -verify-machineinstrs on X86:
http://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-x86_64-O3/9585/

Running llc -verify-machineinstr on X86 crashes on the IR below:

    target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"

    %struct.widget = type { i32, i32, i32, i32, i32*, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [16 x [16 x i16]], [6 x [32 x i32]], [16 x [16 x i32]], [4 x [12 x [4 x [4 x i32]]]], [16 x i32], i8**, i32*, i32***, i32**, i32, i32, i32, i32, %struct.baz*, %struct.wobble.1*, i32, i32, i32, i32, i32, i32, %struct.quux.2*, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [3 x i32], i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32***, i32***, i32****, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [3 x [2 x i32]], [3 x [2 x i32]], i32, i32, i64, i64, %struct.zot.3, %struct.zot.3, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 }
    %struct.baz = type { i32, i32, i32, i32, i32, i32, i32, i32, i32, %struct.snork*, %struct.wombat.0*, %struct.wobble*, i32, i32*, i32*, i32*, i32, i32*, i32*, i32*, i32 (%struct.widget*, %struct.eggs*)*, i32, i32, i32, i32 }
    %struct.snork = type { %struct.spam*, %struct.zot, i32 (%struct.wombat*, %struct.widget*, %struct.snork*)* }
    %struct.spam = type { i32, i32, i32, i32, i8*, i32 }
    %struct.zot = type { i32, i32, i32, i32, i32, i8*, i32* }
    %struct.wombat = type { i32, i32, i32, i32, i32, i32, i32, i32, void (i32, i32, i32*, i32*)*, void (%struct.wombat*, %struct.widget*, %struct.zot*)* }
    %struct.wombat.0 = type { [4 x [11 x %struct.quux]], [2 x [9 x %struct.quux]], [2 x [10 x %struct.quux]], [2 x [6 x %struct.quux]], [4 x %struct.quux], [4 x %struct.quux], [3 x %struct.quux] }
    %struct.quux = type { i16, i8 }
    %struct.wobble = type { [2 x %struct.quux], [4 x %struct.quux], [3 x [4 x %struct.quux]], [10 x [4 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [5 x %struct.quux]], [10 x [5 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [15 x %struct.quux]] }
    %struct.eggs = type { [1000 x i8], [1000 x i8], [1000 x i8], i32, i32, i32, i32, i32, i32, i32, i32 }
    %struct.wobble.1 = type { i32, [2 x i32], i32, i32, %struct.wobble.1*, %struct.wobble.1*, i32, [2 x [4 x [4 x [2 x i32]]]], i32, i64, i64, i32, i32, [4 x i8], [4 x i8], i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 }
    %struct.quux.2 = type { i32, i32, i32, i32, i32, %struct.quux.2* }
    %struct.zot.3 = type { i64, i16, i16, i16 }

    define void @blam(%struct.widget* %arg, i32 %arg1) local_unnamed_addr {
    bb:
      %tmp = load i32, i32* undef, align 4
      %tmp2 = sdiv i32 %tmp, 6
      %tmp3 = sdiv i32 undef, 6
      %tmp4 = load i32, i32* undef, align 4
      %tmp5 = icmp eq i32 %tmp4, 4
      %tmp6 = select i1 %tmp5, i32 %tmp3, i32 %tmp2
      %tmp7 = getelementptr inbounds [4 x [4 x i32]], [4 x [4 x i32]]* undef, i64 0, i64 0, i64 0
      %tmp8 = zext i16 undef to i32
      %tmp9 = zext i16 undef to i32
      %tmp10 = load i16, i16* undef, align 2
      %tmp11 = zext i16 %tmp10 to i32
      %tmp12 = zext i16 undef to i32
      %tmp13 = zext i16 undef to i32
      %tmp14 = zext i16 undef to i32
      %tmp15 = load i16, i16* undef, align 2
      %tmp16 = zext i16 %tmp15 to i32
      %tmp17 = zext i16 undef to i32
      %tmp18 = sub nsw i32 %tmp8, %tmp9
      %tmp19 = shl nsw i32 undef, 1
      %tmp20 = add nsw i32 %tmp19, %tmp18
      %tmp21 = sub nsw i32 %tmp11, %tmp12
      %tmp22 = shl nsw i32 undef, 1
      %tmp23 = add nsw i32 %tmp22, %tmp21
      %tmp24 = sub nsw i32 %tmp13, %tmp14
      %tmp25 = shl nsw i32 undef, 1
      %tmp26 = add nsw i32 %tmp25, %tmp24
      %tmp27 = sub nsw i32 %tmp16, %tmp17
      %tmp28 = shl nsw i32 undef, 1
      %tmp29 = add nsw i32 %tmp28, %tmp27
      %tmp30 = sub nsw i32 %tmp20, %tmp29
      %tmp31 = sub nsw i32 %tmp23, %tmp26
      %tmp32 = shl nsw i32 %tmp30, 1
      %tmp33 = add nsw i32 %tmp32, %tmp31
      store i32 %tmp33, i32* undef, align 4
      %tmp34 = mul nsw i32 %tmp31, -2
      %tmp35 = add nsw i32 %tmp34, %tmp30
      store i32 %tmp35, i32* undef, align 4
      %tmp36 = select i1 %tmp5, i32 undef, i32 undef
      br label %bb37

    bb37:                                             ; preds = %bb
      %tmp38 = load i32, i32* undef, align 4
      %tmp39 = ashr i32 %tmp38, %tmp6
      %tmp40 = load i32, i32* undef, align 4
      %tmp41 = sdiv i32 %tmp39, %tmp40
      store i32 %tmp41, i32* undef, align 4
      ret void
    }
2021-06-12 11:41:38 +01:00
Arthur Eubanks c0c5a98b2c [NFC][OpaquePtr] Explicitly pass GEP source type in optimizeGatherScatterInst()
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D103480
2021-06-11 11:49:59 -07:00
Matt Arsenault 9d7299b6f0 GlobalISel: Reduce indentation and remove dead path 2021-06-11 13:45:24 -04:00
Matt Arsenault 93f3c7cc3e CodeGen: Fix missing const 2021-06-11 13:45:24 -04:00
Tomas Matheson 773771ba38 [CodeGen][regalloc] Don't align stack slots if the stack can't be realigned
Register allocation may spill virtual registers to the stack, which can
increase alignment requirements of the stack frame. If the the function
did not require stack realignment before register allocation, the
registers required to do so may not be reserved/available. This results
in a stack frame that requires realignment but can not be realigned.

Instead, only increase the alignment of the stack if we are still able
to realign.

The register SpillAlignment will be ignored if we can't realign, and the
backend will be responsible for emitting the correct unaligned loads and
stores. This seems to be the assumed behaviour already, e.g.
ARMBaseInstrInfo::storeRegToStackSlot and X86InstrInfo::storeRegToStackSlot
are both `canRealignStack` aware.

Differential Revision: https://reviews.llvm.org/D103602
2021-06-11 16:49:12 +01:00
Simon Pilgrim 61cdaf66fe [ADT] Remove APInt/APSInt toString() std::string variants
<string> is currently the highest impact header in a clang+llvm build:

https://commondatastorage.googleapis.com/chromium-browser-clang/llvm-include-analysis.html

One of the most common places this is being included is the APInt.h header, which needs it for an old toString() implementation that returns std::string - an inefficient method compared to the SmallString versions that it actually wraps.

This patch replaces these APInt/APSInt methods with a pair of llvm::toString() helpers inside StringExtras.h, adjusts users accordingly and removes the <string> from APInt.h - I was hoping that more of these users could be converted to use the SmallString methods, but it appears that most end up creating a std::string anyhow. I avoided trying to use the raw_ostream << operators as well as I didn't want to lose having the integer radix explicit in the code.

Differential Revision: https://reviews.llvm.org/D103888
2021-06-11 13:19:15 +01:00
Carl Ritson 2c2d2922a2 [ValueTypes] Define MVTs for v6i32, v6f32, v7i32, v7f32
For use in AMDGPU selection DAG.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D103881
2021-06-11 08:58:16 +09:00
Carl Ritson cfbb92441f [SDAG] Fix pow2 assumption when splitting vectors
When reducing vector builds to shuffles it possible that
the DAG combiner may try to extract invalid subvectors.

This happens as the existing code assumes vectors will be power
of 2 sizes, which is already untrue, but becomes more noticable
with v6 and v7 types.
Specifically the existing code assumes that half PowerOf2Ceil of
a given vector index will fit twice into a given vector.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D103880
2021-06-11 08:58:16 +09:00
Wolfgang Pieb 5a1589fc6d [static initializers] Emit global_ctors and global_dtors in reverse order when .ctors/.dtors are used.
Reviewed By: rnk, MaskRay, efriedma

Differential Revision: https://reviews.llvm.org/D103495
2021-06-10 16:44:47 -07:00
Nick Desaulniers fc018ebb60 [IR] make -warn-frame-size into a module attr
-Wframe-larger-than= is an interesting warning; we can't know the frame
size until PrologueEpilogueInsertion (PEI); very late in the compilation
pipeline.

-Wframe-larger-than= was propagated through CC1 as an -mllvm flag, then
was a cl::opt in LLVM's PEI pass; this meant it was dropped during LTO
and needed to be re-specified via -plugin-opt.

Instead, make it part of the IR proper as a module level attribute,
similar to D103048. Introduce -fwarn-stack-size CC1 option.

Reviewed By: rsmith, qcolombet

Differential Revision: https://reviews.llvm.org/D103928
2021-06-10 16:15:27 -07:00
David Spickett 64de8763aa Revert "Implementation of global.get/set for reftypes in LLVM IR"
This reverts commit 31859f896c.

Causing SVE and RISCV-V test failures on bots.
2021-06-10 10:11:17 +00:00
Paulo Matos 31859f896c Implementation of global.get/set for reftypes in LLVM IR
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D95425
2021-06-10 10:07:45 +02:00
Jinsong Ji 4a89ed373c [AIX] Add traceback ssp canary bit support
We will need to set the ssp canary bit in traceback table to communicate
with unwinder about the canary.

Reviewed By: #powerpc, shchenz

Differential Revision: https://reviews.llvm.org/D103202
2021-06-10 02:40:02 +00:00
Sanjay Patel dd763ac791 [SDAG] fix miscompile from merging stores of different sizes
As shown in:
https://llvm.org/PR50623
...and the similar tests here, we were not accounting for
store merging of different sizes that do not cover the
entire range of the wide value to be stored.

This is the easy fix: just make sure that all of the
original stores are the same size, so when we calculate
the wide width, it's a simple N * M check.

This still allows all of the motivating optimizations from:
D86420 / 54a5dd485c
D87112 / 7a06b166b1

We could enhance this code to track individual bytes and
allow merging multiple sizes.
2021-06-09 09:51:39 -04:00
Fraser Cormack 502edebd9d [ValueTypes][RISCV] Cap RVV fixed-length vectors by size
This patch changes RVV's policy for its supported list of fixed-length
vector types by capping by vector size rather than element count. Now
all 1024-byte vectors (of supported element types) are supported, rather
than all 256-element vectors.

This is a more natural fit for the architecture, and allows us to, for
example, improve the support for vector bitcasts.

This change necessitated the adding of some new simple types to avoid
"regressing" on the number of currently-supported vectors. We round out
the 1024-byte types by adding `v512i8`, `v1024i8`, `v512i16` and
`v512f16`.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D103884
2021-06-09 12:15:37 +01:00
Matt Arsenault 31a9659de5 GlobalISel: Avoid use of G_INSERT in insertParts
G_INSERT legalization is incomplete and doesn't work very
well. Instead try to use sequences of G_MERGE_VALUES/G_UNMERGE_VALUES
padding with undef values (although this can get pretty large).

For the case of load/store narrowing, this is still performing the
load/stores in irregularly sized pieces. It might be cleaner to split
this down into equal sized pieces, and rely on load/store merging to
optimize it.
2021-06-08 14:44:24 -04:00
Matt Arsenault 2927d40f04 GlobalISel: Hide virtual register creation in MIRBuilder 2021-06-08 14:44:24 -04:00
Nick Desaulniers 3787ee4571 reland [IR] make -stack-alignment= into a module attr
Relands commit 433c8d950c with fixes for
MIPS.

Similar to D102742, specifying the stack alignment via CodegenOpts means
that this flag gets dropped during LTO, unless the command line is
re-specified as a plugin opt. Instead, encode this information as a
module level attribute so that we don't have to expose this llvm
internal flag when linking the Linux kernel with LTO.

Looks like external dependencies might need a fix:
* https://github.com/llvm-hs/llvm-hs/issues/345
* https://github.com/halide/Halide/issues/6079

Link: https://github.com/ClangBuiltLinux/linux/issues/1377

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D103048
2021-06-08 10:59:46 -07:00
Justin Bogner 4271e1d2c5 [GlobalISel] Handle non-multiples of the base type in narrowScalarAddSub
When narrowing G_ADD and G_SUB, handle types that aren't a multiple of
the type we're narrowing to. This allows us to handle types like s96
on 64 bit targets.

Note that the test here has a couple of dead instructions because of
the way the setup legalizes. I wasn't able to come up with a way to
write this test that avoids that easily.

Differential Revision: https://reviews.llvm.org/D97811
2021-06-08 10:13:38 -07:00
Justin Bogner 2a7e759734 [GlobalISel] Handle non-multiples of the base type in narrowScalarInsert
When narrowing G_INSERT, handle types that aren't a multiple of the
type we're narrowing to. This comes up if we're narrowing something
like an s96 to fit in 64 bit registers and also for non-byte multiple
packed types if they come up.

This implementation handles these cases by extending the extra bits to
the narrow size and truncating the result back to the destination
size.

Differential Revision: https://reviews.llvm.org/D97791
2021-06-08 10:13:38 -07:00
Simon Pilgrim 114e712c34 InstrEmitter.cpp - don't dereference a dyn_cast<>.
dyn_cast<> can return nullptr which we would then dereference - use cast<> which will assert that the type is correct.
2021-06-08 17:59:04 +01:00
Nick Desaulniers a596b54d47 Revert "[IR] make -stack-alignment= into a module attr"
This reverts commit 433c8d950c.

Breaks the MIPS build.
2021-06-08 08:55:50 -07:00
Nick Desaulniers 433c8d950c [IR] make -stack-alignment= into a module attr
Similar to D102742, specifying the stack alignment via CodegenOpts means
that this flag gets dropped during LTO, unless the command line is
re-specified as a plugin opt. Instead, encode this information as a
module level attribute so that we don't have to expose this llvm
internal flag when linking the Linux kernel with LTO.

Looks like external dependencies might need a fix:
* https://github.com/llvm-hs/llvm-hs/issues/345
* https://github.com/halide/Halide/issues/6079

Link: https://github.com/ClangBuiltLinux/linux/issues/1377

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D103048
2021-06-08 08:31:04 -07:00
Simon Pilgrim 61a2d6bfe4 [DAG] foldShuffleOfConcatUndefs - ensure shuffles of upper (undef) subvector elements is undef (PR50609)
shuffle(concat(x,undef),concat(y,undef)) -> concat(shuffle(x,y),shuffle(x,y))

If the original shuffle references any of the upper (undef) subvector elements, ensure the split shuffle masks uses undef instead of an out-of-bounds value.

Fixes PR50609
2021-06-08 15:49:41 +01:00
Hans Wennborg 386b66b2fc Revert "3rd Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands""
> This reapplies c0f3dfb9, which was reverted following the discovery of
> crashes on linux kernel and chromium builds - these issues have since
> been fixed, allowing this patch to re-land.

This reverts commit 36ec97f76a.

The change caused non-determinism in the compiler, see comments on the code
review at https://reviews.llvm.org/D91722.

Reverting to unbreak people's builds until that can be addressed.

This also reverts the follow-up "[DebugInfo] Limit the number of values
that may be referenced by a dbg.value" in
a0bd6105d8.
2021-06-08 14:54:08 +02:00
Kerry McLaughlin 5db52751a5 [CostModel] Return an invalid cost for memory ops with unsupported types
Fixes getTypeConversion to return `TypeScalarizeScalableVector` when a scalable vector
type cannot be legalized by widening/splitting. When this is the method of legalization
found, getTypeLegalizationCost will return an Invalid cost.

The getMemoryOpCost, getMaskedMemoryOpCost & getGatherScatterOpCost functions already call
getTypeLegalizationCost and will now also return an Invalid cost for unsupported types.

Reviewed By: sdesmalen, david-arm

Differential Revision: https://reviews.llvm.org/D102515
2021-06-08 12:07:36 +01:00
David Green b889c6ee99 [DAG] Allow isNullOrNullSplat to see truncated zeroes
This sets the AllowTruncation flag on isConstOrConstSplat in
isNullOrNullSplat, allowing it to see truncated constant zeroes on
architectures such as AArch64, where only a i32.i64 are legal. As a
truncation of 0 is always 0, this should always be valid, allowing some
extra folding to happen including some of the cases from D103755.

Differential Revision: https://reviews.llvm.org/D103756
2021-06-08 10:18:58 +01:00
Arthur Eubanks 47211fa889 Revert "[TargetLowering] Only inspect attributes in the arguments for ArgListEntry"
Needs to be discussed more.

This reverts commit 255a5c1baa6020c009934b4fa342f9f6dbbcc46
This reverts commit df2056ff3730316f376f29d9986c9913b95ceb1
This reverts commit faff79b7ca144e505da6bc74aa2b2f7cffbbf23
This reverts commit d2a9020785c6e02afebc876aa2778fa64c5cafd
2021-06-07 16:07:44 -07:00
Matt Arsenault dc98adfb44 GlobalISel: Use MMO helper for getting the size in bits 2021-06-07 14:26:48 -04:00
Matt Arsenault f6555b917b GlobalISel: Remove unnecessary .getReg(0)s 2021-06-07 14:26:48 -04:00
Guillaume Chatelet 1da2c7d25c [NFC] Fix semantic discrepancy for MVT::LAST_VALUETYPE
Differential Revision: https://reviews.llvm.org/D103251
2021-06-07 10:04:16 +00:00
Nikita Popov 1ffa6499ea [TargetLowering] Use IRBuilderBase instead of IRBuilder<> (NFC)
Don't require a specific kind of IRBuilder for TargetLowering hooks.
This allows us to drop the IRBuilder.h include from TargetLowering.h.

Differential Revision: https://reviews.llvm.org/D103759
2021-06-06 16:29:50 +02:00
Nikita Popov 506875c879 [TargetLowering] Move methods out of line (NFC)
Move methods using IRBuilder out of line, so we can drop the
dependency on the header.
2021-06-06 16:02:10 +02:00
Nikita Popov 9914200393 [CodeGen] Add missing includes (NFC)
These currently rely on the IRBuilder.h include in TargetLowering.h.
Make them explicit.
2021-06-06 15:48:27 +02:00
Mirko Brkusanin 35ef4c940b [AMDGPU][GlobalISel] Legalize G_ABS
Legalize and select G_ABS so that we can use llvm.abs intrinsic

Differential Revision: https://reviews.llvm.org/D102391
2021-06-04 14:46:43 +02:00