Commit Graph

29326 Commits

Author SHA1 Message Date
Vitaly Buka adefa9ca2e [SafeStack,NFC] "const" cleanup 2020-06-14 23:05:42 -07:00
Vitaly Buka fb1e0f324f [SafeStack,NFC] Add BlockLifetimeInfo constructor 2020-06-14 23:05:42 -07:00
Vitaly Buka 645058036a [SafeStack,NFC] Use IntrinsicInst instead of Instruction 2020-06-14 23:05:41 -07:00
Vitaly Buka f8e411656e [SafeStack,NFC] Move ClColoring into SafeStack.cpp
This allows to reuse the code in other components.
2020-06-14 23:05:41 -07:00
Vitaly Buka 05590a9cb8 [SafeStack,NFC] Move unconditional code into constructor
Prepare to move ClColoring from SafeStackCode to SafeStackLayout.
This will allow to reuse the code in other components.
2020-06-14 23:05:41 -07:00
Chen Zheng bd7096b977 [PowerPC] fma chain break to expose more ILP
This patch tries to reassociate two patterns related to FMA to expose
more ILP on PowerPC.

// Pattern 1:
//   A =  FADD X,  Y          (Leaf)
//   B =  FMA  A,  M21,  M22  (Prev)
//   C =  FMA  B,  M31,  M32  (Root)
// -->
//   A =  FMA  X,  M21,  M22
//   B =  FMA  Y,  M31,  M32
//   C =  FADD A,  B

// Pattern 2:
//   A =  FMA  X,  M11,  M12  (Leaf)
//   B =  FMA  A,  M21,  M22  (Prev)
//   C =  FMA  B,  M31,  M32  (Root)
// -->
//   A =  FMUL M11,  M12
//   B =  FMA  X,  M21,  M22
//   D =  FMA  A,  M31,  M32
//   C =  FADD B,  D

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D80175
2020-06-15 00:00:04 -04:00
Qiu Chaofan f8ef7c99a0 [DAGCombiner] Require ninf for division estimation
Current implementation of division estimation isn't correct for some
cases like 1.0/0.0 (result is nan, not expected inf).

And this change exposes a potential infinite loop: we use
isConstOrConstSplatFP in combineRepeatedFPDivisors to look up if the
divisor is some constant. But it doesn't work after legalized on some
platforms. This patch restricts the method to act before LegalDAG.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D80542
2020-06-14 22:58:22 +08:00
Amanieu d'Antras 6973125cb7 Fix FastISel dropping srcloc metadata from InlineAsm
Summary:
Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46060

I've also added the Extra_IsConvergent flag which was missing from FastISel.

Reviewers: echristo

Reviewed By: echristo

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80759
2020-06-13 16:52:37 +01:00
Roman Lebedev 17f7654152
[NFCI][MachineCopyPropagation] invalidateRegister(): use SmallSet<8> instead of DenseSet.
This decreases the time consumed by the pass [during RawSpeed unity build]
by 25% (0.0586 s -> 0.04388 s).

While that isn't really impressive overall, that wasn't the goal here.
The memory results here are noticeable.
The baseline results are:
```
total runtime: 55.65s.
calls to allocation functions: 19754254 (354960/s)
temporary memory allocations: 4951609 (88974/s)
peak heap memory consumption: 239.13MB
peak RSS (including heaptrack overhead): 463.79MB
total memory leaked: 198.01MB
```
While with this patch the results are:
```
total runtime: 55.37s.
calls to allocation functions: 19068237 (344403/s)   # -3.47 %
temporary memory allocations: 4261772 (76974/s)      # -13.93 % (!!!)
peak heap memory consumption: 239.13MB
peak RSS (including heaptrack overhead): 463.73MB
total memory leaked: 198.01MB
```

So we get rid of *a lot* of temporary allocations.

Using `SmallSet<8>` makes sense to me because at least here
for x86 BdVer2, the size of that set is *never* more than 3,
over all of llvm test-suite + RawSpeed.

The story might be different on other targets,
not sure if it will ever justify whole DenseSet,
but if it does SmallDenseSet might be a compromise.
2020-06-12 23:10:54 +03:00
Michael Liao e7b920e6fe [DAGCombine] Generalize the case (add (or x, c1), c2) -> (add x, (c1 + c2))
Reviewers: arsenm

Subscribers: sdardis, wdng, hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, ecnelises, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81708
2020-06-12 13:53:08 -04:00
Matt Arsenault 350ee7fb3f GlobalISel: Fix not erasing old instruction in sitofp/uitofp lowering 2020-06-12 10:33:23 -04:00
Simon Pilgrim 5509e2cc2e [DAG] foldAddSubOfSignBit - add support for non-uniform vector constants 2020-06-12 14:58:15 +01:00
diggerlin c6be3ea524 [NFC] clean up the AsmPrinter::emitLinkage for AIX part
SUMMARY:

Since we deal with aix emitLinkage in the PPCAIXAsmPrinter::emitLinkage() in the patch https://reviews.llvm.org/D75866. It do not go to AsmPrinter::emitLinkage() any more, we clean up some aix related code in the AsmPrinter::emitLinkage()

Reviewers:  Jason liu

Differential Revision: https://reviews.llvm.org/D81613
2020-06-11 13:33:51 -04:00
Petar Avramovic bd3d951b8b AMDGPU/GlobalISel: Fix lower for f64->f16 G_FPTRUNC
Put AND before ADD in LegalizerHelper::lowerFPTRUNC_F64_TO_F16
in order to match algorithm from AMDGPUTargetLowering::LowerFP_TO_FP16.

Differential Revision: https://reviews.llvm.org/D81666
2020-06-11 18:19:27 +02:00
Dominik Montada f24e2e9eeb [GlobalISel] fix crash in IRTranslator, MachineIRBuilder when translating @llvm.dbg.value intrinsic and using -debug
Summary:
Fix crash when using -debug caused by the GlobalISel observer trying to print
an incomplete DBG_VALUE instruction. This was caused by the MachineIRBuilder
using buildInstr, which immediately inserts the instruction causing print,
instead of using BuildMI to first build up the instruction and using
insertInstr when finished.

Add RUN-line to existing debug-insts.ll test with -debug flag set to make sure
no crash is happening.

Also fixed a missing %s in the 2nd RUN-line of the same test.

Reviewers: t.p.northover, aditya_nandakumar, aemerson, dsanders, arsenm

Reviewed By: arsenm

Subscribers: wdng, arsenm, rovka, hiraditya, volkan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76934
2020-06-11 10:47:49 +02:00
David Sherwood bd97342a0c [CodeGen] Let computeKnownBits do something sensible for scalable vectors
Until we have a real need for computing known bits for scalable
vectors I have simply changed the code to bail out for now and
pretend we know nothing. I've also fixed up some simple callers of
computeKnownBits too.

Differential Revision: https://reviews.llvm.org/D80437
2020-06-11 08:17:11 +01:00
Matt Arsenault 0671a4c508 RegAllocFast: Avoid unused method warning in release builds 2020-06-10 15:23:56 -04:00
Matt Arsenault 0f2af15c1b GlobalISel: Make default implementation of legalizeCustom unreachable
If the target explicitly requested custom legalization, it should be
required to implement this. Also move default legalizeIntrinsic
implementation into the header so it's next to the related
legalizeCustom.
2020-06-10 11:05:59 -04:00
Wang, Pengfei 6eb9eae010 [MS] Copy the symbols assigned to the former instruction when memory folding.
The memory folding raplaced the old instruction without copying the symbols assigned. Which will resulted in built fail due to the lost symbols.

Reviewed by craig.topper

Differential Revision: https://reviews.llvm.org/D78471
2020-06-10 15:38:32 +08:00
diggerlin edd819c757 [AIX] supporting the visibility attribute for aix assembly
SUMMARY:

in the aix assembly , it do not have .hidden and .protected directive.
in current llvm. if a function or a variable which has visibility attribute, it will generate something like the .hidden or .protected , it can not recognize by aix as.
in aix assembly, the visibility attribute are support in the pseudo-op like
.extern Name [ , Visibility ]
.globl Name [, Visibility ]
.weak Name [, Visibility ]

in this patch, we implement the visibility attribute for the global variable, function or extern function .

for example.

extern __attribute__ ((visibility ("hidden"))) int
  bar(int* ip);
__attribute__ ((visibility ("hidden"))) int b = 0;
__attribute__ ((visibility ("hidden"))) int
  foo(int* ip){
   return (*ip)++;
}
the visibility of .comm linkage do not support , we will have a separate patch for it.
we have the unsupported cases ("default" and "internal") , we will implement them in a a separate patch for it.

Reviewers: Jason Liu ,hubert.reinterpretcast,James Henderson

Differential Revision: https://reviews.llvm.org/D75866
2020-06-09 16:15:06 -04:00
Matt Arsenault 32823091c3 GlobalISel: Set instr/debugloc before any legalizer action
It was annoying enough that every custom lowering needed to set the
insert point, but this was made worse since now these all needed to be
updated to setInstrAndDebugLoc. Consolidate these so every
legalization action has the right insert position by default.

This should fix dropping debug info in every custom AMDGPU
legalization.
2020-06-09 15:37:02 -04:00
Matt Arsenault b94c9e3b55 GlobalISel: Improve MachineIRBuilder construction
The current relationship between LegalizerHelper and MachineIRBuilder
confuses me, because the LegalizerHelper modifies the MachineIRBuilder
which it does not own. Constructing a LegalizerHelper destroys the
insert point, since the constructor calls setMF, which clears all the
fields. Try to separate these functions, so it's possible to construct
a LegalizerHelper from an existing MachineIRBuilder without losing the
insert point/debug loc.
2020-06-09 15:05:04 -04:00
Matt Arsenault babbf4441b GlobalISel: Move some trivial MIRBuilder methods into the header
The construction APIs for MachineIRBuilder don't make much sense, and
it's been annoying to sort through it with these trivial functions
separate from the declaration.
2020-06-09 15:04:48 -04:00
Matt Arsenault bb6cb6bfe4 GlobalISel: Remove redundant check in verifier
This was already checked earlier for all instructions.
2020-06-09 15:04:27 -04:00
Matt Arsenault 6eeac6ae33 GlobalISel: Fix double printing new instructions in legalizer
New instructions were getting printed both in createdInstr, and in the
final printNewInstrs, so it made it look like the same instructions
were created twice. This overall made reading the debug output
harder. Stop printing the initial construction and only print new
instructions in the summary at the end. This avoids printing the less
useful case where instructions are sometimes initially created with no
operands.

I'm not sure this is the correct instance to remove; now the visible
ordering is different. Now you will typically see the one erased
instruction message before all the new instructions in order. I think
this is the more logical view of typical legalization changes,
although it's mechanically backwards from the normal
insert-new-erase-old pattern.
2020-06-09 15:02:31 -04:00
David Green 2fea3fe41c [MachineScheduler] Update available queue on the first mop of a new cycle
If a resource can be held for multiple cycles in the schedule model
then an instruction can be placed into the available queue, another
instruction can be scheduled, but the first will not be taken back out if
the two instructions hazard. To fix this make sure that we update the
available queue even on the first MOp of a cycle, pushing available
instructions back into the pending queue if they now conflict.

This happens with some downstream schedules we have around MVE
instruction scheduling where we use ResourceCycles=[2] to show the
instruction executing over two beats. Apparently the test changes here
are OK too.

Differential Revision: https://reviews.llvm.org/D76909
2020-06-09 19:13:53 +01:00
Sanjay Patel 702cf93356 [DAGCombiner] allow more folding of fadd + fmul into fma
If fmul and fadd are separated by an fma, we can fold them together
to save an instruction:
fadd (fma A, B, (fmul C, D)), N1 --> fma(A, B, fma(C, D, N1))

The fold implemented here is actually a specialization - we should
be able to peek through >1 fma to find this pattern. That's another
patch if we want to try that enhancement though.

This transform was guarded by the TLI hook enableAggressiveFMAFusion(),
so it was done for some in-tree targets like PowerPC, but not AArch64
or x86. The hook is protecting against forming a potentially more
expensive computation when fma takes longer to execute than a single
fadd. That hook may be needed for other transforms, but in this case,
we are replacing fmul+fadd with fma, and the fma should never take
longer than the 2 individual instructions.

'contract' FMF is all we need to allow this transform. That flag
corresponds to -ffp-contract=fast in Clang, so we are allowed to form
fma ops freely across expressions.

Differential Revision: https://reviews.llvm.org/D80801
2020-06-09 10:41:27 -04:00
Guillaume Chatelet 800e100588 Revert "[Alignment][NFC] Migrate TargetLowering::allowsMemoryAccess"
This reverts commit f21c52667e.
2020-06-09 10:43:59 +00:00
Simon Wallis 4dba59689d [ARM] prologue instructions emitted for naked function with >64 byte argument
Summary:

The naked function attribute is meant to suppress all function
prologue/epilogue instructions.

On ARM, some are still emitted if an argument greater than 64 bytes in size
(the threshold for using the byval attribute in IR) is passed partially
in registers.

Perform the check for Attribute::Naked and early exit in
SelectionDAGISel::LowerArguments().

Checking in ARMFrameLowering::determineCalleeSaves() is too late.

A test case is included.

Reviewers: llvm-commits, olista01, danielkiss

Reviewed By: danielkiss

Subscribers: kristof.beyls, hiraditya, danielkiss

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80715

Change-Id: Icedecf2a4ad31bc3c35ab0df7489a9d346e1f7cc
2020-06-09 11:33:03 +01:00
Guillaume Chatelet 3b6196c9b3 [Alignment][NFC] TargetLowering::allowsMisalignedMemoryAccesses
Summary:
Note to downstream target maintainers: this might silently change the semantics of your code if you override `TargetLowering::allowsMisalignedMemoryAccesses` without marking it override.

This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81374
2020-06-09 10:17:42 +00:00
Guillaume Chatelet f21c52667e [Alignment][NFC] Migrate TargetLowering::allowsMemoryAccess
Summary:
Note to downstream target maintainers: this might silently change the semantics of your code if you override `TargetLowering::allowsMemoryAccess` without marking it override.

This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81379
2020-06-09 10:11:07 +00:00
Guillaume Chatelet e26ed6bdae Fix unused variable warning 2020-06-09 08:56:05 +00:00
Kang Zhang 1b6602275d [MachineVerifier] Add TiedOpsRewritten flag to fix verify two-address error
Summary:
Currently, MachineVerifier will attempt to verify that tied operands
satisfy register constraints as soon as the function is no longer in
SSA form. However, PHIElimination will take the function out of SSA
form while TwoAddressInstructionPass will actually rewrite tied operands
to match the constraints. PHIElimination runs first in the pipeline.
Therefore, whenever the MachineVerifier is run after PHIElimination,
it will encounter verification errors on any tied operands.

This patch adds a function property called TiedOpsRewritten that will be
set by TwoAddressInstructionPass and will control when the verifier checks
tied operands.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D80538
2020-06-09 07:39:42 +00:00
David Sherwood cc8872400c [CodeGen] Ensure callers of CreateStackTemporary use sensible alignments
In two instances of CreateStackTemporary we are sometimes promoting
alignments beyond the stack alignment. I have introduced a new function
called getReducedAlign that will return the alignment for the broken
down parts of illegal vector types. For example, on NEON a <32 x i8>
type is made up of two <16 x i8> types - in this case the sensible
alignment is 16 bytes, not 32.

In the legalization code wherever we create stack temporaries I have
started using the reduced alignments instead for illegal vector types.

I added a test to

  CodeGen/AArch64/build-one-lane.ll

that tries to insert an element into an illegal fixed vector type
that involves creating a temporary stack object.

Differential Revision: https://reviews.llvm.org/D80370
2020-06-09 08:10:17 +01:00
Yonghong Song 3eb465a329 [DebugInfo] Fix assertion for extern void type
Commit d77ae1552f ("[DebugInfo] Support to emit debugInfo
for extern variables") added support to emit debuginfo
for extern variables. Currently, only BPF target enables to
emit debuginfo for extern variables.

But if the extern variable has "void" type, the compilation will
fail.

  -bash-4.4$ cat t.c
  extern void bla;
  void *test() {
    void *x = &bla;
    return x;
  }
  -bash-4.4$ clang -target bpf -g -O2 -S t.c
  missing global variable type
  !1 = distinct !DIGlobalVariable(name: "bla", scope: !2, file: !3, line: 1,
                                  isLocal: false, isDefinition: false)
  ...
  fatal error: error in backend: Broken module found, compilation aborted!
  PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace,
      preprocessed source, and associated run script.
  Stack dump:
  ...

The IR requires a DIGlobalVariable must have a valid type and the
"void" type does not generate any type, hence the above fatal error.

Note that if the extern variable is defined as "const void", the
compilation will succeed.

-bash-4.4$ cat t.c
extern const void bla;
const void *test() {
  const void *x = &bla;
  return x;
}
-bash-4.4$ clang -target bpf -g -O2 -S t.c
-bash-4.4$ cat t.ll
...
!1 = distinct !DIGlobalVariable(name: "bla", scope: !2, file: !3, line: 1,
                                type: !6, isLocal: false, isDefinition: false)
!6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: null)
...

Since currently, "const void extern_var" is supported by the
debug info, it is natural that "void extern_var" should also
be supported. This patch disabled assertion of "void extern_var"
in IR verifier and add proper guarding when emiting potential
null debug info type to dwarf types.

Differential Revision: https://reviews.llvm.org/D81131
2020-06-08 13:43:18 -07:00
Andrew Litteken bb677cacc8 [SuffixTree][MachOpt] Factoring out Suffix Tree and adding Unit Tests
This moves the SuffixTree test used in the Machine Outliner and moves it into Support for use in other outliners elsewhere in the compilation pipeline.

Differential Revision: https://reviews.llvm.org/D80586
2020-06-08 12:44:18 -07:00
Hendrik Greving f3d8a93970 [ModuloSchedule] Support instructions with > 1 destination when walking canonical use.
Fixes a minor bug that led to finding the wrong register if the definition had more
than one register destination.
2020-06-08 11:43:59 -07:00
Jan-Willem Maessen 3610d31e7a [NFC] Fix quadratic LexicalScopes::constructScopeNest
We sometimes have functions with large numbers of sibling basic
blocks (usually with an error path exit from each one). This was
triggering the qudratic behavior in this function - after visiting
each child llvm would re-scan the parent from the beginning again. We
modify the work stack to record the next index to be worked on
alongside the pointer. This avoids the need to linearly search for
the next unfinished child.

Differential Revision: https://reviews.llvm.org/D80029
2020-06-08 18:40:56 +01:00
Christopher Tetreault caa2fddce7 [SVE] Eliminate calls to default-false VectorType::get() from CodeGen
Reviewers: efriedma, c-rhodes, david-arm, spatel, craig.topper, aqjune, paquette, arsenm, gchatelet

Reviewed By: spatel, gchatelet

Subscribers: wdng, tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80313
2020-06-08 10:26:10 -07:00
Guillaume Chatelet 54076610dc [Alignment][NFC] Deprecate dead code from CallingConvLower.h
Summary: This is a followup on D81196.

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81362
2020-06-08 14:49:39 +00:00
Matt Arsenault 5f7e38d8f4 GlobalISel: Use Register 2020-06-08 10:15:53 -04:00
Matt Arsenault f13ba22227 GlobalISel: Remove unused header 2020-06-08 10:15:53 -04:00
Matt Arsenault f41994f85b GlobalISel: Make it clearer that regbank/class are mutually exclusive 2020-06-08 10:15:53 -04:00
Matt Arsenault c1d771dc4b GlobalISel: Simplify debug printing 2020-06-08 10:15:53 -04:00
Guillaume Chatelet 94b0c32a0b [Alignment][NFC] Migrate HandleByVal to Align
Summary: Note to downstream target maintainers: this might silently change the semantics of your code if you override `TargetLowering::HandleByVal` without marking it `override`.

This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81365
2020-06-08 10:50:27 +00:00
Sander de Smalen ae09670ee4 [CodeGen][SVE] CopyToReg: Split scalable EVTs that are not powers of 2
Scalable vectors cannot use 'BUILD_VECTOR', so it is necessary to
properly split and widen scalable vectors when passing them
to CopyToReg/CopyFromReg.

This functionality is added to TargetLoweringBase::getVectorTypeBreakdown().

This patch only adds support for 'splitting' scalable vectors that
are a multiple of some legal type, e.g.

      <vscale x 6 x i64> -> 3 x <vscale x 2 x i64>

Reviewers: efriedma, c-rhodes

Reviewed By: efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80139
2020-06-08 10:39:18 +01:00
James Y Knight 748d92b4d3 Simplify MachineVerifier's block-successor verification.
There's two properties we want to verify:

1. That the successors returned by analyzeBranch are in the CFG
   successor list, and
2. That there are no extraneous successors are in the CFG successor
   list.

The previous implementation mostly accomplished this, but in a very
convoluted manner.

Differential Revision: https://reviews.llvm.org/D79793
2020-06-06 22:30:51 -04:00
James Y Knight 1978309db1 MachineBasicBlock::updateTerminator now requires an explicit layout successor.
Previously, it tried to infer the correct destination block from the
successor list, but this is a rather tricky propspect, given the
existence of successors that occur mid-block, such as invoke, and
potentially in the future, callbr/INLINEASM_BR. (INLINEASM_BR, in
particular would be problematic, because its successor blocks are not
distinct from "normal" successors, as EHPads are.)

Instead, require the caller to pass in the expected fallthrough
successor explicitly. In most callers, the correct block is
immediately clear. But, in MachineBlockPlacement, we do need to record
the original ordering, before starting to reorder blocks.

Unfortunately, the goal of decoupling the behavior of end-of-block
jumps from the successor list has not been fully accomplished in this
patch, as there is currently no other way to determine whether a block
is intended to fall-through, or end as unreachable. Further work is
needed there.

Differential Revision: https://reviews.llvm.org/D79605
2020-06-06 22:30:51 -04:00
Simon Pilgrim f14d4c9c54 EHPersonalities.h - reduce Triple.h include to forward declaration. NFC.
Move implicit include dependencies down to source files.
2020-06-06 15:48:31 +01:00
Sanjay Patel 302cc8a121 [DAGCombiner] clean-up FMA+FMUL folds; NFC
D80801 suggests some readability improvements before mocing this block.
2020-06-06 10:32:54 -04:00
Nikita Popov cb5724c71e [CGP] Remove unnecessary MaybeAlign use (NFC)
Stores now always have an alignment.
2020-06-05 23:18:26 +02:00
Matt Arsenault eaa8af9322 GlobalISel: Add helper for constructing load from offset 2020-06-05 15:06:03 -04:00
Matt Arsenault 45e1a22a92 GlobalISel: Make known bits/alignment API more consistent
Just computing the alignment makes sense without caring about the
general known bits, such as for non-integral pointers. Separate the
two and start calling into the TargetLowering hooks for frame indexes.

Start calling the TargetLowering implementation for FrameIndexes,
which improves the AMDGPU matching for stack addressing modes. Also
introduce a new hook for returning known alignment of target
instructions. For AMDGPU, it would be useful to report the known
alignment implied by certain intrinsic calls.

Also stop using MaybeAlign.
2020-06-05 14:57:22 -04:00
Nikita Popov d370088611 [LiveDebugValues] Fix output stream (NFC)
This should dump to the provided Out, rather than dbgs(), though
they coincide in current usage.
2020-06-05 20:02:22 +02:00
Nikita Popov 6a53264926 [LiveDebugValues] Remove PendingInLocs (NFC)
PendingInLocs ends up having the same value as InLocs, just computed
a bit more indirectly. It is a leftover of a previous implementation
approach.

This patch drops PendingInLocs, as well as the Diff and Removed
calulations, which are no longer needed.

Differential Revision: https://reviews.llvm.org/D80868
2020-06-05 20:01:29 +02:00
Sander de Smalen 937cb7a8c7 Reland D80640: [CodeGen][SVE] Calculate correct type legalization for scalable vectors.
This reverts commit 9bcef270d7.
2020-06-05 18:09:31 +01:00
Sander de Smalen 9bcef270d7 Revert "[CodeGen][SVE] Calculate correct type legalization for scalable vectors."
Seems to break some buildbots, reverting the patch for now.

This reverts commit 164f4b9d26.
2020-06-05 16:03:52 +01:00
Sander de Smalen 164f4b9d26 [CodeGen][SVE] Calculate correct type legalization for scalable vectors.
This patch updates TargetLoweringBase::computeRegisterProperties and
TargetLoweringBase::getTypeConversion to support scalable vectors,
and make the right calls on how to legalise them. These changes are required
to legalise both MVTs and EVTs.

Reviewers: efriedma, david-arm, ctetreau

Reviewed By: efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80640
2020-06-05 15:20:34 +01:00
Denis Antrushin dae64d8f42 Fix build breakage caused by 66a1b83bf9 2020-06-05 15:53:09 +03:00
Denis Antrushin 66a1b83bf9 [TargetLowering][NFC] More efficient emitPatchpoint().
Current implementation of emitPatchpoint() is very inefficient:
for every FrameIndex operand if creates new MachineInstr with
that operand expanded and all other copied as is.
Since PATCHPOINT/STATEPOINT instructions may have *a lot* of
FrameIndex operands, we end up creating and erasing many
machine instructions. But we can do it in single pass, with only
one new machine instruction generated.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D81181
2020-06-05 14:57:29 +03:00
Kerry McLaughlin 89fc0166f5 [CodeGen][SVE] Legalisation of extends with scalable types
Summary:
This patch adds legalisation of extensions where the operand
of the extend is a legal scalable type but the result is not.

EXTRACT_SUBVECTOR is used to split the result, before
being replaced by target-specific [S|U]UNPK[HI|LO] operations.

For example:

```
zext <vscale x 16 x i8> %a to <vscale x 16 x i16>
```
should emit:

```
uunpklo z2.h, z0.b
uunpkhi z1.h, z0.b
```

Reviewers: sdesmalen, efriedma, david-arm

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, huihuiz, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79587
2020-06-05 12:08:42 +01:00
Philip Reames 4c735439fd [Statepoint] Migrate a few tests to gc-live bundle format and fix assert
The assert was missed in 0e7c7705, migrating the test revealed the problem.
2020-06-04 18:15:58 -07:00
Vedant Kumar 198762680e [LiveDebugValues] Cache LexicalScopes::getMachineBasicBlocks, NFCI
Summary:
Cache the results from getMachineBasicBlocks in LexicalScopes to speed
up UserValueScopes::dominates queries.  This replaces the caching done
in UserValueScopes. Compared to the old caching method, this reduces
memory traffic when a VarLoc is copied (e.g. when a VarLocMap grows),
and enables caching across basic blocks.

When compiling sqlite 3.5.7 (CTMark version), this patch reduces the
number of calls to getMachineBasicBlocks from 10,207 to 1,093. I also
measured a small compile-time reduction (~ 0.1% of total wall time, on
average, on my machine).

As a drive-by, I made the DebugLoc in UserValueScopes a const reference
to cut down on MetadataTracking traffic.

Reviewers: jmorse, Orlando, aprantl, nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80957
2020-06-04 16:58:45 -07:00
Matt Arsenault af867b7850 DAG: Change computeKnownBitsForFrameIndex to be usable by GISel
This wasn't getting much value from the DAG or depth arguments, since
it's only called on the frame index root nodes. FrameIndexes can also
only return a scalar value, so it also didn't need DemandedElts.
2020-06-04 10:50:26 -04:00
Matt Arsenault 931a68f26b RegAllocFast: Remove dead code 2020-06-04 09:38:31 -04:00
Sanjay Patel 652b3757c8 [x86] add test/code comment for chain value use (PR46195); NFC 2020-06-04 09:15:17 -04:00
Simon Pilgrim adf10dcf2e [DAG] scalarizeBinOpOfSplats - extract from the source of splat vector (PR46189)
D79003/rG9fa58d1bf2f8 exposed an issue with scalarizeBinOpOfSplats that we were extracting from the splatted vector result instead of the source, the splat index is only valid for the source vector not the result, which may contain undefs, including at the splat index.
2020-06-04 11:58:59 +01:00
Tim Northover 87e24c3200 Revert "[DAGCombiner] avoid unnecessary indirection from SDNode/SDValue; NFCI"
This reverts commit 21dadd774f.

In at least PromoteIntBinOps, they wanted to know about users of *all* values
produced by the node not just the integer being promoted. For example not
replacing chain users if the operation was a load breaks the ordering of the
DAG.
2020-06-04 11:53:14 +01:00
Madhur Amilkanthwar b3cff3c720 Utility to dump .dot representation of SelectionDAG without firing viewer
Summary:
This patch adds support for dumping .dot
representation of SelectionDAG. It is inspired from the fact that,
a developer may want to just dump the graph at
a predictable path with a simple name to compare.
The exisitng utility (i.e. viewGraph) are overkill
for this motive hence this patch adds the requires support
while using the core routines from GraphWriter.

Example usage: DAG.dumpDotGraph("/tmp/graph.dot", "MyGraph")
will create /tmp/graph.dot file when DAG is an
object of SelectionDAG class.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D80711
2020-06-04 11:51:48 +05:30
Philip Reames ab6779bbd8 [Statepoint] Remove last of old ImmutableStatepoint code
To do so, I had to sink the old school inline operand handling into GCStatepointInst which is non ideal.  This code should be removed shortly and I was able to at least clean it up a bunch.
2020-06-03 20:31:17 -07:00
Philip Reames 91dd2f2536 [Statepoint] Delete more dead code from old wrappers
The verify() routine duplicates IR/Verifier.cpp checks, so while not technically dead it doesn't add any value either.
2020-06-03 20:10:30 -07:00
Matt Arsenault ed5017e153 GlobalISel: Start defining strict FP instructions
The AMDGPU lowering for unconstrained G_FDIV sometimes needs to
introduce a mode switch in the middle, so it's helpful to have
constrained instructions available to legalize this. Right now nothing
is preventing reordering of the mode switch with the other
instructions in the expansion.
2020-06-03 20:46:37 -04:00
Quentin Colombet ccb3c8e861 [RegisterCoalescer] Update empty subranges when rematerializing
When we rematerialize a value as part of the coalescing, we may
widen the register class of the destination register.
When this happens, updateRegDefUses may create additional subranges
to account for the wider register class.
The created subranges are empty and if they are not defined by
the rematerialized instruction we clean them up.
However, if they are defined by the rematerialized instruction but
unused, we failed to flag them as dead definition and would leave
them as empty live-range.
This is wrong because empty live-ranges don't interfere with anything,
thus if we don't fix them, we would fail to account that the
rematerialized instruction clobbers some lanes.

E.g., let us consider the following pseudo code:
def.lane_low64:reg128 = ldimm
newdef:reg32 = COPY def.lane_low64_low32

When rematerialization happens for newdef, we end up with:
newdef.lane_low64:reg128 = ldimm
 = use newdef.lane_low64_low32

Let's look at the live interval of newdef.
Before rematerialization, we would get:
newdef [defIdx, useIdx:0) 0@defIdx

Right after updateRegDefUses, newdef register class is widen to reg128
and the subrange definitions will be augmented to fill the subreg that
is used at the definition point, here lane_low64.
The resulting live interval would be:
newdef [newDefIdx, useIdx:0) 0@newDefIdx
 * lane_low64_high32 EMPTY
 * lane_low64_low32 [newDefIdx, useIdx:0)

Before this patch this would be the final status of the live interval.
Therefore we miss that lane_low64_high32 is actually live on the
definition point of newdef.

With this patch, after rematerializing, we check all the added subranges
and for the ones that are defined but empty, we flag them as dead def.
Thus, in that case, newdef would look like this:
newdef [newDefIdx, useIdx:0) 0@newDefIdx
 * lane_low64_high32 [newDefIdx, newDefIdxDead) ; <-- instead of EMPTY
 * lane_low64_low32 [newDefIdx, useIdx:0)

This fixes https://www.llvm.org/PR46154
2020-06-03 17:10:55 -07:00
Matt Arsenault 3866e0a563 GlobalISel: Fail expansion of G_DYN_STACKALLOC for StackGrowsUp 2020-06-03 19:56:07 -04:00
Philip Reames 382b3023cb [Statepoints][CGP] Minor parameter type cleanup 2020-06-03 16:00:38 -07:00
Matt Arsenault 66251f7e1d RegAllocFast: Record internal state based on register units
Record internal state based on register units. This is often more
efficient as there are typically fewer register units to update
compared to iterating over all the aliases of a register.

Original patch by Matthias Braun, but I've been rebasing and fixing it
for almost 2 years and fixed a few bugs causing intermediate failures
to make this patch independent of the changes in
https://reviews.llvm.org/D52010.
2020-06-03 16:51:46 -04:00
Victor Huang 3abe7aca45 [CodeGen] Enable tail call position check for speculatable functions
In the function "Analysis.cpp:isInTailCallPosition", it only checks whether
a call is in a tail call position if the call has side effects, access memory
or it is not safe to speculative execute. Therefore, a speculatable function
will not go through tail call position check and improperly tail called when
it is not in a tail-call position. This patch enables tail call position check
for speculatable functions.

Differential Revision: https://reviews.llvm.org/D80661
2020-06-03 10:37:45 -05:00
Kang Zhang 2cc77b2b8a [LiveVariables] Don't set undef reg PHI used as live for FromMBB
Summary:
In the patch D73152, it adds a new function LiveVariables::addNewBlock.
This new function will add the reg which PHI used to the MBB which reg
is from.
But the new function may cause LiveVariable Verification failed when the
Src reg in PHI is undef.

Reviewed By: bjope

Differential Revision: https://reviews.llvm.org/D80077
2020-06-03 15:25:30 +00:00
Henry Kao c57e41c000 [CodeGen][SVE] Replace deprecated calls in getCopyFromPartsVector()
Summary: Replaced getVectorNumElements() with getVectorElementCount(). Added operator overloads for class ElementCount. Fixes warning in several AArch64 unit tests.

Reviewers: sdesmalen, kmclaughlin, dancgr, efriedma, each, andwar, rengolin

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80826
2020-06-03 11:20:02 -04:00
Simon Pilgrim ea80b40669 [DAG] SimplifyDemandedBits - peek through SHL if we only demand sign bits.
If we're only demanding the (shifted) sign bits of the shift source value, then we can use the value directly.

This handles SimplifyDemandedBits/SimplifyMultipleUseDemandedBits for both ISD::SHL and X86ISD::VSHLI.

Differential Revision: https://reviews.llvm.org/D80869
2020-06-03 16:11:54 +01:00
Simon Pilgrim c438b257f1 [DAG] GetDemandedBits - don't bother asserting for a non-null cast<> result. NFC.
cast<> will assert on failure anyhow.

This lets us fold the cast<> with the getAPIntValue() that uses it.
2020-06-03 12:43:07 +01:00
Simon Pilgrim 7a96c181d0 TargetFrameLowering.h - remove unnecessary includes. NFC.
Move TargetFrameLowering.h include to the top of the TargetFrameLoweringImpl.cpp includes (clang-format doesn't do this by default as the filenames don't match).
2020-06-03 11:12:42 +01:00
Kadir Cetinkaya c5468253aa
[llvm] Fix unused variable warnings 2020-06-03 11:49:01 +02:00
Djordje Todorovic dd1bc59b72 [CSInfo][MIPS][DwarfDebug] Add support for delay slots
This adds call site info support for call instructions with delay slot.
Search for instructions inside call delay slot, which load value
into parameter forwarding registers.
Return address of the call points to instruction after call delay slot,
which is not the one, immediately after the call instruction.

Patch by Nikola Tesic

Differential revision: https://reviews.llvm.org/D78107
2020-06-03 11:25:17 +02:00
Eric Christopher 153a24ab0f Undo initialization of TRI in CGP as this is unconditionally initialized
later.
2020-06-02 15:08:54 -07:00
Kadir Cetinkaya af86a10bad
[llvm] Fix unused variable warning 2020-06-02 22:46:24 +02:00
Eric Christopher 971459c3ef Fix up clang-tidy warnings around null and pointers. 2020-06-02 13:24:20 -07:00
Amy Kwan a3ada630d8 [DAGCombiner] Combine shifts into multiply-high
This patch implements a target independent DAG combine to produce multiply-high
instructions from shifts. This DAG combine will combine shifts for any type as
long as the MULH on the narrow type is legal.

For now, it is enabled on PowerPC as PowerPC is the only target that has an
implementation of the isMulhCheaperThanMulShift TLI hook introduced in
D78271.

Moreover, this DAG combine focuses on catching the pattern:
(shift (mul (ext <narrow_type>:$a to <wide_type>), (ext <narrow_type>:$b to <wide_type>)), <narrow_width>)
to produce mulhs when we have a sign-extend, and mulhu when we have
a zero-extend.

The patch performs the following checks:
- Operation is a right shift arithmetic (sra) or logical (srl)
- Input to the shift is a multiply
- Both operands to the shift are sext/zext nodes
- The extends into the multiply are both the same
- The narrow type is half the width of the wide type
- The shift amount is the width of the narrow type
- The respective mulh operation is legal

Differential Revision: https://reviews.llvm.org/D78272
2020-06-02 15:22:48 -05:00
Djordje Todorovic 4e8e5d60b4 [CSInfo][NFC] Interpret loaded parameter value separately
The collectCallSiteParameters() method searches for instructions
which load values into registers used for parameters passing.
Previously, interpretation of those values, loaded by one such
instruction, was implemented inside collectCallSiteParameters() method.

This patch moves the interpretation code from collectCallSiteParameters()
method into a separate static method named interpretValue. New method is
called from collectCallSiteParameters() to process each instruction from
targeted instruction scope.

The collectCallSiteParameters() searches for loaded parameter value
among instructions which precede the call instruction, inside the same
basic block. When needed, new method (interpretValue) could be used for
searching any instruction scope.

This is preparation for search of parameter value, loaded inside call
delay slot.

Patch by Nikola Tesic

Differential revision: https://reviews.llvm.org/D78106
2020-06-02 13:05:04 +02:00
Sriraman Tallam e0bca46b08 Options for Basic Block Sections, enabled in D68063 and D73674.
This patch adds clang options:
-fbasic-block-sections={all,<filename>,labels,none} and
-funique-basic-block-section-names.
LLVM Support for basic block sections is already enabled.

+ -fbasic-block-sections={all, <file>, labels, none} : Enables/Disables basic
block sections for all or a subset of basic blocks. "labels" only enables
basic block symbols.
+ -funique-basic-block-section-names: Enables unique section names for
basic block sections, disabled by default.

Differential Revision: https://reviews.llvm.org/D68049
2020-06-02 00:23:32 -07:00
Denis Antrushin fa818ded24 [StatepointLowering] Handle UNDEF gc values.
Do not spill UNDEF GC values. Instead, replace corresponding
gc.relocate intrinsic with an (arbitrary, but recognizable) constant.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D80714
2020-06-02 10:18:33 +03:00
Richard Smith 4ccb6c36a9 Fix violations of [basic.class.scope]p2.
These cases all follow the same pattern:

struct A {
  friend class X;
  //...
  class X {};
};

But 'friend class X;' injects 'X' into the surrounding namespace scope,
rather than introducing a class member. So the second 'class X {}' is a
completely different type, which changes the meaning of the earlier name
'X' from '::X' to 'A::X'.

Additionally, the friend declaration is pointless -- members of a class
don't need to be befriended to be able to access private members.
2020-06-01 22:03:05 -07:00
Vedant Kumar 776708b00b [LiveDebugValues] Remove early-exit when testing regmasks, NFC
In transferRegisterDef, if the instruction has a regmask attached, we'll
check if any currently used register is clobbered by the regmask.

The early exit in this scan isn't necessary, costs a set lookup, and is
almost never taken [1]. Delete it.

[1]
http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp.html#L1136
2020-06-01 15:16:10 -07:00
Vedant Kumar 11c617c417 [LiveDebugValues] Add LocIndex::u32_{location,index}_t types for readability, NFC
This is per Adrian's suggestion in https://reviews.llvm.org/D80684.
2020-06-01 11:02:36 -07:00
Vedant Kumar 2ecaf93525 [LiveDebugValues] Speed up removeEntryValue, NFC
Summary:
Instead of iterating over all VarLoc IDs in removeEntryValue(), just
iterate over the interval reserved for entry value VarLocs. This changes
the iteration order, hence the test update -- otherwise this is NFC.

This appears to give an ~8.5x wall time speed-up for LiveDebugValues when
compiling sqlite3.c 3.30.1 with a Release clang (on my machine):

```
          ---User Time---   --System Time--   --User+System--   ---Wall Time--- --- Name ---
  Before: 2.5402 ( 18.8%)   0.0050 (  0.4%)   2.5452 ( 17.3%)   2.5452 ( 17.3%) Live DEBUG_VALUE analysis
   After: 0.2364 (  2.1%)   0.0034 (  0.3%)   0.2399 (  2.0%)   0.2398 (  2.0%) Live DEBUG_VALUE analysis
```

The change in removeEntryValue() is the only one that appears to affect
wall time, but for consistency (and to resolve a pending TODO), I made
the analogous changes for iterating over SpillLocKind VarLocs.

Reviewers: nikic, aprantl, jmorse, djtodoro

Subscribers: hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80684
2020-06-01 11:02:36 -07:00
Matt Arsenault 836c7dcf12 DAG: Fix getNode dropping flags if there's a glue output
The AMDGPU non-strict fdiv lowering needs to introduce an FP mode
switch in some cases, and has custom nodes to provide chain/glue for
the intermediate FP operations. We need to propagate nofpexcept here,
but getNode was dropping the flags.

Adding nofpexcept in the AMDGPU custom lowering is left to a future
patch.

Also fix a second case where flags were dropped, but in this case it
seems it just didn't handle this number of operands.

Test will be included in future AMDGPU patch.
2020-06-01 13:48:02 -04:00
hsmahesha 0ed2c04636 [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes
Summary:
While clustering mem ops, AMDGPU target needs to consider number of clustered bytes
to decide on max number of mem ops that can be clustered. This patch adds support to pass
number of clustered bytes to target mem ops clustering logic.

Reviewers: foad, rampitec, arsenm, vpykhtin, javedabsar

Reviewed By: foad

Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80545
2020-06-01 22:52:34 +05:30
Chen Zheng 2a24d350db [MachineCombine] add a hook for resource length limit 2020-05-31 23:21:04 -04:00
Matt Arsenault 95f65a7c6c AArch64/GlobalISel: Fix incorrect ptrmask usage for alignment
I inverted the mask when I ported to the new form of G_PTRMASK in
8bc03d2168.

I don't think this really broke anything, since G_VASTART isn't
handled for types with an alignment higher than the stack alignment.
2020-05-31 10:56:55 -04:00
Florian Hahn ec25a71eb7 [ScheduleDAG] Avoid unnecessary recomputation of topological order.
In some cases ScheduleDAGRRList has to add new nodes to resolve problems
with interfering physical registers. When new nodes are added, it
completely re-computes the topological order, which can take a long
time, but is unnecessary. We only add nodes one by one, and initially
they do not have any predecessors. So we can just insert them at the end
of the vector. Later we add predecessors, but the helper function
properly updates the topological order much more efficiently. With this
change, the compile time for the program below drops from 300s to 30s on
my machine.

    define i11129 @test1() {
      %L1 = load i11129, i11129* undef
      %B30 = ashr i11129 %L1, %L1
      store i11129 %B30, i11129* undef
      ret i11129 %L1
    }

This should be generally beneficial, as we can skip a large amount of
work. Theoretically there are some scenarios where we might not safe
much, e.g. when we add a dependency between the first and last node.
Then we would have to shift all nodes. But we still do not have to spend
the time re-computing the initial order.

Reviewers: MatzeB, atrick, efriedma, niravd, paquette

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D59722
2020-05-31 11:04:35 +01:00
Craig Topper a4dd45b7d0 [DAGCombiner] Move debug message and statistic update into CommitTargetLoweringOpt.
This code was repeated in two callers of CommitTargetLoweringOpt.
But CommitTargetLoweringOpt is also called from TargetLowering.
We should print a message for those calls to. So sink the
repeated code into CommitTargetLoweringOpt to catch those calls.
2020-05-30 19:47:07 -07:00
Simon Pilgrim e6aba43cda SafeStackColoring.h - reduce Instructions.h include to forward declaration. NFC.
SafeStackColoring.cpp - remove includes directly defined in SafeStackColoring.h header. NFC.
2020-05-30 14:38:02 +01:00
Simon Pilgrim 2b881f7911 CriticalAntiDepBreaker.cpp - remove includes directly defined in CriticalAntiDepBreaker.h header. NFC. 2020-05-30 14:32:36 +01:00
Simon Pilgrim e5bc07634d SafeStackLayout.cpp - remove includes directly defined in SafeStackLayout.h module header. NFC. 2020-05-30 14:30:19 +01:00
Simon Pilgrim 63824ad947 [TargetLowering] SimplifyDemandedBits - remove shift amount clamps from getValidShiftAmountConstant calls. NFC.
getValidShiftAmountConstant only returns a value if the shift amount is in range, so we don't need to check it again.
2020-05-30 14:04:55 +01:00
Simon Pilgrim 9d0bfcec83 [SelectionDAG] ComputeNumSignBits - use Valid Min/Max shift amount helpers directly. NFCI.
We are calling getValidShiftAmountConstant first followed by getValidMinimumShiftAmountConstant/getValidMaximumShiftAmountConstant if that failed. But both are used in the same way in ComputeNumSignBits and the Min/Max variants call getValidShiftAmountConstant internally anyhow.
2020-05-30 14:02:14 +01:00
Simon Pilgrim 81b50a7823 [SelectionDAG] Remove repeated getOperand() call. NFC. 2020-05-30 10:21:36 +01:00
Sourabh Singh Tomar 20c9bb44ec [DWARF5] Added support for emission of .debug_macro.dwo section
This patch adds support for emission of following DWARFv5 macro
forms in .debug_macro.dwo section:

- DW_MACRO_start_file
- DW_MACRO_end_file
- DW_MACRO_define_strx
- DW_MACRO_undef_strx

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D78866
2020-05-30 11:13:23 +05:30
Tobias Bosch 6a4714030e [DebugInfo][DAG] Don't reuse debug location on COPY if width changes.
Summary:
This caused incorrect debug information for parameters:
Previously, after a COPY of a parameter that changes the width,
we would emit a DBG_VALUE that continues to be associated to that
parameter, even though it now used a different width.
This made the LiveDebugValues pass assume the parameter value
got clobbered and it stopped tracking the parameter entry
value, leading to incorrect debug information.

Fixes https://bugs.llvm.org/show_bug.cgi?id=39715

Subscribers: aprantl, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80819
2020-05-29 13:24:33 -07:00
Zequan Wu 80e107ccd0 Add NoMerge MIFlag to avoid MIR branch folding
Let the codegen recognized the nomerge attribute and disable branch folding when the attribute is given

Differential Revision: https://reviews.llvm.org/D79537
2020-05-29 12:31:06 -07:00
Sourabh Singh Tomar b47403c0a4 [DWARF5] Replace emission of strp with stx forms in debug_macro section
DW_MACRO_define_strx forms are supported now in llvm-dwarfdump and these
forms can be used in both debug_macro[.dwo] sections. An added advantage
for using strx forms over strp forms is that it uses indices
approach instead of a relocation to debug_str section.

This patch unify the emission for debug_macro section.

Reviewed by: dblaikie, ikudrin

Differential Revision: https://reviews.llvm.org/D78865
2020-05-30 00:24:09 +05:30
Xiangling Liao 26604d06b6 [AIX] Emit AvailableExternally Linkage on AIX
Since on AIX, our strategy is to not use -u to suppress any undefined
symbols, we need to emit .extern for the symbols with AvailableExternally
linkage.

Differential Revision: https://reviews.llvm.org/D80642
2020-05-29 13:12:59 -04:00
Guozhi Wei 40c08367e4 [DAGCombiner] Add command line options to guard store width reduction
optimizations

As discussed in the thread http://lists.llvm.org/pipermail/llvm-dev/2020-May/141838.html,
some bit field access width can be reduced by ReduceLoadOpStoreWidth, some
can't. If two accesses are very close, and the first access width is reduced,
the second is not. Then the wide load of second access will be stalled for long
time.

This patch add command line options to guard ReduceLoadOpStoreWidth and
ShrinkLoadReplaceStoreWithStore, so users can use them to disable these
store width reduction optimizations.

Differential Revision: https://reviews.llvm.org/D80745
2020-05-29 09:41:41 -07:00
Stanislav Mekhanoshin f6a6de288b GlobalISel: fix CombinerHelper::matchEqualDefs()
This matcher was always returning true for the different
results of a same instruction.

Differential Revision:
2020-05-29 09:30:02 -07:00
David Sherwood d8a78889f6 [CodeGen] Fix warning in visitShuffleVector
Make sure we only ask for the number of elements after we've
bailed out for scalable vectors.

Differential revision: https://reviews.llvm.org/D80632
2020-05-29 17:09:59 +01:00
Hendrik Greving d8f2814c91 [ModuloSchedule] Allow illegal phis to be moved across stages.
Fixes a trivial but impactful bug where we did not move illegal phis across stages. This
led to incorrect mappings in certain cases.
2020-05-29 07:01:27 -07:00
Sanjay Patel 21dadd774f [DAGCombiner] avoid unnecessary indirection from SDNode/SDValue; NFCI 2020-05-29 09:31:52 -04:00
Florian Hahn d20a3d35e1 [DAGComb] Do not turn insert_elt into shuffle for single elt vectors.
Currently combineInsertEltToShuffle turns insert_vector_elt into a
vector_shuffle, even if the inserted element is a vector with a single
element. In this case, it should be unlikely that the additional shuffle
would be more efficient than a insert_vector_elt.

Additionally, this fixes a infinite cycle in DAGCombine, where
combineInsertEltToShuffle turns a insert_vector_elt into a shuffle,
which gets turned back into a insert_vector_elt/extract_vector_elt by
a custom AArch64 lowering (in visitVECTOR_SHUFFLE).

Such insert_vector_elt and extract_vector_elt combinations can be
lowered efficiently using mov on AArch64.

There are 2 test changes in arm64-neon-copy.ll: we now use one or two
mov instructions instead of a single zip1. The reason that we need a
second mov in ins1f2 is that we have to move the result to the result
register and is not really related to the DAGCombine fold I think.
But in any case, on most uarchs, mov should be cheaper than zip1. On a
Cortex-A75 for example, zip1 is twice as expensive as mov
(https://developer.arm.com/docs/101398/latest/arm-cortex-a75-software-optimization-guide-v20)

Reviewers: spatel, efriedma, dmgreen, RKSimon

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D80710
2020-05-29 13:21:13 +01:00
Simon Pilgrim b9826c1086 [CGP] Ensure address scaled offset is representable as int64_t
AddressingModeMatcher::matchScaledValue was calling getSExtValue for a constant before ensuring that we can actually represent the value as int64_t

Fixes OSSFuzz#22723 which is a followup to rGc479052a74b2 (PR46004 / OSSFuzz#22357)
2020-05-29 12:25:43 +01:00
Paul Walker 92f3d29af0 [SelectionDAG] Update getNode asserts for EXTRACT/INSERT_SUBVECTOR.
Summary:
The description of EXTACT_SUBVECTOR and INSERT_SUBVECTOR has been
changed to accommodate scalable vectors (see ISDOpcodes.h). This
patch updates the asserts used to verify these requirements when
using SelectionDAG's getNode interface.

This patch introduces the MVT function getVectorMinNumElements
that can be used against fixed-length and scalable vectors when
only the known minimum vector length is required.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80709
2020-05-29 11:02:18 +00:00
David Sherwood 4265f1d23c [CodeGen] Fix warnings in getZeroExtendInReg
We should be using getVectorElementCount() to assert that two types
have the same numbers of elements. I encountered the warnings while
compiling this test:

  CodeGen/AArch64/sve-intrinsics-ld1.ll

Differential Revision: https://reviews.llvm.org/D80616
2020-05-29 11:51:07 +01:00
David Sherwood b147b88c84 [CodeGen] Add support for extracting elements of scalable vectors
I have tried to ensure that SelectionDAG and DAGCombiner do
sensible things for scalable vectors, and added support for a
limited number of simple folds. Codegen support for the vector
extract patterns have also been added to the AArch64 backend.

New vector extract tests have been added here:

  CodeGen/AArch64/sve-extract-element.ll

and I have also added new folds using inserts and extracts here:

  CodeGen/AArch64/sve-insert-element.ll

Differential Revision: https://reviews.llvm.org/D80208
2020-05-29 07:49:43 +01:00
Amara Emerson a0c90b5b2a [AArch64][GlobalISel] Enable extending loads combines post-legalization.
During legalization we can end up with extends of loads, which in the case of
zexts causes us to not hit tablegen imported patterns.

The caveat here is that we don't want anyext load forming, since some variants
are illegal. This change also prevents the combine from creating any illegal
loads.

Differential Revision: https://reviews.llvm.org/D80458
2020-05-28 22:48:20 -07:00
Matt Arsenault e13c84c3be GlobalISel: Work on improving stock set of legality predicates
I get confused by a lot of the predicate names here, since I would
assume they apply to vectors as well. Rename to reflect they only
apply to scalars.

Also add a few predicates AMDGPU uses that should be generally useful.
Also add any() to complement all. I've wanted to use this a few times
but then worked around it not being there.
2020-05-28 20:28:24 -04:00
Eric Christopher bce702e5f2 unsigned -> Register for readability. 2020-05-28 15:21:55 -07:00
Vedant Kumar d11155d273 [LiveDebugValues] Add cutoffs to avoid pathological behavior
Summary:
We received a report of LiveDebugValues consuming 25GB+ of RAM when
compiling code generated by Unity's IL2CPP scripting backend.

There's an initial 5GB spike due to repeatedly copying cached lists of
MachineBasicBlocks within the UserValueScopes members of VarLocs.

But the larger scaling issue arises due to the fact that prior to range
extension, there are 81K basic blocks and 156K DBG_VALUEs: given enough
memory, LiveDebugValues would insert 101 million MIs (I counted this by
incrementing a counter inside of VarLoc::BuildDbgValue).

It seems like LiveDebugValues would have to be rearchitected to support
this kind of input (we'd need some new represntation for DBG_VALUEs that
get inserted into ~every block via flushPendingLocs). OTOH, large globs
of auto-generated code are typically not debugged interactively.

So: add cutoffs to disable range extension when the input is too big. I
chose the cutoffs experimentally, erring on the conservative side. When
compiling a large collection of Apple software, range extension never
got disabled.

rdar://63418929

Reviewers: aprantl, friss, jmorse, Orlando

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80662
2020-05-28 13:53:40 -07:00
Vedant Kumar 4855534d10 [MachineVerifier] Verify that a DBG_VALUE has a debug location
Summary:
Verify that each DBG_VALUE has a debug location. This is required by
LiveDebugValues, and perhaps by other late passes.

There's an exception for tests: lots of tests use a two-operand form of
DBG_VALUE for convenience. There's no reason to prevent that.

This is an extension of D80665, but there's no dependency.

Reviewers: aprantl, jmorse, davide, chrisjackson

Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80670
2020-05-28 13:53:40 -07:00
Vedant Kumar 0aa201eaf9 [MachineLICM] Assert that locations from debug insts are not lost
Summary:
Assert that MachineLICM does not move a debug instruction and then drop
its debug location. Later passes require each debug instruction to have
a location.

Testing: check-llvm, clang stage2 RelWithDebInfo build (x86_64)

Reviewers: aprantl, davide, chrisjackson, jmorse

Subscribers: hiraditya, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80665
2020-05-28 13:53:40 -07:00
Philip Reames 9d06547794 [Statepoints] Sink routines for grabbing projections to GCStatepointInst [NFC]
Mechanical movement, nothing more.
2020-05-28 13:51:59 -07:00
Philip Reames a0d2fd4a1f [Statepoint] Sink actual_args and gc_args to GCStatepointInst [NFC]
These are the two operand sets which are expected to survive more than another week or so.  Instead of bothering to update the deopt and gc-transition operands, we'll just wait until those are removed and delete the code.

For those following along, this is likely to be the last (major) change in this sequence for about a week.  I want to wait until all of this has been merged downstream to ensure I haven't introduced any bugs (and migrate some downstream code to the new interfaces).  Once that's done, we should be able to delete Statepoint/ImmutableStatepoint without too much work.
2020-05-28 13:51:59 -07:00
Philip Reames 4d6cda9bda [Statepoint] Use iterate_range.empty [NFC] 2020-05-28 13:51:59 -07:00
Philip Reames 58beb76b7b [Statepoint] Convert a few more isStatepoint calls to idiomatic isa/cast
I'd apparently only grepped in the lib directories and missed a few used in the Statepoint header itself.  Beyond simple mechanical cleanup, changed the type of one routine to reflect the fact it also returns a statepoint.
2020-05-28 11:35:36 -07:00
Philip Reames 501aa47ab8 [Statepoint] Sink logic about actual callee into GCStatepointInst
Sinking logic around actual callee from Statepoint to GCStatepointInst.  While doing so, adjust naming to be consistent about refering to "actual" callee and follow precedent on naming from CallBase otherwise.

Use the result to simplify one consumer.  This is mostly just to ensure the new code is exercised, but is also a helpful cleanup on it's own.
2020-05-28 10:53:39 -07:00
Nikita Popov e0e5c64460 [SDAG] Don't require LazyBlockFrequencyInfo at optnone
While LazyBlockFrequencyInfo itself is lazy, the dominator tree
and loop info analyses it requires are not. Drop the dependency
on this pass in SelectionDAGIsel at O0.
This makes for a ~0.6% O0 compile-time improvement.

Differential Revision: https://reviews.llvm.org/D80387
2020-05-28 18:48:33 +02:00
Alok Kumar Sharma d20bf5a725 [DebugInfo] Upgrade DISubrange to support Fortran dynamic arrays
This patch upgrades DISubrange to support fortran requirements.

Summary:
Below are the updates/addition of fields.
lowerBound - Now accepts signed integer or DIVariable or DIExpression,
earlier it accepted only signed integer.
upperBound - This field is now added and accepts signed interger or
DIVariable or DIExpression.
stride - This field is now added and accepts signed interger or
DIVariable or DIExpression.
This is required to describe bounds of array which are known at runtime.

Testing:
unit test cases added (hand-written)
check clang
check llvm
check debug-info

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D80197
2020-05-28 13:46:41 +05:30
Philip Reames 98a87c65a3 [Statepoint] Reduce scope of usage of ImmutableStatepoint
Can't quite fully remove it yet as some more items need sunk the GCStatepointInst class from the wrapper, but we can at least reduce scope.
2020-05-27 18:57:42 -07:00
Philip Reames 87bea912c2 [Statepoint] Replace uses of isX functions with idiomatic isa<X>
Now that all of the statepoint related routines have classes with isa support, let's cleanup.

I'm leaving the (dead) utitilities in tree for a few days so that I can do the same cleanup downstream without breakage.
2020-05-27 18:32:28 -07:00
Matt Arsenault dda82986f9 DAG: Fix expansion of DYNAMIC_STACKALLOC for StackGrowsUp targets
Can't test this since I can't directly use the default expansion for
AMDGPU. It needs to scale the amount by the wave size, rather than use
the raw byte size value.
2020-05-27 18:45:40 -04:00
Juneyoung Lee 54b6457240 [TargetPassConfig] Add CanonicalizeFreezeInLoops before LSR
Summary:
This patch adds CanonicalizeFreezeInLoops before LSR.
Relevant patch: https://reviews.llvm.org/D77523

Reviewers: spatel, efriedma, jdoerfert, fhahn, nikic, reames, xbolva00

Reviewed By: nikic

Subscribers: xbolva00, nikic, lebedev.ri, hiraditya, llvm-commits, sanwou01, nlopes

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77524
2020-05-28 05:21:12 +09:00
Jessica Paquette c593bf5342 [GlobalISel] Don't combine instructions which are fed by memory instructions.
If we have a memory instruction (e.g. a load), we shouldn't combine it away in
some trivial combine.

It's possible that, say, a call lives between the instructions. This could
modify the value loaded, making the load instructions not safe to fold.

Differential Revision: https://reviews.llvm.org/D80053
2020-05-27 12:48:58 -07:00
jasonliu 8d9ff23185 [NFC][XCOFF][AIX] Return function entry point symbol with dedicate function
Use getFunctionEntryPointSymbol whenever possible to enclose the
implementation detail and reduce duplicate logic.

Differential Revision: https://reviews.llvm.org/D80402
2020-05-27 17:54:22 +00:00
Philip Reames 1af3705c7f Start migrating away from statepoint's inline length prefixed argument bundles
In the current statepoint design, we have four distinct groups of operands to the call: call args, gc transition args, deopt args, and gc args. This format prexisted the support in IR for operand bundles and was in fact one of the inspirations for the extension. However, we never went back and rearchitected statepoints to fully leverage bundles.

This change is the first in a small sequence to do so. All this does is extend the SelectionDAG lowering code to allow deopt and gc transition operands to be specified in either inline argument bundles or operand bundles.

Differential Revision: https://reviews.llvm.org/D8059
2020-05-27 09:16:10 -07:00
Ties Stuij 0508fb45df [CodeGen][BFloat] Add bfloat MVT type
Summary:
This patch adds BFloat MVT support. It also adds fixed and scalable vector MVT
types for BFloat.

This patch is part of a series that adds support for the Bfloat16 extension of the Armv8.6-a architecture, as
detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

Reviewers: aemerson, huntergr, craig.topper, fpetrogalli, sdesmalen, LukeGeeson, ostannard

Reviewed By: ostannard

Subscribers: LukeGeeson, pbarrio, dschuff, kristof.beyls, hiraditya, aheejin, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79706
2020-05-27 13:38:12 +01:00
Konstantin Schwarz f2fad3f703 [GlobalISel][InlineAsm] Add missing EarlyClobber flag to inline asm output operands
Summary:
Previously, we only added early-clobber flags to the 'group' immediate flag operand
of an inline asm operand.
However, we also have to add the EarlyClobber flag to the MachineOperand itself.

This fixes PR46028

Reviewers: arsenm, leonardchan

Reviewed By: arsenm, leonardchan

Subscribers: phosek, wdng, rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80467
2020-05-27 12:04:18 +02:00
Matt Arsenault ef3e831226 GlobalISel: Basic legalization for G_PTRMASK 2020-05-26 21:20:30 -04:00
Vedant Kumar 6e39379bbb [DwarfExpression] Support entry values for indirect parameters
Summary:
A struct argument can be passed-by-value to a callee via a pointer to a
temporary stack copy. Add support for emitting an entry value DBG_VALUE
when an indirect parameter DBG_VALUE becomes unavailable. This is done
by omitting DW_OP_stack_value from the entry value expression, to make
the expression describe the location of an object.

rdar://63373691

Reviewers: djtodoro, aprantl, dstenb

Subscribers: hiraditya, lldb-commits, llvm-commits

Tags: #lldb, #llvm

Differential Revision: https://reviews.llvm.org/D80345
2020-05-26 14:22:28 -07:00
Chris Jackson bd7ff5d94f [DebugInfo] Correct debuginfo for post-ra hoist and sink in Machine LICM
Reviewers: vsk, aprantl

Differential Revision: https://reviews.llvm.org/D79868
2020-05-26 21:07:10 +01:00
Simon Pilgrim 50db8402fc ResourcePriorityQueue.h - reduce unnecessary includes to forward declarations. NFC.
Move includes to ResourcePriorityQueue.cpp
2020-05-26 19:22:14 +01:00
Matt Arsenault 9786e7552d Revert "[AMDGPU] NFC target dependent requiresUniformRegister refactored out"
This reverts commit fb38b98338.

This will regress compile time.
2020-05-26 12:58:18 -04:00
alex-t fb38b98338 [AMDGPU] NFC target dependent requiresUniformRegister refactored out
Summary: Target specific method encapsulated into the Target Lowering Info.

Reviewers: rampitec, vpykhtin

Reviewed By: rampitec

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70085
2020-05-26 19:49:20 +03:00
Matt Arsenault 8bc03d2168 GlobalISel: Merge G_PTR_MASK with llvm.ptrmask intrinsic
Confusingly, these were unrelated and had different semantics. The
G_PTR_MASK instruction predates the llvm.ptrmask intrinsic, but has a
different format. G_PTR_MASK only allows clearing the low bits of a
pointer, and only a constant number of bits. The ptrmask intrinsic
allows an arbitrary mask. Replace G_PTR_MASK to match the intrinsic.

Only selects the cases that look like the old instruction. More work
is needed to select the general case. Also new legalization code is
still needed to deal with the case where the incoming mask size does
not match the pointer size, which has a specified behavior in the
langref.
2020-05-26 11:48:13 -04:00
Serge Pavlov 4d20e31f73 [FPEnv] Intrinsic llvm.roundeven
This intrinsic implements IEEE-754 operation roundToIntegralTiesToEven,
and performs rounding to the nearest integer value, rounding halfway
cases to even. The intrinsic represents the missed case of IEEE-754
rounding operations and now llvm provides full support of the rounding
operations defined by the standard.

Differential Revision: https://reviews.llvm.org/D75670
2020-05-26 19:24:58 +07:00
Sanjay Patel f368040c14 [DAGCombiner] try to move splat after binop with splat constant
binop (splat X), (splat C) --> splat (binop X, C)
binop (splat C), (splat X) --> splat (binop C, X)

We do this in IR, and there's a similar fold for the case with 2
non-constant operands just above the code diff in this patch.

This was discussed in D79718, and the extra shuffle in the test
(llvm/test/CodeGen/X86/vector-fshl-128.ll::sink_splatvar) where it
was noticed disappears because demanded elements analysis is no
longer blocked. The large majority of the test diffs seem to be
benign code scheduling changes, but I do see another type of win:
moving the splat later allows binop narrowing in some cases.

Regressions were avoided on x86 and ARM with the INSERT_VECTOR_ELT
restriction.

Differential Revision: https://reviews.llvm.org/D79886
2020-05-26 08:12:46 -04:00
hsmahesha 09f7dcb64e [AMDGPU/MemOpsCluster] Code clean-up around mem ops clustering logic
Summary:
Clean-up code around mem ops clustering logic. This patch cleans up code within
the function clusterNeighboringMemOps(). It is WIP, and this patch is a first cut.

Reviewers: foad, rampitec, arsenm, vpykhtin, javedabsar

Reviewed By: foad

Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80119
2020-05-26 15:49:21 +05:30
Fangrui Song 872c5fb143 [AsmPrinter] Don't generate .Lfoo$local for -fno-PIC and -fPIE
-fno-PIC and -fPIE code generally cannot be linked in -shared mode and there is no benefit accessing via local aliases.

Actually, a .Lfoo$local reference will be converted to a STT_SECTION (if no section relaxation) reference which will cause the section symbol (sizeof(Elf64_Sym)=24) to be generated.
2020-05-25 23:35:49 -07:00
Fangrui Song 9d55e4ee13 Make explicit -fno-semantic-interposition (in -fpic mode) infer dso_local
-fno-semantic-interposition is currently the CC1 default. (The opposite
disables some interprocedural optimizations.) However, it does not infer
dso_local: on most targets accesses to ExternalLinkage functions/variables
defined in the current module still need PLT/GOT.

This patch makes explicit -fno-semantic-interposition infer dso_local,
so that PLT/GOT can be eliminated if targets implement local aliases
for AsmPrinter::getSymbolPreferLocal (currently only x86).

Currently we check whether the module flag "SemanticInterposition" is 0.
If yes, infer dso_local. In the future, we can infer dso_local unless
"SemanticInterposition" is 1: frontends other than clang will also
benefit from the optimization if they don't bother setting the flag.
(There will be risks if they do want ELF interposition: they need to set
"SemanticInterposition" to 1.)
2020-05-25 20:48:18 -07:00
Simon Pilgrim 8f48814879 FunctionLoweringInfo.h - move APInt.h dependency to FunctionLoweringInfo.cpp. NFC. 2020-05-25 12:58:35 +01:00
Simon Pilgrim 9fa58d1bf2 [DAG] Add SimplifyDemandedVectorElts binop SimplifyMultipleUseDemandedBits handling
For the supported binops (basic arithmetic, logicals + shifts), if we fail to simplify the demanded vector elts, then call SimplifyMultipleUseDemandedBits and try to peek through ops to remove unnecessary dependencies.

This helps with PR40502.

Differential Revision: https://reviews.llvm.org/D79003
2020-05-25 12:41:22 +01:00
Orivej Desh 838d12207b [TargetLoweringObjectFileImpl] Use llvm::transform
Fixes a build issue with libc++ configured with _LIBCPP_RAW_ITERATORS (ADL not effective)

```
llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp:1602:3: error: no matching function for call to 'transform'
  transform(HexString.begin(), HexString.end(), HexString.begin(), tolower);
  ^~~~~~~~~
```

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D80475
2020-05-24 20:59:24 -07:00
Sanjay Patel 7eed772a27 [PatternMatch] abbreviate vector inst matchers; NFC
Readability is not reduced with these opcodes/match lines,
so reduce odds of awkward wrapping from 80-col limit.
2020-05-24 09:19:47 -04:00
Simon Pilgrim 1603106725 [TargetLowering] Improve expandFunnelShift shift amount masking
For the 'inverse shift', we currently always perform a subtraction of the original (masked) shift amount.

But for the case where we are handling power-of-2 type widths, we can replace:

(sub bw-1, (and amt, bw-1) ) -> (and (xor amt, bw-1), bw-1) -> (and ~amt, bw-1)

This allows x86 shifts to fold away the and-mask.

Followup to D77301 + D80466.

http://volta.cs.utah.edu:8080/z/Nod0Gr

Differential Revision: https://reviews.llvm.org/D80489
2020-05-24 11:25:09 +01:00
Amy Kwan b631f86ac5 [TLI][PowerPC] Introduce TLI query to check if MULH is cheaper than MUL + SHIFT
This patch introduces a TargetLowering query, isMulhCheaperThanMulShift.

Currently in DAG Combine, it will transform mulhs/mulhu into a
wider multiply and a shift if the wide multiply is legal.

This TLI function is implemented on 64-bit PowerPC, as it is more desirable to
have multiply-high over multiply + shift for words and doublewords. Having
multiply-high can also aid in further transformations that can be done.

Differential Revision: https://reviews.llvm.org/D78271
2020-05-23 16:47:12 -05:00
Fangrui Song de172ef61e [CFIInstrInserter] Delete unneeded checks 2020-05-23 14:13:31 -07:00
Nikita Popov 2833c46f75 [DwarfEHPrepare] Don't prune unreachable resumes at optnone
Disable pruning of unreachable resumes in the DwarfEHPrepare pass
at optnone. While I expect the pruning itself to be essentially free,
this does require a dominator tree calculation, that is not used for
anything else. Saving this DT construction makes for a 0.4% O0
compile-time improvement.

Differential Revision: https://reviews.llvm.org/D80400
2020-05-23 20:58:01 +02:00
Simon Pilgrim fe0006c882 TargetLowering.h - remove unnecessary TargetMachine.h include. NFC
Replace with forward declaration and move dependency down to source files that actually need it.

Both TargetLowering.h and TargetMachine.h are 2 of the most expensive headers (top 10) in the ClangBuildAnalyzer report when building llc.
2020-05-23 19:49:38 +01:00
Nikita Popov 0c6bba71e3 [TargetPassConfig] Don't add alias analysis at optnone
When performing codegen at optnone, don't add alias analysis to
the pipeline. We don't need it, but it causes an unnecessary
dominator tree calculation.

I've also moved the module verifier call to the top so that a bunch
of disabled-at-optnone passes group more nicely.

Differential Revision: https://reviews.llvm.org/D80378
2020-05-23 10:35:03 +02:00
Craig Topper 7392820f98 [Align] Remove operations on MaybeAlign that asserted that it had a defined value.
If the caller needs to reponsible for making sure the MaybeAlign
has a value, then we should just make the caller convert it to an Align
with operator*.

I explicitly deleted the relational comparison operators that
were being inherited from Optional. It's unclear what the meaning
of two MaybeAligns were one is defined and the other isn't
should be. So make the caller reponsible for defining the behavior.

I left the ==/!= operators from Optional. But now that exposed a
weird quirk that ==/!= between Align and MaybeAlign required the
MaybeAlign to be defined. But now we use the operator== from
Optional that takes an Optional and the Value.

Differential Revision: https://reviews.llvm.org/D80455
2020-05-22 21:54:28 -07:00
Fangrui Song 0840d725c4 [MC] Change MCCFIInstruction::createDefCfaOffset to cfiDefCfaOffset which does not negate Offset
The negative Offset has caused a bunch of problems and confused quite a
few call sites. Delete the unneeded negation and fix all call sites.
2020-05-22 17:07:11 -07:00
Fangrui Song 7e49dc6184 [MC] Change MCCFIInstruction::createDefCfa to cfiDefCfa which does not negate Offset
The negative Offset has caused a bunch of problems and confused quite a
few call sites. Delete the unneeded negation and fix all call sites.
2020-05-22 15:47:26 -07:00
Jean-Michel Gorius 65cd2c7a80 Revert "[CodeGen] Add support for multiple memory operands in MachineInstr::mayAlias"
This temporarily reverts commit 7019cea26d.

It seems that, for some targets, there are instructions with a lot of memory operands (probably more than would be expected). This causes a lot of buildbots to timeout and notify failed builds. While investigations are ongoing to find out why this happens, revert the changes.
2020-05-22 21:26:46 +02:00
Simon Pilgrim c479052a74 [CGP] Ensure address offset is representable as int64_t
AddressingModeMatcher::matchAddr was calling getSExtValue for a constant before ensuring that we can actually represent the value as int64_t

Fixes PR46004 / OSSFuzz#22357
2020-05-22 17:00:22 +01:00
Simon Pilgrim 4ed909bb5b TargetLowering.h - remove unnecessary includes. NFC.
Replace with forward declarations and move SizeOpts.h down to TargetLoweringBase.cpp
2020-05-22 14:26:27 +01:00
Simon Pilgrim d4c0a082a4 [TargetLowering] Move TargetLoweringBase::isJumpTableRelative() implementation into TargetLoweringBase.cpp. NFC.
This will help with reducing header dependencies in TargetLowering.h in a future patch.
2020-05-22 14:26:27 +01:00
Simon Pilgrim b9def827b7 StatepointLowering.h - remove unused includes. NFC. 2020-05-22 10:49:11 +01:00
Simon Pilgrim 1041e8b886 MILexer.h/cpp - remove unused includes. NFC.
Remove duplicates in MILexer.cpp that are already included in MILexer.h.
2020-05-22 10:49:10 +01:00
Jessica Paquette 49a4f3f7d8 [AArch64][GlobalISel] Add a post-legalizer combiner with a very simple combine.
(This patch is by Jessica, I'm just committing it on her behalf because I need
a post-legalizer combiner for something else).

This supersedes D77250, which did equivalent work in the selector. This can be
done pre-legalization or post-legalization. Post-legalization is more likely to
hit, since G_IMPLICIT_DEFs tend to appear during legalization. There's no reason
to not do it pre-legalization though-- if it can be caught earlier, great.

(I also think that it might be worth reimplementing D78769 using a
target-specific post-legalization combine too after thinking about it for a
while.)

Differential Revision: https://reviews.llvm.org/D78852
2020-05-21 18:47:32 -07:00
Craig Topper f96a7706d9 [Target] Use Align in TargetLoweringObjectFile::getSectionForConstant.
Differential Revision: https://reviews.llvm.org/D80363
2020-05-21 15:23:29 -07:00
Arthur Eubanks fc937806ef Don't jump to landing pads in Control Flow Optimizer
Summary: Likely fixes https://bugs.llvm.org/show_bug.cgi?id=45858.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80047
2020-05-21 15:19:10 -07:00
Jean-Michel Gorius 7019cea26d [CodeGen] Add support for multiple memory operands in MachineInstr::mayAlias
Summary:
To support all targets, the mayAlias member function needs to support instructions with multiple operands.

This revision also changes the order of the emitted instructions in some test cases.

Reviewers: efriedma, hfinkel, craig.topper, dmgreen

Reviewed By: efriedma

Subscribers: MatzeB, dmgreen, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80161
2020-05-21 23:02:54 +02:00
Hendrik Greving 8a6a2c4cb6 [ModuloSchedule] Add missing comma.
This is a test commit as per Chris to verify commit access.

Thanks!
2020-05-21 13:18:07 -07:00
Marcello Maggioni dbaed589ab [SelectionDAG] Add the option of disabling generic combines.
Summary:
For some targets generic combines don't really do much and they
consume a disproportionate amount of time.
There's not really a mechanism in SDISel to tactically disable
combines, but we can have a switch to disable all of them and
let the targets just implement what they specifically need.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79112
2020-05-21 20:11:29 +00:00
Thomas Raoux 20c0527af7 [ModuloSchedule] Trivial fix for instruction with more than one destination in modulo peeler.
When moving an instruction into a block where it was referenced by a phi when peeling,
refer to the phi's register number and assert that the instruction has it in its destinations.
This way, it also covers instructions with more than one destination.

Patch by Hendrik Greving!

Differential Revision: https://reviews.llvm.org/D80027
2020-05-21 08:14:42 -07:00
Sjoerd Meijer b0614509a0 [HardwareLoops] llvm.loop.decrement.reg definition
This is split off from D80316, slightly tightening the definition of overloaded
hardwareloop intrinsic llvm.loop.decrement.reg specifying that both operands
its result have the same type.
2020-05-21 10:48:16 +01:00
Denis Antrushin dedcefe09d [Statepoint] Constant fold FP deopt args.
We do not have any special handling for constant FP deopt arguments.
They are just spilled to stack or generated in register by MOVS
instruction. This is inefficient and, when we have too many such
constant arguments, may result in register allocation failure.
Instead, we can bitcast such constant FP operands to appropriately
sized integer and record as constant into statepoint and later, into
StackMap.

Reviewed By: skatkov
Differential Revision: https://reviews.llvm.org/D80318
2020-05-21 11:02:54 +03:00
Craig Topper ae5ab2f40a [LegalizeDAG] Modify ExpandLegalINT_TO_FP to swap data for little/big endian instead of the pointers.
Will make it easier to pass the pointer info and alignment
correctly to the loads/stores.

While there also make the i32 stores independent and use a token
factor to join before the load.
2020-05-20 22:29:59 -07:00
Eli Friedman f26bdb539e Make Value::getPointerAlignment() return an Align, not a MaybeAlign.
If we don't know anything about the alignment of a pointer, Align(1) is
still correct: all pointers are at least 1-byte aligned.

Included in this patch is a bugfix for an issue discovered during this
cleanup: pointers with "dereferenceable" attributes/metadata were
assumed to be aligned according to the type of the pointer.  This
wasn't intentional, as far as I can tell, so Loads.cpp was fixed to
stop making this assumption. Frontends may need to be updated.  I
updated clang's handling of C++ references, and added a release note for
this.

Differential Revision: https://reviews.llvm.org/D80072
2020-05-20 16:37:20 -07:00
Craig Topper 17bd86bc9b [LegalizeVectorTypes] Create correct memoperands in SplitVecRes_INSERT_SUBVECTOR.
Previously this code just used a default constructed
MachinePointerInfo. But we know the accesses are to a fixed stack
object or at least somewhere on the stack.

While there fix the alignment passed to the full vector load/stores.

I don't think this function is currently exercised in tree so I
don't know how to test it. I just noticed it when I removed
non-constant index support in this function.

Differential Revision: https://reviews.llvm.org/D80058
2020-05-20 15:06:36 -07:00
Arthur Eubanks 8a88755610 Reland [X86] Codegen for preallocated
See https://reviews.llvm.org/D74651 for the preallocated IR constructs
and LangRef changes.

In X86TargetLowering::LowerCall(), if a call is preallocated, record
each argument's offset from the stack pointer and the total stack
adjustment. Associate the call Value with an integer index. Store the
info in X86MachineFunctionInfo with the integer index as the key.

This adds two new target independent ISDOpcodes and two new target
dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}.

The setup ISelDAG node takes in a chain and outputs a chain and a
SrcValue of the preallocated call Value. It is lowered to a target
dependent node with the SrcValue replaced with the integer index key by
looking in X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an
%esp adjustment, the exact amount determined by looking in
X86MachineFunctionInfo with the integer index key.

The arg ISelDAG node takes in a chain, a SrcValue of the preallocated
call Value, and the arg index int constant. It produces a chain and the
pointer fo the arg. It is lowered to a target dependent node with the
SrcValue replaced with the integer index key by looking in
X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a
lea of the stack pointer plus an offset determined by looking in
X86MachineFunctionInfo with the integer index key.

Force any function containing a preallocated call to use the frame
pointer.

Does not yet handle a setup without a call, or a conditional call.
Does not yet handle musttail. That requires a LangRef change first.

Tried to look at all references to inalloca and see if they apply to
preallocated. I've made preallocated versions of tests testing inalloca
whenever possible and when they make sense (e.g. not alloca related,
inalloca edge cases).

Aside from the tests added here, I checked that this codegen produces
correct code for something like

```
struct A {
        A();
        A(A&&);
        ~A();
};

void bar() {
        foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8);
}
```

by replacing the inalloca version of the .ll file with the appropriate
preallocated code. Running the executable produces the same results as
using the current inalloca implementation.

Reverted due to unexpectedly passing tests, added REQUIRES: asserts for reland.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77689
2020-05-20 11:25:44 -07:00
Arthur Eubanks b8cbff51d3 Revert "[X86] Codegen for preallocated"
This reverts commit 810567dc69.

Some tests are unexpectedly passing
2020-05-20 10:04:55 -07:00
Arthur Eubanks 810567dc69 [X86] Codegen for preallocated
See https://reviews.llvm.org/D74651 for the preallocated IR constructs
and LangRef changes.

In X86TargetLowering::LowerCall(), if a call is preallocated, record
each argument's offset from the stack pointer and the total stack
adjustment. Associate the call Value with an integer index. Store the
info in X86MachineFunctionInfo with the integer index as the key.

This adds two new target independent ISDOpcodes and two new target
dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}.

The setup ISelDAG node takes in a chain and outputs a chain and a
SrcValue of the preallocated call Value. It is lowered to a target
dependent node with the SrcValue replaced with the integer index key by
looking in X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an
%esp adjustment, the exact amount determined by looking in
X86MachineFunctionInfo with the integer index key.

The arg ISelDAG node takes in a chain, a SrcValue of the preallocated
call Value, and the arg index int constant. It produces a chain and the
pointer fo the arg. It is lowered to a target dependent node with the
SrcValue replaced with the integer index key by looking in
X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a
lea of the stack pointer plus an offset determined by looking in
X86MachineFunctionInfo with the integer index key.

Force any function containing a preallocated call to use the frame
pointer.

Does not yet handle a setup without a call, or a conditional call.
Does not yet handle musttail. That requires a LangRef change first.

Tried to look at all references to inalloca and see if they apply to
preallocated. I've made preallocated versions of tests testing inalloca
whenever possible and when they make sense (e.g. not alloca related,
inalloca edge cases).

Aside from the tests added here, I checked that this codegen produces
correct code for something like

```
struct A {
        A();
        A(A&&);
        ~A();
};

void bar() {
        foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8);
}
```

by replacing the inalloca version of the .ll file with the appropriate
preallocated code. Running the executable produces the same results as
using the current inalloca implementation.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77689
2020-05-20 09:20:38 -07:00
Florian Hahn bcbd26bfe6 [SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.

This patch was originally committed as b8a3c34eee, but broke the
modules build, as LoopAccessAnalysis was using the Expander.

The code-gen part of LAA was moved to lib/Transforms recently, so this
patch can be landed again.

Reviewers: sanjoy.google, efriedma, reames

Reviewed By: sanjoy.google

Differential Revision: https://reviews.llvm.org/D71537
2020-05-20 10:53:40 +01:00
Simon Pilgrim d9b9ce6c04 CommandFlags.h - remove unnecessary includes. NFC.
Replace with forward declarations and move necessary includes down to source files.

Exposes an implicit dependency on TargetMachine.h in llvm-opt-fuzzer.cpp
2020-05-20 09:58:37 +01:00
QingShan Zhang 2b59e9f1bd [DAGCombine] Remove the getNegatibleCost to avoid the out of sync with getNegatedExpression
We have the getNegatibleCost/getNegatedExpression to evaluate the cost and negate the expression.
However, during negating the expression, the cost might change as we are changing the DAG,
and then, hit the assertion if we negated the wrong expression as the cost is not trustful anymore.

This patch is target to remove the getNegatibleCost to avoid the out of sync with getNegatedExpression,
and check the cost during negating the expression. It also reduce the duplicated code between
getNegatibleCost and getNegatedExpression. And fix the crash for the test in D76638

Reviewed By: RKSimon, spatel

Differential Revision: https://reviews.llvm.org/D77319
2020-05-20 02:12:16 +00:00
Matt Arsenault 08ae945318 GlobalISel: Copy correct flags to select
This was looking for a compare condition, and copying the compare
flags. I don't think this was ever correct outside of certain min/max
patterns which aren't checked, but this probably predates select
instructions having fast math flags.
2020-05-19 18:31:24 -04:00
Matt Arsenault e6658079ac GlobalISel: Remove unused include 2020-05-19 17:56:55 -04:00
Matt Arsenault 4dad4914f7 CodeGen: Use Register 2020-05-19 17:56:55 -04:00
Craig Topper ccba60a784 [StackColoring] When remapping alloca's move the To alloca if the From alloca is before it.
If To is after From its possible that there's a use of From
between them.

Fixes issue reported here http://lists.llvm.org/pipermail/llvm-dev/2020-May/141421.html

Differential Revision: https://reviews.llvm.org/D80101
2020-05-19 10:37:27 -07:00
Matt Arsenault a7759d1785 GlobalISel: Fix IRTranslator for constantexpr selects
This was assuming a select is always an instruction, which is not
true.
2020-05-19 09:52:48 -04:00
Simon Pilgrim cdafe59f95 TargetLoweringObjectFile.h - remove unnecessary includes. NFCI.
Replace with forward declarations and move includes down to source files where required.

I also needed to move the TargetLoweringObjectFile::SectionForGlobal wrapper implementation down into TargetLoweringObjectFile.cpp
2020-05-19 09:28:13 +01:00
Eli Friedman 27b4e6931d [NFC] Replace MaybeAlign with Align in TargetTransformInfo. 2020-05-18 19:25:49 -07:00
Reid Kleckner 47cc6db928 Re-land [Debug][CodeView] Emit fully qualified names for globals
This reverts commit 525a591f0f.

Fixed an issue with pointers to members based on typedefs. In this case,
LLVM would emit a second UDT. I fixed it by not passing the class type
to getTypeIndex when the base type is not a function type. lowerType
only uses the class type for direct function types. This suggests if we
have a PMF with a function typedef, there may be an issue, but that can
be solved separately.
2020-05-18 17:31:00 -07:00
Matt Arsenault ae98939172 GlobalISel: Fold G_MUL x, 0, and G_*DIV 0, x 2020-05-18 18:08:26 -04:00
Amara Emerson 17842025ed [GlobalISel] Add support for using vector values in memset inlining. 2020-05-18 14:56:16 -07:00
Matt Arsenault 3e315697ac DAG: Use correct pointer size for llvm.ptrmask
This was ignoring the address space, and would assert on address
spaces with a different size from the default.
2020-05-18 16:46:11 -04:00
Craig Topper c9f63297e2 Fix several places that were calling verifyFunction or verifyModule without checking the return value.
verifyFunction/verifyModule don't assert or error internally. They
also don't print anything if you don't pass a raw_ostream to them.
So the caller needs to check the result and ideally pass a stream
to get the messages. Otherwise they're just really expensive no-ops.

I've filed PR45965 for another instance in SLPVectorizer
that causes a lit test failure.

Differential Revision: https://reviews.llvm.org/D80106
2020-05-18 13:28:46 -07:00
David Sherwood 364c595403 [SVE] Ignore scalable vectors in InterleavedLoadCombinePass
I have changed the pass so that we ignore shuffle vectors with
scalable vector types, and replaced VectorType with FixedVectorType
in the rest of the pass. I couldn't think of an easy way to test
this change, since for scalable vectors we shouldn't be using
shufflevectors for interleaving. This change fixes up some
type size assert warnings I found in the following test:

  CodeGen/AArch64/sve-intrinsics-int-arith-imm.ll

Differential Revision: https://reviews.llvm.org/D79700
2020-05-18 16:35:55 +01:00
Hans Wennborg 525a591f0f Revert 76c5f277f2 "Re-land [Debug][CodeView] Emit fully qualified names for globals"
> Before this patch, S_[L|G][THREAD32|DATA32] records were emitted with a simple name, not the fully qualified name (namespace + class scope).
>
> Differential Revision: https://reviews.llvm.org/D79447

This causes asserts in Chromium builds:

CodeViewDebug.cpp:2997: void llvm::CodeViewDebug::emitDebugInfoForUDTs(const std::vector<std::pair<std::string, const DIType *>> &):
Assertion `OriginalSize == UDTs.size()' failed.

I will follow up on the Phabricator issue.
2020-05-18 11:26:30 +02:00
OCHyams 709c52b955 [DebugInfo][DWARF] Emit a single location instead of a location list
for variables in nested scopes (including inlined functions) if there is a
single location which covers the entire scope and the scope is contained in a
single block.

Based on work by @jmorse.

Reviewed By: vsk, aprantl

Differential Revision: https://reviews.llvm.org/D79571
2020-05-18 09:43:32 +01:00
Mehdi Amini 8697d443ab Fix warning "defined but not used" for debug function (NFC) 2020-05-17 23:50:18 +00:00
Mehdi Amini ffc6e593d2 Replace dyn_cast with isa when the result isn't used (NFC)
Fix build warning: unused variable 'BB'
2020-05-17 23:15:17 +00:00
Nikita Popov 52e98f620c [Alignment] Remove unnecessary getValueOrABITypeAlignment calls (NFC)
Now that load/store alignment is required, we no longer need most
of them. Also switch the getLoadStoreAlignment() helper to return
Align instead of MaybeAlign.
2020-05-17 22:19:15 +02:00
David Blaikie a055e3856f DebugInfo: Reduce long-distance dependence on what will/won't emit a debug_addr section
This is a no-op/NFC at the moment & generally makes the code /somewhat/
cleaner/less reliant on assumptions about what will produce a debug_addr
section.

It's still a bit "spooky action at a distance" - the add ranges code
pre-emptively inserts addresses into the address pool it knows will
eventually be used by the range emission code (or low/high pc).

The 'ideal' would be either to actually compute the addresses needed for
range (& loc) emission earlier - which would mean decanonicalizing the
range/loc representation earlier to account for whether it was going to
use addrx encodings or not (which would be unfortunate, but could be
refactored to be relatively unobtrusive).

Alternatively, emitting the range/loc sections earlier would cause them
to request the needed addresses sooner - but then you endup having to
split finalizeModuleInfo because some things need to be handled there
before the ranges/locs are emitted, I think...
2020-05-17 12:45:56 -07:00
Craig Topper 796ae8cf82 [LegalizeDAG] Use MachinePointerInfo::getUnknownStack in place of MachinePointerInfo() in a couple places. NFC
We know the pointer somewhere on the stack, we just don't know
exactly where since the index may be variable.

Differential Revision: https://reviews.llvm.org/D80060
2020-05-16 15:48:16 -07:00
Eli Friedman 4f04db4b54 AllocaInst should store Align instead of MaybeAlign.
Along the lines of D77454 and D79968.  Unlike loads and stores, the
default alignment is getPrefTypeAlign, to match the existing handling in
various places, including SelectionDAG and InstCombine.

Differential Revision: https://reviews.llvm.org/D80044
2020-05-16 14:53:16 -07:00
Sanjay Patel 5be37cb124 [x86][CGP] try to hoist funnel shift above select-of-splats
This is basically the same patch as D63233, but converted to
funnel shifts rather than regular shifts. I did not see a
way to effectively share code for these 2 cases though.

This follows D79718 and D79827 to re-fix PR37426 because
that gets canonicalized to funnel shift intrinsics in IR.

I did draft an alternative patch as an enhancement to
"shouldSinkOperands()", but that was awkward because
we have to key the transform from the select, but then
look at both its users and its operands.
2020-05-16 10:44:47 -04:00
Simon Pilgrim 228913780b DIEHash.cpp - remove headers explicitly included in DIEHash.h. NFC.
Don't duplicate module header includes.
2020-05-16 15:00:57 +01:00
Simon Pilgrim 25656332f1 AggressiveAntiDepBreaker.cpp - remove headers explicitly included in AggressiveAntiDepBreaker.h. NFC.
Don't duplicate module header includes.
2020-05-16 15:00:56 +01:00
Craig Topper 13d44b2a0c [LegalizeDAG] Use getMemBasePlusOffset to simplify some code. Use other signature of getMemBasePlusOffset in another location. NFCI
The code was calculating an offset from a stack pointer SDValue.
This is exactly what getMemBasePlusOffset does. I also replaced
sizeof(int) with a hardcoded 4. We know the type we're operating
on is 4 bytes. But the size of int that the source code is being
compiled with isn't guaranteed to be 4 bytes.

While here replace another use of getMemBasePlusOffset that was
proceeded with a call to getConstant with the other signature
that call getConstant internally.
2020-05-16 01:02:08 -07:00
Craig Topper 45c7b3fd91 [LegalizeVectorTypes] Remove non-constnat INSERT_SUBVECTOR handling. NFC
Now that D79814 has landed, we can assume that subvector ops use constant, in-range indices.
2020-05-15 23:56:13 -07:00
Ten Tzen e32f8e5d4a [Windows EH] Fix the order of Nested try-catches in $tryMap$ table
This bug is exposed by Test7 of ehthrow.cxx in MSVC EH suite where
a rethrow occurs in a try-catch inside a catch (i.e., a nested Catch
handlers). See the test code in
https://github.com/microsoft/compiler-tests/blob/master/eh/ehthrow.cxx#L346

When an object is rethrown in a Catch handler, the copy-ctor of this
object must be executed after the destructions of live objects, but
BEFORE the dtors of live objects in parent handlers.

Today Windows 64-bit runtime (__CxxFrameHandler3 & 4) expects nested Catch
handers
are stored in pre-order (outer first, inner next) in $tryMap$ table, so
that given a State, its Catch's beginning State can be properly
retrieved. The Catch beginning state (which is also the ending State) is
the State where rethrown object's copy-ctor must take place.

LLVM currently stores nested catch handlers in post-ordering because
it's the natural way to compute the highest State in Catch.
The fix is to simply store TryCatch handler in pre-order, but update
Catch's highest State after child Catches are all processed.

Differential Revision: https://reviews.llvm.org/D79474?id=263919
2020-05-15 22:03:43 -07:00
Diogo Sampaio 6c68f75ee4 Prevent register coalescing in functions whith setjmp
Summary:
In the the given example, a stack slot pointer is merged
between a setjmp and longjmp. This pointer is spilled,
so it does not get correctly restored, addinga undefined
behaviour where it shouldn't.

Change-Id: I60ec010844f2a24ce01ceccf12eb5eba5ab94abb

Reviewers: eli.friedman, thanm, efriedma

Reviewed By: efriedma

Subscribers: MatzeB, qcolombet, tpr, rnk, efriedma, hiraditya, llvm-commits, chill

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77767
2020-05-16 00:36:34 +01:00
Eli Friedman 11aa3707e3 StoreInst should store Align, not MaybeAlign
This is D77454, except for stores.  All the infrastructure work was done
for loads, so the remaining changes necessary are relatively small.

Differential Revision: https://reviews.llvm.org/D79968
2020-05-15 12:26:58 -07:00
Alexandre Ganea 76c5f277f2 Re-land [Debug][CodeView] Emit fully qualified names for globals
Before this patch, S_[L|G][THREAD32|DATA32] records were emitted with a simple name, not the fully qualified name (namespace + class scope).

Differential Revision: https://reviews.llvm.org/D79447
2020-05-15 10:37:09 -04:00
David Sherwood fb1c55b57d [CodeGen] Fix FoldConstantVectorArithmetic for scalable vectors
For now I have changed FoldConstantVectorArithmetic to return early
if we encounter a scalable vector, since the subsequent code assumes
you can perform lane-wise constant folds. However, in future work we
should be able to extend this to look at splats of a constant value
and fold those if possible. I have also added the same code to
FoldConstantArithmetic, since that deals with vectors too.

The warnings I fixed in this patch were being generated by this
existing test:

  CodeGen/AArch64/sve-int-arith.ll

Differential Revision: https://reviews.llvm.org/D79421
2020-05-15 14:58:44 +01:00
Ties Stuij 8c24f33158 [IR][BFloat] Add BFloat IR type
Summary:
The BFloat IR type is introduced to provide support for, initially, the BFloat16
datatype introduced with the Armv8.6 architecture (optional from Armv8.2
onwards). It has an 8-bit exponent and a 7-bit mantissa and behaves like an IEEE
754 floating point IR type.

This is part of a patch series upstreaming Armv8.6 features. Subsequent patches
will upstream intrinsics support and C-lang support for BFloat.

Reviewers: SjoerdMeijer, rjmccall, rsmith, liutianle, RKSimon, craig.topper, jfb, LukeGeeson, sdesmalen, deadalnix, ctetreau

Subscribers: hiraditya, llvm-commits, danielkiss, arphaman, kristof.beyls, dexonsmith

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78190
2020-05-15 14:43:43 +01:00
Simon Pilgrim 9d4b4f344d DAGCombiner.cpp - remove non-constant EXTRACT_SUBVECTOR/INSERT_SUBVECTOR handling. NFC.
Now that D79814 has landed, we can assume that subvector ops use constant, in-range indices.
2020-05-15 12:41:35 +01:00
Konstantin Schwarz 5425cdc3ad [GlobalISel][InlineAsm] Add early return for memory inputs that need to be indirectified
Summary:
D78319 introduced basic support for inline asm input operands in GlobalISel.
However, that patch did not handle the case where a memory input operand still needs to
be indirectified. Later code asserts that the memory operand is already indirect.

This patch adds an early return false to trigger the SelectionDAG fallback for now.

Reviewers: arsenm, paquette

Reviewed By: arsenm

Subscribers: thakis, wdng, rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79955
2020-05-15 13:37:06 +02:00
David Sherwood 8ce4a8f6df [CodeGen] Refactor CreateStackTemporary
I've created a new variant of CreateStackTemporary that takes
TypeSize and Align arguments, and made the older instances of
CreateStackTemporary call this new function. This refactoring is
in preparation for more patches in this area related to scalable
vectors and improving the alignment calculations.

Differential Revision: https://reviews.llvm.org/D79933
2020-05-15 07:29:13 +01:00
Alok Kumar Sharma 4042ada1c1 [DebugInfo] support for DW_AT_data_location in llvm
This patch adds support for DWARF attribute DW_AT_data_location.

Summary:
Dynamic arrays in fortran are described by array descriptor and
data allocation address. Former is mapped to DW_AT_location and
later is mapped to DW_AT_data_location.

Testing:
unit test cases added (hand-written)
check llvm
check debug-info

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D79592
2020-05-15 11:33:17 +05:30
Alok Kumar Sharma ab699d78a2 [DebugInfo] llvm rejects DWARF operator DW_OP_push_object_address
llvm rejects DWARF operator DW_OP_push_object_address.This DWARF
operator is needed for Flang to support allocatable array.

Summary:
Currently llvm rejects DWARF operator DW_OP_push_object_address.
below error is produced when llvm finds this operator.

[..]
invalid expression
!DIExpression(151)
warning: ignoring invalid debug info in pushobj.ll
[..]

There are some parts missing in support of this operator, need to
be completed.

Testing
-added a unit testcase
-check-debuginfo
-check-llvm

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D79306
2020-05-15 11:10:35 +05:30
Kang Zhang aedb6615a8 [MachineVerifier] Use the for_range loop to instead llvm::any_of
Summary:
In the patch D78849, it uses llvm::any_of to instead of for loop to
simplify the function addRequired().
It's obvious that above code is not a NFC conversion. Because any_of
will return if any addRequired(Reg) is true immediately, but we want
every element to call addRequired(Reg).

This patch uses for_range loop to fix above any_of bug.

Reviewed By: MaskRay, nickdesaulniers

Differential Revision: https://reviews.llvm.org/D79872
2020-05-15 02:35:33 +00:00
Nico Weber e0c1554274 Revert "[GlobalISel][InlineAsm] Add early return for memory inputs that need to be indirectified"
This reverts commit 887dfeec53.
It broke irtranslator-inline-asm.ll on many bots, e.g.
http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/38606/steps/test-check-all/logs/FAIL%3A%20LLVM%3A%3Airtranslator-inline-asm.ll
2020-05-14 19:37:05 -04:00
Konstantin Schwarz 887dfeec53 [GlobalISel][InlineAsm] Add early return for memory inputs that need to be indirectified
Summary:
D78319 introduced basic support for inline asm input operands in GlobalISel.
However, that patch did not handle the case where a memory input operand still needs to
be indirectified. Later code asserts that the memory operand is already indirect.

This patch adds an early return false to trigger the SelectionDAG fallback for now.

Reviewers: arsenm, paquette

Reviewed By: arsenm

Subscribers: wdng, rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79955
2020-05-14 23:42:31 +02:00
Stanislav Mekhanoshin 184b383457 Add v16f64 value type
We need to use it to handle <16 x double> indirect indexes
in the AMDGPU BE.

The only visible change from adding it is in ARM cost model.
To me it looks reasonable. With doubling a vector size it
quadruples the cost up to the size 8 and then it did only
double it. Now it also quadruples, which seems a logical
progression to me.

Actual AMDGPU code is to follow, this is a common part, plus
load/store legalization in the AMDGPU BE not to break what
works now.

Differential Revision: https://reviews.llvm.org/D79952
2020-05-14 14:28:00 -07:00
Eli Friedman accc6b5545 LoadInst should store Align, not MaybeAlign.
The fact that loads and stores can have the alignment missing is a
constant source of confusion: code that usually works can break down in
rare cases.  So fix the LoadInst API so the alignment is never missing.

To reduce the number of changes required to make this work, IRBuilder
and certain LoadInst constructors will grab the module's datalayout and
compute the alignment automatically.  This is the same alignment
instcombine would eventually apply anyway; we're just doing it earlier.
There's a minor risk that the way we're retrieving the datalayout
could break out-of-tree code, but I don't think that's likely.

This is the last in a series of patches, so most of the necessary
changes have already been merged.

Differential Revision: https://reviews.llvm.org/D77454
2020-05-14 13:19:21 -07:00
Eli Friedman 4532a50899 Infer alignment of unmarked loads in IR/bitcode parsing.
For IR generated by a compiler, this is really simple: you just take the
datalayout from the beginning of the file, and apply it to all the IR
later in the file. For optimization testcases that don't care about the
datalayout, this is also really simple: we just use the default
datalayout.

The complexity here comes from the fact that some LLVM tools allow
overriding the datalayout: some tools have an explicit flag for this,
some tools will infer a datalayout based on the code generation target.
Supporting this properly required plumbing through a bunch of new
machinery: we want to allow overriding the datalayout after the
datalayout is parsed from the file, but before we use any information
from it. Therefore, IR/bitcode parsing now has a callback to allow tools
to compute the datalayout at the appropriate time.

Not sure if I covered all the LLVM tools that want to use the callback.
(clang? lli? Misc IR manipulation tools like llvm-link?). But this is at
least enough for all the LLVM regression tests, and IR without a
datalayout is not something frontends should generate.

This change had some sort of weird effects for certain CodeGen
regression tests: if the datalayout is overridden with a datalayout with
a different program or stack address space, we now parse IR based on the
overridden datalayout, instead of the one written in the file (or the
default one, if none is specified). This broke a few AVR tests, and one
AMDGPU test.

Outside the CodeGen tests I mentioned, the test changes are all just
fixing CHECK lines and moving around datalayout lines in weird places.

Differential Revision: https://reviews.llvm.org/D78403
2020-05-14 13:03:50 -07:00
Simon Pilgrim acb6f1ae09 TargetLowering.cpp - remove non-constant EXTRACT_SUBVECTOR/INSERT_SUBVECTOR handling. NFC.
Now that D79814 has landed, we can assume that subvector ops use constant, in-range indices.
2020-05-14 18:13:58 +01:00
Jay Foad 17941437a2 [TargetLowering] Improve expansion of FSHL/FSHR
Use an extra shift-by-1 instead of a compare and select to handle the
shift-by-zero case. This sometimes saves one instruction (if the compare
couldn't be combined with a previous instruction). It also works better
on targets that don't have good select instructions.

Note that currently this change doesn't affect most targets because
expandFunnelShift is not used because funnel shift intrinsics are
lowered early in SelectionDAGBuilder. But there is work afoot to change
that; see D77152.

Differential Revision: https://reviews.llvm.org/D77301
2020-05-14 16:36:22 +01:00
Sanjay Patel 26e742fd84 [x86][CGP] improve sinking of splatted vector shift amount operand
Expands on the enablement of the shouldSinkOperands() TLI hook in:
D79718

The last codegen/IR test diff shows what I suspected could happen - we were
sinking all splat shift operands into a loop. But that's not what we want in
general; we only want to sink the *shift amount* operand if it is a splat.

Differential Revision: https://reviews.llvm.org/D79827
2020-05-14 08:36:03 -04:00
Simon Pilgrim 80715b7124 SelectionDAG.cpp - remove non-constant EXTRACT_SUBVECTOR/INSERT_SUBVECTOR handling. NFC.
Now that D79814 has landed, we can assume that subvector ops use constant, in-range indices.
2020-05-14 13:23:00 +01:00
Konstantin Schwarz 91063cf85a [GlobalISel][InlineAsm] Add support for basic input operand constraints
Reviewers: arsenm, dsanders, aemerson, volkan, t.p.northover, paquette

Reviewed By: arsenm

Subscribers: gargaroff, wdng, rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78319
2020-05-14 10:43:37 +02:00
Eric Christopher bfa200ebcf Remove an unused variable. 2020-05-13 15:13:02 -07:00
Eli Friedman ed428c429e [SelectionDAG] Require constant index for INSERT/EXTRACT_SUBVECTOR.
It sounds like an interesting idea in theory, but nothing is actually
taking advantage of it, and specifying/implementing the edge cases is
painful. So just forbid it.

Differential Revision: https://reviews.llvm.org/D79814
2020-05-13 13:08:59 -07:00
Craig Topper de92dc2850 [Statepoint] Mark FixupStatepointCallerSaved as preserving the CFG
I'm hoping this will restore some compile time lost by D75936 and D75937.

Differential Revision: https://reviews.llvm.org/D79813
2020-05-13 10:59:44 -07:00
Benjamin Kramer a8bf2deae4 [CodeGenPrepare] Remove a superflouos variable. NFC.
Fixes a -Wunused-variable warning in Release builds.
2020-05-13 18:25:20 +02:00
David Green fa15255d8a [ARM] Convert floating point splats to integer
Under MVE a vdup will always take a gpr register, not a floating point
value. During DAG combine we convert the types to a bitcast to an
integer in an attempt to fold the bitcast into other instructions. This
is OK, but only works inside the same basic block. To do the same trick
across a basic block boundary we need to convert the type in
codegenprepare, before the splat is sunk into the loop.

This adds a convertSplatType function to codegenprepare to do that,
putting bitcasts around the splat to force the type to an integer. There
is then some adjustment to the code in shouldSinkOperands to handle the
extra bitcasts.

Differential Revision: https://reviews.llvm.org/D78728
2020-05-13 15:24:16 +01:00
Sourabh Singh Tomar e59744fd9b [DebugInfo] Fortran module DebugInfo support in LLVM
This patch extends DIModule Debug metadata in LLVM to support
Fortran modules. DIModule is extended to contain File and Line
fields, these fields will be used by Flang FE to create debug
information necessary for representing Fortran modules at IR level.

Furthermore DW_TAG_module is also extended to contain these fields.
If these fields are missing, debuggers like GDB won't be able to
show Fortran modules information correctly.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D79484
2020-05-13 12:52:30 +05:30
Fangrui Song 66055230bf [TargetLoweringObjectFileImpl] Produce .text.hot. instead of .text.hot for -fno-unique-section-names
GNU ld's internal linker script uses (https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=add44f8d5c5c05e08b11e033127a744d61c26aee)

  .text           :
  {
    *(.text.unlikely .text.*_unlikely .text.unlikely.*)
    *(.text.exit .text.exit.*)
    *(.text.startup .text.startup.*)
    *(.text.hot .text.hot.*)
    *(SORT(.text.sorted.*))
    *(.text .stub .text.* .gnu.linkonce.t.*)
    /* .gnu.warning sections are handled specially by elf.em.  */
    *(.gnu.warning)
  }

Because `*(.text.exit .text.exit.*)` is ordered before `*(.text .text.*)`, in a -ffunction-sections build, the C library function `exit` will be placed before other functions.
gold's `-z keep-text-section-prefix` has the same problem.

In lld, `-z keep-text-section-prefix` recognizes `.text.{exit,hot,startup,unlikely,unknown}.*`, but not `.text.{exit,hot,startup,unlikely,unknown}`, to avoid the strange placement problem.

In -fno-function-sections or -fno-unique-section-names mode, a function whose `function_section_prefix` is set to `.exit"`
will go to the output section `.text` instead of `.text.exit` when linked by lld.
To address the problem, append a dot to become `.text.exit.`

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D79600
2020-05-12 14:14:17 -07:00
David Blaikie aa99da5ace Avoid binding pointers to "auto&" (by dereferencing the pointer that's non-null anyway)
Based on @djtodoro's 2552dc5317
2020-05-12 11:40:00 -07:00
Craig Topper 8c72b0271b [CodeGen] Use Align in MachineConstantPool. 2020-05-12 10:06:40 -07:00
Jay Foad 989be65b11 [GlobalISel][IRTranslator] Fix <1 x Ty> handling in ConstantExprs
Summary:
ConstantExprs involving operations on <1 x Ty> could translate into MIR
that failed to verify with:
*** Bad machine code: Reading virtual register without a def ***

The problem was that translate(const Constant &C, Register Reg) had
recursive calls that passed the same Reg in for the translation of a
subexpression, but without updating VMap for the subexpression first as
translate(const Constant &C, Register Reg) expects.

Fix this by using the same translateCopy helper function that we use for
translating Instructions. In some cases this causes extra G_COPY
MIR instructions to be generated.

Fixes https://bugs.llvm.org/show_bug.cgi?id=45576

Reviewers: arsenm, volkan, t.p.northover, aditya_nandakumar

Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78378
2020-05-12 16:51:03 +01:00
Jay Foad bd80a8bb87 [GlobalISel][IRTranslator] New helper function translateCopy. NFC.
Reviewers: arsenm, volkan, t.p.northover, aditya_nandakumar

Subscribers: wdng, rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78377
2020-05-12 16:51:03 +01:00
James Y Knight e9536795a3 Add comment for SelectionDAGBuilder::SL field. 2020-05-12 10:46:08 -04:00
Djordje Todorovic 8b7b84e99d Revert "[NFC][DwarfDebug] Prefer explicit to auto type deduction"
This wasn't proposed by the LLVM Style Guide.
Please see https://reviews.llvm.org/D79624.

This reverts commit rG2552dc5317e0.
2020-05-12 09:44:31 +02:00
Djordje Todorovic 41ca605813 Revert "[NFC][DwarfDebug] Avoid default capturing when using lambdas"
Reverting this because we found it isn't that useful.
Please see https://reviews.llvm.org/D79616.

This reverts commit rG45e5a32a8bd3.
2020-05-12 09:37:28 +02:00
David Sherwood 42c7a6d52b [CodeGen] Fix incorrect uses of getVectorNumElements()
I have fixed up some places in SelectionDAG::getNode() where we
used to assert that the number of vector elements for two types
are the same. I have changed such cases to assert that the
element counts are the same instead. I've added new tests that
exercise the code paths for all the truncations. All the extend
operations are covered by this existing test:

  CodeGen/AArch64/sve-sext-zext.ll

For the ISD::SETCC case I fixed this code path is exercised by
these existing tests:

  CodeGen/AArch64/sve-fcmp.ll
  CodeGen/AArch64/sve-intrinsics-int-compares-with-imm.ll

Differential Revision: https://reviews.llvm.org/D79399
2020-05-12 07:50:37 +01:00
Eli Friedman c9c930ae67 [SelectionDAG] Don't promote the alignment of allocas beyond the stack alignment.
allocas in LLVM IR have a specified alignment. When that alignment is
specified, the alloca has at least that alignment at runtime.

If the specified type of the alloca has a higher preferred alignment,
SelectionDAG currently ignores that specified alignment, and increases
the alignment. It does this even if it would trigger stack realignment.
I don't think this makes sense, so this patch changes that.

I was looking into this for SVE in particular: for SVE, overaligning
vscale'ed types is extra expensive because it requires realigning the
stack multiple times, or using dynamic allocation. (This currently isn't
implemented.)

I updated the expected assembly for a couple tests; in particular, for
arg-copy-elide.ll, the optimization in question does not increase the
alignment the way SelectionDAG normally would. For the rest, I just
increased the specified alignment on the allocas to match what
SelectionDAG was inferring.

Differential Revision: https://reviews.llvm.org/D79532
2020-05-11 17:39:00 -07:00
Davide Italiano 288c9e8178 [GlobalISel] Remove debug locations when emitting G_FCONSTANT.
<rdar://problem/62991543>
2020-05-11 16:25:03 -07:00
Sanjay Patel 5f05c2f59a [CGP] remove duplicate function for finding a splat shuffle; NFC 2020-05-11 16:36:07 -04:00
Sam McCall 728cf6d86b Revert "[DAGCombine] Remove the getNegatibleCost to avoid the out of sync with getNegatedExpression"
This reverts commit 3c44c441db.

Causes infloops on some inputs, see https://reviews.llvm.org/D77319 for repro
2020-05-11 16:44:01 +02:00
Djordje Todorovic 45e5a32a8b [NFC][DwarfDebug] Avoid default capturing when using lambdas
It is bad practice to capture by default (via [&] in this case) when
using lambdas, so we should avoid that as much as possible.

This patch fixes that in the getForwardingRegsDefinedByMI
from DwarfDebug module.

Differential Revision: https://reviews.llvm.org/D79616
2020-05-11 10:02:13 +02:00
Djordje Todorovic 2552dc5317 [NFC][DwarfDebug] Prefer explicit to auto type deduction
We should use explicit type instead of auto type deduction when
the type is so obvious. In addition, we remove ambiguity, since auto
type deduction sometimes is not that intuitive, so that could lead
us to some unwanted behavior.

This patch fixes that in the collectCallSiteParameters() from
DwarfDebug module.

Differential Revision: https://reviews.llvm.org/D79624
2020-05-11 09:12:58 +02:00
QingShan Zhang 3c44c441db [DAGCombine] Remove the getNegatibleCost to avoid the out of sync with getNegatedExpression
We have the getNegatibleCost/getNegatedExpression to evaluate the cost and negate the expression.
However, during negating the expression, the cost might change as we are changing the DAG,
and then, hit the assertion if we negated the wrong expression as the cost is not trustful anymore.

This patch is target to remove the getNegatibleCost to avoid the out of sync with getNegatedExpression,
and check the cost during negating the expression. It also reduce the duplicated code between
getNegatibleCost and getNegatedExpression. And fix the crash for the test in D76638

Reviewed By: RKSimon, spatel

Differential Revision: https://reviews.llvm.org/D77319
2020-05-11 02:41:10 +00:00
Matt Arsenault 3af85fa8f0 GlobalISel: Handle more cases in lowerUnmergeValues
Handle scalar sources, as well as vectors.
2020-05-09 19:33:32 -04:00
Craig Topper 24b3c2d058 [BreakFalseDeps] Harden pickBestRegisterForUndef against changing tied operands or physical registers that aren't renamable.
I don't have any test cases since X86 doesn't return any tied
operands from getUndefRegClearance today. But conceivably we could
want BreakFalseDeps to insert a dependency breaking XOR for
a tied operand in the future.
2020-05-09 15:37:31 -07:00
Matt Arsenault 69999605ee GlobalISel: Move code into lowering for G_MERGE_VALUES
Currently this code exists in widenScalar for G_MERGE_VALUE
sources. I'm not sure if the existing expansion in widenScalar should
be removed or not. The widenScalar variant tries to extend to the
requested size, but this just uses the original bitwidth.
2020-05-09 16:39:37 -04:00
Craig Topper bebdc62c3f [SelectionDAG] Remove ConstantPoolSDNode::getAlignment.
Use getAlign instead.

Differential Revision: https://reviews.llvm.org/D79459
2020-05-08 16:04:11 -07:00
Craig Topper d1119980e5 [SelectionDAG] Use Align/MaybeAlign for ConstantPoolSDNode.
This patch stores the alignment for ConstantPoolSDNode as an
Align and updates the getConstantPool interface to take a MaybeAlign.

Removing getAlignment() will be done as a follow up.

Differential Revision: https://reviews.llvm.org/D79436
2020-05-08 16:04:11 -07:00
Jessica Paquette f66309deab [GlobalISel] Don't add duplicate successors to MBBs when translating indirectbr
This fixes a verifier failure on a bot:

http://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-aarch64-O0-g/

```
*** Bad machine code: MBB has duplicate entries in its successor list. ***
- function:    foo
- basic block: %bb.5 indirectgoto (0x7fe3d687ca08)
```

One of the GCC torture suite tests (pr70460.c) has an indirectbr instruction
which has duplicate blocks in its destination list.

According to the langref this is allowed:

> Blocks are allowed to occur multiple times in the destination list, though
> this isn’t particularly useful.
(https://www.llvm.org/docs/LangRef.html#indirectbr-instruction)

We don't allow this in MIR. So, when we translate such an instruction, the
verifier screams.

This patch makes `translateIndirectBr` check if a successor has already been
added to a block. If the successor is present, it is skipped rather than added
twice.

Differential Revision: https://reviews.llvm.org/D79609
2020-05-08 13:40:02 -07:00
Wei Mi aa2ddfc73d [SampleFDO] For functions without profiles, provide an option to put
them in a special text section.

For sampleFDO, because the optimized build uses profile generated from
previous release, previously we couldn't tell a function without profile
was truely cold or just newly created so we had to treat them conservatively
and put them in .text section instead of .text.unlikely. The result was when
we persuing the best performance by locking .text.hot and .text in memory,
we wasted a lot of memory to keep cold functions inside.

In https://reviews.llvm.org/D66374, we introduced profile symbol list to
discriminate functions being cold versus functions being newly added.
This mechanism works quite well for regular use cases in AutoFDO. However,
in some case, we can only have a partial profile when optimizing a target.
The partial profile may be an aggregated profile collected from many targets.
The profile symbol list method used for regular sampleFDO profile is not
applicable to partial profile use case because it may be too large and
introduce many false positives.

To solve the problem for partial profile use case, we provide an option called
--profile-unknown-in-special-section. For functions without profile, we will
still treat them conservatively in compiler optimizations -- for example,
treat them as warm instead of cold in inliner. When we use profile info to
add section prefix for functions, we will discriminate functions known to be
not cold versus functions without profile (being unknown), and we will put
functions being unknown in a special text section called .text.unknown.
Runtime system will have the flexibility to decide where to put the special
section in order to achieve a balance between performance and memory saving.

Differential Revision: https://reviews.llvm.org/D62540
2020-05-08 11:18:09 -07:00
Simon Pilgrim 70293ba26f [DAG] SimplifyMultipleUseDemandedBits - remove superfluous bitcasts
If the SimplifyMultipleUseDemandedBits calls BITCASTs that peek through back to the original type then we can remove the BITCASTs entirely.

Differential Revision: https://reviews.llvm.org/D79572
2020-05-08 19:04:49 +01:00
Fangrui Song befbc99a7f Reland D79501 "[DebugInfo] Fix handling DW_OP_call_ref in DWARF64 units."
With a fix to uninitialized EndOffset.

DW_OP_call_ref is the only operation that has an operand which depends
on the DWARF format. The patch fixes handling that operation in DWARF64
units.

Differential Revision: https://reviews.llvm.org/D79501
2020-05-08 09:35:54 -07:00
Krasimir Georgiev c5e0967e4c Revert "[DebugInfo] Fix handling DW_OP_call_ref in DWARF64 units."
This reverts commit 989ae9e848.

Newly added test fails:
FAIL: LLVM::DW_OP_call_ref_unexpected.s

http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/28298
2020-05-08 17:24:32 +02:00
Simon Pilgrim 9f726376e3 LiveIntervalCalc - remove unnecessary includes. NFC.
As we're inheriting from LiveRangeCalc, all the headers are already explicitly required by LiveRangeCalc.h
2020-05-08 14:57:35 +01:00
Igor Kudrin 989ae9e848 [DebugInfo] Fix handling DW_OP_call_ref in DWARF64 units.
DW_OP_call_ref is the only operation that has an operand which depends
on the DWARF format. The patch fixes handling that operation in DWARF64
units.

Differential Revision: https://reviews.llvm.org/D79501
2020-05-08 15:14:42 +07:00
aartbik 771d30c647 [llvm] [CodeGen] Fixed vector halving bug for masked store
Summary:
Note that this fix is very similar to what has already been
done for the masked load in https://reviews.llvm.org/D78608

Bugs:
https://bugs.llvm.org/show_bug.cgi?id=45563
https://bugs.llvm.org/show_bug.cgi?id=45833

Reviewers: craig.topper, nicolasvasilache, mehdi_amini

Reviewed By: craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79611
2020-05-07 19:01:40 -07:00
James Y Knight 7af9d386da Correctly modify the CFG in IfConverter, and then remove the
CorrectExtraCFGEdges function.

The latter was a workaround for "Various pieces of code" leaving bogus
extra CFG edges in place. Where by "various" it meant only
IfConverter::MergeBlocks, which failed to clear all of the successors
of dead blocks it emptied out. This wouldn't matter a whole lot,
except that the dead blocks remained listed as predecessors of
still-useful blocks, inhibiting optimizations.

This fix slightly changed two thumb tests, because the correct CFG
successors allowed for the "diamond" if-conversion pattern to be
detected, when it could only use "simple" before.

Additionally, the removal of a now-redundant call to analyzeBranch
(with AllowModify=true) in BranchFolder::OptimizeFunction caused a
later check for an empty block in BranchFolder::OptimizeBlock to
fail. Correct this by moving the call to analyzeBranch in
OptimizeBlock higher.

Differential Revision: https://reviews.llvm.org/D79527
2020-05-07 18:17:07 -04:00
Hiroshi Yamauchi 1b4e3def03 [BFI][CGP] Add limited support for detecting missed BFI updates and fix one in CodeGenPrepare.
Summary:
This helps detect some missed BFI updates during CodeGenPrepare.

This is debug build only and disabled behind a flag.

Fix a missed update in CodeGenPrepare::dupRetToEnableTailCallOpts().

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77417
2020-05-07 11:58:00 -07:00
Thomas Raoux dc26dec331 [ModuloSchedule] Fix epilogue peeling with illegal phi.
When peeling out the epilogue we need to ignore illegal phis coming from stages
greater than the producer stage. Otherwise we end up with circular phi
dependencies.

Differential Revision: https://reviews.llvm.org/D79581
2020-05-07 10:04:05 -07:00
Kerry McLaughlin a31f4c52bf [SVE][CodeGen] Fix legalisation for scalable types
Summary:
This patch handles illegal scalable types when lowering IR operations,
addressing several places where the value of isScalableVector() is
ignored.

For types such as <vscale x 8 x i32>, this means splitting the
operations. In this example, we would split it into two
operations of type <vscale x 4 x i32> for the low and high halves.

In cases such as <vscale x 2 x i32>, the elements in the vector
will be promoted. In this case they will be promoted to
i64 (with a vector of type <vscale x 2 x i64>)

Reviewers: sdesmalen, efriedma, huntergr

Reviewed By: efriedma

Subscribers: david-arm, tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78812
2020-05-07 10:01:31 +01:00
Craig Topper 7b9d6673bf [SelectionDAG] When splitting gather operands in type legalization, set MMO size to UnknownSize
I missed this case when I did the same for gather results and scatter
operands in c69a4d6bef.
2020-05-06 19:57:14 -07:00
Alexandre Ganea f78b674de4 Revert "[Debug][CodeView] Emit fully qualified names for globals"
This reverts commit 06591b6d19.
2020-05-06 15:23:58 -04:00
LemonBoy 7fa5abd343 [SelectionDAG] Fix assertion failure with big shift amounts
Calling getShiftAmountTy with LegalTypes set may return a type that's too narrow to hold the shift amount for integer type it's applied to.

Fixes the regression introduced by D79096

Differential Revision: https://reviews.llvm.org/D79405
2020-05-06 11:58:37 -07:00
Michael Liao 6533c1da7f Revert "[MIR] Fix a bug in MIR printer."
This reverts commit e38018b80d.
2020-05-06 11:26:42 -04:00
Michael Liao e38018b80d [MIR] Fix a bug in MIR printer.
- Need to skip the assignment of `ID`, which is used to index that two
  object arrays.
2020-05-06 10:33:45 -04:00
Sanjay Patel 2f1fe1864d [DAGCombiner] sink target-supported FP<->int cast op after concat vectors
Try to combine N short vector cast ops into 1 wide vector cast op:
concat (cast X), (cast Y)... -> cast (concat X, Y...)

This is part of solving PR45794:
https://bugs.llvm.org/show_bug.cgi?id=45794

As noted in the code comment, this is uglier than I was hoping because
the opcode determines whether we pass the source or destination type
to isOperationLegalOrCustom(). Also IIUC, there's no way to validate
what the other (dest or src) type is. Without the extra legality check
on that, there's an ARM regression test in:
test/CodeGen/ARM/isel-v8i32-crash.ll
...that will crash trying to lower an unsupported v8f32 to v8i16.

Differential Revision: https://reviews.llvm.org/D79360
2020-05-06 10:25:58 -04:00
Alexandre Ganea 06591b6d19 [Debug][CodeView] Emit fully qualified names for globals
Emit S_[L|G][THREAD32|DATA32] records with a fully qualified name (namespace + class scope).

Differential Revision: https://reviews.llvm.org/D79447
2020-05-06 09:12:00 -04:00
David Spickett 055ea585c7 Reland "[CodeGen] Make logic of CCState::resultsCompatible clearer"
This relands commit d782d1f898.
With a typo fixed, which was causing the x86 test failure.
2020-05-06 13:40:49 +01:00
David Spickett e1022cb5d4 Revert "[CodeGen] Make logic of CCState::resultsCompatible clearer"
This reverts commit d782d1f898
which caused test CodeGen/X86/sibcall.ll to fail.
2020-05-06 10:14:17 +01:00
David Spickett d782d1f898 [CodeGen] Make logic of CCState::resultsCompatible clearer 2020-05-06 09:48:58 +01:00
Konstantin Schwarz e82b0e9a8e [GlobalISel][InlineAsm] Add support for basic output operand constraints
Reviewers: arsenm, dsanders, aemerson, volkan, t.p.northover, paquette

Reviewed By: arsenm

Subscribers: gargaroff, wdng, rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78318
2020-05-06 10:06:13 +02:00
Puyan Lotfi 0c4aab27b3 [NFC] Outliner label name clean up.
Just simplifying how the label name is generated while using
std::to_string instead of Twine.

Differential Revision: https://reviews.llvm.org/D79464
2020-05-05 23:27:46 -04:00
Jinsong Ji 80b78a47e5 [MachinePipeliner] Add ORE for MachinePipeliner
This patch adds ORE for MachinePipeliner, so that people can anaylyze
their code using opt-viewer or other tools, then optimize the code to
catch more piplining opportunities.

Reviewed By: bcahoon

Differential Revision: https://reviews.llvm.org/D79368
2020-05-05 16:04:53 +00:00
Sam Parker 40574fefe9 [NFC][CostModel] Add TargetCostKind to relevant APIs
Make the kind of cost explicit throughout the cost model which,
apart from making the cost clear, will allow the generic parts to
calculate better costs. It will also allow some backends to
approximate and correlate the different costs if they wish. Another
benefit is that it will also help simplify the cost model around
immediate and intrinsic costs, where we currently have multiple APIs.

RFC thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html

Differential Revision: https://reviews.llvm.org/D79002
2020-05-05 10:35:54 +01:00
David Sherwood cd3a54c55a [CodeGen] Fix warnings due to SelectionDAG::getSplatSourceVector
Summary:
I have fixed several places in getSplatSourceVector and isSplatValue
to work correctly with scalable vectors. I added new support for
the ISD::SPLAT_VECTOR DAG node as one of the obvious cases we can
support with scalable vectors. In other places I have tried to do
the sensible thing, such as bail out for vector types we don't yet
support or don't intend to support.

It's not possible to add IR test cases to cover these changes, since
they are currently only ever exercised on certain targets, e.g.
only X86 targets use the result of getSplatSourceVector. I've
assumed that X86 tests already exist to test these code paths for
fixed vectors. However, I have added some AArch64 unit tests that
test the specific functions I have changed.

Differential revision: https://reviews.llvm.org/D79083
2020-05-05 08:45:41 +01:00
Krzysztof Parzyszek 156092bbcc [RegisterCoalescer] Extend a subrange if needed when filling range gap
Register live ranges may have had gaps that after coalescing should be
removed. This is done by adding a new segment to the range, and merging
it with neighboring segments. When doing so, do not assume that each
subrange of the register ended at the same index. If a subrange ended
earlier, adding this segment could make the live range invalid.
Instead, if the subrange is not live at the start of the segment,
extend it first.
2020-05-04 16:49:59 -05:00
Snehasish Kumar c8ac29ab1d Descriptive symbol names for machine basic block sections.
Today symbol names generated for machine basic block sections use a
unary encoding to reduce bloat. This is essential when every basic block
in the binary is assigned a symbol however with basic block clusters
(rG05192e585ce175b55f2a26b83b4ed7882785c8e6) when we only need to
generate a few non-temporary symbols we can assign more descriptive
names making them more user friendly. With this change -

Cold cluster section for function foo is named "foo.cold"
Exception cluster section for function foo is named "foo.eh"
Other cluster sections identified by their ids are named "foo.ID"
Using this format works well with existing tools. It will demangle as
expected and works with existing symbolizers, profilers and debuggers
out of the box.

$ c++filt _Z3foov.cold
foo() [clone .cold]

$ c++filt _Z3foov.eh
foo() [clone .eh]

$c++filt _Z3foov.1234
foo() [clone 1234]

Tests for basicblock-sections are updated with some cleanup where
appropriate.

Differential Revision: https://reviews.llvm.org/D79221
2020-05-04 19:06:43 +00:00
Alexandre Ganea 721ea5b380 [DebugInfo][CodeView] Include namespace into emitted globals
Before this patch, global variables didn't have their namespace prepended in the Codeview debug symbol stream. This prevented Visual Studio from displaying them in the debugger (they appeared as 'unspecified error')

Differential Revision: https://reviews.llvm.org/D79028
2020-05-04 13:59:36 -04:00
Alex Richardson d1ff003fbb [SelectionDAGBuilder] Stop setting alignment to one for hidden sret values
We allocated a suitably aligned frame index so we know that all the values
have ABI alignment.
For MIPS this avoids using pair of lwl + lwr instructions instead of a
single lw. I found this when compiling CHERI pure capability code where
we can't use the lwl/lwr unaligned loads/stores and and were to falling
back to a byte load + shift + or sequence.

This should save a few instructions for MIPS and possibly other backends
that don't have fast unaligned loads/stores.
It also improves code generation for CodeGen/X86/pr34653.ll and
CodeGen/WebAssembly/offset.ll since they can now use aligned loads.

Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D78999
2020-05-04 14:44:39 +01:00
Ten Tzen 21c1a0c730 Test Commit: add two head comments in WinEHPrepare.cpp
This is a Test commit.
2020-05-03 01:15:59 -07:00
LemonBoy 6d103ca855 [SelectionDAG] Unify scalarizeVectorLoad and VectorLegalizer::ExpandLoad
The two code paths have the same goal, legalizing a load of a non-byte-sized vector by loading the "flattened" representation in memory, slicing off each single element and then building a vector out of those pieces.

The technique employed by `ExpandLoad`  is slightly more convoluted and produces slightly better codegen on ARM, AMDGPU and x86 but suffers from some bugs (D78480) and is wrong for BE machines.

Differential Revision: https://reviews.llvm.org/D79096
2020-05-02 15:18:10 -07:00
Simon Pilgrim a09a3c6d3e Revert rG8e05ac0a510c - "[DAGCombine] visitTRUNCATE - remove GetDemandedBits call"
Causing buildbot failures
2020-05-02 20:08:33 +01:00
Simon Pilgrim 8e05ac0a51 [DAGCombine] visitTRUNCATE - remove GetDemandedBits call
rL368553 added SimplifyMultipleUseDemandedBits handling for ISD::TRUNCATE to SimplifyDemandedBits so we don't need to duplicate this (and it gets rid of another GetDemandedBits call which is slowly being replaced with SimplifyMultipleUseDemandedBits anyhow).
2020-05-02 19:52:17 +01:00
Benjamin Kramer 97f92261df [MBP] tuple->pair. NFC.
std::pair has a trivial copy ctor, std::tuple doesn't.
2020-05-02 20:23:34 +02:00
Sam McCall d10c995b4d std::isspace -> llvm::isSpace (where locale should be ignored)
I've left out some cases where I wasn't totally sure this was right or
whether the include was ok (compiler-rt) or idiomatic (flang).
2020-05-02 15:36:04 +02:00
Simon Pilgrim 7cb5a51f38 [DAG] SimplifyDemandedVectorElts - add INSERT_SUBVECTOR SimplifyMultipleUseDemandedBits handling 2020-05-01 16:20:51 +01:00
Simon Pilgrim 65d32a9892 [DAG] SimplifyDemandedVectorElts - remove INSERT_SUBVECTOR if we don't demand the subvector 2020-05-01 16:20:51 +01:00
Simon Pilgrim e3c0be596c [DAG] SimplifyDemandedVectorElts - add EXTRACT_SUBVECTOR SimplifyMultipleUseDemandedBits handling 2020-05-01 13:48:07 +01:00
Craig Topper 6a1ad76dab [X86] Don't return true from isTruncateFree for vectors
Also fix some cost tables for vXi1 types to match the costs entries for the types they will be promoted to.

Differential Revision: https://reviews.llvm.org/D79045
2020-04-30 16:43:35 -07:00
Benjamin Kramer 31db4dbbbe Clean up warnings after a2c8cd1812 2020-04-30 17:01:30 +02:00
diggerlin a2c8cd1812 [AIX] emit .extern and .weak directive linkage
SUMMARY:

emit .extern and .weak directive linkage

Reviewers: hubert.reinterpretcast, Jason Liu
Subscribers: wuzish, nemanjai, hiraditya

Differential Revision: https://reviews.llvm.org/D76932
2020-04-30 09:54:10 -04:00
Simon Pilgrim 96238486ed [DAGCombine] Move the remaining X86 funnel shift patterns to DAGCombine
X86 matches several 'shift+xor' funnel shift patterns:

  fold (or (srl (srl x1, 1), (xor y, 31)), (shl x0, y))  -> (fshl x0, x1, y)
  fold (or (shl (shl x0, 1), (xor y, 31)), (srl x1, y))  -> (fshr x0, x1, y)
  fold (or (shl (add x0, x0), (xor y, 31)), (srl x1, y)) -> (fshr x0, x1, y)

These patterns are also what we end up with the proposed expansion changes in D77301.

This patch moves these to DAGCombine's generic MatchFunnelPosNeg.

All existing X86 test cases still pass, and we just have a small codegen change in pr32282.ll.

Reviewed By: @spatel

Differential Revision: https://reviews.llvm.org/D78935
2020-04-30 12:57:17 +01:00
Simon Pilgrim 6547a5ceb2 [DAG] Add TODO comment regarding ADD(X,X) -> SHL(X,1) canonicalization
As discussed on D78935
2020-04-30 12:57:16 +01:00
David Sherwood 058cd8c5be [CodeGen] Add support for inserting elements into scalable vectors
Summary:
This patch tries to ensure that we do something sensible when
generating code for the ISD::INSERT_VECTOR_ELT DAG node when operating
on scalable vectors. Previously we always returned 'undef' when
inserting an element into an out-of-bounds lane index, whereas now
we only do this for fixed length vectors. For scalable vectors it
is assumed that the backend will do the right thing in the same way
that we have to deal with variable lane indices.

In this patch I have permitted a few basic combinations for scalable
vector types where it makes sense, but in general avoided most cases
for now as they currently require the use of BUILD_VECTOR nodes.

This patch includes tests for all scalable vector types when inserting
into lane 0, but I've only included one or two vector types for other
cases such as variable lane inserts.

Differential Revision: https://reviews.llvm.org/D78992
2020-04-30 11:14:04 +01:00
Puyan Lotfi ffd5e121d7 [NFCi] Iterative Outliner + clang-format refactoring.
Prior to D69446 I had done some NFC cleanup to make landing an iterative
outliner a cleaner more straight-forward patch. Since then, it seems that has
landed but I noticed some ways it could be cleaned up. Specifically:

1) doOutline was meant to be the re-runable function, but instead
   runOnceOnModule was created that just calls doOutline.
2) In D69446 we discussed that the flag allowing the re-run of the
   outliner should be a flag to tell how many additional times to run
   the outliner again, not the total number of times. I don't think it
   makes sense to introduce a flag, but print an error if the flag is
   set to 0.

This is an NFCi, the i being that I get rid of the way that the
machine-outline-runs flag could be used to tell the outliner to not run
at all, and because I renamed the flag to '-machine-outliner-reruns'.

Differential Revision: https://reviews.llvm.org/D79070
2020-04-29 18:36:47 -04:00
Davide Italiano dcdb1b94e1 [MachineVerifier] Remove an unused function. NFCI. 2020-04-29 09:58:27 -07:00
Simon Pilgrim 1be7f2de1b Revert rG5c4b4a62256876 "PseudoSourceValue.h - reduce GlobalValue.h include to forward declaration. NFC."
Causes buildbot failures.
2020-04-29 16:12:19 +01:00
Simon Pilgrim 5c4b4a6225 PseudoSourceValue.h - reduce GlobalValue.h include to forward declaration. NFC.
Fix MachineMemOperand.h implicit dependency on Type.h via PseudoSourceValue.h
2020-04-29 15:39:27 +01:00
QingShan Zhang b5f89744cc [DAGCombine] Checking the cost directly to improve the code readability
Call getNegatedExpression(Cost) and check the Cost to make the code more clear.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D78347
2020-04-29 01:49:39 +00:00
Casey Carter 68b30bc02b [NFC] Correct spelling of "ambiguous" 2020-04-28 14:51:37 -07:00
Krzysztof Parzyszek 25a4b1904c Handle part-word LL/SC in atomic expansion pass
Differential Revision: https://reviews.llvm.org/D77213
2020-04-28 10:07:39 -05:00
Sam Parker e9c9329aa4 [TTI] Add TargetCostKind argument to getUserCost
There are several different types of cost that TTI tries to provide
explicit information for: throughput, latency, code size along with
a vague 'intersection of code-size cost and execution cost'.

The vectorizer is a keen user of RecipThroughput and there's at least
'getInstructionThroughput' and 'getArithmeticInstrCost' designed to
help with this cost. The latency cost has a single use and a single
implementation. The intersection cost appears to cover most of the
rest of the API.

getUserCost is explicitly called from within TTI when the user has
been explicit in wanting the code size (also only one use) as well
as a few passes which are concerned with a mixture of size and/or
a relative cost. In many cases these costs are closely related, such
as when multiple instructions are required, but one evident diverging
cost in this function is for div/rem.

This patch adds an argument so that the cost required is explicit,
so that we can make the important distinction when necessary.

Differential Revision: https://reviews.llvm.org/D78635
2020-04-28 08:57:45 +01:00
Craig Topper e13c141a91 [SelectionDAGBuilder] Use CallBase::isInlineAsm in a couple places. NFC
These lines were just changed from using CallBase::getCalledValue
to getCallledOperand. Go aheand change them to isInlineAsm.
2020-04-27 23:00:44 -07:00
Craig Topper a58b62b4a2 [IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand().
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().

I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.

Differential Revision: https://reviews.llvm.org/D78882
2020-04-27 22:17:03 -07:00
LemonBoy f30416fdde [AsmPrinter] Fix emission of non-standard integer constants for BE targets
The code assumed that zero-extending the integer constant to the
designated alloc size would be fine even for BE targets, but that's not
the case as that pulls in zeros from the MSB side while we actually
expect the padding zeros to go after the LSB.

I've changed the codepath handling the constant integers to use the
store size for both small(er than u64) and big constants and then add
zero padding right after that.

Differential Revision: https://reviews.llvm.org/D78011
2020-04-27 14:57:29 -07:00
Nick Desaulniers 59acdf0aca fix D78849 for g++ < 7.1
Summary:
Looks like g++ < 7.1 has a bug resolving calls to member functions without
`this->` in lamdas with `auto` types.  It looks like multiple build bots are
using g++-5.

https://stackoverflow.com/questions/32097759/calling-this-member-function-from-generic-lambda-clang-vs-gcc
https://godbolt.org/z/MiaRt-

Reviewers: MaskRay, efriedma, jyknight, craig.topper, rsmith

Reviewed By: rsmith

Subscribers: hiraditya, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78962
2020-04-27 13:47:00 -07:00
Wei Mi 68d2301e12 Recommit "Generate Callee Saved Register (CSR) related cfi directives
like .cfi_restore"

Insert .cfi_offset/.cfi_register when IncomingCSRSaved of current block
is larger than OutgoingCSRSaved of its previous block.

Original commit message:
https://reviews.llvm.org/D42848 only handled CFA related cfi directives but
didn't handle CSR related cfi. The patch adds the CSR part. Basically it reuses
the framework created in D42848. For each basicblock, the patch tracks which
CSR set have been saved at its CFG predecessors's exits, and compare the CSR
set with the set at its previous basicblock's exit (The previous block is the
block laid before the current block). If the saved CSR set at its previous
basicblock's exit is larger, .cfi_restore will be inserted.

The patch also generates proper .cfi_restore in epilogue to make sure the
saved CSR set is consistent for the incoming edges of each block.

Differential Revision: https://reviews.llvm.org/D74303
2020-04-27 12:46:58 -07:00
Nick Desaulniers c695ea2afa [MachineVerifier] retrofit iterators with range for. NFC
Summary:
Reviewing failures identified in D78586, I was finding the identifiers
for these iterators hard to read.

Reviewers: efriedma, MaskRay, jyknight

Reviewed By: MaskRay

Subscribers: hiraditya, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78849
2020-04-27 12:15:55 -07:00
Davide Italiano c8433a5b1b [GlobalISel] Remove debug locations when emitting constants.
The tl;dr story is that this causes jumps in the emitted line
tables, even at `-O0`. We could at some point consider more fancy
solutions to preserve locations, but it doesn't seem to be worth
the effort for now.

<rdar://problem/62460788>

Differential Revision:  https://reviews.llvm.org/D78947
2020-04-27 11:27:08 -07:00
David Sherwood 096b25a8d8 [CodeGen] Use SPLAT_VECTOR for zeroinitialiser with scalable types
Summary:
When generating code for the LLVM IR zeroinitialiser operation, if
the vector type is scalable we should be using SPLAT_VECTOR instead
of BUILD_VECTOR.

Differential Revision: https://reviews.llvm.org/D78636
2020-04-27 15:57:59 +01:00
QingShan Zhang 2957fa0cd1 [NFC][DAGCombine] Adding three helper functions and change the getNegatedExpression to negateExpression
This is a NFC patch for D77319. The idea is to hide the getNegatibleCost inside the getNegatedExpression()
to have it return null if the cost is expensive, and add some helper function for easy to use. And
rename the old getNegatedExpression to negateExpression to avoid the semantic conflict.

Reviewed By: RKSimon

Differential revision: https://reviews.llvm.org/D78291
2020-04-27 04:11:42 +00:00
Simon Pilgrim a3982491db [Pass] Ensure we don't include PassSupport.h or PassAnalysisSupport.h directly
Both PassSupport.h and PassAnalysisSupport.h are only supposed to be included via Pass.h.

Differential Revision: https://reviews.llvm.org/D78815
2020-04-26 12:58:20 +01:00
Benjamin Kramer 1d42764df7 Give helpers internal linkage. NFC. 2020-04-25 11:50:52 +02:00
Snehasish Kumar 0cc063a8ff Use .text.unlikely and .text.eh prefixes for MachineBasicBlock sections.
Summary:
Instead of adding a ".unlikely" or ".eh" suffix for machine basic blocks,
this change updates the behaviour to use an appropriate prefix
instead. This allows lld to group basic block sections together
when -z,keep-text-section-prefix is specified and matches the behaviour
observed in gcc.

Reviewers: tmsriram, mtrofin, efriedma

Reviewed By: tmsriram, efriedma

Subscribers: eli.friedman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78742
2020-04-24 15:07:38 -07:00
Fangrui Song 10bc12588d [XRay] Change Sled.Function to PC-relative for sled version 2 and make llvm-xray support sled version 2 addresses
Follow-up of D78082 and D78590.

Otherwise, because xray_instr_map is now read-only, the absolute
relocation used for Sled.Function will cause a text relocation.
2020-04-24 14:41:56 -07:00
Amara Emerson dbb0356771 [AArch64][GlobalISel] Fix sub-64b stack parameter passing on Darwin.
A previous bug fix for varargs introduced a regression where we would
incorrectly widen some stores to memory when passing i8/i16 parameters on the
stack. This didn't show up seemingly because it only happens when there is
no signext/zeroext parameter attribute, which I think for Darwin clang adds.

Swift however seems to be a different story, and a plain anyext on the parameter
triggered the bug.

To fix this, I've added a new ValueHandler::assignValueToAddress type override
which lets us distiguish between varargs and fixed args (we still need this
widening behaviour for varargs to fix the original bug in 2018).

rdar://61353552
2020-04-24 13:56:43 -07:00
Jean-Michel Gorius 505685a67a [llvm][CodeGen] Check for memory instructions when querying for alias status
Summary:
Add a check to make sure that MachineInstr::mayAlias returns prematurely if at least one of its instruction parameters does not access memory. This prevents calls to TargetInstrInfo::areMemAccessesTriviallyDisjoint with incompatible instructions.

A side effect of this change is to render the mayAlias helper in the AArch64 load/store optimizer obsolete. We can now directly call the MachineInstr::mayAlias member function.

Reviewers: hfinkel, t.p.northover, mcrosier, eli.friedman, efriedma

Reviewed By: efriedma

Subscribers: efriedma, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78823
2020-04-24 22:54:46 +02:00
Simon Pilgrim 628b0243c8 AllocationOrder.h - split MCRegisterInfo.h include. NFC.
We only require to include MCRegister.h and SmallVector.h.
2020-04-24 18:42:43 +01:00
Fangrui Song 25e22613df [XRay] Change ARM/AArch64/powerpc64le to use version 2 sled (PC-relative address)
Follow-up of D78082 (x86-64).

This change avoids dynamic relocations in `xray_instr_map` for ARM/AArch64/powerpc64le.

MIPS64 cannot use 64-bit PC-relative addresses because R_MIPS_PC64 is not defined.
Because MIPS32 shares the same code, for simplicity, we don't use PC-relative addresses for MIPS32 as well.

Tested on AArch64 Linux and ppc64le Linux.

Reviewed By: ianlevesque

Differential Revision: https://reviews.llvm.org/D78590
2020-04-24 08:35:43 -07:00
Simon Pilgrim f10835a034 DwarfDebug.h - remove unnecessary forward declarations. NFC.
We include their headers already.
2020-04-24 15:34:54 +01:00
aartbik 907871d9ad [llvm] [CodeGen] Fixed vector halving bug for masked load
Summary:
Given a VL=14 that is enveloped by a proper VL=16, splitting the
masked load using the enveloping halving VL=8/8 should yields
should eventually yield V=8/5. This fixes various assert failures
in getHalfNumVectorElementsVT() and IncrementMemoryAddress().

Note, I suspect similar fixes will be needed for other masked
operations, but for now I send out a fix for masked load only.

Bugzilla issue 45563
https://bugs.llvm.org/show_bug.cgi?id=45563

Reviewers: craig.topper, mehdi_amini, nicolasvasilache

Reviewed By: craig.topper

Subscribers: hiraditya, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78608
2020-04-23 15:12:44 -07:00
Christopher Tetreault ccd623eae3 [SVE] Remove calls to isScalable from CodeGen
Reviewers: efriedma, sdesmalen, stoklund, sunfish

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77755
2020-04-23 12:58:52 -07:00
Alex Richardson bbcfce4bad Use FrameIndexTy for stack protector
Using getValueType() is not correct for architectures extended with CHERI since
we need a pointer type and not the value that is loaded. While stack
protector is useless when you have CHERI (since CHERI provides much
stronger security guarantees), we still have a test to check that we can
generate correct code for checks. Merging b281138a1b
into our tree broke this test. Fix by using TLI.getFrameIndexTy().

Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D77785
2020-04-23 13:12:27 +01:00
Amara Emerson 613f12dd8e [AArch64][GlobalISel] Set the current debug loc when missing in some cases. 2020-04-23 01:34:57 -07:00
Aditya Nandakumar 3db893b371 [GISel]: Relax opcode checking at the top level to enable CSE
Loosen the restriction on what kinds of opcodes can be CSEd as
targets may want to CSE some generic target specific pseudos.
NFC as far as this change is concerned as CSEConfig still pretty much is
a subset of this check.

Differential Revision: https://reviews.llvm.org/D78684
2020-04-22 17:31:33 -07:00
Vedant Kumar f0b52beef3 [AArch64InstrInfo] Ignore debug insts in areCFlagsAccessedBetweenInstrs [7/14]
Summary:
Fix an issue where the presence of debug info could disable a peephole
optimization due to areCFlagsAccessedBetweenInstrs returning the wrong
result.

In test/CodeGen/AArch64/arm64-csel.ll, the issue was found in the
function @foo5, in which the first compare could successfully be
optimized but not the second.

Reviewers: t.p.northover, eastig, paquette

Subscribers: kristof.beyls, hiraditya, danielkiss, aprantl, dsanders, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78157
2020-04-22 17:03:40 -07:00
Vedant Kumar 26271c8384 [AArch64InstrInfo] Ignore debug insts in canInstrSubstituteCmpInstr [6/14]
Summary:
Fix an issue where the presence of debug info could disable a peephole
optimization in optimizeCompareInstr due to canInstrSubstituteCmpInstr
returning the wrong result.

Depends on D78137.

Reviewers: t.p.northover, eastig, paquette

Subscribers: kristof.beyls, hiraditya, danielkiss, aprantl, llvm-commits, dsanders

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78151
2020-04-22 17:03:40 -07:00
Vedant Kumar f1a71b5949 [GIsel][LegalizerHelper] Account for debug insts when creating mem libcalls [5/14]
Summary:
While lowering memory intrinsics, GIsel attempts to form a tail call to
a library routine.

There might be a DBG_LABEL or something after the intrinsic call,
though: in that case, GIsel should still be able to form the tail call,
and should also delete the debug insts after the tail call as the
transform makes them invalid.

Reviewers: dsanders, aemerson

Subscribers: hiraditya, aprantl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78335
2020-04-22 17:03:40 -07:00
Vedant Kumar ba9db54505 [GIsel][CombinerHelper] Fix for missed ElideBrByInvertingCond/CombineIndexedLoadStore combines [4/14]
Summary:
Fix an issue which could result in ElideBrByInvertingCond or
CombineIndexedLoadStore being missed when debug info is present. In both
cases the fix is s/hasOneUse/hasOneNonDbgUse/.

Reviewers: aemerson, dsanders

Subscribers: hiraditya, aprantl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78254
2020-04-22 17:03:40 -07:00
Vedant Kumar 5c04274dab [GIsel][CombinerHelper] Don't consider debug insts in dominance queries [3/14]
Summary:
This fixes several issues where the presence of debug instructions could
disable certain combines, due to dominance queries finding uses/defs that
don't actually exist.

Reviewers: dsanders, fhahn, paquette, aemerson

Subscribers: hiraditya, arphaman, aprantl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78253
2020-04-22 17:03:40 -07:00
Vedant Kumar 5bae277584 [GISel][RegBankSelect] Hide assertion failure from LLT::getScalarSizeInBits [2/14]
Summary:
It looks like RegBankSelect can try to assign a bank based on a
DBG_VALUE instead of ignoring it. This eventually leads to an assert
in AArch64RegisterBankInfo::getInstrMapping because there is some info
missing from the DBG_VALUE MachineOperand (I see: `Assertion failed:
(RawData != 0 && "Invalid Type"), function getScalarSizeInBits`).

I'm not 100% sure it's safe to insert DBG_VALUE instructions right
before RegBankSelect (that's what -debugify-and-strip-all-safe is
doing). Any advice appreciated.

Depends on D78135.

Reviewers: ab, qcolombet, dsanders, aprantl

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78137
2020-04-22 17:03:39 -07:00
Vedant Kumar 10ce1bc8d0 [MachineBasicBlock] Add helpers for skipping debug instructions [1/14]
Summary:
These helpers are exercised by follow-up commits in this patch series,
which is all about removing CodeGen differences with vs. without debug
info in the AArch64 backend.

Reviewers: fhahn, aprantl, jpaquette, paquette

Subscribers: kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78260
2020-04-22 17:03:39 -07:00
Vedant Kumar 2a5675f11d [MachineDebugify] Insert synthetic DBG_VALUE instructions
Summary:
Teach MachineDebugify how to insert DBG_VALUE instructions.  This can
help find bugs causing CodeGen differences when debug info is present.
DBG_VALUE instructions are only emitted when -debugify-level is set to
locations+variables.

There is essentially no attempt made to match up DBG_VALUE register
operands with the local variables they ought to correspond to. I'm not
sure how to improve the situation. In some cases (MachineMemOperand?)
it's possible to find the IR instruction a MachineInstr corresponds to,
but in general this seems to call for "undoing" the work done by ISel.

Reviewers: dsanders, aprantl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78135
2020-04-22 17:03:39 -07:00
Mark Lacey 328bb446dd Add a policy to enable computing SchedDFSResult.
Summary:
Make GenericScheduler compute SchedDFSResult on initialization if
the policy is set. This makes it possible to create classes
that extend GenericScheduler and rely on the results of SchedDFSResult,
e.g. to perform subtree scheduling.

NFC unless the policy is set.

Subscribers: MatzeB, hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78432
2020-04-22 16:36:11 -07:00
Eli Friedman 1a78b0bd38 [MachineOutliner] Teach outliner to set live-ins
Preserving liveness can be useful even late in the pipeline, if we're
doing substantial optimization work afterwards. (See, for example,
D76065.) Teach MachineOutliner how to correctly set live-ins on the
basic block in outlined functions.

Differential Revision: https://reviews.llvm.org/D78605
2020-04-22 14:19:26 -07:00
Puyan Lotfi 264c07ef77 [llvm][MIRVRegNamer] Avoid collisions across jump table indices.
Hash Jump Table Indices uniquely within a basic block for MIR
Canonicalizer / MIR VReg Renamer passes.

Differential Revision: https://reviews.llvm.org/D77966
2020-04-22 14:58:44 -04:00
Christopher Tetreault 2dea3f1298 [SVE] Add new VectorType subclasses
Summary:
Introduce new types for fixed width and scalable vectors.

Does not remove getNumElements yet so as to not break code during transition
period.

Reviewers: deadalnix, efriedma, sdesmalen, craig.topper, huntergr

Reviewed By: sdesmalen

Subscribers: jholewinski, arsenm, jvesely, nhaehnle, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, kerbowa, Joonsoo, grosul1, frgossen, lldb-commits, tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm, #lldb

Differential Revision: https://reviews.llvm.org/D77587
2020-04-22 08:59:01 -07:00
Simon Pilgrim fc044530f7 BranchFolding.h - remove unused raw_ostream forward declaration. NFC. 2020-04-22 15:07:18 +01:00
Simon Pilgrim c3730ad8fc [AsmPrinter] Remove unused forward declarations. NFC. 2020-04-22 14:01:52 +01:00
Craig Topper 05a11974ae [CallSite removal] Remove unneeded includes of CallSite.h. NFC 2020-04-22 00:07:13 -07:00
Eli Friedman 46a52ff9ed [TargetPassConfig] Run MachineVerifier after more passes.
We were disabling verification for no reason in a bunch of places; just
turn it on.

At this point, there are two key places where we don't run verification:
during register allocation, and after addPreEmitPass.  Regalloc probably
isn't worth messing with; it has its own invariants, and verifying
afterwards is probably good enough.  For after addPreEmitPass, it's
probably worth investigating improvements.
2020-04-21 21:05:07 -07:00
Fangrui Song c5d38924dc [XRay] xray_fn_idx: set SHF_WRITE to avoid text relocations
In a future change we should properly fix xray_fn_idx to use PC-relative
addresses as well, but for now let's keep absolute addresses until sled
addresses are all fixed.
2020-04-21 12:02:29 -07:00
Ana Pazos 66590e1e9e [MC][PGO][PGSO] Cleanup unused MBFI in AsmPrinter
Summary:
Machine Block Frequency Info (MBFI) is being computed but unused in AsmPrinter.

MBFI computation was introduced with PGO change D71149 and then its use was
removed in D71106. No need to keep computing it.

Reviewers: MaskRay, jyknight, skan, yamauchi, davidxl, efriedma, huihuiz

Reviewed By: MaskRay, skan, yamauchi

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78526
2020-04-21 10:01:56 -07:00
Fangrui Song 5771c98562 [XRay] Change xray_instr_map sled addresses from absolute to PC relative for x86-64
xray_instr_map contains absolute addresses of sleds, which are relocated
by `R_*_RELATIVE` when linked in -pie or -shared mode.

By making these addresses relative to PC, we can avoid the dynamic
relocations and remove the SHF_WRITE flag from xray_instr_map.  We can
thus save VM pages containg xray_instr_map (because they are not
modified).

This patch changes x86-64 and bumps the sled version to 2. Subsequent
changes will change powerpc64le and AArch64.

Reviewed By: dberris, ianlevesque

Differential Revision: https://reviews.llvm.org/D78082
2020-04-21 09:36:09 -07:00
Nick Desaulniers d3fdafae06 [InlineSpiller] simplify insertReload() NFC
Summary:
The repeated use of std::next() on a MachineBasicBlock::iterator was
clever, but we only need to reconstruct the iterator post creation of
the spill instruction.

This helps simplifying where we plan to place the spill, as discussed in
D77849.

From here, we can simplify the code a little by flipping the return code
of a helper.

Reviewers: efriedma

Reviewed By: efriedma

Subscribers: qcolombet, hiraditya, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78520
2020-04-21 08:31:20 -07:00
Fraser Cormack c3a292961d Let targets adjust physical output- and anti-deps
Differential Revision: https://reviews.llvm.org/D78380
2020-04-21 13:45:03 +01:00
Craig Topper 68b2e507e4 [Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign.
Differential Revision: https://reviews.llvm.org/D78443
2020-04-20 21:31:44 -07:00
Shengchen Kan c031378ce0 [MC][NFC] Use camelCase style for functions in MCObjectStreamer 2020-04-20 20:09:20 -07:00
Andrew Litteken 1488bef8fc [MachineOutliner] Annotation for outlined functions in AArch64
- Adding changes to support comments on outlined functions with outlining for the conditions through which it was outlined (e.g. Thunks, Tail calls)
- Adapts the emitFunctionHeader to print out a comment next to the header if the target specifies it based on information in MachineFunctionInfo
- Adds mir test for function annotiation

Differential Revision: https://reviews.llvm.org/D78062
2020-04-20 13:33:31 -07:00
Craig Topper fcc9d70260 Revert "[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign."
This is breaking the clang build.

This reverts commit 897409fb56.
2020-04-20 13:25:06 -07:00
Craig Topper 897409fb56 [Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign.
Differential Revision: https://reviews.llvm.org/D78443
2020-04-20 13:08:05 -07:00
Simon Pilgrim 6cb204eb64 BranchFolding.h - cleanup includes and forward declarations. NFC.
Push MBFIWrapper.h include down to BranchFolding.cpp/IfConversion.cpp
2020-04-20 15:59:39 +01:00
Simon Pilgrim 9036fcd25f MIRVRegNamerUtils.h - remove unnecessary includes. NFC.
Replace with forward declarations or push down to MIRVRegNamerUtils.cpp where necessary.
2020-04-20 15:59:39 +01:00
Konstantin Schwarz 12030494fc [GlobalISel] Introduce InlineAsmLowering class
Summary:
Similar to the CallLowering class used for lowering LLVM IR calls to MIR calls,
we introduce a separate class for lowering LLVM IR inline asm to MIR INLINEASM.

There is no functional change yet, all existing tests should pass.

Reviewers: arsenm, dsanders, aemerson, volkan, t.p.northover, paquette

Reviewed By: aemerson

Subscribers: gargaroff, wdng, mgorny, rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78316
2020-04-20 15:10:18 +02:00
Kang Zhang a8e15ee04a [CodeGen] Support freeze expand for ppc_fp128
Summary:
The patch D29014 has added the new ISD::FREEZE and can deal with the
integer.
The patch D76980 has added SoftenFloatRes_FREEZE for float point.
But we still lack of expand for ppc_fp128, this will cause assertion for
some cases.
This patch is to support freeze expand for ppc_fp128.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D78278
2020-04-20 07:27:41 +00:00
Simon Pilgrim 46de0d5fe9 SelectionDAGBuilder.h - remove unused includes + forward declarations. NFC.
Replace SelectionDAG.h include with SelectionDAG forward declaration.
2020-04-19 12:38:41 +01:00
Simon Pilgrim 032738d17e InstrEmitter.h - reduce SelectionDAG.h include to SelectionDAGNodes.h include.
Add SDDbgLabel/TargetLowering forward declarations.
Add the full SelectionDAG.h include to InstrEmitter.cpp.
2020-04-19 11:52:31 +01:00
LemonBoy aad3d578da [DebugInfo] Change DIEnumerator payload type from int64_t to APInt
This allows the representation of arbitrarily large enumeration values.
See https://lists.llvm.org/pipermail/llvm-dev/2017-December/119475.html for context.

Reviewed By: andrewrk, aprantl, MaskRay

Differential Revision: https://reviews.llvm.org/D62475
2020-04-18 12:49:31 -07:00
Simon Pilgrim 2333ea1e70 [cmake] LLVMMIRParser - add include/llvm/CodeGen/LLVMMIRParser header path
Pick up the CodeGen/MIRParser headers in MSVC projects
2020-04-18 12:31:41 +01:00
Simon Pilgrim 5c16da387e [cmake] LLVMGlobalISel - add include/llvm/CodeGen/GlobalISel header path
Pick up the GlobalISel headers in MSVC projects
2020-04-18 12:31:40 +01:00
Andrew Litteken 8d5024f7fe fix to outline cfi instruction when can be grouped in a tail call
[MachineOutliner] fix test for excluding CFI and add test to include CFI in outlining

New test to check that we only outline CFI instruction if all CFI
Instructions in the function would be captured by the outlining

adding x86 tests analagous to AARCH64 cfi tests

Revision: https://reviews.llvm.org/D77852
2020-04-17 22:26:34 -07:00
Daniel Sanders 14ad8dc076 Don't accidentally create MachineFunctions in mir-debugify/mir-strip-debugify
We should only modify existing ones. Previously, we were creating
MachineFunctions for externally-available functions. AFAICT this was benign
in tree but ultimately led to asan bugs in our out of tree target.
2020-04-17 14:28:41 -07:00
Christopher Tetreault c858debebc Remove asserting getters from base Type
Summary:
Remove asserting vector getters from Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: dexonsmith, sdesmalen, efriedma

Reviewed By: efriedma

Subscribers: cfe-commits, hiraditya, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D77278
2020-04-17 14:03:31 -07:00
Daniel Sanders 701af684f6 [globalisel][legalizer] Expect to lose DebugLocs in dead code
There's not really anything else that can be done with them.
Fortunately, this dead code cleanup doesn't seem to trigger
very often.
2020-04-17 13:45:44 -07:00
Daniel Sanders 5ef64bbf7a [globalisel][legalizer] Include newly-dead code in artifact combine checks for DebugLoc loss
This dead code deletion is part of the combine and the combine
results should account for their locations.
2020-04-17 13:45:44 -07:00
Daniel Sanders 7f7f98b154 [globalisel][legalizer] Fix --verify-legalizer-debug-locs values
It was using the enum class name, like so:
    =DebugLocVerifyLevel::None                                         -   No verification
Changed it to:
    =none                                                              -   No verification
2020-04-17 13:45:44 -07:00
Dominik Montada 55e3a7c6b2 [GlobalISel][AMDGPU] add legalization for G_FREEZE
Summary:
Copy the legalization rules from SelectionDAG:
-widenScalar using anyext
-narrowScalar using intermediate merges
-scalarize/fewerElements using unmerge
-moreElements using G_IMPLICIT_DEF and insert

Add G_FREEZE legalization actions to AMDGPULegalizerInfo.
Use the same legalization actions as G_IMPLICIT_DEF.

Depends on D77795.

Reviewers: dsanders, arsenm, aqjune, aditya_nandakumar, t.p.northover, lebedev.ri, paquette, aemerson

Reviewed By: arsenm

Subscribers: kzhuravl, yaxunl, dstuttard, tpr, t-tye, jvesely, nhaehnle, kerbowa, wdng, rovka, hiraditya, volkan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78092
2020-04-17 16:44:46 +02:00
jasonliu 77618cc237 [XCOFF][AIX] Fix getSymbol to return the correct qualname when necessary
Summary:
AIX symbol have qualname and unqualified name. The stock getSymbol
could only return unqualified name, which leads us to patch many
caller side(lowerConstant, getMCSymbolForTOCPseudoMO).
So we should try to address this problem in the callee
side(getSymbol) and clean up the caller side instead.

Note: this is a "mostly" NFC patch, with a fix for the original
lowerConstant behavior.

Differential Revision: https://reviews.llvm.org/D78045
2020-04-17 13:45:14 +00:00
Fraser Cormack c819ef9653 Provide operand indices to adjustSchedDependency
This allows targets to know exactly which operands are contributing to
the dependency, which is required for targets with per-operand
scheduling models.

Differential Revision: https://reviews.llvm.org/D77135
2020-04-17 11:08:44 +01:00
Craig Topper 944cc5e0ab [SelectionDAGBuilder][CGP][X86] Move some of SDB's gather/scatter uniform base handling to CGP.
I've always found the "findValue" a little odd and
inconsistent with other things in SDB.

This simplfifies the code in SDB to just handle a splat constant
address or a 2 operand GEP in the same BB. This removes the
need for "findValue" since the operands to the GEP are
guaranteed to be available. The splat constant handling is
new, but was needed to avoid regressions due to constant
folding combining GEPs created in CGP.

CGP is now responsible for canonicalizing gather/scatters into
this form. The pattern I'm using for scalarizing, a scalar GEP
followed by a GEP with an all zeroes index, seems to be subject
to constant folding that the insertelement+shufflevector was not.

Differential Revision: https://reviews.llvm.org/D76947
2020-04-16 17:49:22 -07:00
Wouter van Oortmerssen 48139ebc3a [WebAssembly] Add int32 DW_OP_WASM_location variant
This to allow us to add reloctable global indices as a symbol.
Also adds R_WASM_GLOBAL_INDEX_I32 relocation type to support it.

See discussion in https://github.com/WebAssembly/debugging/issues/12
2020-04-16 16:32:17 -07:00
David Green 8e8c3c3408 [ARM] Mir test for machine sinking multiple def instructions. NFC 2020-04-16 20:58:14 +01:00
bd1976llvm 86478d3de9 [MC][ELF] Put explicit section name symbols into entry size compatible sections
Ensure that symbols explicitly* assigned a section name are placed into
a section with a compatible entry size.

This is done by creating multiple sections with the same name** if
incompatible symbols are explicitly given the name of an incompatible
section, whilst:

  - Avoiding using uniqued sections where possible (for readability and
    to maximize compatibly with assemblers).

  - Creating as few SHF_MERGE sections as possible (for efficiency).

Given that each symbol is assigned to a section in a single pass, we
must decide which section each symbol is assigned to without seeing the
properties of all symbols. A stable and easy to understand assignment is
desirable. The following rules facilitate this: The "generic" section
for a given section name will be mergeable if the name is a mergeable
"default" section name (such as .debug_str), a mergeable "implicit"
section name (such as .rodata.str2.2), or MC has already created a
mergeable "generic" section for the given section name (e.g. in response
to a section directive in inline assembly). Otherwise, the "generic"
section for a given name is non-mergeable; and, non-mergeable symbols
are assigned to the "generic" section, while mergeable symbols are
assigned to uniqued sections.

Terminology:
"default" sections are those always created by MC initially, e.g. .text
or .debug_str.

"implicit" sections are those created normally by MC in response to the
symbols that it encounters, i.e. in the absence of an explicit section
name assignment on the symbol, e.g. a function foo might be placed into
a .text.foo section.

"generic" sections are those that are referred to when a unique section
ID is not supplied, e.g. if there are multiple unique .bob sections then
".quad .bob" will reference the generic .bob section. Typically, the
generic section is just the first section of a given name to be created.
Default sections are always generic.

* Typically, section names might be explicitly assigned in source code
using a language extension e.g. a section attribute: _attribute_
((section ("section-name"))) -
https://clang.llvm.org/docs/AttributeReference.html

** I refer to such sections as unique/uniqued sections. In assembly the
", unique," assembly syntax is used to express such sections.

Fixes https://bugs.llvm.org/show_bug.cgi?id=43457.

See https://reviews.llvm.org/D68101 for previous discussions leading to
this patch.

Some minor fixes were required to LLVM's tests, for tests had been using
the old behavior - which allowed for explicitly assigning globals with
incompatible entry sizes to a section.

This fix relies on the ",unique ," assembly feature. This feature is not
available until bintuils version 2.35
(https://sourceware.org/bugzilla/show_bug.cgi?id=25380). If the
integrated assembler is not being used then we avoid using this feature
for compatibility and instead try to place mergeable symbols into
non-mergeable sections or issue an error otherwise.

Differential Revision: https://reviews.llvm.org/D72194
2020-04-16 19:12:49 +00:00
Amy Huang 2b8c6acc39 Reland "[codeview] Reference types in type parent scopes"
Summary:
Original description (https://reviews.llvm/org/D69924)
Without this change, when a nested tag type of any kind (enum, class,
struct, union) is used as a variable type, it is emitted without
emitting the parent type. In CodeView, parent types point to their inner
types, and inner types do not point back to their parents. We already
walk over all of the parent scopes to build the fully qualified name.
This change simply requests their type indices as we go along to enusre
they are all emitted.

Now, while walking over the parent scopes, add the types to
DeferredCompleteTypes, since they might already be in the process of
being emitted.

Fixes PR43905

Reviewers: rnk, amccarth

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78249
2020-04-16 12:08:52 -07:00
Daniel Sanders d9085f65db [globalisel] Add lost debug locations verifier
Summary:
This verifier tries to ensure that DebugLoc's don't just disappear as
we transform the MIR. It observes the instructions created, erased, and
changed and at checkpoints chosen by the client algorithm verifies the
locations affected by those changes.

In particular, it verifies that:
* Every DebugLoc for an erased/changing instruction is still present on
  at least one new/changed instruction
* Failing that, that there is a line-0 location in the new/changed
  instructions. It's not possible to confirm which locations were merged so
  it conservatively assumes all unaccounted for locations are accounted
  for by any line-0 location to avoid false positives.
If that fails, it prints the lost locations in the debug output along with
the instructions that should have accounted for them.

In theory, this is usable by the legalizer, combiner, selector and any other
pass that performs incremental changes to the MIR. However, it has so far
only really been tested on the legalizer (not including the artifact
combiner) where it has caught lots of lost locations, particularly in Custom
legalizations. There's only one example here as my initial testing was on an
out-of-tree target and I haven't done a pass over the in-tree targets yet.

Depends on D77575, D77446

Reviewers: bogner, aprantl, vsk

Subscribers: jvesely, nhaehnle, mgorny, rovka, hiraditya, volkan, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77576
2020-04-16 10:43:35 -07:00
Daniel Sanders 7c6ca18fff [globalisel] Allow backends to report an issue without triggering fallback. NFC
Summary:
This will allow us to fix the issue where the lost locations
verifier causes CodeGen changes on lost locations because it
falls back on DAGISel

Reviewers: qcolombet, bogner, aprantl, vsk, paquette

Subscribers: rovka, hiraditya, volkan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78261
2020-04-16 10:43:35 -07:00
David Green 44c4ba34d0 [MachineSink] Fix for breaking phi edges with instructions with multiple defs
BreakPHIEdge would be set based on whether the instruction needs to
insert a new critical edge to allow sinking into a block where the uses
are PHI nodes. But for instructions with multiple defs it would be reset
on the second def, allowing the instruciton to sink where it should not.

Fixes PR44981

Differential Revision: https://reviews.llvm.org/D78087
2020-04-16 16:42:07 +01:00
Konstantin Schwarz 1a3e89aa2b [MIR] Add comments to INLINEASM immediate flag MachineOperands
Summary:
The INLINEASM MIR instructions use immediate operands to encode the values of some operands.
The MachineInstr pretty printer function already handles those operands and prints human readable annotations instead of the immediates. This patch adds similar annotations to the output of the MIRPrinter, however uses the new MIROperandComment feature.

Reviewers: SjoerdMeijer, arsenm, efriedma

Reviewed By: arsenm

Subscribers: qcolombet, sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78088
2020-04-16 13:46:14 +02:00
Carl Ritson 43e2460a89 [LiveIntervals] Replace handleMoveIntoBundle
Summary:
The current handleMoveIntoBundle implementation is unusable,
it attempts to access the slot indexes of bundled instructions.
It also leaves bundled instructions with slot indexes assigned.

Replace handleMoveIntoBundle this with a more explicit
handleMoveIntoNewBundle function which recalculates the live
intervals for all instructions moved into a newly formed bundle,
and removes slot indexes from these instructions.

Reviewers: arsenm, MaskRay, kariddi, tpr, qcolombet

Reviewed By: qcolombet

Subscribers: MatzeB, wdng, hiraditya, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77969
2020-04-16 19:58:19 +09:00
Jeremy Morse c8d6fa5134 [LiveDebugValues] Terminate open ranges on DBG_VALUE $noreg
In D68209, LiveDebugValues::transferDebugValue had a call to
OpenRanges.erase shifted, and by accident this led to a code path where
DBG_VALUEs of $noreg would not have their open range terminated, allowing
variable locations to extend past blocks where they were terminated.

This patch correctly terminates the open range, if present, when such a
DBG_VAUE is encountered, and adds a test for this behaviour.

Differential Revision: https://reviews.llvm.org/D78218
2020-04-16 10:26:47 +01:00
Craig Topper 8e1408695c [CallSite removal][TargetLibraryInfo] Replace ImmutableCallSite with CallBase in one of the getLibFunc signatures. NFC
Differential Revision: https://reviews.llvm.org/D78083
2020-04-15 22:43:41 -07:00
Fangrui Song 7d1ff446b6 [MC] Rename MCSection*::getSectionName() to getName(). NFC
A pending change will merge MCSection*::getName() to MCSection::getName().
2020-04-15 16:48:14 -07:00
Josh Stone 5a0d8c31a3 [NFC] correct "thier" to "their" 2020-04-15 14:38:52 -07:00
Eli Friedman 7c10541e56 [SelectionDAG] Fix usage of Align constructing MachineMemOperands.
The "Align" passed into getMachineMemOperand etc. is the alignment of
the MachinePointerInfo, not the alignment of the memory operation.
(getAlign() on a MachineMemOperand automatically reduces the alignment
to account for this.)

We were passing on wrong (overconservative) alignment in a bunch of
places. Fix a bunch of these, mostly in legalization.  And while I'm
here, switch to the new Align APIs.

The test changes are all scheduling changes: the biggest effect of
preserving large alignments is that it improves alias analysis, so the
scheduler has more freedom.

(I was originally just trying to do a minor cleanup in
SelectionDAGBuilder, but I accidentally went deeper down the rabbit
hole.)

Differential Revision: https://reviews.llvm.org/D77687
2020-04-15 13:01:41 -07:00
Dominik Montada 443c244cff [GlobalISel] translate freeze to new generic G_FREEZE
Summary:
As a follow up to https://reviews.llvm.org/D29014, add translation
support for freeze.

Introduce a new generic instruction G_FREEZE and translate freeze to it.

Reviewers: dsanders, aqjune, arsenm, aditya_nandakumar, t.p.northover, lebedev.ri, paquette, aemerson

Reviewed By: aqjune, arsenm

Subscribers: fhahn, lebedev.ri, wdng, rovka, hiraditya, jfb, volkan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77795
2020-04-15 16:47:05 +02:00
Benjamin Kramer d790bd3999 Unbreak the build 2020-04-15 15:54:47 +02:00
Victor Campos d85b3877dc [CodeGen][ARM] Error when writing to specific reserved registers in inline asm
Summary:
No error or warning is emitted when specific reserved registers are
written to in inline assembly. Therefore, writes to the program counter
or to the frame pointer, for instance, were permitted, which could have
led to undesirable behaviour.

Example:
  int foo() {
    register int a __asm__("r7"); // r7 = frame-pointer in M-class ARM
    __asm__ __volatile__("mov %0, r1" : "=r"(a) : : );
    return a;
  }

In contrast, GCC issues an error in the same scenario.

This patch detects writes to specific reserved registers in inline
assembly for ARM and emits an error in such case. The detection works
for output and input operands. Clobber operands are not handled here:
they are already covered at a later point in
AsmPrinter::emitInlineAsm(const MachineInstr *MI). The registers
covered are: program counter, frame pointer and base pointer.

This is ARM only. Therefore the implementation of other targets'
counterparts remain open to do.

Reviewers: efriedma

Reviewed By: efriedma

Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76848
2020-04-15 14:40:42 +01:00
Denis Antrushin edbb27ccb6 [Statepoint] Add getters to StatepointOpers.
To simplify future work on statepoint representation, hide
direct access to statepoint field indices and provide getters
for them. Add getters for couple more statepoint fields.

This also fixes two bugs in MachineVerifier for statepoint:
First, the `break` statement was falling out of `if` statement
scope, thus disabling following checks.
Second, it was incorrectly accessing some fields like CallingConv -
StatepointOpers gives index to their value directly, not to
preceeding field type encoding.

Reviewed By: skatkov
Differential Revision: https://reviews.llvm.org/D78119
2020-04-15 14:31:42 +03:00
Benjamin Kramer 6f64daca8f Upgrade calls to CreateShuffleVector to use the preferred form of passing an array of ints
No functionality change intended.
2020-04-15 12:51:38 +02:00
QingShan Zhang c9f9c79c5a [NFC][DAGCombine] Change the value of NegatibleCost to make it align with the semantics
This is a minor NFC change to make the code more clear. We have the NegatibleCost that
has cheaper, neutral, and expensive. Typically, the smaller one means the less cost.
It is inverse for current implementation, which makes following code not easy to read.
If (CostX > CostY) negate(X)

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D77993
2020-04-15 02:20:58 +00:00
Sam Clegg 3ea1c62cba [WebAssembly] Emit .llvmcmd and .llvmbc as custom sections
Fixes: https://bugs.llvm.org/show_bug.cgi?id=45362

Differential Revision: https://reviews.llvm.org/D77115
2020-04-14 13:24:18 -07:00
Thomas Raoux c228c717aa [AntidepBreaker] Move AntiDepBreaker to include folder.
This allows AntiDepBreaker to be used in target specific postRA
scheduler.

Differential Revision: https://reviews.llvm.org/D78047
2020-04-14 11:40:57 -07:00
Georgii Rymar 1647ff6e27 [ADT/STLExtras.h] - Add llvm::is_sorted wrapper and update callers.
It can be used to avoid passing the begin and end of a range.
This makes the code shorter and it is consistent with another
wrappers we already have.

Differential revision: https://reviews.llvm.org/D78016
2020-04-14 14:11:02 +03:00
Craig Topper 3043093822 [CallSite removal][CodeGen] Replace ImmutableCallSite with CallBase in isInTailCallPosition. 2020-04-13 23:04:57 -07:00
Mircea Trofin 4aae4e3f48 [llvm][NFC] CallSite removal from inliner-related files
Summary: This removes CallSite from inliner files. Some dependencies where thus affected.

Reviewers: dblaikie, davidxl, craig.topper

Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, aheejin, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77991
2020-04-13 21:28:58 -07:00
Craig Topper 113f37a1f9 [CallSite removal][TargetLowering] Replace ImmutableCallSite with CallBase
Differential Revision: https://reviews.llvm.org/D77995
2020-04-13 13:50:15 -07:00
Rahman Lavaee 05192e585c Extend BasicBlock sections to allow specifying clusters of basic blocks in the same section.
Differential Revision: https://reviews.llvm.org/D76954
2020-04-13 12:19:59 -07:00
Rahman Lavaee 4ddf7ab454 Revert "Extend BasicBlock sections to allow specifying clusters of basic blocks"
This reverts commit 0d4ec16d3d Because
tests were not added to the commit.
2020-04-13 12:19:59 -07:00
Rahman Lavaee 0d4ec16d3d Extend BasicBlock sections to allow specifying clusters of basic blocks
in the same section.

This allows specifying BasicBlock clusters like the following example:
!foo
!!0 1 2
!!4
This places basic blocks 0, 1, and 2 in one section in this order, and
places basic block #4 in a single section of its own.
2020-04-13 11:46:11 -07:00
Vedant Kumar 122a6bfb07 [Debugify] Strip added metadata in the -debugify-each pipeline
Summary:
Share logic to strip debugify metadata between the IR and MIR level
debugify passes. This makes it simpler to hunt for bugs by diffing IR
with vs. without -debugify-each turned on.

As a drive-by, fix an issue causing CallGraphNodes to become invalid
when a dead llvm.dbg.value prototype is deleted.

Reviewers: dsanders, aprantl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77915
2020-04-13 10:55:17 -07:00
Craig Topper 68eb08646c [CallSite removal][GlobalISel] Use CallBase instead of CallSite in lowerCall and translateCallBase.
Differential Revision: https://reviews.llvm.org/D78001
2020-04-13 10:31:30 -07:00
Matt Arsenault e6605a209c DAG: Fix wrong legality check for ISD::FMAD
Since 1725f28841, this should check
isFMADLegalForFAddFSub rather than the the plain isOperationLegal.

This would assert in a subset of cases due to an oddity in how FMAD is
selected. We will allow FMA formation pre-legalize, but not FMAD even
in cases where it would be valid.

The current hook requires passing in the root fadd/fsub. However, in
this distributed case, this would be far more complicated to pass in
the relevant operand. AMDGPU doesn't get any value from the node, and
only needs the type and is the only implementor, so I'm not sure why
we have this complexity. Just rename and expand the assert to avoid
the more complicated checks spread through the distribution logic.
2020-04-13 10:25:39 -07:00
Craig Topper f06cf9da89 [CallSite removal][CodeGen] Use CallBase instead of CallSite in getNoopInput in Analysis.cpp. NFC 2020-04-13 00:20:12 -07:00
Craig Topper 5889c5a814 [CallSite removal][CodeGen] Use CallBase instead of ImmutableCallSite in TargetFrameLoweringInfo. NFC 2020-04-13 00:20:12 -07:00
Craig Topper e59162960c [CallSite removal][CodeGen] Use CallBase instead of ImmutableCallSite in IntrinsicLowering. NFC 2020-04-13 00:19:27 -07:00
Craig Topper 83208cdd57 [CallSite removal][CodeGen] Use CallBase instead of ImmutableCallSite in WinEHPrepare. NFC 2020-04-13 00:19:27 -07:00
Craig Topper 42487eafa6 [CallSite removal][CodeGen] Use CallBase instead of ImmutableCallSite in SwiftErrorValueTracking. NFC 2020-04-13 00:19:27 -07:00
Craig Topper dbb272b0a3 [CallSite removal][FastISel] Use CallBase instead of CallSite in fastLowerCall. 2020-04-12 18:02:24 -07:00
Craig Topper 95192f548d [CallSite removal][TargetLowering] Use CallBase instead of CallSite in TargetLowering::ParseConstraints interface.
Differential Revision: https://reviews.llvm.org/D77929
2020-04-12 11:26:25 -07:00
Jonathan Roelofs 41f13f1f64 reland: [DAG] Fix PR45049: LegalizeTypes crash
Sometimes LegalizeTypes knows about common subexpressions before SelectionDAG
does, leading to accidental SDValue removal before its reference count was
truly zero.

Differential Revision: https://reviews.llvm.org/D76994

Reviewed-By: bjope

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45049

Reverted in 3ce77142a6 because the previous patch
broke the expensive-checks bots. The new patch removes the broken check.
2020-04-12 09:52:17 -06:00
Craig Topper 5b42399029 [CallSite removal][FastISel] Remove uses of CallSite.
Differential Revision: https://reviews.llvm.org/D77933
2020-04-11 20:52:45 -07:00
Craig Topper 806763efcf [CallSite removal][SelectionDAGBuilder] Use CallBase instead of ImmutableCallSite in visitPatchpoint.
Differential Revision: https://reviews.llvm.org/D77932
2020-04-11 13:07:31 -07:00
Matt Arsenault 1747ba25b2 GlobalISel: Fix typo in assert message 2020-04-11 16:02:26 -04:00
Hongtao Yu 11455a7905 [CodeGen] Allow partial tail duplication in Machine Block Placement.
Summary: A count profile may affect tail duplication's heuristic causing a block to be duplicated in only a part of its predecessors. This is not allowed in the Machine Block Placement pass where an assert will go off. I'm removing the assert and making the optimization bail out when such case happens.

Reviewers: wenlei, davidxl, Carrot

Reviewed By: Carrot

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77748
2020-04-11 12:20:31 -07:00
Sanjay Patel 1318ddbc14 [VectorUtils] rename scaleShuffleMask to narrowShuffleMaskElts; NFC
As proposed in D77881, we'll have the related widening operation,
so this name becomes too vague.

While here, change the function signature to take an 'int' rather
than 'size_t' for the scaling factor, add an assert for overflow of
32-bits, and improve the documentation comments.
2020-04-11 10:05:49 -04:00
Simon Pilgrim 89f6ca05b7 CodeGen/EdgeBundles - move Twine.h include down into EdgeBundles.cpp. NFC.
EdgeBundles.h has no use for it.
2020-04-11 12:21:04 +01:00
Craig Topper 9c1842d8af Change FastISel::CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI.
This is the same as what was done to the CallLoweringInfo in
TargetLowering.h in r309159.

This is just a step on the way to replacing this with CallBase.
2020-04-10 23:45:36 -07:00
Craig Topper f49f6cf91e [CallSite removal][SelectionDAGBuilder] Remove most CallSite usage from visitInlineAsm.
I only left it at the interface to ParseConstraints since that
needs updates to other callers in different files. I'll do that
as a follow up.

Differential Revision: https://reviews.llvm.org/D77892
2020-04-10 19:23:33 -07:00
Matt Arsenault 49ae0fc2f0 GlobalISel: Fix incorrect lowering G_FCOPYSIGN
In the basic case, this was reading the sign from the wrong operand.
2020-04-10 21:00:25 -04:00
Daniel Sanders f71350f05a Add -debugify-and-strip-all to add debug info before a pass and remove it after
Summary:
This allows us to test each backend pass under the presence
of debug info using pre-existing tests. The tests should not
fail as a result of this so long as it's true that debug info
does not affect CodeGen.

In practice, a few tests are sensitive to this:
* Tests that check the pass structure (e.g. O0-pipeline.ll)
* Tests that check --debug output. Specifically instruction
  dumps containing MMO's (e.g. prelegalizercombiner-extends.ll)
* Tests that contain debugify metadata as mir-strip-debug will
  remove it (e.g. fastisel-debugvalue-undef.ll)
* Tests with partial debug info (e.g.
  patchable-function-entry-empty.mir had debug info but no
  !llvm.dbg.cu)
* Tests that check optimization remarks overly strictly (e.g.
  prologue-epilogue-remarks.mir)
* Tests that would inject the pass in an unsafe region (e.g.
  seqpairspill.mir would inject between register alloc and
  virt reg rewriter)
In all cases, the checks can either be updated or
--debugify-and-strip-all-safe=0 can be used to avoid being
affected by something like llvm-lit -Dllc='llc --debugify-and-strip-all-safe'

I tested this without the lost debug locations verifier to
confirm that AArch64 behaviour is unaffected (with the fixes
in this patch) and with it to confirm it finds the problems
without the additional RUN lines we had before.

Depends on D77886, D77887, D77747

Reviewers: aprantl, vsk, bogner

Subscribers: qcolombet, kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77888
2020-04-10 16:36:07 -07:00
Daniel Sanders dfca98d6a8 [mir-strip-debug] Optionally preserve debug info that wasn't from debugify/mir-debugify
Summary:
A few tests start out with debug info and expect it to reach
the output. For these tests we shouldn't strip the debug info

Reviewers: aprantl, vsk, bogner

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77886
2020-04-10 15:24:14 -07:00
Christopher Tetreault 889f6606ed Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: stoklund, sdesmalen, efriedma

Reviewed By: sdesmalen

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77272
2020-04-10 14:53:43 -07:00
Daniel Sanders c162bc2aed Make TargetPassConfig and llc add pre/post passes the same way. NFC
Summary:
At the moment, any changes we make to the passes that can be
injected before/after others (e.g. -verify-machineinstrs and
-print-after-all) have to be duplicated in both
TargetPassConfig (for normal execution, -start-before/
-stop-before/etc) and llc (for -run-pass). Unify this pass
injection into addMachinePrePass/addMachinePostPass that both
TargetPassConfig and llc can use.

Reviewers: vsk, aprantl, bogner

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77887
2020-04-10 13:46:53 -07:00
Marcello Maggioni ea11f4726f Split LiveRangeCalc in LiveRangeCalc/LiveIntervalCalc. NFC
Summary:
Refactor LiveRangeCalc such that it is now split into two classes

The objective is to split all the "register specific" logic away
from LiveRangeCalc.
The two new classes created are:

- LiveRangeCalc - is meant as a generic class to compute and modify
  live ranges in a generic way. This class should deal only with
  SlotIndices and VNInfo objects.

- LiveIntervalCals - is meant to be equivalent to the old LiveRangeCalc.
  It computes the liveness virtual registers tracked by a LiveInterval
  object.

With this refactoring LiveRangeCalc can be used to implement tracking of
liveness of LiveRanges that represent other things than just registers.

Subscribers: MatzeB, qcolombet, mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76584
2020-04-10 11:26:21 -07:00
Sumanth Gundapaneni a04ab2ec08 [Pipeliner] Fix the bug in pragma that disables the pipeliner.
Differential Revision: https://reviews.llvm.org/D76303.
2020-04-10 12:52:16 -05:00
Simon Pilgrim a88cc20456 ProfileSummaryInfo.h - remove unnecessary includes. NFC
Remove a number of includes that aren't necessary (nor are we relying on the remaining includes to provide the declarations), we just needed a llvm::Instruction forward declaration.

This exposed a couple of source files that were implicitly replying on the includes for their use of llvm::SmallSet or std::set, requiring local includes to be added there instead.
2020-04-10 16:25:48 +01:00
Serguei Katkov 4275eb1331 Re-land [Codegen/Statepoint] Allow usage of registers for non gc deopt values.
The change introduces the usage of physical registers for non-gc deopt values.
This require runtime support to know how to take a value from register.
By default usage is off and can be switched on by option.

The change also introduces additional fix-up patch which forces the spilling
of caller saved registers (clobbered after the call) and re-writes statepoint
to use spill slots instead of caller saved registers.

Reviewers: reames, danstrushin
Reviewed By: dantrushin
Subscribers: mgorny, hiraditya, mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D77797
2020-04-10 10:13:39 +07:00
Francesco Petrogalli c846d2682b [llvm][Codegen] Make `getVectorTypeBreakdownMVT` work with scalable types.
Reviewers: efriedma, andwar, sdesmalen

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77434
2020-04-10 00:48:27 +01:00
Daniel Sanders a79b2fc44b Add pass to strip debug info from MIR
Summary:
Removes:
* All LLVM-IR level debug info using StripDebugInfo()
* All debugify metadata
* 'Debug Info Version' module flag
* All (valid*) DEBUG_VALUE MachineInstrs
* All DebugLocs from MachineInstrs

This is a more complete solution than the previous MIRPrinter
option that just causes it to neglect to print debug-locations.

* The qualifier 'valid' is used here because AArch64 emits
  an invalid one and tests depend on it

Reviewers: vsk, aprantl, bogner

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77747
2020-04-09 15:44:38 -07:00
Serguei Katkov 44f0d7f136 Revert "[Codegen/Statepoint] Allow usage of registers for non gc deopt values."
This reverts commit a0275705bb.

It causes buildbot failures building LLVM with BUILD_SHARED_LIBS due to a linker error.
2020-04-09 18:24:47 +07:00
Serguei Katkov a0275705bb [Codegen/Statepoint] Allow usage of registers for non gc deopt values.
The change introduces the usage of physical registers for non-gc deopt values.
This require runtime support to know how to take a value from register.
By default usage is off and can be switched on by option.

The change also introduces additional fix-up patch which forces the spilling
of caller saved registers (clobbered after the call) and re-writes statepoint
to use spill slots instead of caller saved registers.

Reviewers: reames, dantrushin
Reviewed By: reames, dantrushin
Subscribers: mgorny, hiraditya, mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D77371
2020-04-09 16:57:35 +07:00
Jay Foad bf730e1686 [CodeGen] Fix a simple FIXME. NFC. 2020-04-09 10:54:03 +01:00
Jay Foad c63aed890e [KnownBits] Move AND, OR and XOR logic into KnownBits
Summary:
There are at least three clients for KnownBits calculations:
ValueTracking, SelectionDAG and GlobalISel. To reduce duplication the
common logic should be moved out of these clients and into KnownBits
itself.

This patch does this for AND, OR and XOR calculations by implementing
and using appropriate operator overloads KnownBits::operator& etc.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74060
2020-04-09 10:10:37 +01:00
Matt Arsenault 0aa0d70067 MIR: Use Register 2020-04-08 22:07:26 -04:00
Amara Emerson befc788cfa GlobalISel: Add a setInstrAndDebugLoc(MachineInstr&) convenience helper to MachineIRBuilder. NFC.
This saves doing two separate calls to set the Instr and DebugLoc from an existing MI.
2020-04-08 14:38:33 -07:00
Matt Arsenault e49e33b610 CodeGen: Use Register in MachineInstrBuilder 2020-04-08 17:03:53 -04:00
Matt Arsenault c42cc7fd24 CodeGen: Use Register in MachineSSAUpdater 2020-04-08 14:29:01 -04:00
Vedant Kumar 48e65fc630 MachineFunction: Copy call site info when duplicating insts
Summary:
Preserve call site info for duplicated instructions. We copy over the
call site info in CloneMachineInstrBundle to avoid repeated calls to
copyCallSiteInfo in CloneMachineInstr.

(Alternatively, we could copy call site info higher up the stack, e.g.
into TargetInstrInfo::duplicate, or even into individual backend passes.
However, I don't see how that would be safer or more general than the
current approach.)

Reviewers: aprantl, djtodoro, dstenb

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77685
2020-04-08 11:06:14 -07:00
Matt Arsenault 586769cce2 DAG: Use Register 2020-04-08 13:44:31 -04:00
Matt Arsenault dcce3ef1d2 FastISel: Partially use Register
Doesn't try to convert the cases that depend on generated code.
2020-04-08 12:10:58 -04:00
Matt Arsenault 7a46e36d51 CodeGen: Use Register more in CallLowering
Some of these MCPhysReg uses should probably be MCRegister, but right
now this would require more invasive changes.
2020-04-08 12:10:58 -04:00
Matt Arsenault ca0ace7298 CodeGen: Use Register in MachineBasicBlock 2020-04-08 12:10:58 -04:00
Jeremy Morse c77887e4d1 [DebugInfo][NFC] Early-exit when analyzing for single-location variables
This is a performance patch that hoists two conditions in DwarfDebug's
validThroughout to avoid a linear-scan of all instructions in a block. We
now exit early if validThrougout will never return true for the variable
location.

The first added clause filters for the two circumstances where
validThroughout will return true. The second added clause should be
identical to the one that's deleted from after the linear-scan.

Differential Revision: https://reviews.llvm.org/D77639
2020-04-08 12:27:11 +01:00
Mikael Holmen 893df2032d [IfConversion] Disallow TrueBB == FalseBB for valid diamonds
Summary:
This fixes PR45302.

Previously the case

     BB1
     / \
    |   |
   TBB FBB
    |   |
     \ /
     BB2

was treated as a valid diamond also when TBB and FBB was the same basic
block. This then lead to a failed assertion in IfConvertDiamond.

Since TBB == FBB is quite a degenerated case of a diamond, we now
don't treat it as a valid diamond anymore, and thus we will avoid the
trouble of making IfConvertDiamond handle it correctly.

Reviewers: efriedma, kparzysz

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D77651
2020-04-08 12:50:36 +02:00
Dominik Montada 35950fea8d [GlobalISel] support narrow G_IMPLICIT_DEF for DstSize % NarrowSize != 0
Summary:
When narrowing G_IMPLICIT_DEF where the original size is not a multiple
of the narrow size, emit a smaller G_IMPLICIT_DEF and use G_ANYEXT.

To prevent a potential endless loop in the legalizer, the condition
to combine G_ANYEXT(G_IMPLICIT_DEF) is changed from isInstUnsupported
to !isInstLegal, since in this case the combine is only valid if
consequent legalization of the newly combined G_IMPLICIT_DEF does not
introduce G_ANYEXT due to narrowing.

Although this legalization for G_IMPLICIT_DEF would also be valid for
the general case, it actually caused a lot of code regressions when
tried due to superfluous COPYs and combines not getting hit anymore.

Reviewers: dsanders, aemerson, volkan, arsenm, aditya_nandakumar

Reviewed By: arsenm

Subscribers: jvesely, nhaehnle, kerbowa, wdng, rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76598
2020-04-08 11:00:07 +02:00
Daniel Sanders 1adeeabb79 Add MIR-level debugify with only locations support for now
Summary:
Re-used the IR-level debugify for the most part. The MIR-level code then
adds locations to the MachineInstrs afterwards based on the LLVM-IR debug
info.

It's worth mentioning that the resulting locations make little sense as
the range of line numbers used in a Function at the MIR level exceeds that
of the equivelent IR level function. As such, MachineInstrs can appear to
originate from outside the subprogram scope (and from other subprogram
scopes). However, it doesn't seem worth worrying about as the source is
imaginary anyway.

There's a few high level goals this pass works towards:
* We should be able to debugify our .ll/.mir in the lit tests without
  changing the checks and still pass them. I.e. Debug info should not change
  codegen. Combining this with a strip-debug pass should enable this. The
  main issue I ran into without the strip-debug pass was instructions with MMO's and
  checks on both the instruction and the MMO as the debug-location is
  between them. I currently have a simple hack in the MIRPrinter to
  resolve that but the more general solution is a proper strip-debug pass.
* We should be able to test that GlobalISel does not lose debug info. I
  recently found that the legalizer can be unexpectedly lossy in seemingly
  simple cases (e.g. expanding one instr into many). I have a verifier
  (will be posted separately) that can be integrated with passes that use
  the observer interface and will catch location loss (it does not verify
  correctness, just that there's zero lossage). It is a little conservative
  as the line-0 locations that arise from conflicts do not track the
  conflicting locations but it can still catch a fair bit.

Depends on D77439, D77438

Reviewers: aprantl, bogner, vsk

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77446
2020-04-07 16:25:13 -07:00
Matt Arsenault 6011627f51 CodeGen: More conversions to use Register 2020-04-07 18:54:36 -04:00
Matt Arsenault 2481f26ac3 CodeGen: Use Register in TargetFrameLowering 2020-04-07 17:07:44 -04:00
Matt Arsenault aa26dd9858 CodeGen: Use Register in more places 2020-04-07 15:59:40 -04:00
Craig Topper c41685b16f [SelectionDAG] Make getZeroExtendInReg take a vector VT if the operand VT is a vector.
This removes a call to getScalarType from a bunch of call sites.
It also makes the behavior consistent with SIGN_EXTEND_INREG.

Differential Revision: https://reviews.llvm.org/D77631
2020-04-07 11:34:08 -07:00
Matt Arsenault b281138a1b DAG: Use the correct getPointerTy in a few places
These should not be assuming address space 0. Calling getPointerTy is
generally the wrong thing to do, since you should already know the
type from the incoming IR.
2020-04-07 12:45:41 -04:00
Nikita Popov 259649a519 [RDA] Avoid full reprocessing of blocks in loops (NFCI)
RDA sometimes needs to visit blocks twice, to take into account
reaching defs coming in along loop back edges. Currently it handles
repeated visitation the same way as usual, which means that it will
scan through all instructions and their reg unit defs again. Not
only is this very inefficient, it also means that all reaching defs
in loops are going to be inserted twice.

We can do much better than this. The only thing we need to handle
is a new reaching def from a predecessor, which either needs to be
prepended to the reaching definitions (if there was no reaching def
from a predecessor), or needs to replace an existing predecessor
reaching def, if it is more recent. Since D77508 we only store the
most recent predecessor reaching def, so that's the only one that
may need updating.

This also has the nice side-effect that reaching definitions are
now automatically sorted and unique, so drop the llvm::sort() call
in favor of an assertion.

Differential Revision: https://reviews.llvm.org/D77511
2020-04-07 17:55:37 +02:00
Nikita Popov 76e987b372 [RDA] Don't pass down TraversedMBB (NFC)
Only pass the MachineBasicBlock itself down to helper methods,
they don't need to know about traversal. Move the debug print
into the main method.
2020-04-07 17:53:04 +02:00
Nikita Popov 361c29d7ba [RDA] Avoid inserting duplicate reaching defs (NFCI)
An instruction may define the same reg unit multiple times,
avoid inserting the same reaching def multiple times in that case.

Also print the reg unit, rather than the super-register, in the
debug code.
2020-04-07 17:50:38 +02:00
Serguei Katkov b7e3759e17 [DAG] Consolidate require spill slot logic in lambda. NFC.
Move the logic whether lowering of deopt value requires a spill slot in
a separate lambda.

Reviewers: reames, dantrushin
Reviewed By: dantrushin
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D77629
2020-04-07 16:43:47 +07:00
Pierre-vh 4fc59a468f Revert "[CodeGen][SelectionDAG] Flip Booleans More Often"
This reverts commit 23342bdcc8.
2020-04-07 09:09:10 +01:00
Pierre-vh 23342bdcc8 [CodeGen][SelectionDAG] Flip Booleans More Often
Differential Revision: https://reviews.llvm.org/D77201
2020-04-07 08:19:57 +01:00
Eli Friedman 3f13ee8a00 [NFC] Modernize misc. uses of Align/MaybeAlign APIs.
Use the current getAlign() APIs where it makes sense, and use Align
instead of MaybeAlign when we know the value is non-zero.
2020-04-06 17:53:04 -07:00
Eli Friedman 68b03aee1a Remove SequentialType from the type heirarchy.
Now that we have scalable vectors, there's a distinction that isn't
getting captured in the original SequentialType: some vectors don't have
a known element count, so counting the number of elements doesn't make
sense.

In some cases, there's a better way to express the commonality using
other methods. If we're dealing with GEPs, there's GEP methods; if we're
dealing with a ConstantDataSequential, we can query its element type
directly.

In the relatively few remaining cases, I just decided to write out
the type checks. We're talking about relatively few places, and I think
the abstraction doesn't really carry its weight. (See thread "[RFC]
Refactor class hierarchy of VectorType in the IR" on llvmdev.)

Differential Revision: https://reviews.llvm.org/D75661
2020-04-06 17:03:49 -07:00
Davide Italiano 8115e08b05 [MachineCSE] Don't carry the wrong location when hoisting
PR: 45425
<rdar://problem/61359768>

Differential Revision:  https://reviews.llvm.org/D77604
2020-04-06 16:36:22 -07:00
Daniel Sanders f27cea721e Add way to omit debug-location from MIR output
Summary:
In lieu of a proper pass that strips debug info, add a way
to omit debug-locations from the MIR output so that
instructions with MMO's continue to match CHECK's when
mir-debugify is used

Reviewers: aprantl, bogner, vsk

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77575
2020-04-06 16:22:01 -07:00
Daniel Sanders 35b7b0851b Allow MachineFunction to obtain non-const Function (to enable MIR-level debugify)
Summary:
To debugify MIR, we need to be able to create metadata and to do that, we
need a non-const Module. However, MachineFunction only had a const reference
to the Function preventing this.

Reviewers: aprantl, bogner

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77439
2020-04-06 15:19:21 -07:00
Leonard Chan a0222ac1f9 [AsmPrinter] Do not define local aliases for global objects in a comdat
A global symbol that is defined in a comdat should not generate an alias since
call sites that would've referred to that symbol will refer to their own
independent local aliases rather than the surviving global comdat one. This
could result in something that looks like:

```
ld.lld: error: relocation refers to a discarded section: .text._ZN3fbl8internal18NullFunctionTargetIvJjjPjEED1Ev.stub
>>> defined in user-x64-clang/obj/system/ulib/minfs/libminfs.a(minfs._sources.file.cc.o)
>>> section group signature: _ZN3fbl8internal18NullFunctionTargetIvJjjPjEED1Ev.stub
>>> prevailing definition is in user-x64-clang/obj/system/ulib/minfs/libminfs.a(minfs._sources.vnode.cc.o)
>>> referenced by function.h:169 (../../zircon/system/ulib/fbl/include/fbl/function.h:169)
>>>               minfs._sources.file.cc.o:(minfs::File::AllocateAndCommitData(std::__2::unique_ptr<minfs::Transaction, std::__2::default_delete<minfs::Transaction> >)) in archive user-x64-clang/obj/system/ulib/minfs/libminfs.a
```

We ran into this when experimenting with a new C++ ABI for fuchsia
(refer to D72959) which takes relative offsets between comdat'd functions
which is why the normal C++ user wouldn't run into this.

Differential Revision: https://reviews.llvm.org/D77429
2020-04-06 13:48:05 -07:00
Nick Desaulniers 5bc291be71 [SelectionDAG] fix predecessor list for INLINEASM_BRs' parent
Summary:
A bug report mentioned that LLVM was producing jumps off the end of a
function when using "asm goto with outputs". Further digging pointed to
MachineBasicBlocks that had their address taken and were indirect
targets of INLINEASM_BR being removed by BranchFolder, because their
 predecessor list was empty, so they appeared to have no entry.

This was a cascading failure caused earlier, during Pre-RA instruction
scheduling. We have a few special cases in Pre-RA instruction scheduling
where we split a MachineBasicBlock in two.  This requires careful
handing of predecessor and successor lists for a MachineBasicBlock that
was split, and careful handing of PHI MachineInstrs that referred to the
MachineBasicBlock before it was split.

The clue that led to this fix was the observation that many callers of
MachineBasicBlock::splice() frequently call
MachineBasicBlock::transferSuccessorsAndUpdatePHIs() to update their PHI
nodes after a splice. We don't want to reuse that method, as we have
custom successor transferring logic for this block split.

This patch fixes 2 pre-existing bugs, and adds tests.

The first bug was that MachineBasicBlock::splice() correctly handles
updating most successors and predecessors; we don't need to do anything
more than removing the previous fallthrough block from the first half of
the split block post splice. Previously, we were updating the successor
list incorrectly (updating successors updates predecessors).

The second bug was that PHI nodes that needed registers from the first
half of the split block were not having entries populated.  The register
live out information was correct, and the FuncInfo->PHINodesToUpdate was
correct. Specifically, the check in SelectionDAGISel::FinishBasicBlock:

    for (unsigned i = 0, e = FuncInfo->PHINodesToUpdate.size(); i != e; ++i) {
      MachineInstrBuilder PHI(*MF, FuncInfo->PHINodesToUpdate[i].first);
      if (!FuncInfo->MBB->isSuccessor(PHI->getParent()))
        continue;
      PHI.addReg(FuncInfo->PHINodesToUpdate[i].second).addMBB(FuncInfo->MBB);

was `continue`ing because FuncInfo->MBB tracks the second half of
the post-split block; no one was updating PHI entries for the first half
of the post-split block.

SelectionDAGBuilder::UpdateSplitBlock() already expects to perform
special handling for MachineBasicBlocks that were split post calls to
ScheduleDAGSDNodes::EmitSchedule(), so I'm confident that it's both
correct for ScheduleDAGSDNodes::EmitSchedule() to return the second half
of the split block `CopyBB` which updates `FuncInfo->MBB` (ie. the
current MachineBasicBlock being processed), and perform special handling
for this in SelectionDAGBuilder::UpdateSplitBlock().

Reviewers: void, craig.topper, efriedma

Reviewed By: void, efriedma

Subscribers: hfinkel, fhahn, MatzeB, efriedma, hiraditya, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76961
2020-04-06 13:46:39 -07:00
Francesco Petrogalli 53b7abdd23 [llvm][CodeGen] Avoid implicit cast of TypeSize to integer in `initActions`.
Reviewers: sdesmalen, efriedma

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77317
2020-04-06 19:46:11 +01:00
Craig Topper 07ed1fb597 [SelectionDAGBuilder] Fix ISD::FREEZE creation for structs with fields of different types.
The previous code used the type of the first field for the VT
passed to getNode for every field.

I've based the implementation here off what is done in visitSelect
as it removes the need to special case aggregates.

Differential Revision: https://reviews.llvm.org/D77093
2020-04-06 11:03:40 -07:00
Nikita Popov e8b83f7ddc [RDA] Only store most recent reaching def from predecessors (NFCI)
When entering a basic block, RDA inserts reaching definitions coming
from predecessor blocks (which will be negative numbers) in a rather
peculiar way. If you have incoming reaching definitions -4, -3, -2, -1,
it will insert those. If you have incoming reaching definitions
-1, -2, -3, -4, it will insert -1, -1, -1, -1, as the max is taken
at each step. That's probably not what was intended...

However, RDA only actually cares about the most recent reaching
definition from a predecessor (to calculate clearance), so this
ends up working fine as far as behavior is concerned. It does
waste memory on unnecessary reaching definitions though.

This patch changes the implementation to first compute the most
recent reaching definition in one loop, and then insert only that
one in a separate loop.

Differential Revision: https://reviews.llvm.org/D77508
2020-04-06 18:39:09 +02:00
Nikita Popov 8d75df1438 [RDA] Don't adjust ReachingDefDefaultVal (NFCI)
At the end of a basic block, RDA adjusts all the reaching defs it
found to be relative to the end of the basic block, rather than the
start of it. However, it also does this to registers which don't
have a reaching def, indicated by ReachingDefDefaultVal. This means
that code checking against ReachingDefDefaultVal will not skip them,
and may insert them into the reaching definition list. This is
ultimately harmless, but causes unnecessary work and is logically
not right.

Differential Revision: https://reviews.llvm.org/D77506
2020-04-06 18:36:29 +02:00
Matt Arsenault 70726cec5b DAG: Combine extract_vector_elt of concat_vectors
Fixes extra canonicalize regressions when legalizing
vector fminnum/fmaxnum.
2020-04-06 09:26:29 -04:00
Sourabh Singh Tomar 5d7e9adce2 [DWARF5] Added support for emission of debug_macro section.
Summary:
This patch adds support for emission of following DWARFv5 macro forms
in .debug_macro section.

1. DW_MACRO_start_file
2. DW_MACRO_end_file
3. DW_MACRO_define_strp
4. DW_MACRO_undef_strp.

Reviewed By: dblaikie, ikudrin

Differential Revision: https://reviews.llvm.org/D72828
2020-04-06 17:45:10 +05:30
Guillaume Chatelet ff858d7781 [Alignment][NFC] Add DebugStr and operator*
Summary:
This is a roll forward of D77394 minus AlignmentFromAssumptions (which needs to be addressed separately)
Differences from D77394:
 - DebugStr() now prints the alignment value or `None` and no more `Align(x)` or `MaybeAlign(x)`
   - This is to keep Warning message consistent (CodeGen/SystemZ/alloca-04.ll)
 - Removed a few unneeded headers from Alignment (since it's included everywhere it's better to keep the dependencies to a minimum)

Reviewers: courbet

Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77537
2020-04-06 12:09:45 +00:00
Oliver Stannard a294d9eb21 Revert "[IPRA][ARM] Spill extra registers at -Oz"
Reverting because this is causing failures on bots with expensive checks
enabled.

This reverts commit 73cea83a6f.
2020-04-06 10:34:59 +01:00
Guillaume Chatelet 6000478f39 Revert "[Alignment][NFC] Add DebugStr and operator*"
This reverts commit 1e34ab98fc.
2020-04-06 07:55:25 +00:00
Guillaume Chatelet 1e34ab98fc [Alignment][NFC] Add DebugStr and operator*
Summary:
Also updates files to use them.

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77394
2020-04-06 07:12:46 +00:00
Craig Topper 97e57f3b24 [DAGCombiner] Use getAnyExtOrTrunc instead of getSExtOrTrunc in the zext(setcc) combine.
We're ANDing with 1 right after which will cause the SIGN_EXTEND to
be combined to ANY_EXTEND later. Might as well just start with an
ANY_EXTEND.

While there replace create the AND using the getZeroExtendInReg
helper to remove the need to explicitly create the VecOnes constant.
2020-04-05 22:44:45 -07:00
Craig Topper 586c051a27 [DAGCombiner] Replace a hardcoded constant in visitZERO_EXTEND with a proper check for the condition its trying to protect.
This code is replacing a shift with a new shift on an extended type.
If the shift amount type can't represent the maximum shift amount
for the new type, the amount needs to be extended to a type that
can.

Previously, the code just hardcoded a check for 256 bits which
seems to have been an assumption that the original shift amount
was MVT::i8. But that seems more catered to a specific target
like X86 that uses i8 as its legal shift amount type. Other
targets may use different types.

This commit changes the code to look at the real type of the shift
amount and makes sure it has enough bits for the Log2 of the
new type. There are similar checks to this in SelectionDAGBuilder
and LegalizeIntegerTypes.
2020-04-05 20:35:57 -07:00
Sourabh Singh Tomar 0d71782f4e [DebugInfo]: Allow DwarfCompileUnit to have line table symbol
Previously line table symbol was represented as `DIE::value_iterator`
inside `DwarfCompileUnit` and subsequent function `intStmtList`
was used to create a local `MCSymbol` to initialize it.

This patch removes `DIE::value_iterator` from `DwarfCompileUnit`
and intoduce `MCSymbol` for representing this units symbol for
`debug_line` section. As a result `applyStmtList` is also modified
 to utilize this. Further more a helper function `getLineTableStartSym`
is also introduced to get this symbol, this would be used by clients
which need to access this line table, i.e `debug_macro`.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D77489
2020-04-06 00:14:29 +05:30
Zuojian Lin a58c8a7866 Remove the additional constant which requires an extra register for statepoint lowering.
The newly-created constant zero will need an extra register to hold it
in the current statepoint lowering implementation. Remove it if there
exists one.
2020-04-05 11:22:09 -04:00
Jonathan Roelofs 3ce77142a6 Revert "[DAG] Fix PR45049: LegalizeTypes crash"
This reverts commit 17673ae0b2.
2020-04-04 13:47:22 -06:00
Jonathan Roelofs 17673ae0b2 [DAG] Fix PR45049: LegalizeTypes crash
Sometimes LegalizeTypes knows about common subexpressions before SelectionDAG
does, leading to accidental SDValue removal before its reference count was
truly zero.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45049

https://reviews.llvm.org/D76994
2020-04-04 13:36:22 -06:00
Heejin Ahn fc5d8b672b [WebAssembly] Fix a sanitizer error in WasmEHPrepare
Summary:
D77423 started using a dominator tree in WasmEHPrepare, but we deleted
BBs in `prepareThrows` before we used the domtree in `prepareEHPads`,
and those CFG changes were not reflected in the domtree. This uses
`DomTreeUpdater` to make sure we update the domtree every time we delete
BBs from the CFG. This fixes ubsan/msan/expensive_check errors caught in
LLVM buildbots.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77465
2020-04-04 09:57:07 -07:00
Heejin Ahn 2e9839729d [WebAssembly] Fix wasm.lsda() optimization in WasmEHPrepare
Summary:
When we insert a call to the personality function wrapper
(`_Unwind_CallPersonality`) for a catch pad, we store some necessary
info in `__wasm_lpad_context` struct and pass it. One of the info is the
LSDA address for the function. For this, we insert a call to
`wasm.lsda()`, which will be lowered down to the address of LSDA, and
store it in a field in `__wasm_lpad_context`.

There are exceptions to this personality call insertion: catchpads for
`catch (...)` and cleanuppads (for destructors) don't need personality
function calls, because we don't need to figure out whether the current
exception should be caught or not. (They always should.)

There was a little optimization to `wasm.lsda()` call insertion. Because
the LSDA address is the same throughout a function, we don't need to
insert a store of `wasm.lsda()` return value in every catchpad. For
example:
```
try {
  foo();
} catch (int) {
  // wasm.lsda() call and a store are inserted here, like, in
  // pseudocode,
  // %lsda = wasm.lsda();
  // store %lsda to a field in __wasm_lpad_context
  try {
    foo();
  } catch (int) {
    // We don't need to insert the wasm.lsda() and store again, because
    // to arrive here, we have already stored the LSDA address to
    // __wasm_lpad_context in the outer catch.
  }
}
```
So the previous algorithm checked if the current catch has a parent EH
pad, we didn't insert a call to `wasm.lsda()` and its store.

But this was incorrect, because what if the outer catch is `catch (...)`
or a cleanuppad?
```
try {
  foo();
} catch (...) {
  // wasm.lsda() call and a store are NOT inserted here
  try {
    foo();
  } catch (int) {
    // We need wasm.lsda() here!
  }
}
```
In this case we need to insert `wasm.lsda()` in the inner catchpad,
because the outer catchpad does not have one.

To minimize the number of inserted `wasm.lsda()` calls and stores, we
need a way to figure out whether we have encountered `wasm.lsda()` call
in any of EH pads that dominates the current EH pad. To figure that
out, we now visit EH pads in BFS order in the dominator tree so that we
visit parent BBs first before visiting its child BBs in the domtree.

We keep a set named `ExecutedLSDA`, which basically means "Do we have
`wasm.lsda()` either in the current EH pad or any of its parent EH
pads in the dominator tree?". This is to prevent scanning the domtree up
to the root in the worst case every time we examine an EH pad: each EH
pad only needs to examine its immediate parent EH pad.

- If any of its parent EH pads in the domtree has `wasm.lsda()`, this
  means we don't need `wasm.lsda()` in the current EH pad. We also insert
  the current EH pad in `ExecutedLSDA` set.
- If none of its parent EH pad has `wasm.lsda()`
  - If the current EH pad is a `catch (...)` or a cleanuppad, done.
  - If the current EH pad is neither a `catch (...)` nor a cleanuppad,
    add `wasm.lsda()` and the store in the current EH pad, and add the
    current EH pad to `ExecutedLSDA` set.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77423
2020-04-04 07:02:50 -07:00
Matt Arsenault 30ebafaa56 CodeGen: Convert some TII hooks to use Register 2020-04-03 14:52:54 -04:00
jasonliu d65557d15d [NFC][XCOFF][AIX] Refactor get/setContainingCsect
Summary:
For current architect, we always require setContainingCsect to be
called on every MCSymbol got used in XCOFF context.
This is very hard to achieve because symbols gets created everywhere
 and other MCSymbol types(ELF, COFF) do not have similar rules.
It's very easy to miss setting the containing csect, and we would
need to add a lot of XCOFF specialized code around some common code area.

This patch intendeds to do
1. Rely on getFragment().getParent() to get csect from labels.
2. Only use get/setRepresentedCsect (was get/setContainingCsect)
if symbol itself represents a csect.

Reviewers: DiggerLin, hubert.reinterpretcast, daltenty

Differential Revision: https://reviews.llvm.org/D77080
2020-04-03 13:33:12 +00:00
Guillaume Chatelet 9068bccbae [Alignment][NFC] Deprecate InstrTypes getRetAlignment/getParamAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77312
2020-04-03 13:21:58 +00:00
Guillaume Chatelet ca11c480e7 [Alignment][NFC] Convert MachineIRBuilder::buildDynStackAlloc to Align
Summary:
The change in IRTranslator is not trivial but is NFC as far as I can tell.

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77292
2020-04-03 09:05:19 +00:00
Guillaume Chatelet 9f5c786876 [NFC] G_DYN_STACKALLOC realign iff align > 1, update documentation
Summary: I think it would be better to require the alignment to be >= 1. It is currently confusing to allow both values.

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77372
2020-04-03 08:12:39 +00:00
Serguei Katkov bd1d70bf0e [DAG] Change isGCValue detection for statepoint lowering
isGCValue should detect whether the deopt value is a GC pointer.
Currently it checks by finding the value in SI.Bases and SI.Ptrs.
However these data structures contain only those values which
have corresponding gc.relocate call. So we can miss GC value if it
does not have gc.relocate call (dead after the call).

Check GC strategy whether pointer is GC one or consider any pointer
to be GC one conservatively.

Reviewers: reames, dantrushin
Reviewed By: reames
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D77130
2020-04-03 12:36:13 +07:00
Simon Pilgrim b02c7a8152 Fix "result of 32-bit shift implicitly converted to 64 bits" MSVC warning. NFCI.
The shift of 1 by an amount that is never more than 31 means that the warning is a false positive but is safe and fixes Werror builds.
2020-04-02 12:02:04 +01:00
Guillaume Chatelet 96cae168fa [NFC] Preparatory work for D77292 2020-04-02 09:30:33 +00:00
Guillaume Chatelet 189d2e215f [Alignment][NFC] Use more Align versions of various functions
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: MatzeB, qcolombet, arsenm, sdardis, jvesely, nhaehnle, hiraditya, jrtc27, atanasyan, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77291
2020-04-02 09:00:53 +00:00
OCHyams 550ab58bc1 [NFC] Fix performance issue in LiveDebugVariables
When compiling AMDGPUDisassembler.cpp in a stage 1 trunk build with
CMAKE_BUILD_TYPE=RelWithDebInfo LLVM_USE_SANITIZER=Address LiveDebugVariables
accounts for 21.5% wall clock time. This fix reduces that to 1.2% by switching
out a linked list lookup with a map lookup.

Note that the linked list is still used to group UserValues by vreg. The vreg
lookups don't cause any problems in this pathological case.

This is the same idea as D68816, which was reverted, except that it is a less
intrusive fix.

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D77226
2020-04-02 09:39:33 +01:00
Daniel Sanders e65e677ee4 [globalisel][legalizer] Fix DebugLoc bugs caught by a prototype lost-location verifier
The legalizer has a tendency to lose DebugLoc's when expanding or
combining instructions. The verifier that detected these isn't ready
for upstreaming yet but this patch fixes the cases that came up when
applying it to our out-of-tree backend's CodeGen tests.

This pattern comes up a few more times in this file and probably in
the backends too but I'd prefer to fix the others separately (and
preferably when the lost-location verifier detects them).
2020-04-01 12:50:18 -07:00
Jessica Clarke 616289ed29 [LegalizeTypes][RISCV] Correctly sign-extend comparison for ATOMIC_CMP_XCHG
Summary:
Currently, the comparison argument used for ATOMIC_CMP_XCHG is legalised
with GetPromotedInteger, which leaves the upper bits of the value
undefind. Since this is used for comparing in an LR/SC loop with a
full-width comparison, we must sign extend it. We introduce a new
getExtendForAtomicCmpSwapArg to complement getExtendForAtomicOps, since
many targets have compare-and-swap instructions (or pseudos) that
correctly handle an any-extend input, and the existing function
determines the extension of the result, whereas we are concerned with
the input.

This is related to https://reviews.llvm.org/D58829, which solved the
issue for ATOMIC_CMP_SWAP_WITH_SUCCESS, but not the simpler
ATOMIC_CMP_SWAP.

Reviewers: asb, lenary, efriedma

Reviewed By: asb

Subscribers: arichardson, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, evandro, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74453
2020-04-01 15:51:26 +01:00
Guillaume Chatelet 1dffa2550b [Alignment][NFC] Transition to MachineFrameInfo::getObjectAlign()
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77215
2020-04-01 14:08:28 +00:00
Guillaume Chatelet 3a78f44daf [Alignment][NFC] Convert SelectionDAG::InferPtrAlignment to MaybeAlign
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77212
2020-04-01 13:22:11 +00:00
Guillaume Chatelet bf573bea19 [Alignment][NFC] Convert MIR Yaml to MaybeAlign
Summary:
Although it may look like non NFC it is. especially the MIRParser may set `0` to the MachineFrameInfo and MachineFunction, but they all deal with `Align` internally and assume that `0` means `1`.
93fc0ba145/llvm/include/llvm/CodeGen/MachineFrameInfo.h (L483)

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77203
2020-04-01 12:26:31 +00:00
Guillaume Chatelet c7468c1696 [Alignment][NFC] Use Align in SelectionDAG::getMemIntrinsicNode
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, nemanjai, hiraditya, kbarton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77149
2020-04-01 09:32:05 +00:00
Qiu Chaofan 95bcab8272 [DAGCombiner] Require ninf for sqrt recip estimation
Currently, DAG combiner uses (fmul (rsqrt x) x) to estimate square
root of x. However, this method would return NaN if x is +Inf, which
is incorrect.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D76853
2020-04-01 16:23:43 +08:00
Craig Topper f92563f907 [VectorUtils][X86] De-templatize scaleShuffleMask and 2 X86 shuffle mask helpers and move their implementation to cpp files
Summary: These were templated due to SelectionDAG using int masks for shuffles and IR using unsigned masks for shuffles. But now that D72467 has landed we have an int mask version of IRBuilder::CreateShuffleVector. So just use int instead of a template

Reviewers: spatel, efriedma, RKSimon

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D77183
2020-04-01 00:46:48 -07:00
Eli Friedman 1ee6ec2bf3 Remove "mask" operand from shufflevector.
Instead, represent the mask as out-of-line data in the instruction. This
should be more efficient in the places that currently use
getShuffleVector(), and paves the way for further changes to add new
shuffles for scalable vectors.

This doesn't change the syntax in textual IR. And I don't currently plan
to change the bitcode encoding in this patch, although we'll probably
need to do something once we extend shufflevector for scalable types.

I expect that once this is finished, we can then replace the raw "mask"
with something more appropriate for scalable vectors.  Not sure exactly
what this looks like at the moment, but there are a few different ways
we could handle it.  Maybe we could try to describe specific shuffles.
Or maybe we could define it in terms of a function to convert a fixed-length
array into an appropriate scalable vector, using a "step", or something
like that.

Differential Revision: https://reviews.llvm.org/D72467
2020-03-31 13:08:59 -07:00
Guozhi Wei 6d20937c29 [CodeGenPrepare] Delete intrinsic call to llvm.assume to enable more tailcall
The attached test case is simplified from tcmalloc. Both function calls should be optimized as tailcall. But llvm can only optimize the first call. The second call can't be optimized because function dupRetToEnableTailCallOpts failed to duplicate ret into block case2.

There 2 problems blocked the duplication:

  1 Intrinsic call llvm.assume is not handled by dupRetToEnableTailCallOpts.
  2 The control flow is more complex than expected, dupRetToEnableTailCallOpts can only duplicate ret into its predecessor, but here we have an intermediate block between call and ret.

The solutions:

  1 Since CodeGenPrepare is already at the end of LLVM IR phase, we can simply delete the intrinsic call to llvm.assume.
  2 A general solution to the complex control flow is hard, but for this case, after exit2 is duplicated into case1, exit2 is the only successor of exit1 and exit1 is the only predecessor of exit2, so they can be combined through eliminateFallThrough. But this function is called too late, there is no more dupRetToEnableTailCallOpts after it. We can add an earlier call to eliminateFallThrough to solve it.

Differential Revision: https://reviews.llvm.org/D76539
2020-03-31 11:55:51 -07:00
Guillaume Chatelet 998118c3d3 [Alignment][NFC] Deprecate MachineMemOperand::getMachineMemOperand version that takes an untyped alignement.
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77138
2020-03-31 16:05:31 +00:00
Guillaume Chatelet b9810988b2 [Alignment][NFC] Transitionning more getMachineMemOperand call sites
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77127
2020-03-31 11:04:10 +00:00
Denis Antrushin 47107dc3bd [Statepoint] Fix StatepointLoweringInfo::GCTransitionArgs initialization
Summary:
In method SelectionDAGBuilder::LowerStatepoint, array SI.GCTransitionArgs
is initialized from wrong part of ImmutableStatepoint class.
We copy gc args instead of transitions args.

Reviewers: reames, skatkov

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77075
2020-03-31 11:45:06 +03:00
Guillaume Chatelet c9d5c19597 [Alignment][NFC] Transitionning more getMachineMemOperand call sites
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, Jim, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77121
2020-03-31 08:36:18 +00:00
Guillaume Chatelet d2d6c9f591 [Alignment][NFC] GlobalIsel Utils inferAlignFromPtrInfo
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: rovka, hiraditya, volkan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77079
2020-03-31 06:58:57 +00:00
Guillaume Chatelet af3c52d558 [Alignment][NFC] Simplify IRTranslator::getMemOpAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77078
2020-03-31 06:57:13 +00:00
Craig Topper 2a07221cf3 [SelectionDAG] Add an assert that the input VT and output VT for ISD::FREEZE are the same.
Differential Revision: https://reviews.llvm.org/D77092
2020-03-30 23:21:58 -07:00
Jessica Paquette d5ee72065b [GlobalISel] Implement identity transforms for x op x -> x
When we have

```
a = G_OR x, x
```

or

```
b = G_AND y, y
```

We can drop the G_OR/G_AND and just use x/y respectively.

Also update arm64-fallback.ll because there was an or in there which hits this
transformation.

Differential Revision: https://reviews.llvm.org/D77105
2020-03-30 18:22:37 -07:00
Juneyoung Lee 519f5c3796 [LegalizeTypes] Add SoftenFloatRes_FREEZE
Summary: This adds SoftenFloatRes_FREEZE.

Reviewers: bkramer, JamesNagurne, craig.topper, efriedma

Reviewed By: craig.topper

Subscribers: AbigailLinden, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76980
2020-03-31 10:16:38 +09:00
Jessica Paquette 63d70ea6a0 [GlobalISel] Combine (x op 0) -> x for operations with a right identity of 0
Implement identity combines for operations like the following:

```
%a = G_SUB %b, 0
```

This can just be replaced with %b.

Over CTMark, this gives some minor size improvements at -O3.

Differential Revision: https://reviews.llvm.org/D76640
2020-03-30 16:49:52 -07:00
Eli Friedman cf36f9855a [SVE][SelectionDAG] Fix dumping of EVTs to use correct API for element count.
This makes "-debug" output for SVE SelectionDAG readable.
2020-03-30 16:47:53 -07:00
Matt Arsenault b8fc192d42 Revert "[GISel]: Fix incorrect IRTranslation while translating null pointer types"
This reverts commit b3297ef051.

This change is incorrect. The current semantic of null in the IR is a
pointer with the bitvalue 0. It is not a cast from an integer 0, so
this should preserve the pointer type.
2020-03-30 19:30:42 -04:00
Nick Desaulniers f086941765 [SelectionDAGISel] small cleanup to INLINEASM_BR selection. NFC
Summary:
This code was throwing away the opcode for a boolean, which was then
reconstructing the opcode from that boolean.  Just pass the opcode, and
forget the boolean.

Reviewers: srhines

Reviewed By: srhines

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77100
2020-03-30 15:32:06 -07:00
Matt Arsenault 4919f2e1c5 AMDGPU/GlobalISel: Basic legalize rules for G_FSHR
Only handles easy 32-bit cases.
2020-03-30 11:53:01 -07:00
Matt Arsenault 23da702d69 GlobalISel: Translate llvm.fshl/llvm.fshr 2020-03-30 11:34:42 -07:00
Jakub Kuderski 77ce2e21a8 [AMDGPU] Add Relocation Constant Support
Summary:
This change adds amdgcn.reloc.constant intrinsic to the amdgpu backend, which will compile into a relocation entry in the resulting elf.

The intrinsics takes a MetadataNode (String) as its only argument, which specifies the symbol name of the relocation entry.

`SelectionDAGBuilder::getValueImpl` is changed to allow metadata operands passed through to ISel.

Author: csyonghe <yonghe@google.com>

Reviewers: tpr, nhaehnle

Reviewed By: nhaehnle

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76440
2020-03-30 13:49:20 -04:00
Guillaume Chatelet bdf77209b9 [Alignment][NFC] Use Align version of getMachineMemOperand
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jyknight, sdardis, nemanjai, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, jfb, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77059
2020-03-30 15:46:27 +00:00
Matt Arsenault cc3b5590d2 GlobalISel: Minor cleanups 2020-03-30 11:26:22 -04:00
Guillaume Chatelet 01ba2ad9ef [Alignment][NFC] Provide tightened up functions in SelectionDAG, MachineFunction and MachineMemOperand
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77046
2020-03-30 13:03:27 +00:00
Guillaume Chatelet b91535f6c7 [Alignment][NFC] Return Align for SelectionDAGNodes::getOriginalAlignment/getAlignment
Summary:
Also deprecate getOriginalAlignment, getAlignment will take much more time as it is pervasive through the codebase (including TableGened files).

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76933
2020-03-30 07:26:48 +00:00
Reid Kleckner e5bf5037d8 [CodeGen] Fix sinking local values in lpads with phis
There was already a test case for landingpads to handle this case, but I
had forgotten to consider PHI instructions preceding the EH_LABEL in the
landingpad.

PR45261
2020-03-28 11:10:33 -07:00
Martin Storsjö e6112a56dd [AsmPrinter] Emit .weak directive for weak linkage on COFF for symbols without a comdat
MC already knows how to emulate the .weak directive (with its ELF
semantics; i.e., an undefined weak symbol resolves to 0, and a defined
weak symbol has lower link precedence than a strong symbol of the same
name) using COFF weak externals. Plumb this through the ASM printer too,
so that definitions marked with __attribute__((weak)) at the language
level (which gets translated to weak linkage at the IR level) have the
corresponding .weak directive emitted. Note that declarations marked
with __attribute__((weak)) at the language level (which translates to
extern_weak at the IR level) already have .weak directives emitted.

Weak*/linkonce* symbols without an associated comdat (in particular, ones
generated with __attribute__((weak)) in C/C++) were earlier emitted as
normal unique globals, as the comdat is required to provide the linkonce
semantics. This change makes sure they are emitted as .weak instead,
allowing other symbols to override them.

Rename the existing coff-weak.ll test to coff-linkonce.ll. I'm not
quite sure what that test covers, since the behavior being tested in it
(the emission of a one_only section) is just a result of passing
-function-sections to llc; the linkonce_odr makes no difference.

Add a new coff-weak.ll which tests the new directive emission.

Based on an previous patch by Shoaib Meenai.

Differential Revision: https://reviews.llvm.org/D44543
2020-03-28 18:48:58 +02:00
Jessica Paquette 98d05f88d5 [GlobalISel] Fix equality for copies from physregs in matchEqualDefs
When we see this:

```
%a = COPY $physreg
...
SOMETHING implicit-def $physreg
...
%b = COPY $physreg
```

The two copies are not equivalent, and so we shouldn't perform any folding
on them.

When we have two instructions which use a physical register check that they
define the same virtual register(s) as well.

e.g., if we run into this case

```
%a = COPY $physreg
...
%b = COPY %a
```

we can say that the two copies are the same, and can be folded.

Differential Revision: https://reviews.llvm.org/D76890
2020-03-27 17:52:21 -07:00
Nemanja Ivanovic 4821411347 [DAGCombine] Fix splitting indexed loads in ForwardStoreValueToDirectLoad()
In DAGCombiner::visitLOAD() we perform some checks before breaking up an indexed
load. However, we don't do the same checking in ForwardStoreValueToDirectLoad()
which can lead to failures later during combining
(see: https://bugs.llvm.org/show_bug.cgi?id=45301).

This patch just adds the same checks to this function as well.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45301

Differential revision: https://reviews.llvm.org/D76778
2020-03-27 18:03:47 -05:00
Matt Arsenault a8cc9047de CodeGen: Add -denormal-fp-math-f32 flag
Make the set of FP related attributes and command flags closer.
2020-03-27 14:00:39 -07:00
Matt Arsenault 0ab5b5b858 Fix denormal-fp-math flag and attribute interaction
Make these behave the same way unsafe-fp-math and co. The command line
flag should add the attribute to functions that do not already have
it, and leave existing attributes. The attribute is the actual
implementation, but the flag is useful in some testing situations.

AMDGPU has a variety of tests with denormals enabled/disabled that
would require a painful level of test duplication without a flag. This
doesn't expose setting the separate input/output modes, or add a flag
for the f32 version yet.

Tests will be included in future patch.
2020-03-27 12:48:58 -07:00
Guillaume Chatelet 74eac9031a [Alignment][NFC] MachineMemOperand::getAlign/getBaseAlign
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, dschuff, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, jrtc27, atanasyan, jfb, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76925
2020-03-27 15:49:13 +00:00
Guillaume Chatelet a98662f4c1 [Alignment][NFC] Update MachineMemOperand implementation to use MaybeAlign
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Reviewed By: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76625
2020-03-27 08:06:10 +00:00
Juneyoung Lee 1bcc500b48 [DAGCombine] Add basic optimizations for FREEZE in SelDag
Summary: This patch is the first effort to adding basic optimizations for FREEZE in SelDag.

Reviewers: spatel, lebedev.ri

Reviewed By: spatel

Subscribers: xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76707
2020-03-27 12:20:39 +09:00
Craig Topper 9f7d4150b9 [X86] Move combineLoopMAddPattern and combineLoopSADPattern to an IR pass before SelecitonDAG.
These transforms rely on a vector reduction flag on the SDNode
set by SelectionDAGBuilder. This flag exists because SelectionDAG
can't see across basic blocks so SelectionDAGBuilder is looking
across and saving the info. X86 is the only target that uses this
flag currently. By removing the X86 code we can remove the flag
and the SelectionDAGBuilder code.

This pass adds a dedicated IR pass for X86 that looks across the
blocks and transforms the IR into a form that the X86 SelectionDAG
can finish.

An advantage of this new approach is that we can enhance it to
shrink the phi nodes and final reduction tree based on the zeroes
that we need to concatenate to bring the partially reduced
reduction back up to the original width.

Differential Revision: https://reviews.llvm.org/D76649
2020-03-26 14:10:20 -07:00
diggerlin fdfe411e7c [AIX] discard the label in the csect of function description and use qualname for linkage
SUMMARY:

SUMMARY
for a source file  "test.c"

void foo() {};

llc will generate assembly code as (assembly patch)
     .globl  foo
     .globl  .foo
     .csect foo[DS]
foo:

        .long   .foo
        .long   TOC[TC0]
        .long   0

   and symbol table as (xcoff object file)
   [4]     m   0x00000004     .data     1  unamex                    foo
   [5]     a4  0x0000000c       0    0     SD       DS    0    0
   [6]     m   0x00000004     .data     1  extern                    foo
   [7]     a4  0x00000004       0    0     LD       DS    0    0

   After first patch, the assembly will be as

        .globl  foo[DS]                 # -- Begin function foo
        .globl  .foo
        .align  2
        .csect foo[DS]
        .long   .foo
        .long   TOC[TC0]
        .long   0

    and symbol table will as
   [6]     m   0x00000004     .data     1  extern                    foo
   [7]     a4  0x00000004       0    0     DS      DS    0    0
Change the code for the assembly path and xcoff objectfile patch for llc.

Reviewers: Jason Liu
Subscribers: wuzish, nemanjai, hiraditya

Differential Revision: https://reviews.llvm.org/D76162
2020-03-26 15:46:52 -04:00
Dominik Montada 9fedb6900d [GlobalISel] add helper function to create arbitrary libcalls
Summary:
The existing helper function can only create a libcall to functions available in
RTLIB. Add a helper function that can create a libcall to a given function name
using the provided calling convention.

Reviewers: aditya_nandakumar, t.p.northover, rovka, arsenm, dsanders

Reviewed By: arsenm

Subscribers: wdng, hiraditya, volkan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76845
2020-03-26 16:11:13 +01:00
Qiu Chaofan 172456c775 [Legalizer] Fix some flags miss in vector results
In some scalarize/split result methods (unary, binary, ...), flags in
SDNode were not passed down, which may lead to unexpected results in
unsafe float-point optimization. This patch fixes them. (maybe not
complete)

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D76832
2020-03-26 22:01:19 +08:00
Juneyoung Lee 453eac3f77 Minor fixes to a comment in CodeGenPrepare 2020-03-25 16:34:43 +09:00
Matt Arsenault 39c55cef21 GlobalISel: Introduce bitcast legalize action
For some operations, the type is unimportant and only the number of
bits matters. For example I don't want to treat <4 x s8> as a legal
type, but I also don't want to decompose loads of this into smaller
pieces to get legal register types.

On AMDGPU in SelectionDAG, we legalize a number of operations (most
notably load and store) by coercing all types to vectors of i32. For
GlobalISel, I'm trying very hard to avoid doing this for every type,
but I don't think this strategy can be completely avoided. I'm trying
to avoid bitcasts for any legitimately legal type we can operate on,
since the intervening bitcasts have proven to be a hassle.

For loads, I think I can get away without ever casting the result
type, and handling any arbitrary bitwidth during selection (I will
eventually want new tablegen support to help with this, rather than
having to add every possible type as legal). The unmerge required to
do anything with the value should expand to the expected shifts. This
is trickier for stores, since it would now require handling a wide
array of truncates during selection which I don't want.

Future potentially interesting case are for vector indexing, where
sub-dword type should be indexed in s32 pieces.
2020-03-24 19:33:33 -04:00
Vedant Kumar f7052da6db [DWARF] Emit DW_AT_call_pc for tail calls
Record the address of a tail-calling branch instruction within its call
site entry using DW_AT_call_pc. This allows a debugger to determine the
address to use when creating aritificial frames.

This creates an extra attribute + relocation at tail call sites, which
constitute 3-5% of all call sites in xnu/clang respectively.

rdar://60307600

Differential Revision: https://reviews.llvm.org/D76336
2020-03-24 12:01:55 -07:00
Benjamin Kramer 0019c2f194 [SelectionDAG] Don't crash when freezing illegal float types 2020-03-24 19:45:19 +01:00
Hiroshi Yamauchi c3417592c8 Revert "Include static prof data when collecting loop BBs"
This reverts commit 129c911efa.

Due to an internal benchmark regression.
2020-03-24 09:41:16 -07:00
Lama 4a6ebc03ba [MachinePipeliner] Fix a bug in Output Dependency chains
The current implementation collects all Preds/Succs of a Dep of kind Output, creating a long chain and subsequently a schedule with an unnecessarily large II.
Was this done on purpose for a reason I'm missing?

Reviewed By: bcahoon

Differential Revision: https://reviews.llvm.org/D75424
2020-03-24 14:37:50 +00:00
Juneyoung Lee 7802be4a3d [SelDag] Add FREEZE
Summary:
- Add FREEZE node to SelDag
- Lower FreezeInst (in IR) to FREEZE node
- Add Legalization for FREEZE node

Reviewers: qcolombet, bogner, efriedma, lebedev.ri, nlopes, craig.topper, arsenm

Reviewed By: lebedev.ri

Subscribers: wdng, xbolva00, Petar.Avramovic, liuz, lkail, dylanmckay, hiraditya, Jim, arsenm, craig.topper, RKSimon, spatel, lebedev.ri, regehr, trentxintong, nlopes, mkuper, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D29014
2020-03-24 23:04:58 +09:00
Jinsong Ji 816ad48c82 [NFC][RUIP] Small debug output refine
Add a new line, so that we always print MI in a new line,
before and after UpdateRegMask, for easier check..
2020-03-24 03:29:45 +00:00
Jessica Paquette 02187ed45a [GlobalISel] Combine G_SELECTs of the form (cond ? x : x) into x
When we find something like this:

```
%a:_(s32) = G_SOMETHING ...
...
%select:_(s32) = G_SELECT %cond(s1), %a, %a
```

We can remove the select and just replace it entirely with `%a` because it's
always going to result in `%a`.

Same if we have

```
%select:_(s32) = G_SELECT %cond(s1), %a, %b
```

where we can deduce that `%a == %b`.

This implements the following cases:

- `%select:_(s32) = G_SELECT %cond(s1), %a, %a` -> `%a`

- `%select:_(s32) = G_SELECT %cond(s1), %a, %some_copy_from_a` -> `%a`

- `%select:_(s32) = G_SELECT %cond(s1), %a, %b` -> `%a` when `%a` and `%b`
   are defined by identical instructions

This gives a few minor code size improvements on CTMark at -O3 for AArch64.

Differential Revision: https://reviews.llvm.org/D76523
2020-03-23 16:46:03 -07:00
Matt Arsenault aa63eb6a46 GlobalISel: Add computeKnownBitsForTargetInstr
I think we can save the MRI argument from these since it's in
GISelKnownBits already, but currently not accessible.

Implementation deferred to avoid dependency on other patches.
2020-03-23 15:02:30 -04:00
Reid Kleckner 5ff5ddd0ad [Win64] Insert int3 into trailing empty BBs
Otherwise, the Win64 unwinder considers direct branches to such empty
trailing BBs to be a branch out of the function. It treats such a branch
as a tail call, which can only be part of an epilogue. If the unwinder
misclassifies such a branch as part of the epilogue, it will fail to
unwind the stack further. This can lead to bad stack traces, or failure
to handle exceptions properly. This is described in
https://llvm.org/PR45064#c4, and by the comment at the top of the
X86AvoidTrailingCallPass.cpp file.

It should be safe to insert int3 for such blocks. An empty trailing BB
that reaches this pass is pretty much guaranteed to be unreachable.  If
a program executed such a block, it would fall off the end of the
function.

Most of the complexity in this patch comes from threading through the
"EHFuncletEntry" boolean on the MIRParser and registering the pass so we
can stop and start codegen around it. I used an MIR test because we
should teach LLVM to optimize away these branches as a follow-up.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D76531
2020-03-23 08:50:37 -07:00
Jay Foad 0444d16a16 [GlobalISel] Add generic opcodes for saturating add/subtract
Summary:
Add new generic MIR opcodes G_SADDSAT etc. Add support in IRTranslator
for translating the saturating add/subtract intrinsics to the new
opcodes.

Reviewers: aemerson, dsanders, paquette, arsenm

Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76600
2020-03-23 15:16:45 +00:00
Sanjay Patel 0eeee83d75 [VectorUtils] move x86's scaleShuffleMask to generic VectorUtils
We have some long-standing missing shuffle optimizations that could
use this transform via VectorCombine now:
https://bugs.llvm.org/show_bug.cgi?id=35454
(and we still don't get that case in the backend either)

This function is apparently templated because there's existing code
in IR that treats mask values as unsigned and backend code that
treats masks values as signed.

The mask values are not endian-dependent (as shown by the existing
bitcast transform from DAGCombiner).

Differential Revision: https://reviews.llvm.org/D76508
2020-03-23 09:58:55 -04:00
Guillaume Chatelet 3ba550a05a [Alignment][NFC] Use TFL::getStackAlign()
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: dylanmckay, sdardis, nemanjai, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76551
2020-03-23 13:48:29 +01:00
Guillaume Chatelet ea64ee0edb [Alignment][NFC] Deprecate ensureMaxAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76368
2020-03-23 11:31:33 +01:00
Jay Foad 7cdbf1ed4b Make use of APInt::countLeadingOnes. NFC. 2020-03-23 09:08:20 +00:00
Sam Parker 62fdb1f534 [DAGCombine] Skip PostInc combine with later users
When decided whether to generate a post-inc load/store, look at the
other memory nodes that use the same base address and, if any proceed
the current node, then don't do the combine.
The change only seems to be affecting the Arm backend, which I was
surprised at, but it appears to fix a lot of our issues around MVE
masked load/stores having to store a temporary address after an early
post-increment on a shared base address.

Differential Revision: https://reviews.llvm.org/D75847
2020-03-23 08:39:53 +00:00
Sam Parker 8e45eaf1da [NFC][DAGCombine] Refactor post-inc logic
Extract the decision to combine into a post-inc address into a
couple of functions to make the logic more clear and re-usable.

Differential Revision: https://reviews.llvm.org/D76060
2020-03-23 08:32:20 +00:00
Dominik Montada ccf49b9ef0 [GlobalISel] support widen unmerge if WideTy > SrcTy
Summary:
Widening G_UNMERGE_VALUES to a type which is larger than the
original source type is the same as widening it to the same
type as the source type: in both cases, G_UNMERGE_VALUES has
to be replaced with bit arithmetic which. Although the arithmetic
itself is independent of whether the source type is smaller
or equal to the widen type, widening the source type to the
widen type should result in less artifacts being emitted,
since this is the type that the user explicitly requested.

Reviewers: arsenm, dsanders, aemerson, aditya_nandakumar

Reviewed By: arsenm, dsanders

Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76494
2020-03-23 09:16:45 +01:00
Qiu Chaofan 763871053c [DAGCombiner] Require nsz for aggressive fma fold
For folding pattern `x-(fma y,z,u*v) -> (fma -y,z,(fma -u,v,x))`, if
`yz` is 1, `uv` is -1 and `x` is -0, sign of result would be changed.

Differential Revision: https://reviews.llvm.org/D76419
2020-03-22 23:10:07 +08:00
Fangrui Song 71f8b78d89 [AsmPrinter] Simplify AsmPrinter::emitXXStructorList after D61547 2020-03-21 23:18:23 -07:00
Simon Pilgrim c5fd9e3888 [DAG] Don't permit EXTLOAD when combining FSHL/FSHR consecutive loads (PR45265)
Technically we can permit EXTLOAD of the LHS operand but only if all the extended bits are shifted out. Until we test coverage for that case, I'm just disabling this to fix PR45265.
2020-03-21 10:52:41 +00:00
Fangrui Song 85c30f3374 [X86] Reland D71360 Clean up UseInitArray initialization for X86ELFTargetObjectFile
-fuse-init-array is now the CC1 default but TargetLoweringObjectFileELF::UseInitArray still defaults to false.
The following two unknown OS target triples continue using .ctors/.dtors because InitializeELF is not called.

clang -target i386 -c a.c
clang -target x86_64 -c a.c

This cleanup fixes this as a bonus.

X86SpeculativeLoadHardeningPass::tracePredStateThroughCall can call
MCContext::createTempSymbol before TargetLoweringObjectFileELF::Initialize().
We need to call TargetLoweringObjectFileELF::Initialize() ealier.

test/CodeGen/X86/speculative-load-hardening-indirect.ll

Differential Revision: https://reviews.llvm.org/D71360
2020-03-20 21:57:34 -07:00
Eric Christopher fc7233d774 Temporarily Revert "[X86] Reland D71360 Clean up UseInitArray initialization for X86ELFTargetObjectFile"
as it's causing msan failures.

This reverts commit 7899fe9da8.
2020-03-20 17:36:12 -07:00
Vedant Kumar a245943355 [LiveDebugValues] Speed up collectIDsForRegs, NFC
Use the advanceToLowerBound operation available on CoalescingBitVector
iterators to speed up collection of variables which reside within some
set of registers.

The speedup comes from avoiding repeated top-down traversals in
IntervalMap::find. The linear scan forward from one register interval to
the next is unlikely to be as expensive as a full IntervalMap search
starting from the root.

This reduces time spent in LiveDebugValues when compiling sqlite3 by
200ms (about 0.1% - 0.2% of the total User Time).

Depends on D76466.

rdar://60046261

Differential Revision: https://reviews.llvm.org/D76467
2020-03-20 12:18:26 -07:00
Vedant Kumar 4716ebb823 [ADT] CoalescingBitVector: Avoid initial heap allocation, NFC
Avoid making a heap allocation when constructing a CoalescingBitVector.

This reduces time spent in LiveDebugValues when compiling sqlite3 by
700ms (0.5% of the total User Time).

rdar://60046261

Differential Revision: https://reviews.llvm.org/D76465
2020-03-20 12:18:25 -07:00
Fangrui Song 7899fe9da8 [X86] Reland D71360 Clean up UseInitArray initialization for X86ELFTargetObjectFile
UseInitArray is now the CC1 default but TargetLoweringObjectFileELF::UseInitArray still defaults to false.
The following two unknown OS target triples continue using .ctors/.dtors because InitializeELF is not called.

clang -target i386 -c a.c
clang -target x86_64 -c a.c

This cleanup fixes this as a bonus.

Differential Revision: https://reviews.llvm.org/D71360
2020-03-20 11:18:36 -07:00
Vedant Kumar 636665331b PR45181: Fix another invalid DIExpression combination
The original test case from PR45181 triggers a DIExpression combination
that wasn't fixed in D76164.
2020-03-20 11:18:05 -07:00
Pirama Arumuga Nainar edcfb47ff6 [DAGCombiner] Do not fold truncate(build_vector(..)) if it creates an illegal type
Summary:
It can be the case that a vector type is legal but the corresponding
scalar type is not legal for an architecture (i8 vs. v16i8 on AArch64).
Check if the scalar type created when folding
  truncate(build_vector(x,y)) -> build_vector(truncate(x),truncate(y))

is legal if we are running after the type legalizer.

This fixes https://github.com/android/ndk/issues/1207.

Reviewers: RKSimon, srhines

Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76312
2020-03-20 09:20:16 -07:00
Bjorn Pettersson d168b77780 [DAGCombiner] Fix non-determinism problem related to argument evaluation order in visitFDIV
Summary:
For some reason the order in which we call getNegatedExpression
for the involved operands, after a call to isCheaperToUseNegatedFPOps,
seem to matter. This patch includes a new test case in
test/CodeGen/X86/fdiv.ll that crashes if we reverse the order of
those calls. Before this patch that could happen depending on
which compiler that were used when buildind llvm. With my GCC
version (7.4.0) I got the crash, because it seems like it is
using a different order for the argument evaluation compared
to clang.

All other users of isCheaperToUseNegatedFPOps already used this
pattern with unfolded/ordered calls to getNegatedExpression, so
this patch is aligning visitFDIV with the other use cases.

This patch simply deals with the non-determinism for FDIV. While
the underlying problem with getNegatedExpression is discussed
further in D76439.

Reviewers: spatel, RKSimon

Reviewed By: spatel

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76319
2020-03-20 16:11:17 +01:00
Adrian Kuegel baa6f6a782 Revert "[TableGen][GlobalISel] Account for HwMode in RegisterBank register sizes"
This reverts commit e9f22fd429.

When building with -DLLVM_USE_SANITIZER="Thread", check-llvm has 70
failing tests with this revision, and 29 without this revision.
2020-03-20 11:02:50 +01:00
Wei Mi a035726e5a Revert "Generate Callee Saved Register (CSR) related cfi directives like .cfi_restore."
This reverts commit 3c96d01d2e. Got report that it caused test failures in libc++.
2020-03-19 22:45:27 -07:00
Jessica Paquette c999084619 [GlobalISel] Port some basic shufflevector undef combines from the DAGCombiner
Port over the following:

- shuffle undef, undef, any_mask -> undef
- shuffle anything, anything, undef_mask -> undef

This sort of thing shows up a lot when you try to bugpoint code containing
shufflevector.

Differential Revision: https://reviews.llvm.org/D76382
2020-03-19 16:46:06 -07:00
Sanjay Patel 56da41393d [SDAG] reduce code duplication in getNegatedExpression(); NFCI 2020-03-19 13:55:15 -04:00
Djordje Todorovic d9b9621009 Reland D73534: [DebugInfo] Enable the debug entry values feature by default
The issue that was causing the build failures was fixed with the D76164.
2020-03-19 13:57:30 +01:00
Cullen Rhodes 5ce38fcbac [ValueTypes] Add support for scalable EVTs
Summary:
* Remove a bunch of asserts checking for unsupported scalable types and
  add some more now that they are supported.
* Propagate the scalable flag where necessary.
* Add another `EVT::getExtendedVectorVT` method that takes an
  ElementCount parameter.
* Add `EVT::isExtendedScalableVector` and
  `EVT::getExtendedVectorElementCount` - latter is currently unused.

Reviewers: sdesmalen, efriedma, rengolin, craig.topper, huntergr

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75672
2020-03-19 11:04:15 +00:00
Cullen Rhodes 5c296df0c0 [ValueTypes] Add EVT::isFixedLengthVector
Summary:
Related to D75672, this patch adds EVT::isFixedLengthVector to determine
if the underlying vector type is of fixed length.

An assert is introduced in EVT::getVectorNumElements that triggers for
types that aren't fixed length. This is currently guarded by a flag
added D75297 that is off by default and has been renamed to the more
generic ENABLE_STRICT_FIXED_SIZE_VECTORS.

Ideally we want to get rid of getVectorNumElements but a quick grep
shows there are >350 uses in lib/CodeGen and 75 in lib/Target/AArch64
alone. All of these probably aren't EVT::getVectorNumElements (some may
be the MVT equivalent), but there are many places to fixup and having
the assert on by default would make the SVE upstreaming effort
difficult.

Reviewers: sdesmalen, efriedma, ctetreau, huntergr, rengolin

Reviewed By: efriedma

Subscribers: mgorny, kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76376
2020-03-19 10:08:17 +00:00
Craig Topper c69a4d6bef [SelectionDAG] When splitting gathers/scatters in type legalization, set MMO size to UnknownSize
Gather/scatter don't access one memory location, they access multiple disjoint locations. So using a fixed size isn't accurate. But we don't have a way to represent the true behavior so just use UnknownSize.

Previously we "split" the memory VT and use that size for the MMO of each half. But the memory VT is scalar so splitting usually just returned the original scalar VT, but on 32-bit X86 if the scalar VT was i64 it probably returned i32?

Differential Revision: https://reviews.llvm.org/D76388
2020-03-18 16:07:15 -07:00
Eli Friedman e24e95fe90 Remove CompositeType class.
The existence of the class is more confusing than helpful, I think; the
commonality is mostly just "GEP is legal", which can be queried using
APIs on GetElementPtrInst.

Differential Revision: https://reviews.llvm.org/D75660
2020-03-18 13:53:17 -07:00
Craig Topper 498b53890d [SelectionDAGBuilder][FPEnv] Take into account SelectionDAG continuous CSE when setting the nofpexcept flag for constrained intrinsics
SelectionDAG CSEs nodes based on their result type and operands, but not their flags. The flags are expected to be intersected when they are CSEd. In SelectionDAGBuilder, for FP nodes we manage both the fast math flags and the nofpexcept flag after the nodes have already been CSEd when they were created with getNode. The management of the fastmath flags before the constrained nodes prevents the nofpexcept management from working correctly.

This commit moves the FMF handling for constrained intrinsics into their visitor and disables the common FMF handling for these nodes.

Differential Revision: https://reviews.llvm.org/D75224
2020-03-18 13:37:17 -07:00
lewis-revill e9f22fd429 [TableGen][GlobalISel] Account for HwMode in RegisterBank register sizes
This patch generates TableGen descriptions for the specified register
banks which contain a list of register sizes corresponding to the
available HwModes. The appropriate size is used during codegen according
to the current HwMode. As this HwMode was not available on generation,
it is set upon construction of the RegisterBankInfo class. Targets
simply need to provide the HwMode argument to the
<target>GenRegisterBankInfo constructor.

The RISC-V RegisterBankInfo constructor has been updated accordingly
(plus an unused argument removed).

Differential Revision: https://reviews.llvm.org/D76007
2020-03-18 19:52:23 +00:00
Simon Pilgrim 746bd860c9 Replace get*Alignment() methods with get*Align() equivalents.
Fixes deprecation warning in EXPENSIVE_CHECKS builds.
2020-03-18 18:25:07 +00:00
Jessica Paquette dc5f982639 [GlobalISel] Port some basic undef combines from DAGCombiner.cpp
This ports some combines from DAGCombiner.cpp which perform some trivial
transformations on instructions with undef operands.

Not having these can make it extremely annoying to find out where we differ
from SelectionDAG by looking at existing lit tests. Without them, we tend to
produce pretty bad code generation when we run into instructions which use
undef operands.

Also remove the nonpow2_store_narrowing testcase from arm64-fallback.ll, since
we no longer fall back on the add.

Differential Revision: https://reviews.llvm.org/D76339
2020-03-18 11:05:44 -07:00
Jin Lin 0d896278c8 Support repeated machine outlining
Summary: The following change is to allow the machine outlining can be applied for Nth times, where N is specified by the compiler option. By default the value of N is 1. The motivation is that the repeated machine outlining can further reduce code size.  Please refer to the presentation "Improving Swift Binary Size via Link Time Optimization" in LLVM Developers' Meeting in 2019.

Reviewers: aschwaighofer, tellenbach, paquette

Reviewed By: paquette

Subscribers: tellenbach, hiraditya, llvm-commits, jinlin

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71027
2020-03-18 10:48:52 -07:00
Oliver Stannard 73cea83a6f [IPRA][ARM] Spill extra registers at -Oz
When optimising for code size at the expense of performance, it is often
worth saving and restoring some of r0-r3, if IPRA will be able to take
advantage of them. This doesn't cost any extra code size if we already
have a PUSH/POP pair, and increases the number of available registers
across any calls to the function.

We already have an optimisation which tries fold the subtract/add of the
SP into the PUSH/POP by using extra registers, which somewhat conflicts
with this. I've made the new optimisation less aggressive in cases where
the existing one is likely to trigger, which gives better results than
either of these optimisations by themselves.

Differential revision: https://reviews.llvm.org/D69936
2020-03-18 13:51:16 +00:00