Commit Graph

25285 Commits

Author SHA1 Message Date
Sumanth Gundapaneni 8916e438c4 [Pipeliner] Fix the Schedule DAG topoligical order.
This patch updates the DAG change to reflect in the topological ordering
of the nodes.

Differential Revision: https://reviews.llvm.org/D53105

llvm-svn: 344282
2018-10-11 19:42:46 +00:00
Zachary Turner e8a6c3eb96 Revert SymbolFileNativePDB plugin.
This was originally causing some test failures on non-Windows
platforms, which required fixes in the compiler and linker.  After
those fixes, however, other tests started failing.  Reverting
temporarily until I can address everything.

llvm-svn: 344279
2018-10-11 18:45:44 +00:00
Artem Dergachev 2ce1d6faf8 Revert r344197 "[MC][ELF] compute entity size for explicit sections"
Revert r344206 "[MC][ELF] Fix section_mergeable_size.ll"

They were causing failures on too many important buildbots for too long.
Please revert eagerly if your fix takes more than a couple of hours to land!

llvm-svn: 344278
2018-10-11 18:43:08 +00:00
Nirav Dave f1f2a2a31a [DAG] Fix Big Endian in Load-Store forwarding
Summary:
Correct offset calculation in load-store forwarding for big-endian
targets.

Reviewers: rnk, RKSimon, waltl

Subscribers: sdardis, nemanjai, hiraditya, jrtc27, atanasyan, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D53147

llvm-svn: 344272
2018-10-11 18:28:59 +00:00
Zachary Turner e502f8b315 Better support for POSIX paths in PDBs.
While it doesn't make a *ton* of sense for POSIX paths to be
in PDBs, it's possible to occur in real scenarios involving
cross compilation.

The tools need to be able to handle this, because certain types
of debugging scenarios are possible without a running process
and so don't necessarily require you to be on a Windows system.
These include post-mortem debugging and binary forensics (e.g.
using a debugger to disassemble functions and examine symbols
without running the process).

There's changes in clang, LLD, and lldb in this patch.  After
this the cross-platform disassembly and source-list tests pass
on Linux.

Furthermore, the behavior of LLD can now be summarized by a much
simpler rule than before: Unless you specify /pdbsourcepath and
/pdbaltpath, the PDB ends up with paths that are valid within
the context of the machine that the link is performed on.

Differential Revision: https://reviews.llvm.org/D53149

llvm-svn: 344269
2018-10-11 18:01:55 +00:00
Sanjay Patel 4875662e57 [DAGCombiner] move comment closer to the corresponding code; NFC
llvm-svn: 344255
2018-10-11 16:07:25 +00:00
Nick Desaulniers 335315697a [MC][ELF] compute entity size for explicit sections
Summary:
Global variables might declare themselves to be in explicit sections.
Calculate the entity size always to prevent assembler warnings
"entity size for SHF_MERGE not specified" when sections are to be
marked merge-able.

Fixes PR31828.

Reviewers: rnk, echristo

Reviewed By: rnk

Subscribers: llvm-commits, pirama, srhines

Differential Revision: https://reviews.llvm.org/D53056

llvm-svn: 344197
2018-10-10 22:52:32 +00:00
George Burgess IV 6ef8002c2c Replace most users of UnknownSize with LocationSize::unknown(); NFC
Moving away from UnknownSize is part of the effort to migrate us to
LocationSizes (e.g. the cleanup promised in D44748).

This doesn't entirely remove all of the uses of UnknownSize; some uses
require tweaks to assume that UnknownSize isn't just some kind of int.
This patch is intended to just be a trivial replacement for all places
where LocationSize::unknown() will Just Work.

llvm-svn: 344186
2018-10-10 21:28:44 +00:00
Nirav Dave 07acc992dc [DAGCombine] Improve Load-Store Forwarding
Summary:
Extend analysis forwarding loads from preceeding stores to work with
extended loads and truncated stores to the same address so long as the
load is fully subsumed by the store.

Hexagon's swp-epilog-phis.ll and swp-memrefs-epilog1.ll test are
deleted as they've no longer seem to be relevant.

Reviewers: RKSimon, rnk, kparzysz, javed.absar

Subscribers: sdardis, nemanjai, hiraditya, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D49200

llvm-svn: 344142
2018-10-10 14:15:52 +00:00
Simon Pilgrim bc7b6251b6 [TargetLowering] SimplifyDemandedBits - rename demanded mask args. NFCI.
Help stop bugs like rL343935 by making the 'original' DemandedBits arg more obviously not the mask that is actually used.

llvm-svn: 344138
2018-10-10 13:00:49 +00:00
Simon Pilgrim 53a503f6ac [TargetLowering] SimplifyDemandedBits - pull out repeated getOperands. NFCI.
Part of a minor cleanup to make all the switch statements more consistent prior to improving vector support.

llvm-svn: 344136
2018-10-10 12:32:13 +00:00
Simon Pilgrim 5cb3a82892 [TargetLowering] Add root node back to work list after successful SimplifyDemandedBits/SimplifyDemandedVectorElts
Similar to what already happens in the DAGCombiner wrappers, this patch adds the root nodes back onto the worklist if the DCI wrappers' SimplifyDemandedBits/SimplifyDemandedVectorElts were successful.

Differential Revision: https://reviews.llvm.org/D53026

llvm-svn: 344132
2018-10-10 10:44:15 +00:00
Nemanja Ivanovic 72d4866e57 [DAGCombiner] Expand combining of FP logical ops to sign-setting FP ops
We already do the following combines:
(bitcast int (and (bitcast fp X to int), 0x7fff...) to fp) -> fabs X
(bitcast int (xor (bitcast fp X to int), 0x8000...) to fp) -> fneg X

When the target has "bit preserving fp logic". This patch just extends it
to also combine:
(bitcast int (or (bitcast fp X to int), 0x8000...) to fp) -> fneg (fabs X)

As some targets have fnabs and even those that don't can efficiently lower
both the fabs and the fneg.

Differential revision: https://reviews.llvm.org/D44548

llvm-svn: 344093
2018-10-09 23:20:11 +00:00
Simon Pilgrim 23f880317a [SelectionDAG] Add SIGN_EXTEND_VECTOR_INREG and CONCAT_VECTORS support to SimplifyDemandedBits
Fix for AVX1 masked load/store regression on D52964

llvm-svn: 344043
2018-10-09 13:13:35 +00:00
Matthias Braun 1c96c9687f ExpandPostRAPseudos: Fix alldefsAreDead() not removing operands
One case left around nonsensical operands for the KILL instruction
which the machine verifier checks for nowadays. While this should not
hurt in release builds we should fix the machine verifier errors anyway.

llvm-svn: 344008
2018-10-09 00:07:34 +00:00
Matthias Braun c9dac6b9b4 TwoAddressInstructionPass: Modernize/fix some comments; NFC
llvm-svn: 344006
2018-10-08 23:47:35 +00:00
Matthias Braun 66badb38b2 PHIElimination: Remove wrong comment; NFC
The comment was contradicting the code. Looking at history the feature
was implemented a day after the comment was written without dropping the
comment.

llvm-svn: 344005
2018-10-08 23:47:35 +00:00
Matthias Braun 7d88f0e386 MachineFunctionPrinterPass: Declare SlotIndexes as used if available; NFC
This makes print-machineinstrs print the slot indexes in more
situations. NFC for normal compilation.

llvm-svn: 344004
2018-10-08 23:47:34 +00:00
Sanjay Patel b64c0d7b53 [DAGCombiner] simplify code for fmul with constant fold; NFCI
llvm-svn: 343997
2018-10-08 21:17:20 +00:00
Alex Bradbury f27c67af12 [SelectionDAGBuilder][NFC] Pass LHSTy to getShiftAmountTy rather than RHSTy
r126518 introduced a a type parameter to the getShiftAmountTy target hook. It 
produces the type of the shift (RHSTy), parameterised by the type of the value 
being shifted (LHSTy). SelectionDAGBuilder::visitShift passed RHSTy rather 
than LHSTy and this patch corrects this. The change is a no-op because in LLVM 
IR the LHS and RHS types for a shift must be equal anyway.

llvm-svn: 343955
2018-10-08 06:24:59 +00:00
Craig Topper 98dd9d6896 Revert r343948 "[LegalizeDAG] Make one of the ReplaceNode signatures take an ArrayRef instead a pointer to an array. Add assert on size of array. NFC"
The assert is failing some asan tests on the bots.

llvm-svn: 343950
2018-10-08 03:12:12 +00:00
Craig Topper c058a68784 [LegalizeDAG] Make one of the ReplaceNode signatures take an ArrayRef instead a pointer to an array. Add assert on size of array. NFC
llvm-svn: 343948
2018-10-08 02:02:08 +00:00
Craig Topper cd38de8b15 [LegalizeDAG] Move legalization of scatter and masked store from LegalizeVectorOps to LegalizeDAG.
This is where we legalize gather and masked load so this is consistent.

Since these ops are always on vectors I've chosen to go with LegalizeDAG since that's what we do for other vector only ops like BUILD_VECTOR, VECTOR_SHUFFLE, etc. The ScalarizeMaskedMemIntrinsic pass should take care of scalarizing these before SelectionDAG so hopefully we don't need to worry about illegally typed scalar ops being emitted in the legalizing. If we did we would need to do this in LegalizeVectorOps so we could get the second type legalization that runs between LegalizeVectorOps and LegalizeDAG.

llvm-svn: 343947
2018-10-08 00:04:55 +00:00
Sanjay Patel ecc8af61e7 [DAGCombiner] allow undef elts in vector fadd matching
llvm-svn: 343945
2018-10-07 16:30:42 +00:00
Sanjay Patel ef76e27985 [DAGCombiner] allow undefs when matching vector splats for fmul folds
llvm-svn: 343942
2018-10-07 16:05:37 +00:00
Sanjay Patel 0b74c840dd [DAGCombiner] allow undef elts in vector fabs/fneg matching
This change is proposed as a part of D44548, but we
need this independently to avoid regressions from improved
undef propagation in SimplifyDemandedVectorElts().

llvm-svn: 343940
2018-10-07 15:32:06 +00:00
Sanjay Patel 46a9dc2e3e [DAGCombiner] shorten code for bitcast+fabs fold; NFC
llvm-svn: 343939
2018-10-07 15:18:30 +00:00
Simon Pilgrim 3b04a4e322 [SelectionDAG] Respect multiple uses in SimplifyDemandedBits to SimplifyDemandedVectorElts simplification
rL343913 was using SimplifyDemandedBits's original demanded mask instead of the adjusted 'NewMask' that accounts for multiple uses of the op (those variable names really need improving....).

Annoyingly many of the test changes (back to pre-rL343913 state) are actually safe - but only because their multiple uses are all by PMULDQ/PMULUDQ.

Thanks to Jan Vesely (@jvesely) for bisecting the bug.

llvm-svn: 343935
2018-10-07 11:45:46 +00:00
Craig Topper e4d199e360 [LegalizeVectorOps] Make ExpandStrictFPOp return the result corresponding to the result number of the SDValue passed in.
It was always returning the chain which seems to be the result number of the SDValue in the lit tests we have. But I don't know if that's guaranteed.

llvm-svn: 343933
2018-10-07 07:16:44 +00:00
Simon Pilgrim 9c9c97bcf4 [SelectionDAG] Add SimplifyDemandedBits to SimplifyDemandedVectorElts simplification
This patch enables SimplifyDemandedBits to call SimplifyDemandedVectorElts in cases where the demanded bits mask covers entire elements of a bitcasted source vector.

There are a couple of cases here where simplification at a deeper level (such as through bitcasts) prevents further simplification - CommitTargetLoweringOpt only adds immediate uses/users back to the worklist when we might want to combine the original caller again to see what else it can simplify.

As well as that I had to disable handling of bool vector until SimplifyDemandedVectorElts better supports some of their opcodes (SETCC, shifts etc.).

Fixes PR39178

Differential Revision: https://reviews.llvm.org/D52935

llvm-svn: 343913
2018-10-06 10:20:04 +00:00
Vedant Kumar 8c46668b6e [LiveDebugValues] Extend var ranges through artificial blocks
ASan often introduces basic blocks consisting exclusively of
instructions without debug locations, or with line 0 debug locations.

LiveDebugValues needs to extend variable ranges through these artificial
blocks. Otherwise, a lot of variables disappear -- even at -O0.

Typically, LiveDebugValues does not extend a variable's range into a
block unless the block is essentially "part of" the variable's scope
(for a precise definition, see LexicalScopes::dominates). This patch
relaxes the lexical dominance check for artificial blocks.

This makes the following Swift program debuggable at -O0:
```
  1| var x = 100
  2| print("x = \(x)")
```

rdar://39127144

Differential Revision: https://reviews.llvm.org/D52921

llvm-svn: 343890
2018-10-05 21:44:15 +00:00
Vedant Kumar 9b558380dd Clarify debug output in LiveDebugValues
MachineBasicBlocks often do not have names, so it helps to refer to them
by block number when printing debug messages.

llvm-svn: 343889
2018-10-05 21:44:00 +00:00
Jessica Paquette b328d95333 [GlobalIsel] Add llvm.invariant.start and llvm.invariant.end
Port over the implementation in SelectionDAGBuilder.cpp into the IRTranslator
and update the arm64-irtranslator test.

These were causing fallbacks in CTMark/Bullet (-Rpass-missed=gisel-select),
and this patch fixes that.

https://reviews.llvm.org/D52945

llvm-svn: 343885
2018-10-05 21:02:46 +00:00
Vedant Kumar 5931b4e5b5 [DebugInfo] Add support for DWARF5 call site-related attributes
DWARF v5 introduces DW_AT_call_all_calls, a subprogram attribute which
indicates that all calls (both regular and tail) within the subprogram
have call site entries. The information within these call site entries
can be used by a debugger to populate backtraces with synthetic tail
call frames.

Tail calling frames go missing in backtraces because the frame of the
caller is reused by the callee. Call site entries allow a debugger to
reconstruct a sequence of (tail) calls which led from one function to
another. This improves backtrace quality. There are limitations: tail
recursion isn't handled, variables within synthetic frames may not
survive to be inspected, etc. This approach is not novel, see:

  https://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=get&target=jelinek.pdf

This patch adds an IR-level flag (DIFlagAllCallsDescribed) which lowers
to DW_AT_call_all_calls. It adds the minimal amount of DWARF generation
support needed to emit standards-compliant call site entries. For easier
deployment, when the debugger tuning is LLDB, the DWARF requirement is
adjusted to v4.

Testing: Apart from check-{llvm, clang}, I built a stage2 RelWithDebInfo
clang binary. Its dSYM passed verification and grew by 1.4% compared to
the baseline. 151,879 call site entries were added.

rdar://42001377

Differential Revision: https://reviews.llvm.org/D49887

llvm-svn: 343883
2018-10-05 20:37:17 +00:00
Matthias Braun fb43114ba2 DwarfDebug: Pick next location in case of missing location at block begin
Context: Compiler generated instructions do not have a debug location
assigned to them. However emitting 0-line records for all of them bloats
the line tables for very little benefit so we usually avoid doing that.

Not emitting anything will lead to the previous debug location getting
applied to the locationless instructions. This is not desirable for
block begin and after labels. Previously we would emit simply emit
line-0 records in this case, this patch changes the behavior to do a
forward search for a debug location in these cases before emitting a
line-0 record to further reduce line table bloat.

Inspired by the discussion in https://reviews.llvm.org/D52862

llvm-svn: 343874
2018-10-05 18:29:24 +00:00
Sanjay Patel f6a160a102 [SelectionDAG] allow undefs when matching splat constants
And use that to transform fsub with zero constant operands.
The integer part isn't used yet, but it is proposed for use in
D44548, so adding both enhancements here makes that 
patch simpler.

llvm-svn: 343865
2018-10-05 17:42:19 +00:00
Jonas Paulsson faad1b3056 [TargetRegisterInfo] Remove temporary hook enableMultipleCopyHints()
Finally all targets are enabling multiple regalloc hints, so the hook to
disable this can now be removed.

NFC.

Review: Simon Pilgrim
https://reviews.llvm.org/D52316

llvm-svn: 343851
2018-10-05 14:23:11 +00:00
Daniel Sanders a464ffd52c [globalisel][combine] When placing truncates, handle the case when the BB is empty
GlobalISel uses MIR with implicit fallthrough on each basic block. As a result,
getFirstNonPhi() can return end().

llvm-svn: 343829
2018-10-04 23:47:37 +00:00
Daniel Sanders ab358bfd09 [globalisel][combine] Fix a rare crash when encountering an instruction whose op0 isn't a reg
The simplest instance of this is an intrinsic with no results which will have the
intrinsic ID as operand 0.

Also fix some benign incorrectness when op0 is a reg but isn't a def that was
guarded against by checking for the extension opcodes.

llvm-svn: 343821
2018-10-04 21:44:32 +00:00
Craig Topper 7d2155e3f9 [X86][LegalizeVectorOps] Use MERGE_VALUES to return two results from LowerLoad. Remove special case code in LegalizeVectorOps that allowed us to only return one result.
Previously we replaced the chain use ourself and return the data result. LegalizeVectorOps then detected that we'd done this and assumed the chain had already been handled.

This commit instead returns a MERGE_VALUES node with two results joined from nodes. This allows LegalizeVectorOps to do all the replacements for us without any special casing. The MERGE_VALUES will be removed by DAG combine.

llvm-svn: 343817
2018-10-04 21:24:24 +00:00
Daniel Sanders a05c7583c9 [globalisel][combine] Improve the truncate placement for the extending-loads combine
This brings the extending loads patch back to the original intent but minus the
PHI bug and with another small improvement to de-dupe truncates that are
inserted into the same block.

The truncates are sunk to their uses unless this would require inserting before a
phi in which case it sinks to the _beginning_ of the predecessor block for that
path (but no earlier than the def).

The reason for choosing the beginning of the predecessor is that it makes de-duping
multiple truncates in the same block simple, and optimized code is going to run a
scheduler at some point which will likely change the position anyway.

llvm-svn: 343804
2018-10-04 18:44:58 +00:00
Craig Topper 08ae6774eb [LegalizeIntegerTypes] Fix typo in comment. NFC
llvm-svn: 343750
2018-10-04 02:40:35 +00:00
Daniel Sanders 1b493739e0 [machineverifier] Detect PHI's that are preceeded by non-PHI's
If present, PHI nodes must appear before non-PHI nodes in a basic block. The
register allocator relies on this and will fail to eliminate PHI's that do not
meet this requirement.

llvm-svn: 343731
2018-10-03 22:05:31 +00:00
Heejin Ahn 9d224346b2 Make meanings of variables clearer in action table generation (NFC)
Summary:

Reviewers: kristina, zhmu, dschuff, rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52680

llvm-svn: 343724
2018-10-03 21:30:15 +00:00
Matthew Voss f8ab35a4f4 Emit template type and value parameter DIEs for template variables.
Summary:
Ensure the TemplateParam attribute of the DIGlobalVariable node is translated into the proper DIEs.

Resolves https://bugs.llvm.org/show_bug.cgi?id=22119

Reviewers: dblaikie, probinson, aprantl, JDevlieghere, clayborg, whitequark, deadalnix

Reviewed By: dblaikie

Subscribers: llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D52057

llvm-svn: 343706
2018-10-03 18:44:53 +00:00
Daniel Sanders 10dedc00d0 Correct implementation of -verify-machineinstrs such that it's still overridable for EXPENSIVE_CHECKS
-verify-machineinstrs was implemented as a simple bool. As a result, the
'VerifyMachineCode == cl::BOU_UNSET' used by EXPENSIVE_CHECKS to make it on by
default but possible to disable didn't work as intended. Changed
-verify-machineinstrs to a boolOrDefault to correct this.

llvm-svn: 343696
2018-10-03 16:29:24 +00:00
Daniel Sanders fb9b99b26e [globalisel][combines] Don't sink G_TRUNC down to use if that use is a G_PHI
This fixes a problem where the register allocator fails to eliminate a PHI
because there's a non-PHI in the middle of the PHI instructions at the start
of a BB.

This G_TRUNC can be better placed but this at least fixes the correctness issue
quickly. I'll follow up with a patch to the verifier to catch this kind of bug
in future.

llvm-svn: 343693
2018-10-03 15:43:39 +00:00
Jonas Paulsson fb3a97bec0 [RA CopyHints] Fix compile-time regression
This patch makes sure that a register is only hinted once to RA. In extreme
cases the same register can otherwise be hinted numerous times and cause a
compile time slowdown.

Review: Simon Pilgrim
https://reviews.llvm.org/D52826

llvm-svn: 343686
2018-10-03 12:51:19 +00:00
Jonas Toth 602e3a640f [CodeGen] NFC fix pedantic warning from extra semicolon
llvm-svn: 343674
2018-10-03 10:59:19 +00:00
Daniel Sanders c973ad1878 Re-commit: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
Summary: Depends on D45541

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson

Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45543

The previous commit failed portions of the test-suite on GreenDragon due to
duplicate COPY instructions and iterator invalidation. Both issues have now
been fixed. To assist with this, a helper (cloneVirtualRegister) has been added
to MachineRegisterInfo that can be used to get another register that has the same
type and class/bank as an existing one.

llvm-svn: 343654
2018-10-03 02:12:17 +00:00
Aaron Smith da0602c154 [CodeView] Only add the Scoped flag for an enum type when it has an immediate function scope to match MSVC
Reviewers: rnk, zturner, llvm-commits

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D52706

llvm-svn: 343627
2018-10-02 20:28:15 +00:00
Aaron Smith 802b033d78 [CodeView] Emit function options for subprogram and member functions
Summary:
Use the newly added DebugInfo (DI) Trivial flag, which indicates if a C++ record is trivial or not, to determine Codeview::FunctionOptions.

Clang and MSVC generate slightly different Codeview for C++ records. For example, here is the C++ code for a class with a defaulted ctor,

       class C {
       public:
         C() = default;
       };

Clang will produce a LF for the defaulted ctor while MSVC does not. For more details, refer to FIXMEs in the test cases in "function-options.ll" included with this set of changes.


Reviewers: zturner, rnk, llvm-commits, aleksandr.urakov

Reviewed By: rnk

Subscribers: Hui, JDevlieghere

Differential Revision: https://reviews.llvm.org/D45123

llvm-svn: 343626
2018-10-02 20:21:05 +00:00
Daniel Sanders 74de21d06f [globalisel][verifier] Run the MachineVerifier from IRTranslator onwards
-verify-machineinstrs inserts the MachineVerifier after every MachineInstr-based
pass. However, GlobalISel creates MachineInstr-based passes earlier than DAGISel
and the corresponding verifiers are not being added. This patch fixes that.

If GlobalISel triggers the fallback path then the MIR can be left in a bad
state that is going to be cleared by ResetMachineFunctions. In this situation
verifying between GlobalISel passes will prevent the fallback path from
recovering from this. As a result, we bail out of verifying a function if the
FailedISel attribute is present.

llvm-svn: 343613
2018-10-02 17:56:58 +00:00
Reid Kleckner d5e4ec74e3 [codeview] Fix 32-bit x86 variable locations in realigned stack frames
Add the .cv_fpo_stackalign directive so that we can define $T0, or the
VFRAME virtual register, with it. This was overlooked in the initial
implementation because unlike MSVC, we push CSRs before allocating stack
space, so this value is only needed to describe local variable
locations. Variables that the compiler now addresses via ESP are instead
described as being stored at offsets from VFRAME, which for us is ESP
after alignment in the prologue.

This adds tests that show that we use the VFRAME register properly in
our S_DEFRANGE records, and that we emit the correct FPO data to define
it.

Fixes PR38857

llvm-svn: 343603
2018-10-02 16:43:52 +00:00
Daniel Sanders 33f42f97af Revert: r343521 and r343541: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
There's a strange assertion on two of the Green Dragon bots that goes away when
this is reverted. The assertion is in RegBankAlloc and if it is this commit then
-verify-machine-instrs should have caught it earlier in the pipeline.

llvm-svn: 343546
2018-10-01 22:32:08 +00:00
Reid Kleckner 8d7c421a70 [codeview] Simplify S_DEFRANGE emission code, NFC
These assembler directives are still pretty unreadable and it would be
nice to clean them up at some point.

llvm-svn: 343544
2018-10-01 22:25:49 +00:00
Reid Kleckner 9ea2c01264 [codeview] Emit S_FRAMEPROC and use S_DEFRANGE_FRAMEPOINTER_REL
Summary:
Before this change, LLVM would always describe locals on the stack as
being relative to some specific register, RSP, ESP, EBP, ESI, etc.
Variables in stack memory are pretty common, so there is a special
S_DEFRANGE_FRAMEPOINTER_REL symbol for them. This change uses it to
reduce the size of our debug info.

On top of the size savings, there are cases on 32-bit x86 where local
variables are addressed from ESP, but ESP changes across the function.
Unlike in DWARF, there is no FPO data to describe the stack adjustments
made to push arguments onto the stack and pop them off after the call,
which makes it hard for the debugger to find the local variables in
frames further up the stack.

To handle this, CodeView has a special VFRAME register, which
corresponds to the $T0 variable set by our FPO data in 32-bit.  Offsets
to local variables are instead relative to this value.

This is part of PR38857.

Reviewers: hans, zturner, javed.absar

Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D52217

llvm-svn: 343543
2018-10-01 21:59:45 +00:00
Reid Kleckner 7a6966ec27 Fix the Windows build in GlobalISel
Clang-cl was complaining about some sort of constexpr narrowing bug:

C:\src\llvm-project\llvm\lib\CodeGen\GlobalISel\CombinerHelper.cpp(136,31):  error: non-constant-expression cannot be narrowed from type 'llvm::TargetOpcode::(anonymous enum at C:\src\llvm-project\llvm\include\llvm/CodeGen/TargetOpcodes.h:22:1)' to 'unsigned int' in initializer list [-Wc++11-narrowing]
                              unsigned(MI.getOpcode()) == unsigned(TargetOpcode::G_LOAD)
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\src\llvm-project\llvm\lib\CodeGen\GlobalISel\CombinerHelper.cpp(136,31):  note: insert an explicit cast to silence this issue
                              unsigned(MI.getOpcode()) == unsigned(TargetOpcode::G_LOAD)
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                              static_cast<unsigned int>(

llvm-svn: 343541
2018-10-01 21:39:39 +00:00
Daniel Sanders 9659bfda5a [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
Summary: Depends on D45541

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson

Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45543

llvm-svn: 343521
2018-10-01 18:56:47 +00:00
Matthias Braun 7159daa68e MIRParser: Check that instructions only reference DILocation metadata
llvm-svn: 343505
2018-10-01 17:50:52 +00:00
Matthias Braun 004fe6bf83 DAGCombiner: StoreMerging: Fix bad index calculating when adjusting mismatching vector types
This fixes a case of bad index calculation when merging mismatching
vector types. This changes the existing code to just use the existing
extract_{subvector|element} and a bitcast (instead of bitcast first and
then newly created extract_xxx) so we don't need to adjust any indices
in the first place.

rdar://44584718

Differential Revision: https://reviews.llvm.org/D52681

llvm-svn: 343493
2018-10-01 16:25:50 +00:00
Carlos Alberto Enciso 81d8ef2196 [DebugInfo][Dexter] Incorrect DBG_VALUE after MCP dead copy instruction removal.
When MachineCopyPropagation eliminates a dead 'copy', its associated debug information becomes invalid. as the recorded register has been removed.  It causes the debugger to display wrong variable value.

Differential Revision: https://reviews.llvm.org/D52614

llvm-svn: 343445
2018-10-01 08:14:44 +00:00
Fangrui Song 3507c6e884 Use the container form llvm::sort(C, ...)
There are a few leftovers in rL343163 which span two lines. This commit
changes these llvm::sort(C.begin(), C.end, ...) to llvm::sort(C, ...)

llvm-svn: 343426
2018-09-30 22:31:29 +00:00
Bjorn Pettersson c2fc53ac90 [PHIElimination] Lower a PHI node with only undef uses as IMPLICIT_DEF
Summary:
The lowering of PHI nodes used to detect if all inputs originated
from IMPLICIT_DEF's. If so the PHI node was replaced by an
IMPLICIT_DEF. Now we also consider undef uses when checking the
inputs. So if all inputs are implicitly defined or undef we
lower the PHI to an IMPLICIT_DEF. This makes
PHIElimination::LowerPHINode more consistent as it checks
both implicit and undef properties at later stages.

Reviewers: MatzeB, tstellar

Reviewed By: MatzeB

Subscribers: jvesely, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D52558

llvm-svn: 343417
2018-09-30 17:26:58 +00:00
Bjorn Pettersson 4af7f57bdf [PHIElimination] Update the regression test for PR16508
Summary:
When PR16508 was solved (in rL185363) a regression test was
added as test/CodeGen/PowerPC/2013-07-01-PHIElimBug.ll.
I discovered that the test case no longer reproduced the
scenario from PR16508. This problem could have been amended
by adding an extra RUN line with "-O1" (or possibly "-O0"),
but instead I added a mir-reproducer
  test/CodeGen/PowerPC/2013-07-01-PHIElimBug.mir
to get a reproducer that is less sensitive to changes in
earlier passes (including O-level).

While being at it I also corrected a code comment in
PHIElimination::EliminatePHINodes that has been incorrect
since the related bugfix from rL185363.

Reviewers: MatzeB, hfinkel

Reviewed By: MatzeB

Subscribers: nemanjai, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D52553

llvm-svn: 343416
2018-09-30 17:23:21 +00:00
Simon Pilgrim 818cfc40ff [DAG] Don't perform SINT_TO_FP<->UINT_TO_FP custom conversion after legalization
The SINT_TO_FP<->UINT_TO_FP combines for non-negative integers should only occur for legal ops once LegalOperations = true

No test case to hand, noticed when investigating PR38226 + PR38970

llvm-svn: 343405
2018-09-30 12:46:42 +00:00
Heejin Ahn 5e174e7474 Fix comment indentation in addLandingPad
rL343018 messed up the comment indentation while moving it.

llvm-svn: 343371
2018-09-29 09:22:25 +00:00
Heejin Ahn ec3d65b870 [WebAssembly] Fix memory leak on WasmEHFuncInfo
Summary: WasmEHFuncInfo objects were not being properly deleted.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52582

llvm-svn: 343362
2018-09-28 20:54:04 +00:00
David Bolvansky 8e90bad63d [DAGCombiner] [NFC] Improve X div/rem 1 fold
Reviewers: spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52661

llvm-svn: 343349
2018-09-28 18:40:30 +00:00
Luke Cheeseman 10981cc884 Revert r343317
- asan buildbots are breaking and I need to investigate the issue

llvm-svn: 343341
2018-09-28 17:01:50 +00:00
Aditya Nandakumar 1cbb057142 [GISel]: Remove an incorrect assert in CallLowering
https://reviews.llvm.org/D51147

Asserting if any extend of vectors should be up to the target's
legalizer/target specific code not in CallLowering.

reviewed by : dsanders.

llvm-svn: 343325
2018-09-28 15:08:49 +00:00
Luke Cheeseman 21f2955bb2 Reapply changes reverted by r343235
- Add fix so that all code paths that create DWARFContext
  with an ObjectFile initialise the target architecture in the context
- Add an assert that the Arch is known in the Dwarf CallFrameString method

llvm-svn: 343317
2018-09-28 13:37:27 +00:00
Hiroshi Inoue 69bfa40200 [CodeGen] fix broken successor probability in MBB dump
When printing successor probabilities for a MBB, a human readable value is sometimes shown as 200.0%.
The human readable output is based on getProbabilityIterator, which returns 0xFFFFFFFF for getNumerator() and 0x80000000 for getDenominator() for unknown BranchProbability.
By using getSuccProbability as we do for the non-human readable part, we can avoid this problem.

Differential Revision: https://reviews.llvm.org/D52605

llvm-svn: 343297
2018-09-28 05:27:32 +00:00
Craig Topper bb50c38635 [ScalarizeMaskedMemIntrin] Use MinAlign to calculate alignment for the scalar load/stores to handle element types that are byte-sized but not powers of 2.
This pass doesn't handle non-byte sized types correctly at all, but at least we can make byte sized types work.

llvm-svn: 343294
2018-09-28 03:35:37 +00:00
Craig Topper fdf4c76ca0 [ScalarizeMaskedMemIntrin] Fix the alignment calculation for the scalar stores of a masked store expansion.
It should be the minimum of the original alignment and the scalar size.

llvm-svn: 343284
2018-09-28 01:06:13 +00:00
Craig Topper 8b4f0e1b8c [ScalarizeMaskedMemIntrin] Ensure the mask is a vector of ConstantInts before generating the expansion without control flow.
Its possible the mask itself or one of the elements is a ConstantExpr and we shouldn't optimize in that case.

llvm-svn: 343278
2018-09-27 22:31:42 +00:00
Craig Topper 10ec021621 [ScalarizeMaskedMemIntrin] Use cast instead of dyn_cast checked by an assert. Consistently make use of the element type variable we already have. NFCI
cast will take care of asserting internally.

llvm-svn: 343277
2018-09-27 22:31:40 +00:00
Craig Topper 6911bfe263 [ScalarizeMaskedMemIntrin] When expanding masked gathers, start with the passthru vector and insert the new load results into it.
Previously we started with undef and did a final merge with the passthru at the end.

llvm-svn: 343273
2018-09-27 21:28:59 +00:00
Craig Topper 7d234d6628 [ScalarizeMaskedMemIntrin] When expanding masked loads, start with the passthru value and insert each conditional load result over their element.
Previously we started with undef and did one final merge at the end with a select.

llvm-svn: 343271
2018-09-27 21:28:52 +00:00
Craig Topper dfc0f289fa [ScalarizeMaskedMemIntrin] Handle the case where the mask is an all zero vector.
This shouldn't really happen in practice I hope, but we tried to handle other constant cases. We missed this one because we checked for ConstantVector without realizing that zero becomes ConstantAggregateZero instead.

So instead just check for Constant and use getAggregateElement which will do the dirty work for us.

llvm-svn: 343270
2018-09-27 21:28:46 +00:00
Craig Topper dfe460db57 [ScalarizeMaskedMemIntrin] Remove some temporary variables that are only used by a single if condition.
llvm-svn: 343268
2018-09-27 21:28:41 +00:00
Craig Topper 49dad8b8af [ScalarizeMaskedMemIntrin] Cleanup comments. NFC
llvm-svn: 343267
2018-09-27 21:28:39 +00:00
Craig Topper 0423681d4a [ScalarizeMaskedMemIntrin] Don't emit 'icmp eq i1 %x, 1' to check mask values. That's just %x so use that directly.
Had we emitted this IR earlier, InstCombine would have removed icmp so I'm going to assume using the i1 directly would be considered canonical.

llvm-svn: 343244
2018-09-27 18:01:48 +00:00
Luke Cheeseman 8e5676b1aa Revert r343192 as an ubsan build is currently failing
llvm-svn: 343235
2018-09-27 16:47:30 +00:00
Luke Cheeseman f6844b307a Reapply changes reverted in r343114, lldb patch to follow shortly
llvm-svn: 343192
2018-09-27 10:39:20 +00:00
Hans Wennborg 5008d9014b Revert r342942 "[MachineCopyPropagation] Reimplement CopyTracker in terms of register units"
It seems to have broken several targets, see comments on the llvm-commits thread.

> Change the copy tracker to keep a single map of register units instead
> of 3 maps of registers. This gives a very significant compile time
> performance improvement to the pass. I measured a 30-40% decrease in
> time spent in MCP on x86 and AArch64 and much more significant
> improvements on out of tree targets with more registers.
>
> Differential Revision: https://reviews.llvm.org/D52374

llvm-svn: 343189
2018-09-27 09:59:27 +00:00
Fangrui Song 0cac726a00 llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)
Summary: The convenience wrapper in STLExtras is available since rL342102.

Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb

Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D52573

llvm-svn: 343163
2018-09-27 02:13:45 +00:00
Simon Pilgrim b0189289bf [DAG] SelectionDAGLegalize::ExpandLegalINT_TO_FP - use getFPExtendOrRound helper. NFCI.
Handles SrcVT == DstVT as well.

llvm-svn: 343121
2018-09-26 16:24:07 +00:00
Luke Cheeseman 77aaa22081 Revert r343112 as CallFrameString API change has broken lldb builds
llvm-svn: 343114
2018-09-26 14:48:03 +00:00
Luke Cheeseman 03ad8812f5 [AArch64] - Return address signing dwarf support
- Reapply r343089 with a fix for DebugInfo/Sparc/gnu-window-save.ll

llvm-svn: 343112
2018-09-26 14:30:29 +00:00
Francis Visoiu Mistrih 6acaa18afc [CodeGen] Always print register ties in MI::dump()
It was the case when calling MO::dump(), but MI::dump() was still
depending on hasComplexRegisterTies().

The MIR output is not affected.

llvm-svn: 343107
2018-09-26 13:33:09 +00:00
Hans Wennborg 00b88bbcaf Revert r343089 "[AArch64] - Return address signing dwarf support"
This caused the DebugInfo/Sparc/gnu-window-save.ll test to fail.

> Functions that have signed return addresses need additional dwarf support:
> - After signing the LR, and before authenticating it, the LR register is in a
>   state the is unusable by a debugger or unwinder
> - To account for this a new directive, .cfi_negate_ra_state, is added
> - This directive says the signed state of the LR register has now changed,
>   i.e. unsigned -> signed or signed -> unsigned
> - This directive has the same CFA code as the SPARC directive GNU_window_save
>   (0x2d), adding a macro to account for multiply defined codes
> - This patch matches the gcc implementation of this support:
>   https://patchwork.ozlabs.org/patch/800271/
>
> Differential Revision: https://reviews.llvm.org/D50136

llvm-svn: 343103
2018-09-26 12:57:45 +00:00
Simon Pilgrim e2437689a8 [DAG] ExpandLegalINT_TO_FP - pull out repeated getValueType() call. NFCI.
llvm-svn: 343101
2018-09-26 12:42:19 +00:00
David Green 353cb3d4e5 [CodeGen] Enable tail calls for functions with NonNull attributes.
Adding NonNull as attributes to returned pointers has the unfortunate side
effect of disabling tail calls. This patch ignores the NonNull attribute when
we decide whether to tail merge, in the same way that we ignore the NoAlias
attribute, as it has no affect on the call sequence.

Differential Revision: https://reviews.llvm.org/D52238

llvm-svn: 343091
2018-09-26 10:46:18 +00:00
Yury Gribov 67572004df Fixes removal of dead elements from PressureDiff (PR37252).
Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D51495

llvm-svn: 343090
2018-09-26 10:42:41 +00:00
Luke Cheeseman f755e687fc [AArch64] - Return address signing dwarf support
Functions that have signed return addresses need additional dwarf support:
- After signing the LR, and before authenticating it, the LR register is in a
  state the is unusable by a debugger or unwinder
- To account for this a new directive, .cfi_negate_ra_state, is added
- This directive says the signed state of the LR register has now changed,
  i.e. unsigned -> signed or signed -> unsigned
- This directive has the same CFA code as the SPARC directive GNU_window_save
  (0x2d), adding a macro to account for multiply defined codes
- This patch matches the gcc implementation of this support:
  https://patchwork.ozlabs.org/patch/800271/

Differential Revision: https://reviews.llvm.org/D50136

llvm-svn: 343089
2018-09-26 10:14:15 +00:00
Mikael Nilsson 9c8e35174e Run VerifyDAGDiverence in debug only
VerifyDAGDiverence costs compilation time, avoid running it in non-debug
builds.

Differential Revision: https://reviews.llvm.org/D52454

llvm-svn: 343086
2018-09-26 09:25:45 +00:00
Mikael Holmen e4d61182cd Silence compiler warning about unused variable introduced in r343018
Since the body of the "else if" contains
 // TODO
I suppose someone will need the variable again at some point, but with
-Werror the warning made it not compile at all.

llvm-svn: 343071
2018-09-26 06:19:08 +00:00
Hsiangkai Wang 55321d82bd [DebugInfo] Do not generate address info for removed debug labels.
In some senario, LLVM will remove llvm.dbg.labels in IR. For example,
when the labels are in unreachable blocks, these labels will not
be generated in LLVM IR. In the case, these debug labels will have
address zero as their address. It is not legal address for debugger to
set breakpoints or query sources. So, the patch inhibits the address info
(DW_AT_low_pc) of removed labels.

Fix build failed in BuildBot, clang-stage1-cmake-RA-incremental, on macOS.

Differential Revision: https://reviews.llvm.org/D51908

llvm-svn: 343062
2018-09-26 04:19:23 +00:00
Craig Topper b2a00acb24 [DAGCombiner] Remove unnecessary check for visitSDIVLike/visitUDIVLike returning a UDIVREM or SDIVREM node.
This shouldn't be possible and is a leftover from when we used to recursively call combine here.

llvm-svn: 343049
2018-09-25 23:52:07 +00:00
Heejin Ahn e41be38efd Unify landing pad information adding routines (NFC)
Summary:
We have `llvm::addLandingPadInfo` and `MachineFunction::addLandingPad`,
both of which add landing pad information to populate `LandingPadInfo`
but are called from different locations, which was confusing. This patch
unifies them with one `MachineFunction::addLandingPad` function, which
now has functionlities of both functions.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52428

llvm-svn: 343018
2018-09-25 19:56:44 +00:00
Sanjay Patel 10c11b867a [x86] avoid 256-bit andnp that requires insert/extract with AVX1 (PR37449)
This is the final (I hope!) problem pattern mentioned in PR37749:
https://bugs.llvm.org/show_bug.cgi?id=37749

We are trying to avoid an AVX1 sinkhole caused by having 256-bit bitwise logic ops but no other 256-bit integer ops. 
We've already solved the simple logic ops, but 'andn' is an x86 special. I looked at alternative solutions like 
extending the generic DAG combine or trying to wait until the ANDNP node is created, but those are bigger patches 
that can over-reach. Ie, splitting to 128-bit does not look like a win in most cases with >1 256-bit op.

The pattern matching is cluttered with bitcasts because of our i64 element canonicalization. For the affected test, 
we have this vector-type-legalized sequence:

        t29: v8i32 = concat_vectors t27, t28
      t30: v4i64 = bitcast t29
        t18: v8i32 = BUILD_VECTOR Constant:i32<-1>, Constant:i32<-1>, ...
      t31: v4i64 = bitcast t18
    t32: v4i64 = xor t30, t31
      t9: v8i32 = BUILD_VECTOR Constant:i32<255>, Constant:i32<255>, ...
    t34: v4i64 = bitcast t9
  t35: v4i64 = and t32, t34
t36: v8i32 = bitcast t35
      t37: v4i32 = extract_subvector t36, Constant:i64<0>
      t38: v4i32 = extract_subvector t36, Constant:i64<4>

Differential Revision: https://reviews.llvm.org/D52318

llvm-svn: 343008
2018-09-25 19:09:34 +00:00
Daniil Fukalov 349b5943b4 [RegAllocGreedy] avoid using physreg candidates that cannot be correctly spilled
For the AMDGPU target if a MBB contains exec mask restore preamble, SplitEditor may get state when it cannot insert a spill instruction.

E.g. for a MIR

bb.100:
    %1 = S_OR_SAVEEXEC_B64 %2, implicit-def $exec, implicit-def $scc, implicit $exec
and if the regalloc will try to allocate a virtreg to the physreg already assigned to virtreg %1, it should insert spill instruction before the S_OR_SAVEEXEC_B64 instruction.
But it is not possible since can generate incorrect code in terms of exec mask.

The change makes regalloc to ignore such physreg candidates.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D52052

llvm-svn: 343004
2018-09-25 18:37:38 +00:00
Justin Bogner ef2ae740c6 Revert "[DebugInfo] Do not generate address info for removed debug labels."
The added test is failing on macOS:

  http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/53550/

This reverts r342943.

llvm-svn: 342993
2018-09-25 17:29:30 +00:00
Nirav Dave a2f514d672 [LegalizeDAG] Prune Predecessor check in ExpandExtractFromVectorThroughStack. NFCI.
llvm-svn: 342985
2018-09-25 15:29:57 +00:00
Nirav Dave f445a67be4 [DAGCombine] Improve Predecessor check in SimplifySelectOps. NFCI.
Reuse search space bookkeeping across multiple predecessor checks
qdone to avoid redundancy. This should cut search cost by ~4x.

llvm-svn: 342984
2018-09-25 15:29:30 +00:00
Nirav Dave 7373d5e646 [DAGCombine] Share predecessor bookkeeping in CombineToPostIndexedLoadStore. NFCI.
llvm-svn: 342983
2018-09-25 15:29:04 +00:00
Nirav Dave 46ab89a0d0 [DAGCombine] Don't fold dependent loads across SELECT_CC.
DAGCombine will try to fold two loads that feed a SELECT or SELECT_CC
after the select, resulting in a select of an address and a single
load after.

If either of the loads depend on the other, this is not legal as it
could introduce cycles. However, it only checked this if the opcode
was a SELECT, and not for a SELECT_CC.

Unfortunately, the only reproducer I have for this is for our
downstream target. I've tried getting it to trigger on an upstream one
but haven't been successful.

Patch thanks to Bevin Hansson.

llvm-svn: 342980
2018-09-25 14:43:05 +00:00
Fangrui Song 10a2162588 Use unique_ptr to hold AsmInfo,MRI,MII,STI
Reviewers: pcc, dblaikie

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52389

llvm-svn: 342945
2018-09-25 06:19:31 +00:00
Mikael Holmen adf5e0d91d Use TRI->regsOverlap() in MachineBasicBlock::computeRegisterLiveness
Summary:
For the loop that used MCRegAliasIterator this should be NFC.

For the loop that previously used MCSubRegIterator we should
now detect more cases where the register is actually live out that
we previously missed.

Reviewers: MatzeB, arsenm

Reviewed By: MatzeB

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D52410

llvm-svn: 342944
2018-09-25 06:10:04 +00:00
Hsiangkai Wang 9c2463622d [DebugInfo] Do not generate address info for removed debug labels.
In some senario, LLVM will remove llvm.dbg.labels in IR. For example,
when the labels are in unreachable blocks, these labels will not
be generated in LLVM IR. In the case, these debug labels will have
address zero as their address. It is not legal address for debugger to
set breakpoints or query sources. So, the patch inhibits the address info
(DW_AT_low_pc) of removed labels.

Differential Revision: https://reviews.llvm.org/D51908

llvm-svn: 342943
2018-09-25 06:09:50 +00:00
Justin Bogner e152483623 [MachineCopyPropagation] Reimplement CopyTracker in terms of register units
Change the copy tracker to keep a single map of register units instead
of 3 maps of registers. This gives a very significant compile time
performance improvement to the pass. I measured a 30-40% decrease in
time spent in MCP on x86 and AArch64 and much more significant
improvements on out of tree targets with more registers.

Differential Revision: https://reviews.llvm.org/D52374

llvm-svn: 342942
2018-09-25 05:16:44 +00:00
Justin Bogner db02d3d4b3 [MachineCopyPropagation] Rework how we manage RegMask clobbers
Instead of updating the CopyTracker's maps each time we come across a
RegMask, defer checking for this kind of interference until we're
actually trying to propagate a copy. This avoids the need to
repeatedly iterate over maps in the cases where we don't end up doing
any work.

This is a slight compile time improvement for MachineCopyPropagation
as is, but it also enables a much bigger improvement that I'll follow
up with soon.

Differential Revision: https://reviews.llvm.org/D52370

llvm-svn: 342940
2018-09-25 04:45:25 +00:00
Fedor Sergeev 662e5686fe [New PM][PassInstrumentation] IR printing support for New Pass Manager
Implementing -print-before-all/-print-after-all/-filter-print-func support
through PassInstrumentation callbacks.

- PrintIR routines implement printing callbacks.

- StandardInstrumentations class provides a central place to manage all
  the "standard" in-tree pass instrumentations. Currently it registers
  PrintIR callbacks.

Reviewers: chandlerc, paquette, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D50923

llvm-svn: 342896
2018-09-24 16:08:15 +00:00
Sanjay Patel 2c901742ca [DAGCombiner] use UADDO to optimize saturated unsigned add
This is a preliminary step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

If we have an 'add' instruction that sets flags, we can use that to eliminate an
explicit compare instruction or some other instruction (cmn) that sets flags for 
use in the later select.

As shown in the unchanged tests that use 'icmp ugt %x, %a', we're effectively 
reversing an IR icmp canonicalization that replaces a variable operand with a
constant:
https://rise4fun.com/Alive/V1Q

But we're not using 'uaddo' in those cases via DAG transforms. This happens in 
CGP after D8889 without checking target lowering to see if the op is supported. 
So AArch already shows 'uaddo' codegen for the i8/i16/i32/i64 test variants with 
"using_cmp_sum" in the title. That's the pattern that CGP matches as an unsigned 
saturated add and converts to uaddo without checking target capabilities.

This patch is gated by isOperationLegalOrCustom(ISD::UADDO, VT), so we see only 
see AArch diffs for i32/i64 in the tests with "using_cmp_notval" in the title 
(unlike x86 which sees improvements for all sizes because all sizes are 'custom'). 
But the AArch code (like x86) looks better when translated to 'uaddo' in all cases. 
So someone that is involved with AArch may want to set i8/i16 to 'custom' for UADDO, 
so this patch will fire on those tests.

Another possibility given the existing behavior: we could remove the legal-or-custom 
check altogether because we're assuming that a UADDO sequence is canonical/optimal 
before we ever reach here. But that seems like a bug to me. If the target doesn't 
have an add-with-flags op, then it's not likely that we'll get optimal DAG combining 
using a UADDO node. This is similar justification for why we don't canonicalize IR to 
the overflow math intrinsic sibling (llvm.uadd.with.overflow) for UADDO in the first 
place.

Differential Revision: https://reviews.llvm.org/D51929

llvm-svn: 342886
2018-09-24 14:47:15 +00:00
Hans Wennborg 83d15dfe2d Remove debug printf leftover from r342397
llvm-svn: 342863
2018-09-24 08:18:47 +00:00
Craig Topper 5bef27e808 [DAGCombiner] Remove some dead code from ConstantFoldBITCASTofBUILD_VECTOR
This code handled SCALAR_TO_VECTOR being returned by the recursion, but the code that used to return SCALAR_TO_VECTOR was removed in 2015.

llvm-svn: 342856
2018-09-24 02:03:11 +00:00
Craig Topper b3b94a8e8b [DAGCombiner] Clarify a comment. NFC
This comment was misleading about why we were restricting to before legalize types. The reason given would only apply to before legalize ops. But there is a before legalize types reason that should also be listed.

llvm-svn: 342851
2018-09-23 21:17:56 +00:00
Craig Topper bec5967176 [LegalizeTypes] Fix bad indentation. NFC
llvm-svn: 342850
2018-09-23 21:17:55 +00:00
Sanjay Patel 0027946915 [DAGCombiner][x86] extend decompose of integer multiply into shift/add with negation
This is an alternative to https://reviews.llvm.org/D37896. We can't decompose 
multiplies generically without a target hook to tell us when it's profitable.

ARM and AArch64 may be able to remove some existing code that overlaps with
this transform.

This extends D52195 and may resolve PR34474: 
https://bugs.llvm.org/show_bug.cgi?id=34474
(still an open question about transforming legal vector multiplies, but we
could open another bug report for those)

llvm-svn: 342844
2018-09-23 18:41:38 +00:00
Craig Topper 81f67f7afb [DAGCombiner] Simplify some code in visitBITCAST. NFCI
llvm-svn: 342826
2018-09-22 23:12:34 +00:00
Craig Topper e79a588cac [DAGCombiner] Rewrite r331896 in a different way to address a FIXME. NFCI
llvm-svn: 342809
2018-09-22 18:03:14 +00:00
Justin Bogner 45b3ddc5a4 [MachineCopyPropagation] Refactor copy tracking into a class. NFC
This is a bit easier to follow than handling the copy and src maps
directly in the pass, and will make upcoming changes to how this is
done easier to follow.

llvm-svn: 342703
2018-09-21 00:51:04 +00:00
Justin Bogner 927b75dfba [MachineCopyPropagation] Minor clang-formatting. NFC
llvm-svn: 342700
2018-09-21 00:08:33 +00:00
Aditya Nandakumar e5909431b5 Add the ability to register callbacks for removal and insertion of MachineInstrs
https://reviews.llvm.org/D52127

This patch adds the ability to watch for insertions/deletions of
MachineInstructions similar to MachineRegisterInfo.

llvm-svn: 342696
2018-09-20 23:01:56 +00:00
Jessica Paquette b320ca2642 [MachineOutliner][NFC] Don't add MBBs with a size < 2 to the search space
The suffix tree won't ever consider sequences with a length less than 2.

Therefore, we really ought to not even consider them in the first place.

Also add a FIXME explaining that this should be defined in terms of the size
in B of an outlined call versus the size in B of the MBB.

llvm-svn: 342688
2018-09-20 21:53:25 +00:00
Walter Lee f75e803679 [RegAllocGreedy] Fix crash in tryLocalSplit
tryLocalSplit only handles a single use block, but an interval may
have multiple use blocks.  So don't crash in that case.  This fixes
PR38795.

Differential revision: https://reviews.llvm.org/D52277

llvm-svn: 342682
2018-09-20 20:05:57 +00:00
Jessica Paquette cc06a782ba [MachineOutliner][NFC] Move debug info emission to createOutlinedFunction
When you create an outlined function, you know everything you need to know
to decide if debug info should be created. If we emit debug info in
createOutlinedFunction, then we don't need to keep track of every IR function
we create.

llvm-svn: 342677
2018-09-20 18:53:53 +00:00
Sanjay Patel 8a1227ccc8 [SelectionDAG] replace duplicated peekThroughBitcast helper functions; NFCI
x86 had 2 versions of peekThroughBitcast. DAGCombiner had 1. Plus, it had a 1-off implementation for the one-use variant.
Move the x86 versions of the code to SelectionDAG, so we don't have different copies of the code. 
No functional change intended.

I'm putting this next to isBitwiseNot() because I am planning to use it in there. Another option is next to the
helpers in the ISD namespace (eg, ISD::isConstantSplatVector()). But if there's no good reason for those to be 
there, I'd prefer to pull other helpers over to SelectionDAG in follow-up steps.

Differential Revision: https://reviews.llvm.org/D52285

llvm-svn: 342669
2018-09-20 17:34:08 +00:00
George Rimar 425f75172f [DWARF] - Emit the correct value for DW_AT_addr_base.
Currently, we emit DW_AT_addr_base that points to the beginning of
the .debug_addr section. That is not correct for the DWARF5 case because address
table contains the header and the attribute should point to the first entry
following the header.

This is currently the reason why LLDB does not work with such executables correctly.
Patch fixes the issue.

Differential revision: https://reviews.llvm.org/D52168

llvm-svn: 342635
2018-09-20 09:17:36 +00:00
Bjorn Pettersson b2154af25f [MachineVerifier] Relax checkLivenessAtDef regarding dead subreg defs
Summary:
Consider an instruction that has multiple defs of the same
vreg, but defining different subregs:
  %7.sub1:rc, dead %7.sub2:rc = inst

Calling checkLivenessAtDef for the live interval associated
with %7 incorrectly reported "live range continues after a
dead def". The live range for %7 has a dead def at the slot
index for "inst" even if the live range continues (given that
there are later uses of %7.sub1).

This patch adjusts MachineVerifier::checkLivenessAtDef
to allow dead subregister definitions, unless we are checking
a subrange (when tracking subregister liveness).

A limitation is that we do not detect the situation when the
live range continues past an instruction that defines the
full virtual register by multiple dead subreg defines.

I also removed some dead code related to physical register
in checkLivenessAtDef. Wwe only call that method for virtual
registers, so I added an assertion instead.

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52237

llvm-svn: 342618
2018-09-20 06:59:18 +00:00
Sanjay Patel fdc0de19cb [SelectionDAG] allow vector types with isBitwiseNot()
The test diff in not-and-simplify.ll is from a use in SimplifyDemandedBits,
and the test diff in add.ll is from a DAGCombiner transform.

llvm-svn: 342594
2018-09-19 21:48:30 +00:00
Matthias Braun 3136e42039 MachineScheduler: Add -misched-print-dags flag
Add a flag to dump the schedule DAG to the debug stream. This will be
used in upcoming commits to test schedule DAG mutations such as macro
fusion.

llvm-svn: 342589
2018-09-19 20:50:49 +00:00
Michael Berg 894c39f770 Copy utilities updated and added for MI flags
Summary: This patch adds a GlobalIsel copy utility into MI for flags and updates the instruction emitter for the SDAG path.  Some tests show new behavior and I added one for GlobalIsel which mirrors an SDAG test for handling nsw/nuw.

Reviewers: spatel, wristow, arsenm

Reviewed By: arsenm

Subscribers: wdng

Differential Revision: https://reviews.llvm.org/D52006

llvm-svn: 342576
2018-09-19 18:52:08 +00:00
Sanjay Patel 4fd2e2a498 [DAGCombiner][x86] add transform/hook to decompose integer multiply into shift/add
This is an alternative to D37896. I don't see a way to decompose multiplies 
generically without a target hook to tell us when it's profitable. 

ARM and AArch64 may be able to remove some duplicate code that overlaps with 
this transform.

As a first step, we're only getting the most clear wins on the vector examples
requested in PR34474:
https://bugs.llvm.org/show_bug.cgi?id=34474

As noted in the code comment, it's likely that the x86 constraints are tighter
than necessary, but it may not always be a win to replace a pmullw/pmulld.

Differential Revision: https://reviews.llvm.org/D52195

llvm-svn: 342554
2018-09-19 15:57:40 +00:00
Alex Bradbury 79518b02cd [AtomicExpandPass]: Add a hook for custom cmpxchg expansion in IR
This involves changing the shouldExpandAtomicCmpXchgInIR interface, but I have 
updated the in-tree backends using this hook (ARM, AArch64, Hexagon) so they 
will see no functional change. Previously this hook returned bool, but it now 
returns AtomicExpansionKind.

This hook allows targets to select how a given cmpxchg is to be expanded. 
D48131 uses this to expand part-word cmpxchg to a target-specific intrinsic.

See my associated RFC for more info on the motivation for this change 
<http://lists.llvm.org/pipermail/llvm-dev/2018-June/123993.html>.

Differential Revision: https://reviews.llvm.org/D48130

llvm-svn: 342550
2018-09-19 14:51:42 +00:00
Alex Bradbury 21aea51e71 [RISCV] Codegen for i8, i16, and i32 atomicrmw with RV32A
Introduce a new RISCVExpandPseudoInsts pass to expand atomic 
pseudo-instructions after register allocation. This is necessary in order to 
ensure that register spills aren't introduced between LL and SC, thus breaking 
the forward progress guarantee for the operation. AArch64 does something 
similar for CmpXchg (though only at O0), and Mips is moving towards this 
approach (see D31287). See also [this mailing list 
post](http://lists.llvm.org/pipermail/llvm-dev/2016-May/099490.html) from 
James Knight, which summarises the issues with lowering to ll/sc in IR or 
pre-RA.

See the [accompanying RFC 
thread](http://lists.llvm.org/pipermail/llvm-dev/2018-June/123993.html) for an 
overview of the lowering strategy.

Differential Revision: https://reviews.llvm.org/D47882

llvm-svn: 342534
2018-09-19 10:54:22 +00:00
Matthias Braun 726e12cf0c ScheduleDAG: Cleanup dumping code; NFC
- Instead of having both `SUnit::dump(ScheduleDAG*)` and
  `ScheduleDAG::dumpNode(ScheduleDAG*)`, just keep the latter around.
- Add `ScheduleDAG::dump()` and avoid code duplication in several
  places. Implement it for different ScheduleDAG variants.
- Add `ScheduleDAG::dumpNodeName()` in favor of the `SUnit::print()`
  functions. They were only ever used for debug dumping and putting the
  function into ScheduleDAG is consistent with the `dumpNode()` change.

llvm-svn: 342520
2018-09-19 00:23:35 +00:00
Krzysztof Parzyszek c1e2f39b35 [PostRASink] Make sure to remove subregisters from live-ins as well
llvm-svn: 342492
2018-09-18 16:10:51 +00:00
Hans Wennborg 01c3154971 Revert r342457 "Fixes removal of dead elements from PressureDiff (PR37252)."
This broke the lit tests on a bunch of buildbots, e.g.
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/36679

> Reviewed By: MatzeB
>
> Differential Revision: https://reviews.llvm.org/D51495

llvm-svn: 342482
2018-09-18 14:12:54 +00:00
John Brawn 83d7414e19 [TargetLowering] Android has sincos functions
Since Android API version 9 the Android libm has had the sincos functions, so
they should be recognised as libcalls and sincos optimisation should be applied.

Differential Revision: https://reviews.llvm.org/D52025

llvm-svn: 342471
2018-09-18 13:18:21 +00:00
Yury Gribov 53db663afb Fixes removal of dead elements from PressureDiff (PR37252).
Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D51495

llvm-svn: 342457
2018-09-18 09:53:42 +00:00
Jessica Paquette bd72988c3a [MachineOutliner][NFC] Don't map more illegal instrs than you have to
We were mapping an instruction every time we saw something we couldn't map
before this. Since each illegal mapping is unique, we only have to do this once.

This makes it so that we don't map illegal instructions when the previous
mapped instruction was illegal.

In CTMark (AArch64), this results in 240 fewer instruction mappings on
average over 619 files in total. The largest improvement is 12576 fewer
mappings in one file, and the smallest is 0. The median improvement is 101
fewer mappings.

llvm-svn: 342405
2018-09-17 18:40:21 +00:00
Amara Emerson 91c2913522 Revert "Revert r342183 "[DAGCombine] Fix crash when store merging created an extract_subvector with invalid index.""
Fixed the assertion failure.

llvm-svn: 342397
2018-09-17 14:40:13 +00:00
Kristina Brooks 46c6d3fe75 [DebugInfo] Fix build when std::vector::iterator is a pointer
std::vector::iterator type may be a pointer, then
iterator::value_type fails to compile since iterator is not a class,
namespace, or enumeration.

Patch by orivej (Orivej Desh)

Differential Revision: https://reviews.llvm.org/D52142

llvm-svn: 342354
2018-09-16 22:21:59 +00:00
Sanjay Patel 3eaf500a6d [DAGCombiner] try to convert pow(x, 1/3) to cbrt(x)
This is a follow-up suggested in D51630 and originally proposed as an IR transform in D49040.

Copying the motivational statement by @evandro from that patch:
"This transformation helps some benchmarks in SPEC CPU2000 and CPU2006, such as 188.ammp, 
447.dealII, 453.povray, and especially 300.twolf, as well as some proprietary benchmarks. 
Otherwise, no regressions on x86-64 or A64."

I'm proposing to add only the minimum support for a DAG node here. Since we don't have an 
LLVM IR intrinsic for cbrt, and there are no other DAG ways to create a FCBRT node yet, I 
don't think we need to worry about DAG builder, legalization, a strict variant, etc. We 
should be able to expand as needed when adding more functionality/transforms. For reference, 
these are transform suggestions currently listed in SimplifyLibCalls.cpp:

//   * cbrt(expN(X))  -> expN(x/3)
//   * cbrt(sqrt(x))  -> pow(x,1/6)
//   * cbrt(cbrt(x))  -> pow(x,1/9)

Also, given that we bail out on long double for now, there should not be any logical 
differences between platforms (unless there's some platform out there that has pow()
but not cbrt()).

Differential Revision: https://reviews.llvm.org/D51753

llvm-svn: 342348
2018-09-16 16:50:26 +00:00
Vedant Kumar 1b02dad9f2 [CodeGenPrepare] Preserve debug locs in OptimizeExtractBits
CodeGenPrepare has a transform that sinks {lshr, trunc} pairs to make it
easier for the backend to emit fancy extract-bits instructions (e.g UBFX).

Teach it to preserve debug locations and salvage debug values.

llvm-svn: 342319
2018-09-15 04:08:52 +00:00
Craig Topper 5692ac6ce7 [BreakFalseDeps] Fix bad formatting. NFC
llvm-svn: 342293
2018-09-14 22:26:09 +00:00
Reid Kleckner b3d456a79e [codeview] Remove dead code
llvm-svn: 342285
2018-09-14 21:14:08 +00:00
Reid Kleckner 4d1b75c6b7 Revert r342183 "[DAGCombine] Fix crash when store merging created an extract_subvector with invalid index."
Causes 'isVector() && "Invalid vector type!"' assertion when building
Skia in Chrome.

llvm-svn: 342265
2018-09-14 19:39:40 +00:00
Adrian Prantl 16f58d1850 Fix debug info for SelectionDAG legalization of DAG nodes with two results.
This patch fixes the debug info handling for SelectionDAG legalization
of DAG nodes with two results. When an replaced SDNode has more than
one result, transferDbgValues was always copying the SDDbgValue from
the first result and attaching them to all members. In reality
SelectionDAG::ReplaceAllUsesWith() is given an array of SDNodes
(though the type signature doesn't make this obvious (cf. the call
site code in ReplaceNode()).

rdar://problem/44162227

Differential Revision: https://reviews.llvm.org/D52112

llvm-svn: 342264
2018-09-14 19:38:45 +00:00
Adrian Prantl 66945cf6e3 fix noasserts build
llvm-svn: 342247
2018-09-14 17:32:52 +00:00
Adrian Prantl 55b8756b8a SelectionDAG: Add compact SDDbgValue representation to -dag-dump-verbose output
llvm-svn: 342245
2018-09-14 17:08:02 +00:00
Adrian Prantl 86497ad2af fix typos
llvm-svn: 342241
2018-09-14 16:12:14 +00:00
Amara Emerson ef600cbd86 [DAGCombine] Fix crash when store merging created an extract_subvector with invalid index.
Differential Revision: https://reviews.llvm.org/D51831

llvm-svn: 342183
2018-09-13 21:28:58 +00:00
Craig Topper 2f88006ced [MachineInstr] In addRegisterKilled and addRegisterDead, don't remove operands from inline assembly instructions if they have an associated flag operand.
INLINEASM instructions use extra operands to carry flags. If a register operand is removed without removing the flag operand, then the flags will no longer make sense.

This patch fixes this by preventing the removal when a flag operand is present.

The included test case was generated by MS inline assembly. Longer term maybe we should fix the inline assembly parsing to not generate redundant operands.

Differential Revision: https://reviews.llvm.org/D51829

llvm-svn: 342176
2018-09-13 20:51:27 +00:00
Matt Arsenault 842cda6312 DAG: Fix expansion of unaligned FP loads and stores
This was trying to scalarizing a scalar FP type,
resulting in an assert.

Fixes unaligned f64 stack stores for AMDGPU.

llvm-svn: 342132
2018-09-13 12:14:23 +00:00
Tim Northover c15d47bb01 ARM: align loops to 4 bytes on Cortex-M3 and Cortex-M4.
The Technical Reference Manuals for these two CPUs state that branching
to an unaligned 32-bit instruction incurs an extra pipeline reload
penalty. That's bad.

This also enables the optimization at -Os since it costs on average one
byte per loop in return for 1 cycle per iteration, which is pretty good
going.

llvm-svn: 342127
2018-09-13 10:28:05 +00:00
Sanjay Patel 8a478b79dc [DAGCombiner] improve formatting for select+setcc code; NFC
llvm-svn: 342095
2018-09-12 23:03:50 +00:00
David Green e27e87cdcb [CGP] Ensure splitgep gives deterministic output
The output of splitLargeGEPOffsets does not appear to be deterministic because
of the way that we iterate over a DenseMap. I've changed it to a MapVector for
consistent output.

The test here isn't particularly great, only showing a consmetic difference in
output. The original reproducer is much larger but show a diffierence in
instruction ordering, leading to different codegen.

Differential Revision: https://reviews.llvm.org/D51851

llvm-svn: 342043
2018-09-12 10:19:10 +00:00
Craig Topper 26a3799858 [SelectionDAG] Remove some code from PromoteIntOp_MGATHER that handles UpdateNodeOperands returning an existing node instead of updating.
I suspect this became unecessary when the CSE of mgather was fixed in r338080. It may still be possible to hit this if we widen the element type of a gather outside of type legalization and the promote the mask of a separate gather node so they become the same. But that seems pretty unlikely.

llvm-svn: 342022
2018-09-12 05:25:41 +00:00
Jessica Paquette 2386eab360 [MachineOutliner] Add codegen size remarks to the MachineOutliner
Since the outliner is a module pass, it doesn't get codegen size remarks like
the other codegen passes do. This adds size remarks *to* the outliner.

This is kind of a workaround, so it's peppered with FIXMEs; size remarks
really ought to not ever be handled by the pass itself. However, since the
outliner is the only "MachineModulePass", this works for now. Since the
entire purpose of the MachineOutliner is to produce code size savings, it
really ought to be included in codgen size remarks.

If we ever go ahead and make a MachineModulePass (say, something similar to
MachineFunctionPass), then all of this ought to be moved there.

llvm-svn: 342009
2018-09-11 23:05:34 +00:00
Michael Berg c72a7259be add IR flags to MI
Summary: Initial support for nsw, nuw and exact flags in MI

Reviewers: spatel, hfinkel, wristow

Reviewed By: spatel

Subscribers: nlopes

Differential Revision: https://reviews.llvm.org/D51738

llvm-svn: 341996
2018-09-11 21:35:32 +00:00
Josh Stone f446facab0 [GlobalISel] Lower dbg.declare into indirect DBG_VALUE
Summary:
D31439 changed the semantics of dbg.declare to take the address of a
variable as the first argument, making it indirect.  It specifically
updated FastISel for this change here:

https://reviews.llvm.org/D31439#change-WVArzi177jPl

GlobalISel needs to follow suit, or else it will be missing a level of
indirection in the generated debuginfo.  This problem was seen in a Rust
debuginfo test on aarch64, since GlobalISel is used at -O0 for aarch64.

https://github.com/rust-lang/rust/issues/49807
https://bugzilla.redhat.com/show_bug.cgi?id=1611597
https://bugzilla.redhat.com/show_bug.cgi?id=1625768

Reviewers: dblaikie, aprantl, t.p.northover, javed.absar, rnk

Reviewed By: rnk

Subscribers: #debug-info, rovka, kristof.beyls, JDevlieghere, llvm-commits, tstellar

Differential Revision: https://reviews.llvm.org/D51749

llvm-svn: 341969
2018-09-11 17:52:01 +00:00
Jessica Paquette 050d1ac4a6 [MachineOutliner][NFC] Factor out instruction mapping into its own function
Just some tidy-up. Pull the mapper stuff into `populateMapper`. This makes it
a bit easier to read what's going on in `runOnModule`.

llvm-svn: 341959
2018-09-11 16:33:46 +00:00
Jessica Paquette 54fbfaeace Add size remarks to MachineFunctionPass
This adds per-function size remarks to codegen, similar to what we have in the
IR layer as of r341588. This only impacts MachineFunctionPasses.

This does the same thing, but for `MachineInstr`s instead of just
`Instructions`. After this, when a `MachineFunctionPass` modifies the number of
`MachineInstr`s in the function it ran on, you'll get a remark.

To enable this, use the size-info analysis remark as before.

llvm-svn: 341876
2018-09-10 22:24:10 +00:00
Benjamin Kramer 28559a2605 Don't create a temporary vector of loop blocks just to iterate over them.
Loop's getBlocks returns an ArrayRef.

llvm-svn: 341821
2018-09-10 12:32:06 +00:00
Matt Arsenault 57b5966dad DAG: Handle odd vector sizes in calling conv splitting
This already worked if only one register piece was used,
but didn't if a type was split into multiple, unequal
sized pieces.

Fixes not splitting 3i16/v3f16 into two registers for
AMDGPU.

This will also allow fixing the ABI for 16-bit vectors
in a future commit so that it's the same for all subtargets.

llvm-svn: 341801
2018-09-10 11:49:23 +00:00
Sanjay Patel 6ebf218e4c [SelectionDAG] enhance vector demanded elements to look at a vector select condition operand
This is the DAG equivalent of D51433.
If we know we're not using all vector lanes, use that knowledge to potentially simplify a vselect condition.

The reduction/horizontal tests show that we are eliminating AVX1 operations on the upper half of 256-bit 
vectors because we don't need those anyway.
I'm not sure what the pr34592 test is showing. That's run with -O0; is SimplifyDemandedVectorElts supposed 
to be running there?

Differential Revision: https://reviews.llvm.org/D51696

llvm-svn: 341762
2018-09-09 14:13:22 +00:00
Fangrui Song 58963e4396 Fix typos. NFC
llvm-svn: 341740
2018-09-08 02:04:20 +00:00
Adrian Prantl 609bf36952 Remove addBlockByrefAddress(), it is dead code as far as clang is concerned.
This patch removes addBlockByrefAddress(), it is dead code as far as
clang is concerned: Every byref block capture is emitted with a
complex expression that is equivalent to what this function does.

rdar://problem/31629055

Differential Revision: https://reviews.llvm.org/D51763

llvm-svn: 341737
2018-09-08 00:21:55 +00:00
Reid Kleckner f803b23879 [COFF] Implement llvm.global_ctors priorities for MSVC COFF targets
Summary:
MSVC and LLD sort sections ASCII-betically, so we need to use section
names that sort between .CRT$XCA (the start) and .CRT$XCU (the default
priority).

In the general case, use .CRT$XCT12345 as the section name, and let the
linker sort the zero-padded digits.

Users with low priorities typically want to initialize as early as
possible, so use .CRT$XCA00199 for prioties less than 200. This number
is arbitrary.

Implements PR38552.

Reviewers: majnemer, mstorsjo

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51820

llvm-svn: 341727
2018-09-07 23:07:55 +00:00
David Stenberg 45acc9610b [DebugInfo] Handle stack slot offsets for spilled sub-registers in LDV
Summary:
Extend LDV so that stack slot offsets for spilled sub-registers
are added to the emitted debug locations. This is accomplished
by querying InstrInfo::getStackSlotRange().

With this change, LDV will add a DW_OP_plus_uconst operation to
the expression if a sub-register is spilled. Later on, PEI will
add an offset operation for the stack slot, meaning that we will
get expressions of the forms:

 * {DW_OP_constu #fp-offset, DW_OP_minus,
    DW_OP_plus_uconst #subreg-offset}

 * {DW_OP_plus_const #fp-offset,
    DW_OP_minus, DW_OP_plus_uconst #subreg-offset}

The two offset operations should ideally be merged.

Reviewers: rnk, aprantl, stoklund

Reviewed By: aprantl

Subscribers: dblaikie, bjope, nemanjai, JDevlieghere, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D51612

llvm-svn: 341659
2018-09-07 13:54:07 +00:00
Simon Pilgrim 96d6b9c2e2 [DAGCombiner] foldBitcastedFPLogic - Add basic vector support
Add support for bitcasts from float type to an integer type of the same element bitwidth.

There maybe cases where we need to support different widths (e.g. as SSE __m128i is treated as v2i64) - but I haven't seen cases of this in the wild yet.

llvm-svn: 341652
2018-09-07 12:13:45 +00:00
Sven van Haastregt abe3295cbd Fix argument type in MachineInstr::hasPropertyInBundle
The MCID::Flag enumeration now has more than 32 items, this means that
the hasPropertyBundle argument 'Mask' can overflow.

This patch changes the argument to be 64 bits instead.

Patch by Mikael Nilsson.

Differential Revision: https://reviews.llvm.org/D51596

llvm-svn: 341536
2018-09-06 10:25:59 +00:00
Hsiangkai Wang 760c1ab199 [DebugInfo] Do not generate label debug info if it has been processed.
In DwarfDebug::collectEntityInfo(), if the label entity is processed in
DbgLabels list, it means the label is not optimized out. There is no
need to generate debug info for it with null position.

llvm-svn: 341513
2018-09-06 02:22:06 +00:00
Sanjay Patel dbf52837fe [DAGCombiner] try to convert pow(x, 0.25) to sqrt(sqrt(x))
This was proposed as an IR transform in D49306, but it was not clearly justifiable as a canonicalization. 
Here, we only do the transform when the target tells us that sqrt can be lowered with inline code.

This is the basic case. Some potential enhancements are in the TODO comments:

1. Generalize the transform for other exponents (allow more than 2 sqrt calcs if that's really cheaper).
2. If we have less fast-math-flags, generate code to avoid -0.0 and/or INF.
3. Allow the transform when optimizing/minimizing size (might require a target hook to get that right).

Note that by default, x86 converts single-precision sqrt calcs into sqrt reciprocal estimate with 
refinement. That codegen is controlled by CPU attributes and can be manually overridden. We have plenty 
of test coverage for that already, so I didn't bother to include extra testing for that here. AArch uses 
its full-precision ops in all cases (not sure if that's the intended behavior or not, but that should 
also be covered by existing tests).

Differential Revision: https://reviews.llvm.org/D51630 

llvm-svn: 341481
2018-09-05 17:01:56 +00:00
Jonas Devlieghere 965b598b2a [DebugInfo] Normalize common kinds of DWARF sub-expressions.
Normalize common kinds of DWARF sub-expressions to make debug info
encoding a bit more compact:

  DW_OP_constu [X < 32] -> DW_OP_litX
  DW_OP_constu [all ones] -> DW_OP_lit0, DW_OP_not (64-bit only)

Differential revision: https://reviews.llvm.org/D51640

llvm-svn: 341457
2018-09-05 10:18:36 +00:00
Sander de Smalen c91b27d9ee Remove FrameAccess struct from hasLoadFromStackSlot
This removes the FrameAccess struct that was added to the interface
in D51537, since the PseudoValue from the MachineMemoryOperand
can be safely casted to a FixedStackPseudoSourceValue.

Reviewers: MatzeB, thegameg, javed.absar

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D51617

llvm-svn: 341454
2018-09-05 08:59:50 +00:00
Hsiangkai Wang b2b7f5f6d7 [DebugInfo] Fix bug in LiveDebugVariables.
In lib/CodeGen/LiveDebugVariables.cpp, it uses std::prev(MBBI) to
get DebugValue's SlotIndex. However, the previous instruction may be
also a debug instruction. It could not use a debug instruction to query
SlotIndex in mi2iMap.

Scan all debug instructions and use the first debug instruction to query
SlotIndex for following debug instructions. Only handle DBG_VALUE in
handleDebugValue().

Differential Revision: https://reviews.llvm.org/D50621

llvm-svn: 341446
2018-09-05 05:58:53 +00:00
Matt Arsenault a6f32f4015 DAG: Factor out helper function for odd vector sizes
llvm-svn: 341392
2018-09-04 18:47:43 +00:00
Scott Linder cab029f474 [CodeGen] Fix remaining zext() assertions in SelectionDAG
Fix remaining cases not committed in https://reviews.llvm.org/D49574

Differential Revision: https://reviews.llvm.org/D50659

llvm-svn: 341380
2018-09-04 16:33:34 +00:00
Matt Arsenault ca25b58957 DAG: Handle extract_vector_elt in isKnownNeverNaN
llvm-svn: 341317
2018-09-03 14:01:03 +00:00
Sander de Smalen 0c78da5132 Fix issue introduced by r341301 that broke buildbot.
A condition in isSpillInstruction() updates a small vector rather
than the 'FI' by-ref parameter, which was used in a subsequent
call to 'isSpillSlotObjectIndex()'. This patch fixes the condition
to check the FIs in the vector instead.

llvm-svn: 341305
2018-09-03 10:23:34 +00:00
Sander de Smalen 6cab60fa06 Extend hasStoreToStackSlot with list of FI accesses.
For instructions that spill/fill to and from multiple frame-indices
in a single instruction, hasStoreToStackSlot and hasLoadFromStackSlot
should return an array of accesses, rather than just the first encounter
of such an access.

This better describes FI accesses for AArch64 (paired) LDP/STP
instructions.

Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar, MatzeB

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D51537

llvm-svn: 341301
2018-09-03 09:15:58 +00:00
Hsiangkai Wang e0dcc28a4d Revert "[DebugInfo] Fix bug in LiveDebugVariables."
This reverts commit 8f548ff2a1819e1bc051e8218584f1a3d2cf178a.

buildbot failure in LLVM on clang-ppc64be-linux
http://lab.llvm.org:8011/builders/clang-ppc64le-linux/builds/19765

llvm-svn: 341290
2018-09-02 16:35:42 +00:00
Hsiangkai Wang 1368434b49 [DebugInfo] Fix bug in LiveDebugVariables.
In lib/CodeGen/LiveDebugVariables.cpp, it uses std::prev(MBBI) to
get DebugValue's SlotIndex. However, the previous instruction may be
also a debug instruction. It could not use a debug instruction to query
SlotIndex in mi2iMap.

Scan all debug instructions and use the first debug instruction to query
SlotIndex for following debug instructions. Only handle DBG_VALUE in
handleDebugValue().

Differential Revision: https://reviews.llvm.org/D50621

llvm-svn: 341289
2018-09-02 15:57:22 +00:00
Roman Lebedev d7a6244475 [DAGCombine] optimizeSetCCOfSignedTruncationCheck(): handle inverted pattern
Summary:
A follow-up for D49266 / rL337166 + D49497 / rL338044.

This is still the same pattern to check for the [lack of]
signed truncation, but in this case the constants and the predicate
are negated.

https://rise4fun.com/Alive/BDV
https://rise4fun.com/Alive/n7Z

Reviewers: spatel, craig.topper, RKSimon, javed.absar, efriedma, dmgreen

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51532

llvm-svn: 341287
2018-09-02 13:56:22 +00:00
Vlad Tsyrklevich 2499aeead9 SafeStack: Prevent OOB reads with mem intrinsics
Summary:
Currently, the SafeStack analysis disallows out-of-bounds writes but not
out-of-bounds reads for mem intrinsics like llvm.memcpy. This could
cause leaks of pointers to the safe stack by leaking spilled registers/
frame pointers. Check for allocas used as source or destination pointers
to mem intrinsics.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: pcc, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D51334

llvm-svn: 341116
2018-08-30 20:44:51 +00:00
Craig Topper 6666861158 [DAGCombiner] Fix bad identation. NFC
llvm-svn: 341103
2018-08-30 19:35:40 +00:00
Vladimir Stefanovic 7e58ebf6b8 Allow inconsistent offsets for 'noreturn' basic blocks when '-verify-cfiinstrs'
With r295105, some 'noreturn' blocks (those that don't return and have no
successors) may be merged.
If such blocks' predecessors have different outgoing offset or register, don't
report an error in CFIInstrInserter verify().

Thanks to Vlad Tsyrklevich for reporting the issue.

Differential Revision: https://reviews.llvm.org/D51161

llvm-svn: 341087
2018-08-30 17:31:38 +00:00
Nicolai Haehnle 35617ed4cb [NFC] Rename the DivergenceAnalysis to LegacyDivergenceAnalysis
Summary:
This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433).

The purpose of this patch is to free up the name DivergenceAnalysis for the new generic
implementation. The generic implementation class will be shared by specialized
divergence analysis classes.

Patch by: Simon Moll

Reviewed By: nhaehnle

Subscribers: jvesely, jholewinski, arsenm, nhaehnle, mgorny, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D50434

Change-Id: Ie8146b11be2c50d5312f30e11c7a3036a15b48cb
llvm-svn: 341071
2018-08-30 14:21:36 +00:00
Ties Stuij 9c16d809d2 [CodeGen] emit inline asm clobber list warnings for reserved (cont)
Summary:
This is a continuation of https://reviews.llvm.org/D49727
Below the original text, current changes in the comments:

Currently, in line with GCC, when specifying reserved registers like sp or pc on an inline asm() clobber list, we don't always preserve the original value across the statement. And in general, overwriting reserved registers can have surprising results.

For example:

  extern int bar(int[]);
  
  int foo(int i) {
    int a[i]; // VLA
    asm volatile(
        "mov r7, #1"
      :
      :
      : "r7"
    );
  
    return 1 + bar(a);
  }

Compiled for thumb, this gives:

  $ clang --target=arm-arm-none-eabi -march=armv7a -c test.c -o - -S -O1 -mthumb
  ...
  foo:
          .fnstart
  @ %bb.0:                                @ %entry
          .save   {r4, r5, r6, r7, lr}
          push    {r4, r5, r6, r7, lr}
          .setfp  r7, sp, #12
          add     r7, sp, #12
          .pad    #4
          sub     sp, #4
          movs    r1, #7
          add.w   r0, r1, r0, lsl #2
          bic     r0, r0, #7
          sub.w   r0, sp, r0
          mov     sp, r0
          @APP
          mov.w   r7, #1
          @NO_APP
          bl      bar
          adds    r0, #1
          sub.w   r4, r7, #12
          mov     sp, r4
          pop     {r4, r5, r6, r7, pc}
  ...

r7 is used as the frame pointer for thumb targets, and this function needs to restore the SP from the FP because of the variable-length stack allocation a. r7 is clobbered by the inline assembly (and r7 is included in the clobber list), but LLVM does not preserve the value of the frame pointer across the assembly block.

This type of behavior is similar to GCC's and has been discussed on the bugtracker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11807 . No consensus seemed to have been reached on the way forward. Clang behavior has briefly been discussed on the CFE mailing (starting here: http://lists.llvm.org/pipermail/cfe-dev/2018-July/058392.html). I've opted for following Eli Friedman's advice to print warnings when there are reserved registers on the clobber list so as not to diverge from GCC behavior for now.

The patch uses MachineRegisterInfo's target-specific knowledge of reserved registers, just before we convert the inline asm string in the AsmPrinter.

If we find a reserved register, we print a warning:

  repro.c:6:7: warning: inline asm clobber list contains reserved registers: R7 [-Winline-asm]
        "mov r7, #1"
        ^

Reviewers: efriedma, olista01, javed.absar

Reviewed By: efriedma

Subscribers: eraman, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D51165

llvm-svn: 341062
2018-08-30 12:52:35 +00:00
Matt Arsenault f2edba8e43 Don't count debug instructions towards neighborhood count
In computeRegisterLiveness, the max instructions to search
was counting dbg_value instructions, which could potentially
cause an observable codegen change from the presence of debug
info.

llvm-svn: 341028
2018-08-30 07:18:19 +00:00
Matt Arsenault 015a147c9f CodeGen: Make computeRegisterLiveness search forward first
If there is an unused def, this would previously
report that the register was live. Check for uses
first so that it is reported as dead if never used.

llvm-svn: 341027
2018-08-30 07:18:10 +00:00
Matt Arsenault eba9e9a266 CodeGen: Make computeRegisterLiveness consider successors
If the end of the block is reached during the scan, check
the live ins of the successors. This was already done in the
other direction if the block entry was reached.

llvm-svn: 341026
2018-08-30 07:17:51 +00:00
Carlos Alberto Enciso 06adfa1718 [DWARF] Missing location debug information with -O2.
Check that Machine CSE correctly handles during the transformation, the
debug location information for local variables.

Differential Revision: https://reviews.llvm.org/D50887

llvm-svn: 341025
2018-08-30 07:17:41 +00:00
Matt Arsenault 167601e629 DAG: Don't use ABI copies in some contexts
If an ABI-like value is used in a different block,
the type split used is not necessarily the same as
the call's ABI. The value is used through an intermediate
copy virtual registers from the other block. This
resulted in copies with inconsistent sizes later.

Fixes regressions since r338197 when AMDGPU started
splitting vector types for calls.

llvm-svn: 341018
2018-08-30 05:49:28 +00:00
Huihui Zhang 2f4106592d [GlobalMerge] Fix GlobalMerge on bss external global variables.
Summary:
Global variables that are external and zero initialized are
supposed to be merged with global variables in the bss section
rather than the data section.

Reviewers: efriedma, rengolin, t.p.northover, javed.absar, asl, john.brawn, pcc

Reviewed By: efriedma

Subscribers: dmgreen, llvm-commits

Differential Revision: https://reviews.llvm.org/D51379

llvm-svn: 341008
2018-08-30 00:49:50 +00:00
Eli Friedman 3769639335 [NFC] Make getPreferredAlignment honor section markings.
This should more accurately reflect what the AsmPrinter will actually
do.

This is NFC, as far as I can tell; all the places that might be affected
already have an extra check to avoid using the result of
getPreferredAlignment in this situation.

Differential Revision: https://reviews.llvm.org/D51377

llvm-svn: 340999
2018-08-29 23:46:26 +00:00
Matthias Braun b7b5860657 Reverse subregister saved loops in register usage info collector; NFC
On AMDGPU we have 70 register classes, so iterating over all 70
each time and exiting is costly on the CPU, this flips the loop
around so that it loops over the 70 register classes first,
and exits without doing the inner loop if needed.

On my test just starting radv this takes
RegUsageInfoCollector::runOnMachineFunction
from 6.0% of total time to 2.7% of total time,
and reduces the startup from 2.24s to 2.19s

Patch by David Airlie!

Differential Revision: https://reviews.llvm.org/D48582

llvm-svn: 340993
2018-08-29 23:12:42 +00:00
Martin Storsjo 489993db94 [MinGW] [X86] Add stubs for references to data variables that might end up imported from a dll
Variables declared with the dllimport attribute are accessed via a
stub variable named __imp_<var>. In MinGW configurations, variables that
aren't declared with a dllimport attribute might still end up imported
from another DLL with runtime pseudo relocs.

For x86_64, this avoids the risk that the target is out of range
for a 32 bit PC relative reference, in case the target DLL is loaded
further than 4 GB from the reference. It also avoids having to make the
text section writable at runtime when doing the runtime fixups, which
makes it worthwhile to do for i386 as well.

Add stub variables for all dso local data references where a definition
of the variable isn't visible within the module, since the DLL data
autoimporting might make them imported even though they are marked as
dso local within LLVM.

Don't do this for variables that actually are defined within the same
module, since we then know for sure that it actually is dso local.

Don't do this for references to functions, since there's no need for
runtime pseudo relocations for autoimporting them; if a function from
a different DLL is called without the appropriate dllimport attribute,
the call just gets routed via a thunk instead.

GCC does something similar since 4.9 (when compiling with -mcmodel=medium
or large; from that version, medium is the default code model for x86_64
mingw), but only for x86_64.

Differential Revision: https://reviews.llvm.org/D51288

llvm-svn: 340942
2018-08-29 17:28:34 +00:00
Simon Pilgrim b49d5f3b53 [DAGCombiner] Add X / X -> 1 & X % X -> 0 folds
Adds more divrem folds to try and get in sync with InstructionSimplify

Differential Revision: https://reviews.llvm.org/D50636

llvm-svn: 340919
2018-08-29 11:30:16 +00:00
George Rimar 9fbecc97ae Revert r340904 "[llvm-mc] - Allow to set custom flags for debug sections."
It broke PPC64 BB:
http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/23252

llvm-svn: 340906
2018-08-29 09:04:52 +00:00
George Rimar 999d1ce517 [llvm-mc] - Allow to set custom flags for debug sections.
I am experimenting with a single split dwarf (.dwo sections in .o files).
I want to make linker to ignore .dwo sections in .o, for that I am trying to add
SHF_EXCLUDE flag ("E") for them in my asm sample.

I found that currently, it is impossible to add any flag for debug sections using llvm-mc.

That happens because we have a set of predefined unique sections created early with default flags:
https://github.com/llvm-mirror/llvm/blob/master/lib/MC/MCObjectFileInfo.cpp#L391

This patch allows a user to add any flags he wants.

I had to edit TargetLoweringObjectFileImpl.cpp to set MetaData type for debug sections.
Their kind was Data by default (so they were allocatable) and so after changes introduced by
this patch the SHF_ALLOC flag was applied for them, what does not make sense for debug sections.
One of OrcJITTests tests failed because of that.

Differential revision: https://reviews.llvm.org/D51361

llvm-svn: 340904
2018-08-29 08:42:02 +00:00
Aditya Nandakumar 6d47a41520 [GISel]: Add legalization support for Widening UADDO/USUBO
https://reviews.llvm.org/D51384

Added code in LegalizerHelper to widen UADDO/USUBO along with unit
tests.

Reviewed by volkan.

llvm-svn: 340892
2018-08-29 03:17:08 +00:00
Craig Topper 9f42726cc7 [X86] Support v2i32 gather/scatter indices with -x86-experimental-vector-widening-legalization
Summary: This is split out from D41062 to cover the code in LegalVectorTypes.cpp

Reviewers: RKSimon, spatel, efriedma

Reviewed By: efriedma

Subscribers: sdardis, jvesely, nhaehnle, jrtc27, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D51337

llvm-svn: 340891
2018-08-29 02:12:49 +00:00
Aditya Nandakumar 6b4d343e13 [GISel]: Add missing opcodes for overflow intrinsics
https://reviews.llvm.org/D51197

Currently, IRTranslator (and GISel) seems to be arbitrarily picking
which overflow intrinsics get mapped into opcodes which either have a
carry as an input or not.
For intrinsics such as Intrinsic::uadd_with_overflow, translate it to an
opcode (G_UADDO) which doesn't have any carry inputs (similar to LLVM
IR).

This patch adds 4 missing opcodes for completeness - G_UADDO, G_USUBO,
G_SSUBE and G_SADDE.

llvm-svn: 340865
2018-08-28 18:54:10 +00:00
Nirav Dave 11e39fb6fb [DAGCombine] Rework MERGE_VALUES to inline in single pass. NFCI.
Avoid hyperlinear cost of inlining MERGE_VALUE node by constructing
temporary vector and doing a single replacement.

llvm-svn: 340853
2018-08-28 18:13:26 +00:00
Nirav Dave 113f2b9058 [DAG] Avoid recomputing Divergence checks. NFCI.
When making multiple updates to the same SDNode, recompute node
divergence only once after all changes have been made.

llvm-svn: 340852
2018-08-28 18:13:00 +00:00
Nirav Dave 0b8cb46e0b [DAG] Fix updateDivergence calculation
Check correct SDNode when deciding if we should update the divergence
property.

llvm-svn: 340851
2018-08-28 18:12:35 +00:00
Craig Topper c7506b28c1 [DAGCombiner][AMDGPU][Mips] Fold bitcast with volatile loads if the resulting load is legal for the target.
Summary:
I'm not sure if this patch is correct or if it needs more qualifying somehow. Bitcast shouldn't change the size of the load so it should be ok? We already do something similar for stores. We'll change the type of a volatile store if the resulting store is Legal or Custom. I'm not sure we should be allowing Custom there...

I was playing around with converting X86 atomic loads/stores(except seq_cst) into regular volatile loads and stores during lowering. This would allow some special RMW isel patterns in X86InstrCompiler.td to be removed. But there's some floating point patterns in there that didn't work because we don't fold (f64 (bitconvert (i64 volatile load))) or (f32 (bitconvert (i32 volatile load))).

Reviewers: efriedma, atanasyan, arsenm

Reviewed By: efriedma

Subscribers: jvesely, arsenm, sdardis, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, arichardson, jrtc27, atanasyan, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D50491

llvm-svn: 340797
2018-08-28 03:47:20 +00:00
David Blaikie 7d30653259 Revert "[CodeGenPrepare] Scan past debug intrinsics to find select candidates (NFC)"
This causes crashes due to the interleaved dbg.value intrinsics being
left at the end of basic blocks, causing the actual terminators (br,
etc) to be not where they should be (not at the end of the block),
leading to later crashes.

Further discussion on the original commit thread.

This reverts commit r340368.

llvm-svn: 340794
2018-08-28 00:55:19 +00:00
Brendon Cahoon e3841eea87 [Pipeliner] Fix incorrect phi values in the epilog and kernel
The code that generates the loop definition operand for phis
in the epilog and kernel is incorrect in some cases.

In the kernel, when a phi refers to another phi, the code that
updates PhiOp2 needs to include the stage difference between
the two phis.

In the epilog, the check for using the loop definition instead
of the phi definition uses the StageDiffAdj value (the difference
between the phi stage and the loop definition stage), but the
adjustment is not needed to determine if the current stage
contains an iteration with the loop definition.

Differential Revision: https://reviews.llvm.org/D51167

llvm-svn: 340782
2018-08-27 22:04:50 +00:00
Matt Arsenault cea7c6969d DAG: Check transformed type for forming fminnum/fmaxnum from vselect
Follow up to r340655 to fix vector types which are split.

llvm-svn: 340766
2018-08-27 18:11:31 +00:00
Matt Arsenault 9eb3dda0b2 MachineVerifier: Fix assert on implicit virtreg use
If the liveness of a physical register was invalid, this
was attempting to iterate the subregisters of all register
uses of the instruction, which would assert when it
encountered an implicit virtual register operand.

llvm-svn: 340763
2018-08-27 17:40:09 +00:00
Sanjay Patel f645927875 [SelectionDAG] add helper query for binops; NFC
We will also use this in a planned enhancement for vector insertelement.

llvm-svn: 340741
2018-08-27 14:20:15 +00:00
Sanjay Patel 113cac3b15 [SelectionDAG][x86] turn insertelement into undef with variable index into splat
I noticed this along with the patterns in D51125, but when the index is variable, 
we don't convert insertelement into a build_vector.

For x86, that means these get expanded at legalization time into the loading/spilling 
code that we see in the tests. I think it's always better to avoid going to memory on 
these, and we get the optimal 'broadcast' if it's available.

I suspect other targets may want to look at enabling the hook. AArch64 and AMDGPU have 
regression tests that would be affected (although I did not check what would happen in 
those cases). In the most basic cases shown here, AArch64 would probably do much 
better with a splat.

Differential Revision: https://reviews.llvm.org/D51186

llvm-svn: 340705
2018-08-26 18:20:41 +00:00
Chandler Carruth 9ae926b973 [IR] Replace `isa<TerminatorInst>` with `isTerminator()`.
This is a bit awkward in a handful of places where we didn't even have
an instruction and now we have to see if we can build one. But on the
whole, this seems like a win and at worst a reasonable cost for removing
`TerminatorInst`.

All of this is part of the removal of `TerminatorInst` from the
`Instruction` type hierarchy.

llvm-svn: 340701
2018-08-26 09:51:22 +00:00
Chandler Carruth 96fc1de77d [IR] Begin removal of TerminatorInst by removing successor manipulation.
The core get and set routines move to the `Instruction` class. These
routines are only valid to call on instructions which are terminators.

The iterator and *generic* range based access move to `CFG.h` where all
the other generic successor and predecessor access lives. While moving
the iterator here, simplify it using the iterator utilities LLVM
provides and updates coding style as much as reasonable. The APIs remain
pointer-heavy when they could better use references, and retain the odd
behavior of `operator*` and `operator->` that is common in LLVM
iterators. Adjusting this API, if desired, should be a follow-up step.

Non-generic range iteration is added for the two instructions where
there is an especially easy mechanism and where there was code
attempting to use the range accessor from a specific subclass:
`indirectbr` and `br`. In both cases, the successors are contiguous
operands and can be easily iterated via the operand list.

This is the first major patch in removing the `TerminatorInst` type from
the IR's instruction type hierarchy. This change was discussed in an RFC
here and was pretty clearly positive:
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123407.html

There will be a series of much more mechanical changes following this
one to complete this move.

Differential Revision: https://reviews.llvm.org/D47467

llvm-svn: 340698
2018-08-26 08:41:15 +00:00
Craig Topper a11a3b3818 [SelectionDAG][X86] Reorder the operands the MaskedStoreSDNode to put the value first.
Summary:
Previously the value being stored is the last operand in SDNode. This causes the type legalizer to visit the mask operand before the value operand. The type legalizer was more complicated because of this since we want the type of the value to drive the decisions.

This patch moves the value to be the first operand so we visit it first during type legalization. It also simplifies the type legalization code accordingly.

X86 is currently the only in tree target that uses this SDNode. Not sure if there are any users out of tree.

Reviewers: RKSimon, delena, hfinkel, eli.friedman

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50402

llvm-svn: 340689
2018-08-25 17:48:17 +00:00
Bjorn Pettersson 7ded6a909b [CodeGen] Set FrameSetup/FrameDestroy on BUNDLE instructions
Summary:
If any of the bundled instructions are marked as FrameSetup
or FrameDestroy, then that property is set on the BUNDLE
instruction as well.

As long as the scheduler/packetizer aren't mixing
prologue/epilogue instructions (i.e. all the bundled
instructions have the same property) then this simply gives
the bundle the correct property (so when using a bundle
iterator in late passes a bundle will be correctly identified
as FrameSetup/FrameDestroy).

When for example bundling a mix of FrameSetup instructions
with non-FrameSetup instructions it could be discussed if
the bundle should have the property or not. The choice here
has been to set these properties on the BUNDLE instruction if
any of the bundled instructions have the property set.

Reviewers: #debug-info, kparzysz

Reviewed By: kparzysz

Subscribers: vsk, thegameg, llvm-commits

Differential Revision: https://reviews.llvm.org/D50637

llvm-svn: 340680
2018-08-25 11:26:17 +00:00
Bjorn Pettersson 8483004723 [LiveDebugVariables] Avoid faulty addDefsFromCopies in computeIntervals
Summary:
When computeIntervals is looking through COPY instruction to
extend the location mapping for a debug variable it did not
handle subregisters correctly.

For example
    DBG_VALUE debug-use %0.sub_8bit_hi, ...
    %1:gr16 = COPY %0
was transformed into
    DBG_VALUE debug-use %0.sub_8bit_hi, ...
    %1:gr16 = COPY %0
    DBG_VALUE debug-use %1, ...
So the subregister index was missing in the added DBG_VALUE.

As long as the subreg refered to the least significant bits
of the superreg, then I guess we could get the correct
result in a debugger even when referring to the superreg.
But as in the example above when the subreg refers to other
parts of the superreg, then debuginfo would be incorrect.

I'm not sure exactly how to fix this properly, so this patch
just avoids looking through the COPY when there is a subreg
involved (for more info, see the FIXME added in the code).

Reviewers: rnk, aprantl

Reviewed By: aprantl

Subscribers: JDevlieghere, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D50788

llvm-svn: 340679
2018-08-25 10:02:03 +00:00
Matt Arsenault 5b9ef39bdd DAG: Allow matching fminnum/fmaxnum from vselect
llvm-svn: 340655
2018-08-24 21:24:18 +00:00
Eli Friedman 59de37ba6c [SafeStack] Set debug location for calls to __safestack_pointer_address.
Otherwise, the debug info is incorrect.  On its own, this is mostly
harmless, but the safe-stack also later inlines the call to
__safestack_pointer_address, which leads to debug info with the wrong
scope, which eventually causes an assertion failure (and incorrect debug
info in release mode).

Differential Revision: https://reviews.llvm.org/D51075

llvm-svn: 340651
2018-08-24 20:42:32 +00:00
Peter Collingbourne 3f792230cb CodeGen: Add two more conditions for adding symbols to the address-significance table.
Firstly, require the symbol to be used within the module. If a
symbol is unused within a module, then by definition it cannot be
address-significant within that module. This condition is useful on all
platforms because it could make symbol tables smaller -- without this
change, emitting an address-significance table could cause otherwise
unused undefined symbols to be added to the object file.

But this change is necessary with COFF specifically in order to
preserve the property that an unreferenced undefined symbol in an IR
module does not result in a link failure. This is already the case for
ELF because ELF linkers only reject links with unresolved symbols if
there is a relocation to that symbol, but COFF linkers require all
undefined symbols to be resolved regardless of relocations. So if
a module contains an unreferenced undefined symbol, we need to make
sure not to add it to the address-significance table (and thus the
symbol table) in case it doesn't end up resolved at link time.

Secondly, do not add dllimport symbols to the table. These symbols
won't be able to be resolved because their definitions live in another
module and are accessed via the IAT, and the address-significance
table has no effect on other modules anyway. It wouldn't make sense
to add the IAT entry symbol to the address-significance table either
because the IAT entry isn't address-significant -- the generated code
never takes its address.

Differential Revision: https://reviews.llvm.org/D51199

llvm-svn: 340648
2018-08-24 20:37:09 +00:00
David Blaikie 6dd452b514 DebugInfo: Fix skipping CUs in DWARFv5 debug_names table
My previoust test case had skipped CUs from one TU out of a two-TU LTO
scenario, which meant the CU index wasn't needed (as it was unambiguous
which CU a table entry applied to) - expanding the test to use 3 TUs,
skipping one (so long as it's not the last one) shows the indexes are
miscomputed. Fix that with a little indirection for the index.

llvm-svn: 340646
2018-08-24 20:31:05 +00:00
Craig Topper d8e91c3e8d [DAGCombiner][Mips] Don't combine bitcast+store after LegalOperations when the store is volatile, if the resulting store isn't Legal
Previously we allowed the store to be Custom. But without knowing for sure that the Custom handling won't split the store, we shouldn't convert a volatile store. We also probably shouldn't be creating a store the requires custom handling after LegalizeOps. This could lead to an infinite loop if the custom handling was to insert a bitcast. Though I guess isStoreBitCastBeneficial could be used to block such a loop.

The test changes here are due to the volatile part of this. The stores in the test are all volatile and i32 stores are marked custom, So we are no longer converting them

This is related to D50491 where I was trying to allow some bitcasting of volatile loads

Differential Revision: https://reviews.llvm.org/D50578

llvm-svn: 340626
2018-08-24 17:48:25 +00:00
Justin Bogner fbbd4366a6 [SDAG] Add versions of computeKnownBits that return a value
Having the KnownBits as an output parameter is kind of awkward to use
and a holdover from when it was two separate APInts. Instead, just
return a KnownBits object.

I'm leaving the existing interface in place for now, since updating
the callers all at once would be thousands of lines of diff.

llvm-svn: 340594
2018-08-24 02:42:24 +00:00
Tim Renouf 4be70ba94a [RegisterCoalescer] Fix for assert in removePartialRedundancy
Summary:
I got "Use not jointly dominated by defs" when removePartialRedundancy
attempted to prune then re-extend a subrange whose only liveness was a
dead def at the copy being removed.

V2: Removed junk from test. Improved comment.
V3: Addressed minor review comments.

Subscribers: MatzeB, qcolombet, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D50914

Change-Id: I6f894e9f517f71e921e0c6d81d28c5f344db8dad
llvm-svn: 340549
2018-08-23 17:28:33 +00:00
Chandler Carruth 8505dcf745 Revert r340508: [DebugInfo] Fix bug in LiveDebugVariables.
This patch's test case relies on debug prints which isn't generally an
OK way to test stuff in LLVM and fails whenever asserts aren't enabled.
I've send a heads-up to the commit and detailed comments on the review.

llvm-svn: 340513
2018-08-23 05:39:02 +00:00
Hsiangkai Wang 97edcbc4e0 [DebugInfo] Fix bug in LiveDebugVariables.
In lib/CodeGen/LiveDebugVariables.cpp, it uses std::prev(MBBI) to
get DebugValue's SlotIndex. However, the previous instruction may be
also a debug instruction. It could not use a debug instruction to query
SlotIndex in mi2iMap.

Scan all debug instructions and use the first debug instruction to query
SlotIndex for following debug instructions. Only handle DBG_VALUE in
handleDebugValue().

Differential Revision: https://reviews.llvm.org/D50621

llvm-svn: 340508
2018-08-23 03:28:24 +00:00
Sanjay Patel ed1b9695ee [SelectionDAG] unroll unsupported vector FP ops earlier to avoid libcalls on undef elements (PR38527)
This solves the motivating case from:
https://bugs.llvm.org/show_bug.cgi?id=38527

If we are legalizing an FP vector op that maps to 1 of the LLVM intrinsics that mimic libm calls, 
but we're going to end up with scalar libcalls for that vector type anyway, then we should unroll 
the vector op into scalars before widening. This avoids libcalls because we've lost the knowledge 
that some of the scalar elements are undef.

Differential Revision: https://reviews.llvm.org/D50791

llvm-svn: 340469
2018-08-22 22:52:05 +00:00
Eli Friedman 96e3cd85bd [ARM] Lower llvm.ctlz.i32 to a libcall when clz is not available.
The inline sequence is very long (about 70 bytes on Thumb1), so it's
not really a good idea to inline it, especially when optimizing for
size.

Differential Revision: https://reviews.llvm.org/D47917

llvm-svn: 340458
2018-08-22 21:47:14 +00:00
Eli Friedman f3c39a7c79 [SafeStack] Handle unreachable code with safe stack coloring.
Instead of asserting that the function doesn't have any unreachable
code, just ignore it for the purpose of computing liveness.

Differential Revision: https://reviews.llvm.org/D51070

llvm-svn: 340456
2018-08-22 21:38:57 +00:00
Vedant Kumar a85ca3de66 [CodeGenPrepare] Set debug locs when folding a comparison into a uadd.with.overflow
CGP can replace a branch + select with a uadd.with.overflow. Teach it to
set debug locations as it does this.

llvm-svn: 340432
2018-08-22 18:15:03 +00:00
Aditya Nandakumar c106183518 [GISel]: Add legalization support for widening bit counting operations
https://reviews.llvm.org/D51053

Added legalization for WidenScalar of various bitcounting opcodes.

Reviewed by arsenm.

llvm-svn: 340429
2018-08-22 17:59:18 +00:00
Vedant Kumar 4760686823 [CodeGenPrepare] Set debug loc when widening a switch condition
Set a debug location on the cast instruction used to widen a switch
condition.

llvm-svn: 340379
2018-08-22 01:23:31 +00:00
Vedant Kumar 1e8a2c963c [CodeGenPrepare] Set debug locations when splitting selects
When splitting a select into a diamond, set debug locations on
newly-created branch instructions and phi nodes.

llvm-svn: 340371
2018-08-22 00:10:37 +00:00
Vedant Kumar 30406fd789 [CodeGenPrepare] Clean up dbg.value use-before-def as late as possible
CodeGenPrepare has a strategy for moving dbg.values so that a value's
definition always dominates its debug users. This cleanup was happening
too early (before certain CGP transforms were run), resulting in some
dbg.value use-before-def errors.

Perform this cleanup as late as possible to avoid use-before-def.

llvm-svn: 340370
2018-08-21 23:43:08 +00:00
Vedant Kumar 00e7558edd [CodeGenPrepare] Scan past debug intrinsics to find select candidates (NFC)
In optimizeSelectInst, when scanning for candidate selects to rewrite
into branches, scan past debug intrinsics. This makes the debug-enabled
and non-debug paths through optimizeSelectInst more congruent.

NFC because every select is eventually visited either way.

llvm-svn: 340368
2018-08-21 23:42:38 +00:00
Vedant Kumar fbc3873be9 [CodeGenPrepare] Exit earlier when optimizing selects (NFC)
When optimizing for size, this allows optimizeSelectInst to skip a
linear scan and exit early.

llvm-svn: 340367
2018-08-21 23:42:23 +00:00
Tom Stellard ecd6aa5be2 MachineScheduler: Refactor setPolicy() to limit computing remaining latency
Summary:
Computing the remaining latency can be very expensive especially
on graphs of N nodes where the number of edges approaches N^2.

This reduces the compile time of a pathological case with the
AMDGPU backend from ~7.5 seconds to ~3 seconds.  This test case has
a basic block with 2655 stores, each with somewhere between 500
and 1500 successors and predecessors.

Reviewers: atrick, MatzeB, airlied, mareko

Reviewed By: mareko

Subscribers: tpr, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D50486

llvm-svn: 340346
2018-08-21 21:48:43 +00:00
Heejin Ahn 9cd7f88a35 [WebAssembly] Don't make wasm cleanuppads into funclet entries
Summary:
Catchpads and cleanuppads are not funclet entries; they are only EH
scope entries. We already dont't set `isEHFuncletEntry` for catchpads.
This patch does the same thing for cleanuppads.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D50654

llvm-svn: 340330
2018-08-21 20:04:42 +00:00
Bjorn Pettersson e06321382b [RegisterCoalescer] Use substPhysReg in reMaterializeTrivialDef
Summary:
When RegisterCoalescer::reMaterializeTrivialDef is substituting
a register use in a DBG_VALUE instruction, and the old register
is a subreg, and the new register is a physical register,
then we need to use substPhysReg in order to extract the correct
subreg.

Reviewers: wmi, aprantl

Reviewed By: wmi

Subscribers: hiraditya, MatzeB, qcolombet, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D50844

llvm-svn: 340326
2018-08-21 19:47:32 +00:00
Heejin Ahn ed5e06b0a7 [WebAssembly] Add isEHScopeReturn instruction property
Summary:
So far, `isReturn` property is used to mean both a return instruction
from a functon and the end of an EH scope, a scope that starts with a EH
scope entry BB and ends with a catchret or a cleanupret instruction.
Because WinEH uses funclets, all EH-scope-ending instructions are also
real return instruction from a function. But for wasm, they only serve
as the end marker of an EH scope but not a return instruction that
exits a function. This mismatch caused incorrect prolog and epilog
generation in wasm EH scopes. This patch fixes this.

This patch is in the same vein with rL333045, which splits
`MachineBasicBlock::isEHFuncletEntry` into `isEHFuncletEntry` and
`isEHScopeEntry`.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D50653

llvm-svn: 340325
2018-08-21 19:44:11 +00:00
Krzysztof Parzyszek b211434a78 [RegisterCoalscer] Manually remove leftover segments when commuting def
In removeCopyByCommutingDef, segments from the source live range are
copied into (and merged with) the segments of the target live range.
This is performed for all subranges of the source interval. It can
happen that there will be subranges of the target interval that had
no corresponding subranges in the source interval, and in such cases
these subrages will not be updated. Since the copy being coalesced
is about to be removed, these ranges need to be updated by removing
the segments that are started by the copy.

llvm-svn: 340318
2018-08-21 19:01:26 +00:00
Yury Delendik 132fc5a861 Update DBG_VALUE register operand during LiveInterval operations
Summary:
Handling of DBG_VALUE in ConnectedVNInfoEqClasses::Distribute() was fixed in
PR16110. However DBG_VALUE register operands are not getting updated. This
patch properly resolves the value location.

Reviewers: MatzeB, vsk

Reviewed By: MatzeB

Subscribers: kparzysz, thegameg, vsk, MatzeB, dschuff, sbc100, jgravelle-google, aheejin, sunfish, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D48994

llvm-svn: 340310
2018-08-21 17:48:28 +00:00
Aditya Nandakumar c0333f7184 Revert "Revert rr340111 "[GISel]: Add Legalization/lowering code for bit counting operations""
This reverts commit d1341152d91398e9a882ba2ee924147ea2f9b589.

This patch originally made use of Nested MachineIRBuilder buildInstr
calls, and since order of argument processing is not well defined, the
instructions were built slightly in a different order (still correct).
I've removed the nested buildInstr calls to have a defined order now.

Patch was tested by Mikael.

llvm-svn: 340309
2018-08-21 17:30:31 +00:00
Bjorn Pettersson d378a39603 Change how finalizeBundle selects debug location for the BUNDLE instruction
Summary:
Previously a BUNDLE instruction inherited the DebugLoc from the
first instruction in the bundle, even if that DebugLoc had no
DILocation. With this commit this is changed into selecting the
first DebugLoc that has a DILocation, by searching among the
bundled instructions.

The idea is to reduce amount of bundles that are lacking
debug locations.

Reviewers: #debug-info, JDevlieghere

Reviewed By: JDevlieghere

Subscribers: JDevlieghere, mattd, llvm-commits

Differential Revision: https://reviews.llvm.org/D50639

llvm-svn: 340267
2018-08-21 10:59:50 +00:00