Commit Graph

23755 Commits

Author SHA1 Message Date
Daniel Sanders df39cbae2f Re-commit r315863: [globalisel][tablegen] Import ComplexPattern when used as an operator
Summary:
It's possible for a ComplexPattern to be used as an operator in a match
pattern. This is used by the load/store patterns in AArch64 to name the
suboperands returned by ComplexPattern predicate so that they can be broken
apart and referenced independently in the result pattern.

This patch adds support for this in order to enable the import of load/store
patterns.

Depends on D37445

Hopefully fixed the ambiguous constructor that a large number of bots reported.

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D37456

llvm-svn: 315869
2017-10-15 18:22:54 +00:00
Daniel Sanders bb082a36d3 Revert r315863: [globalisel][tablegen] Import ComplexPattern when used as an operator
A large number of bots are failing on an ambiguous constructor call.

llvm-svn: 315866
2017-10-15 17:51:07 +00:00
Daniel Sanders b95b867dd8 [globalisel][tablegen] Import ComplexPattern when used as an operator
Summary:
It's possible for a ComplexPattern to be used as an operator in a match
pattern. This is used by the load/store patterns in AArch64 to name the
suboperands returned by ComplexPattern predicate so that they can be broken
apart and referenced independently in the result pattern.

This patch adds support for this in order to enable the import of load/store
patterns.

Depends on D37445

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D37456

llvm-svn: 315863
2017-10-15 17:03:36 +00:00
Aaron Ballman 615eb47035 Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people.
Error LNK2019 unresolved external symbol "public: void __cdecl `anonymous namespace'::MatchableInfo::dump(void)const " (?dump@MatchableInfo@?A0xf4f1c304@@QEBAXXZ) referenced in function "public: void __cdecl `anonymous namespace'::AsmMatcherEmitter::run(class llvm::raw_ostream &)" (?run@AsmMatcherEmitter@?A0xf4f1c304@@QEAAXAEAVraw_ostream@llvm@@@Z) llvm-tblgen D:\llvm\2017\utils\TableGen\AsmMatcherEmitter.obj 1

llvm-svn: 315854
2017-10-15 14:32:27 +00:00
Quentin Colombet 4a6b75012d [RegisterBankInfo] Cache the getMinimalPhysRegClass information
TargetRegisterInfo::getMinimalPhysRegClass is actually pretty expensive
because it has to iterate over all the register classes.
Cache this information as we need and get it so that we limit its usage.
Right now, we heavily rely on it, because this is how we get the mapping
for vregs defined by copies from physreg (i.e., the one that are ABI
related).

Improve compile time by up to 10% for that pass.

NFC

llvm-svn: 315759
2017-10-13 21:16:15 +00:00
Quentin Colombet d58265ad55 [Legalizer] Use SmallSetVector instead of SetVector.
NFC

llvm-svn: 315758
2017-10-13 21:16:14 +00:00
Quentin Colombet b1a3bd1529 [LegalizerInfo] Don't evaluate end boundary every time through the loop
Match the LLVM coding standard for loop conditions.

NFC.

llvm-svn: 315757
2017-10-13 21:16:13 +00:00
Quentin Colombet 86220488b4 [Legalizer] Only allocate the SetVectors once per function.
Prior to this patch we used to create SetVectors in temporaries that
were created and destroyed for each instruction. Now, instead we create
and destroyed them only once, but clear the content for each
instruction.
This speeds up the pass by ~25%.

NFC.

llvm-svn: 315756
2017-10-13 21:16:05 +00:00
Matt Arsenault f2db97d8fa DAG: Add opcode and source type to isFPExtFree
This is only currently used for mad/fma transforms.
This is the only case where it should be used for AMDGPU,
so add an opcode to be sure.

llvm-svn: 315740
2017-10-13 19:55:45 +00:00
Matt Arsenault 846dd6b2fa DAG: Add flags to dumps
llvm-svn: 315690
2017-10-13 15:41:40 +00:00
Craig Topper 6794eb8ba1 [SelectionDAG] Cleanup the SIGN_EXTEND_INREG handling in computeKnownBits. NFCI
Use less temporary APInts. Use bit counting more. Don't call getScalarSizeInBits so many places, just capture it once.

llvm-svn: 315671
2017-10-13 05:35:35 +00:00
Craig Topper bde7abe9c9 [SelectionDAG] Fix typo in comment. NFC
llvm-svn: 315670
2017-10-13 05:35:34 +00:00
Craig Topper d6630b9889 [SelectionDAG] Correct the early out in SelectionDAG::getZeroExtendInReg to work properly for vector types.
I don't know if we ever hit this case or not. Turning it into an assert only fired on expanding some atomic operation in a SystemZ lit test.

llvm-svn: 315648
2017-10-13 00:18:58 +00:00
Adam Nemet 01104aee1a Add DK_Remark to SMDiagnostic
Swift uses SMDiagnostic for diagnostic messages. For
https://github.com/apple/swift/pull/12294, we need remark support.

I picked the color that clang uses to display them.

Differential Revision: https://reviews.llvm.org/D38865

llvm-svn: 315642
2017-10-12 23:56:02 +00:00
Craig Topper 0e8211ea57 [SelectionDAG] Const-correct the DemandedMask argument to one of the overloads of SimplifyDemandedBits. NFC
llvm-svn: 315641
2017-10-12 23:46:05 +00:00
Matthias Braun bb8507e63c Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine"
Reverting to investigate layering effects of MCJIT not linking
libCodeGen but using TargetMachine::getNameWithPrefix() breaking the
lldb bots.

This reverts commit r315633.

llvm-svn: 315637
2017-10-12 22:57:28 +00:00
Adrian Prantl a2f9e3a60f Deprecate DwarfUnit::addBlockByrefAddress().
The clang frontend already creates a DIExpression that replicates the
logic in addBlockByrefAddress() exactly, thus making this function
effectively unreachable. To guard against human error I'm hereby
marking the function with an assertion and let it hit the bots before
eventually removing it.

rdar://problem/31629055

llvm-svn: 315636
2017-10-12 22:54:36 +00:00
Matthias Braun 3a9c114b24 TargetMachine: Merge TargetMachine and LLVMTargetMachine
Merge LLVMTargetMachine into TargetMachine.

- There is no in-tree target anymore that just implements TargetMachine
  but not LLVMTargetMachine.
- It should still be possible to stub out all the various functions in
  case a target does not want to use lib/CodeGen
- This simplifies the code and avoids methods ending up in the wrong
  interface.

Differential Revision: https://reviews.llvm.org/D38489

llvm-svn: 315633
2017-10-12 22:28:54 +00:00
Craig Topper d52cf99f53 [SelectionDAG] Simplify the ISD::SIGN_EXTEND/ZERO_EXTEND handling to use less temporary APInts by counting bits instead. NFCI
llvm-svn: 315628
2017-10-12 21:58:25 +00:00
Eli Friedman 48e7549691 [DWARF] Fix bad comparator in sortGlobalExprs.
The comparator passed to std::sort must provide a strict weak ordering;
otherwise, the behavior is undefined.

Fixes an assertion failure generating debug info for globals
split by GlobalOpt. I have a testcase, but not sure how to reduce it,
so not included here.  (Someone else came up with a testcase, but I
can't reproduce the crash with it, presumably because my version of LLVM
ends up sorting the array differently.)

This isn't really a complete fix (see the FIXME in the patch), but at
least it doesn't have undefined behavior.

Differential Revision: https://reviews.llvm.org/D38830

llvm-svn: 315619
2017-10-12 20:54:08 +00:00
Wei Ding 5676acad9e Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.
Differential Revision: http://reviews.llvm.org/D37348

llvm-svn: 315610
2017-10-12 19:37:14 +00:00
Don Hinton 3e0199f7eb [dump] Remove NDEBUG from test to enable dump methods [NFC]
Summary:
Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with
LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP.

Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods.

Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so
it'll be picked up by public headers.

Differential Revision: https://reviews.llvm.org/D38406

llvm-svn: 315590
2017-10-12 16:16:06 +00:00
Diana Picus 4a5f522d4d MachineInstr: Make isEqual agree with getHashValue in MachineInstrExpressionTrait
MachineInstr::isIdenticalTo has a lot of logic for dealing with register
Defs (i.e. deciding whether to take them into account or ignore them).
This logic gets things wrong in some obscure cases, for instance if an
operand is not a Def for both the current MI and the one we are
comparing to.

I'm not sure if it's possible for this to happen for regular register
operands, but it may happen in the ARM backend for special operands
which use sentinel values for the register (i.e. 0, which is neither a
physical register nor a virtual one).

This causes MachineInstrExpressionTrait::isEqual (which uses
MachineInstr::isIdenticalTo) to return true for the following
instructions, which are the same except for the fact that one sets the
flags and the other one doesn't:
%1114 = ADDrsi %1113, %216, 17, 14, _, def _
%1115 = ADDrsi %1113, %216, 17, 14, _, _

OTOH, MachineInstrExpressionTrait::getHashValue returns different values
for the 2 instructions due to the different isDef on the last operand.
In practice this means that when trying to add those instructions to a
DenseMap, they will be considered different because of their different
hash values, but when growing the map we might get an assertion while
copying from the old buckets to the new buckets because isEqual
misleadingly returns true.

This patch makes sure that isEqual and getHashValue agree, by improving
the checks in MachineInstr::isIdenticalTo when we are ignoring virtual
register definitions (which is what the Trait uses). Firstly, instead of
checking isPhysicalRegister, we use !isVirtualRegister, so that we cover
both physical registers and sentinel values. Secondly, instead of
checking MachineOperand::isReg, we use MachineOperand::isIdenticalTo,
which checks isReg, isSubReg and isDef, which are the same values that
the hash function uses to compute the hash.

Note that the function is symmetric with this change, since if the
current operand is not a Def, we check MachineOperand::isIdenticalTo,
which returns false if the operands have different isDef's.

Differential Revision: https://reviews.llvm.org/D38789

llvm-svn: 315579
2017-10-12 13:59:51 +00:00
Daniel Jasper 4d93120273 Reinstantiate old/bad deduplication logic that was removed in r315279.
While this shouldn't be necessary anymore, we have cases where we run
into the assertion below, i.e. cases with two non-fragment entries for the
same variable at different frame indices.

This should be fixed, but for now, we should revert to a version that
does not trigger asserts.

llvm-svn: 315576
2017-10-12 13:25:05 +00:00
Hiroshi Inoue b49b015bed [ScheduleDAGInstrs] fix behavior of getUnderlyingObjectsForCodeGen when no identifiable object found
This patch fixes the bug introduced in https://reviews.llvm.org/D35907; the bug is reported by http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171002/491452.html.

Before D35907, when GetUnderlyingObjects fails to find an identifiable object, allMMOsOkay lambda in getUnderlyingObjectsForInstr returns false and Objects vector is cleared. This behavior is unintentionally changed by D35907.

This patch makes the behavior for such case same as the previous behavior.
Since D35907 introduced a wrapper function getUnderlyingObjectsForCodeGen around GetUnderlyingObjects, getUnderlyingObjectsForCodeGen is modified to return a boolean value to ask the caller to clear the Objects vector.

Differential Revision: https://reviews.llvm.org/D38735

llvm-svn: 315565
2017-10-12 06:26:04 +00:00
Mikael Holmen a079ef68e3 [RegisterCoalescer] Don't set read-undef in pruneValues, only clear
Summary:
The comments in the code said

 // Remove <def,read-undef> flags. This def is now a partial redef.

but the code didn't just remove read-undef, it could introduce new ones which
could cause errors.

E.g. if we have something like

%vreg1<def> = IMPLICIT_DEF
%vreg2:subreg1<def, read-undef> = op %vreg3, %vreg4
%vreg2:subreg2<def> = op %vreg6, %vreg7

and we merge %vreg1 and %vreg2 then we should not set undef on the second subreg
def, which the old code did.

Now we solve this by actually do what the code comment says. We remove
read-undef flags rather than remove or introduce them.

Reviewers: qcolombet, MatzeB

Reviewed By: MatzeB

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38616

llvm-svn: 315564
2017-10-12 06:21:28 +00:00
Wei Mi 1736efd16a Revert r307036 because of PR34919.
llvm-svn: 315540
2017-10-12 00:24:52 +00:00
Lang Hames 2241ffa43c [MC] Have MCObjectStreamer take its MCAsmBackend argument via unique_ptr.
MCObjectStreamer owns its MCCodeEmitter -- this fixes the types to reflect that,
and allows us to remove the last instance of MCObjectStreamer's weird "holding
ownership via someone else's reference" trick.

llvm-svn: 315531
2017-10-11 23:34:47 +00:00
Reid Kleckner 9cdd4df81a [codeview] Implement FPO data assembler directives
Summary:
This adds a set of new directives that describe 32-bit x86 prologues.
The directives are limited and do not expose the full complexity of
codeview FPO data. They are merely a convenience for the compiler to
generate more readable assembly so we don't need to generate tons of
labels in CodeGen. If our prologue emission changes in the future, we
can change the set of available directives to suit our needs. These are
modelled after the .seh_ directives, which use a different format that
interacts with exception handling.

The directives are:
  .cv_fpo_proc _foo
  .cv_fpo_pushreg ebp/ebx/etc
  .cv_fpo_setframe ebp/esi/etc
  .cv_fpo_stackalloc 200
  .cv_fpo_endprologue
  .cv_fpo_endproc
  .cv_fpo_data _foo

I tried to follow the implementation of ARM EHABI CFI directives by
sinking most directives out of MCStreamer and into X86TargetStreamer.
This helps avoid polluting non-X86 code with WinCOFF specific logic.

I used cdb to confirm that this can show locals in parent CSRs in a few
cases, most importantly the one where we use ESI as a frame pointer,
i.e. the one in http://crbug.com/756153#c28

Once we have cdb integration in debuginfo-tests, we can add integration
tests there.

Reviewers: majnemer, hans

Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D38776

llvm-svn: 315513
2017-10-11 21:24:33 +00:00
Florian Hahn e52abba277 [MachineCombiner] Fix initialisation of LastUpdate for incremental update.
Summary:
Fixes a bogus iterator resulting from the removal of a block's first instruction at the point that incremental update is enabled.

Patch by Paul Walker.

Reviewers: fhahn, Gerolf, efriedma, MatzeB

Reviewed By: fhahn

Subscribers: aemerson, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38734

llvm-svn: 315502
2017-10-11 20:25:58 +00:00
Vivek Pandya 9590658fb8 [NFC] Convert OptimizationRemarkEmitter old emit() calls to new closure
parameterized emit() calls

Summary: This is not functional change to adopt new emit() API added in r313691.

Reviewed By: anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38285

llvm-svn: 315476
2017-10-11 17:12:59 +00:00
Krzysztof Parzyszek 12bdcab59c [Pipeliner] Fix offset value for instrs dependent on post-inc load/stores
The software pipeliner and the packetizer try to break dependence
between the post-increment instruction and the dependent memory
instructions by changing the base register and the offset value.
However, in some cases, the existing logic didn't work properly
and created incorrect offset value.

Patch by Jyotsna Verma.

llvm-svn: 315468
2017-10-11 15:59:51 +00:00
Krzysztof Parzyszek 8f174dde92 [Pipeliner] Improve serialization order for post-increments
The pipeliner is generating a serial sequence that causes poor
register allocation when a post-increment instruction appears
prior to the use of the post-increment register. This occurs when
there is a circular set of dependences involved with a sequence
of instructions in the same cycle. In this case, there is no
serialization of the parallel semantics that will not cause an
additional register to be allocated.

This patch fixes the problem by changing the instructions so that
the post-increment instruction is used by the subsequent
instruction, which enables the register allocator to make a
better decision and not require another register.

Patch by Brendon Cahoon.

llvm-svn: 315466
2017-10-11 15:51:44 +00:00
Sanjay Patel 34fd5eaaf0 [DAGCombiner] convert insertelement of bitcasted vector into shuffle
Eg:
insert v4i32 V, (v2i16 X), 2 --> shuffle v8i16 V', X', {0,1,2,3,8,9,6,7}

This is a generalization of the IR fold in D38316 to handle insertion into a non-undef vector. 
We may want to abandon that one if we can't find value in squashing the more specific pattern sooner.

We're using the existing legal shuffle target hook to avoid AVX512 horror with vXi1 shuffles.

There may be room for improvement in the shuffle lowering here, but that would be follow-up work.

Differential Revision: https://reviews.llvm.org/D38388

llvm-svn: 315460
2017-10-11 14:12:16 +00:00
Alex Bradbury 4d275f0dfe [TargetLowering] Correctly track NumFixedArgs field of CallLoweringInfo
The NumFixedArgs field of CallLoweringInfo is used by
TargetLowering::LowerCallTo to determine whether a given argument is passed
using the vararg calling convention or not (specifically, to set IsFixed for
each ISD::OutputArg).

Firstly, CallLoweringInfo::setLibCallee and CallLoweringInfo::setCallee both
incorrectly set NumFixedArgs based on the _previous_ args list. Secondly,
TargetLowering::LowerCallTo failed to increment NumFixedArgs when modifying
the argument list so a pointer is passed for the return value.

If your backend uses the IsFixed property or directly accesses NumFixedArgs, 
it is _possible_ this change could result in codegen changes (although the 
previous behaviour would have been incorrect). No such cases have been
identified during code review for any in-tree architecture.

Differential Revision: https://reviews.llvm.org/D37898

llvm-svn: 315457
2017-10-11 13:48:45 +00:00
Lang Hames 02d330548d [MC] Have MCObjectStreamer take its MCAsmBackend argument via unique_ptr.
MCObjectStreamer owns its MCAsmBackend -- this fixes the types to reflect that,
and allows us to remove another instance of MCObjectStreamer's weird "holding
ownership via someone else's reference" trick.

llvm-svn: 315410
2017-10-11 01:57:21 +00:00
Justin Bogner fdf9bf4f16 CodeGen: Minor cleanups to use MachineInstr::getMF. NFC
Since r315388 we have a shorter way to say this, so we'll replace
MI->getParent()->getParent() with MI->getMF() in a few places.

llvm-svn: 315390
2017-10-10 23:50:49 +00:00
Justin Bogner ec7cba53e6 CodeGen: Add MachineInstr::getMF(). NFC
Similarly to how Instruction has getFunction, this adds a less verbose
way to write MI->getParent()->getParent(). I'll follow up shortly with
a change that changes a bunch of the uses.

llvm-svn: 315388
2017-10-10 23:34:01 +00:00
Eugene Zelenko 149178d92b [CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 315380
2017-10-10 22:33:29 +00:00
Adrian Prantl 3a3ba77ba3 Convert condition to an early exit (NFC).
<rdar://problem/34689604>

llvm-svn: 315359
2017-10-10 20:33:43 +00:00
David Stuttard 51c1b22806 [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors
Summary:
See https://llvm.org/PR33743 for more details

It seems that for non-power of 2 vector sizes, the algorithm can produce
non-matching sizes for input and result causing an assert.

This usually isn't a problem as the isAnyExtend check will weed these out, but
in some cases (most often with lots of undefined values for the mask indices) it
can pass this check for non power of 2 vectors.

Adding in an extra check that ensures that bit size will match for the result
and input (as required)

Subscribers: nhaehnle

Differential Revision: https://reviews.llvm.org/D35241

llvm-svn: 315307
2017-10-10 12:45:45 +00:00
Bjorn Steinbrink d36bbe9c89 Ignore all duplicate frame index expression
Some passes might duplicate calls to llvm.dbg.declare creating
duplicate frame index expression which currently trigger an assertion
which is meant to catch erroneous, overlapping fragment declarations.
But identical frame index expressions are just redundant and don't
actually conflict with each other, so we can be more lenient and just
ignore the duplicates.

Reviewers: aprantl, rnk

Subscribers: llvm-commits, JDevlieghere

Differential Revision: https://reviews.llvm.org/D38540

llvm-svn: 315279
2017-10-10 07:46:17 +00:00
Adam Nemet 0965da2055 Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.*
Sync it up with the name of the class actually defined here.  This has been
bothering me for a while...

llvm-svn: 315249
2017-10-09 23:19:02 +00:00
Aditya Nandakumar c3bfc81a1f [GISel]: Fix generation of illegal COPYs during CallLowering
We end up creating COPY's that are either truncating/extending and this
should be illegal.

https://reviews.llvm.org/D37640

Patch for X86 and ARM by igorb, rovka

llvm-svn: 315240
2017-10-09 20:07:43 +00:00
Sanjay Patel 2a61a821a0 [DAG] combine assertsexts around a trunc
This was a suggested follow-up to:
D37017 / https://reviews.llvm.org/rL313577

llvm-svn: 315206
2017-10-09 15:22:20 +00:00
Benjamin Kramer 16610028ea Remove unused variables. No functionality change.
llvm-svn: 315185
2017-10-08 19:11:02 +00:00
Craig Topper 90b76211d3 [SelectionDAG} Use KnownBits::isUnknown and hasConflict. NFC
llvm-svn: 315154
2017-10-07 17:07:48 +00:00
Jessica Paquette 13593843f6 [MachineOutliner] Disable outlining from LinkOnceODRs by default
Say you have two identical linkonceodr functions, one in M1 and one in M2.
Say that the outliner outlines A,B,C from one function, and D,E,F from another
function (where letters are instructions). Now those functions are not
identical, and cannot be deduped. Locally to M1 and M2, these outlining
choices would be good-- to the whole program, however, this might not be true!

To mitigate this, this commit makes it so that the outliner sees linkonceodr
functions as unsafe to outline from. It also adds a flag,
-enable-linkonceodr-outlining, which allows the user to specify that they
want to outline from such functions when they know what they're doing.

Changing this handles most code size regressions in the test suite caused by
competing with linker dedupe. It also doesn't have a huge impact on the code
size improvements from the outliner. There are 6 tests that regress > 5% from
outlining WITH linkonceodrs to outlining WITHOUT linkonceodrs. Overall, most
tests either improve or are not impacted.

Not outlined vs outlined without linkonceodrs:
https://hastebin.com/raw/qeguxavuda

Not outlined vs outlined with linkonceodrs:
https://hastebin.com/raw/edepoqoqic

Outlined with linkonceodrs vs outlined without linkonceodrs:
https://hastebin.com/raw/awiqifiheb

Numbers generated using compare.py with -m size.__text. Tests run for AArch64
with -Oz -mllvm -enable-machine-outliner -mno-red-zone.

llvm-svn: 315136
2017-10-07 00:16:34 +00:00
Reid Kleckner 813c577cc2 [PEI] Remove required properties and use 'if' instead of std::function
Summary:
After r303360, we initialize UsesCalleeSaves in runOnMachineFunction,
which runs after getRequiredProperties. UsesCalleeSaves was initialized
to 'false', so getRequiredProperties would always return an empty set.
We don't have a TargetMachine available early anymore after r303360.
Just removing the requirement of NoVRegs seems to make things work, so
let's do that.

Reviewers: thegameg, dschuff, MatzeB

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D38597

llvm-svn: 315089
2017-10-06 18:21:19 +00:00
Xin Tong 27e66fb579 [MBP] Remove an invalid assert.
The patch that this assert comes with is fixing a bug in MBP. The assert is
invalid however.

Thanks to @sergey.k.okunev for finding this

Currently this fails SPECCPU2006 LTO. I will add a test case when I do more
investigation and have one.

llvm-svn: 315032
2017-10-05 23:00:04 +00:00
Karl-Johan Karlsson 8d8d201c17 [DebugInfo] Insert DEBUG_VALUEs after each register redefinition
Summary:
When reinserting debug values after register allocation, make sure to
insert debug values after each redefinition of debug value register in
the slot index range. The reason for this is that DwarfDebug will end
the range of a debug variable when the physical reg is defined. For
instructions with e.g. tied operands this result in prematurely ended
debug range.

This resolves pr34545

Patch by Karl-Johan Karlsson and Bjorn Pettersson

Reviewers: rnk, aprantl

Reviewed By: rnk

Subscribers: bjope, llvm-commits

Differential Revision: https://reviews.llvm.org/D38229

llvm-svn: 314974
2017-10-05 08:37:31 +00:00
Mikael Holmen 0ec1d25d33 Minor refactoring regarding Cast::isNoopCast(), NFC
Summary:
FastISel::hasTrivialKill() was the only user of the "IntPtrTy" version of
Cast::isNoopCast(). According to review comments in D37894 we could instead
use the "DataLayout" version of the method, and thus get rid of the
"IntPtrTy" versions of isNoopCast() completely.

With the above done, the remaining isNoopCast() could then be simplified
a bit more.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D38497

llvm-svn: 314969
2017-10-05 07:07:09 +00:00
Xin Tong d8d97972de [MachineBlockPlacement] Make sure PreferredLoopExit is cleared everytime new loop is processed
Summary: Rotate on exit that actually exits the current loop.

Reviewers: davidxl, danielcdh, iteratee, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38563

llvm-svn: 314937
2017-10-04 21:39:25 +00:00
Sanjay Patel 4c33d5213b [SimplifyCFG] put the optional assumption cache pointer in the options struct; NFCI
This is a follow-up to https://reviews.llvm.org/D38138. 

I fixed the capitalization of some functions because we're changing those
lines anyway and that helped verify that we weren't accidentally dropping 
any options by using default param values.

llvm-svn: 314930
2017-10-04 20:26:25 +00:00
Hans Wennborg 2a6c9adb2f Revert r314886 "[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)"
It broke the Chromium / SQLite build; see PR34830.

> Summary:
>    1/  Operand folding during complex pattern matching for LEAs has been
>        extended, such that it promotes Scale to accommodate similar operand
>        appearing in the DAG.
>        e.g.
>          T1 = A + B
>          T2 = T1 + 10
>          T3 = T2 + A
>        For above DAG rooted at T3, X86AddressMode will no look like
>          Base = B , Index = A , Scale = 2 , Disp = 10
>
>    2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
>        so that if there is an opportunity then complex LEAs (having 3 operands)
>        could be factored out.
>        e.g.
>          leal 1(%rax,%rcx,1), %rdx
>          leal 1(%rax,%rcx,2), %rcx
>        will be factored as following
>          leal 1(%rax,%rcx,1), %rdx
>          leal (%rdx,%rcx)   , %edx
>
>    3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
>       thus avoiding creation of any complex LEAs within a loop.
>
> Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy
>
> Reviewed By: lsaba
>
> Subscribers: jmolloy, spatel, igorb, llvm-commits
>
>     Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 314919
2017-10-04 17:54:06 +00:00
Adam Nemet 6c381b7a2e [OptRemark] Move YAML writing to IR
Before the patch this was in Analysis.  Moving it to IR and making it implicit
part of LLVMContext::diagnose allows the full opt-remark facility to be used
outside passes e.g. the pass manager.  Jessica is planning to use this to
report function size after each pass.  The same could be used for time
reports.

Tested with BUILD_SHARED_LIBS=On.

llvm-svn: 314909
2017-10-04 15:18:11 +00:00
Adam Nemet f1bea0a6ab Also update MachineORE after r314874.
llvm-svn: 314908
2017-10-04 15:18:07 +00:00
Jatin Bhateja 3c29bacd43 [X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)
Summary:
   1/  Operand folding during complex pattern matching for LEAs has been
       extended, such that it promotes Scale to accommodate similar operand
       appearing in the DAG.
       e.g.
         T1 = A + B
         T2 = T1 + 10
         T3 = T2 + A
       For above DAG rooted at T3, X86AddressMode will no look like
         Base = B , Index = A , Scale = 2 , Disp = 10

   2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
       so that if there is an opportunity then complex LEAs (having 3 operands)
       could be factored out.
       e.g.
         leal 1(%rax,%rcx,1), %rdx
         leal 1(%rax,%rcx,2), %rcx
       will be factored as following
         leal 1(%rax,%rcx,1), %rdx
         leal (%rdx,%rcx)   , %edx

   3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
      thus avoiding creation of any complex LEAs within a loop.

Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy

Reviewed By: lsaba

Subscribers: jmolloy, spatel, igorb, llvm-commits

    Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 314886
2017-10-04 09:02:10 +00:00
Mikael Holmen a1a3f5c5e6 Recommit [UnreachableBlockElim] Use COPY if PHI input is undef
This time invoking llc with "-march=x86-64" in the testcase, so we don't assume
the default target is x86.

Summary:
If we have

    %vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3
    %vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0

then we can't just change %vreg0 into %vreg3, since %vreg2 is actually
undef. We would have to also copy the undef flag to be able to change the
register.

Instead we deal with this case like other cases where we can't just
replace the register: we insert a COPY. The code creating the COPY already
copied all flags from the PHI input, so the undef flag will be transferred
as it should.

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38235

llvm-svn: 314882
2017-10-04 07:42:45 +00:00
Mikael Holmen 75b1992f78 Revert r314879 "[UnreachableBlockElim] Use COPY if PHI input is undef"
Build-bots broke on the new testcase. I'll investigate and fix.

llvm-svn: 314880
2017-10-04 06:39:22 +00:00
Mikael Holmen 65eb2f394c [UnreachableBlockElim] Use COPY if PHI input is undef
Summary:
If we have

    %vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3
    %vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0

then we can't just change %vreg0 into %vreg3, since %vreg2 is actually
undef. We would have to also copy the undef flag to be able to change the
register.

Instead we deal with this case like other cases where we can't just
replace the register: we insert a COPY. The code creating the COPY already
copied all flags from the PHI input, so the undef flag will be transferred
as it should.

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38235

llvm-svn: 314879
2017-10-04 06:06:31 +00:00
Jessica Paquette acc15e1265 [MachineOutliner] Fix off-by-one in cost model
This commit does two things. Firstly, it cleans up some of the benefit
calculation wrt outlined functions and candidates. Secondly, it fixes an
off-by-one bug in the cost model which was caused by the benefit value of
an OutlinedFunction and Candidate differing by 1. It updates the remarks test
to reflect this change.

llvm-svn: 314836
2017-10-03 20:32:55 +00:00
Reid Kleckner b4569de739 Implement David Blaikie's suggestion for comparison operators
llvm-svn: 314822
2017-10-03 18:30:11 +00:00
Reid Kleckner 04e25e00b7 [DebugInfo] Correctly coalesce DBG_VALUEs that mix direct and indirect values
Summary:
This should fix a regression introduced by r313786, which switched from
MachineInstr::isIndirectDebugValue() to checking if operand 1 is an
immediate. I didn't have a test case for it until now.

A single UserValue, which approximates a user variable, may have many
DBG_VALUE instructions that disagree about whether the variable is in
memory or in a virtual register. This will become much more common once
we have llvm.dbg.addr, but you can construct such a test case manually
today with llvm.dbg.value.

Before this change, we would get two UserValues: one for direct and one
for indirect DBG_VALUE instructions describing the same variable. If we
build separate interval maps for direct and indirect locations, we will
end up accidentally coalescing identical DBG_VALUE intervals that need
to remain separate because they are broken up by intervals of the
opposite direct-ness.

Reviewers: aprantl

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37932

llvm-svn: 314819
2017-10-03 17:59:02 +00:00
Geoff Berry fabedbad11 Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding""
This reverts commit r314729.

Another bug has been encountered in an out-of-tree target reported by Quentin.

llvm-svn: 314814
2017-10-03 16:59:13 +00:00
John Brawn 736bf0020f [CGP] Make optimizeMemoryInst capable of handling multiple AddrModes
Currently optimizeMemoryInst requires that all of the AddrModes it sees are
identical. This patch makes it capable of tracking multiple AddrModes, so long
as they differ in at most one field.

This patch does nothing by itself, but later patches will make use of it to
insert or reuse phi or select instructions for the differing fields.

Differential Revision: https://reviews.llvm.org/D38278

llvm-svn: 314795
2017-10-03 13:08:22 +00:00
John Brawn eb83c7554e [CGP] In optimizeMemoryInst handle select similarly to phi
This lets us optimize away selects that perform the same address computation in
two different ways and is also the first step towards being able to handle
selects between two different, but compatible, address computations.

Differential Revision: https://reviews.llvm.org/D38242

llvm-svn: 314794
2017-10-03 13:04:15 +00:00
Sam Clegg b2b019f727 [WebAssembly] MC: Support for init_array and fini_array
Differential Revision: https://reviews.llvm.org/D37757

llvm-svn: 314783
2017-10-03 11:20:28 +00:00
Bjorn Pettersson 90cc1b53d0 [DebugInfo] Handle endianness when moving debug info for split integer values (reapplied)
Summary:
Take the target's endianness into account when splitting the
debug information in DAGTypeLegalizer::SetExpandedInteger.

This patch fixes so that, for big-endian targets, the fragment
expression corresponding to the high part of a split integer
value is placed at offset 0, in order to correctly represent
the memory address order.

I have attached a PPC32 reproducer where the resulting DWARF
pieces for a 64-bit integer were incorrectly reversed.

Original patch was reverted due to using -stop-after=isel in
the test case (but that is only working when AMDGPU target
is included in the llc build). The test case has now been
updated to use -stop-before=expand-isel-pseudos instead.

Patch by: dstenb

Reviewers: JDevlieghere, aprantl, dblaikie

Reviewed By: JDevlieghere, aprantl, dblaikie

Subscribers: nemanjai

Differential Revision: https://reviews.llvm.org/D38172

llvm-svn: 314781
2017-10-03 11:03:02 +00:00
Javed Absar e485b143ea [MiSched] - Simplify ProcResEntry access
Reviewed by: @MatzeB
Differential Revision: https://reviews.llvm.org/D38447

llvm-svn: 314775
2017-10-03 09:35:04 +00:00
Sjoerd Meijer 7a22a4948f ISel type legalization: add debug messages. NFCI.
This adds some more debug messages to the type legalizer and functions
like PromoteNode, ExpandNode, ExpandLibCall in an attempt to make
the debug messages a little bit more informative and useful.

Differential Revision: https://reviews.llvm.org/D38450

llvm-svn: 314773
2017-10-03 08:54:15 +00:00
Quentin Colombet c2f3cea608 [Legalizer] Add support for G_OR NarrowScalar.
Legalize bitwise OR:
 A = BinOp<Ty> B, C
into:
 B1, ..., BN = G_UNMERGE_VALUES B
 C1, ..., CN = G_UNMERGE_VALUES C
 A1 = BinOp<Ty/N> B1, C2
 ...
 AN = BinOp<Ty/N> BN, CN
 A = G_MERGE_VALUES A1, ..., AN

llvm-svn: 314760
2017-10-03 04:53:56 +00:00
Tim Shen 59465d29f8 [PowerPC] Revert r314666.
See https://reviews.llvm.org/D38172.

I tried to XFAIL it, but sometimes XPASS triggers the bot. Simply
revert it.

llvm-svn: 314739
2017-10-02 23:20:06 +00:00
Geoff Berry bfc5fb4571 Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Issues addressed since original review:
- Avoid bug in regalloc greedy/machine verifier when forwarding to use
  in an instruction that re-defines the same virtual register.
- Fixed bug when forwarding to use in EarlyClobber instruction slot.
- Fixed incorrect forwarding to register definitions that showed up in
  explicit_uses() iterator (e.g. in INLINEASM).
- Moved removal of dead instructions found by
  LiveIntervals::shrinkToUses() outside of loop iterating over
  instructions to avoid instructions being deleted while pointed to by
  iterator.
- Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
- The pass no longer forwards COPYs to physical register uses, since
  doing so can break code that implicitly relies on the physical
  register number of the use.
- The pass no longer forwards COPYs to undef uses, since doing so
  can break the machine verifier by creating LiveRanges that don't
  end on a use (since the undef operand is not considered a use).

  [MachineCopyPropagation] Extend pass to do COPY source forwarding

  This change extends MachineCopyPropagation to do COPY source forwarding.

  This change also extends the MachineCopyPropagation pass to be able to
  be run during register allocation, after physical registers have been
  assigned, but before the virtual registers have been re-written, which
  allows it to remove virtual register COPY LiveIntervals that become dead
  through the forwarding of all of their uses.

llvm-svn: 314729
2017-10-02 22:01:37 +00:00
Michael Liao 70356574bb Remove trailing whitespace to trigger re-cmaking
llvm-svn: 314728
2017-10-02 21:54:38 +00:00
Stanislav Mekhanoshin 6deb212522 Eliminate ftrunc if source is know to be rounded
Differential Revision: https://reviews.llvm.org/D38421

llvm-svn: 314688
2017-10-02 16:57:07 +00:00
Sanjay Patel 232669d6b3 use range-for-loops; NFCI
llvm-svn: 314676
2017-10-02 15:02:06 +00:00
Sanjay Patel 7998d37076 remove duplicate comments, reposition related functions; NFC
llvm-svn: 314669
2017-10-02 14:03:17 +00:00
Bjorn Pettersson 839775a277 [Debug info] Handle endianness when moving debug info for split integer values
Summary:
Take the target's endianness into account when splitting the
debug information in DAGTypeLegalizer::SetExpandedInteger.

This patch fixes so that, for big-endian targets, the fragment
expression corresponding to the high part of a split integer
value is placed at offset 0, in order to correctly represent
the memory address order.

I have attached a PPC32 reproducer where the resulting DWARF
pieces for a 64-bit integer were incorrectly reversed.

Patch by: dstenb

Reviewers: JDevlieghere, aprantl, dblaikie

Reviewed By: JDevlieghere, aprantl, dblaikie

Subscribers: nemanjai

Differential Revision: https://reviews.llvm.org/D38172

llvm-svn: 314666
2017-10-02 12:46:32 +00:00
Yaxun Liu b33607e5a1 CodeGen: Fix pointer info in expandUnalignedLoad/Store
Currently expandUnalignedLoad/Store uses place holder pointer info for temporary memory operand
in stack, which does not have correct address space. This causes unaligned private double16 load/store to be
lowered to flat_load instead of buffer_load for amdgcn target.

This fixes failures of OpenCL conformance test basic/vload_private/vstore_private on target amdgcn---amdgizcl.

Differential Revision: https://reviews.llvm.org/D35361

llvm-svn: 314566
2017-09-29 23:31:14 +00:00
Eugene Zelenko 4f81cdd818 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 314559
2017-09-29 21:55:49 +00:00
Jonas Paulsson c9e363ac69 [SystemZ] implement shouldCoalesce()
Implement shouldCoalesce() to help regalloc avoid running out of GR128
registers.

If a COPY involving a subreg of a GR128 is coalesced, the live range of the
GR128 virtual register will be extended. If this happens where there are
enough phys-reg clobbers present, regalloc will run out of registers (if
there is not a single GR128 allocatable register available).

This patch tries to allow coalescing only when it can prove that this will be
safe by checking the (local) interval in question.

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D37899
https://bugs.llvm.org/show_bug.cgi?id=34610

llvm-svn: 314516
2017-09-29 14:31:39 +00:00
Jessica Paquette 919991690c [MachineOutliner][NFC] Simplify logic in pruneCandidates
This commit yanks out the repeated sections of code in pruneCandidates into
two lambdas: ShouldSkipCandidate and Prune. This simplifies the logic in
pruneCandidates significantly, and reduces the chance of introducing bugs by
folding all of the shared logic into one place.

llvm-svn: 314475
2017-09-28 23:39:36 +00:00
Matthias Braun 5c3e8a450e MIR: Serialize CaleeSavedInfo Restored flag
llvm-svn: 314449
2017-09-28 18:52:14 +00:00
Adrian Prantl 99fdb9d927 llvm-dwarfdump: implement --find for .apple_names
This patch implements the dwarfdump option --find=<name>.  This option
looks for a DIE in the accelerator tables and dumps it if found.  This
initial patch only adds support for .apple_names to keep the review
small, adding the other sections and pubnames support should be
trivial though.

Differential Revision: https://reviews.llvm.org/D38282

llvm-svn: 314439
2017-09-28 18:10:52 +00:00
Bjorn Pettersson 715a5efaad [DebugInfo] Do not extend range for physreg in LiveDebugVariables
Summary:
A DBG_VALUE that is referring to a physical register is
valid up until the next def of the register, or the end
of the basic block that it belongs to.

LiveDebugVariables is computing live intervals (slot index
ranges) for DBG_VALUE instructions, before regalloc, in order
to be able to re-insert DBG_VALUE instructions again after
regalloc. When the DBG_VALUE is mapping a variable to a
physical register we do not need to compute the range. We
should simply re-insert the DBG_VALUE at the start position.

The problem that was found, resulting in this patch, was a
situation when the DBG_VALUE was the last real use of the
physical register. The computeIntervals/extendDef methods
extended the range to cover the whole basic block, even though
the physical register very well could be allocated to some
virtual register inside the basic block. So the extended
range could not be trusted.

This patch is a preparation for https://reviews.llvm.org/D38229,
where the goal is to insert DBG_VALUE after each new definition
of a variable, even if the virtual registers that the variable
was connected to has been coalesced into using the same physical
register (e.g. due to two address instructions). For more info
see https://bugs.llvm.org/show_bug.cgi?id=34545

Reviewers: aprantl, rnk, echristo

Reviewed By: aprantl

Subscribers: Ka-Ka, llvm-commits

Differential Revision: https://reviews.llvm.org/D38140

llvm-svn: 314414
2017-09-28 13:10:06 +00:00
Alex Bradbury 5518cbfc41 Teach TargetInstrInfo::getInlineAsmLength to parse .space directives with integer arguments
It's currently quite difficult to test passes like branch relaxation, which
requires branches with large displacement to be generated. The .space assembler
directive makes it easy to create arbitrarily large basic blocks, but
getInlineAsmLength is not able to parse it and so the size of the block is not
correctly estimated. Other backends (AArch64, AMDGPU) introduce options just
for testing that artificially restrict the ranges of branch instructions (e.g.
aarch64-tbz-offset-bits). Although parsing a single form of the .space
directive feels inelegant, it does allow a more direct testing approach.

This patch adapts the .space parsing code from
Mips16InstrInfo::getInlineAsmLength and removes it now the extra functionality
is provided by the base implementation. I want to move this functionality to
the generic getInlineAsmLength as 1) I need the same for RISC-V, and 2) I feel
other backends will benefit from more direct testing of large branch
displacements.

Differential Revision: https://reviews.llvm.org/D37798

llvm-svn: 314393
2017-09-28 09:31:46 +00:00
Mikael Holmen 07f1e2e2b3 [RegAllocGreedy]: Allow recoloring of done register if it's non-tied
Summary:
If we have a non-allocated register, we allow us to try recoloring of an
already allocated and "Done" register, even if they are of the same
register class, if the non-allocated register has at least one tied def
and the allocated one has none.

It should be easier to recolor the non-tied register than the tied one, so
it might be an improvement even if they use the same regclasses.

Reviewers: qcolombet

Reviewed By: qcolombet

Subscribers: llvm-commits, MatzeB

Differential Revision: https://reviews.llvm.org/D38309

llvm-svn: 314388
2017-09-28 08:22:35 +00:00
George Burgess IV f8e11b803d [DAGCombiner] Fix an off-by-one error in vector logic
Without this, we could end up trying to get the Nth (0-indexed) element
from a subvector of size N.

Differential Revision: https://reviews.llvm.org/D37880

llvm-svn: 314380
2017-09-28 06:17:19 +00:00
Eugene Zelenko fa57bd0ced [CodeGen] Fix some Clang-tidy modernize-use-default-member-init and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 314363
2017-09-27 23:26:01 +00:00
Don Hinton 53eb637115 Cleanup some problems with LLVM_ENABLE_DUMP in release builds, and
always set LLVM_ENABLE_DUMP=ON for +Asserts builds.

Differential Revision: https://reviews.llvm.org/D38306

llvm-svn: 314346
2017-09-27 21:19:56 +00:00
Jessica Paquette 4cf187b5b4 [MachineOutliner] AArch64: Avoid saving + restoring LR if possible
This commit allows the outliner to avoid saving and restoring the link register
on AArch64 when it is dead within an entire class of candidates.

This introduces changes to the way the outliner interfaces with the target.
For example, the target now interfaces with the outliner using a
MachineOutlinerInfo struct rather than by using getOutliningCallOverhead and
getOutliningFrameOverhead.

This also improves several comments on the outliner's cost model.

https://reviews.llvm.org/D36721

llvm-svn: 314341
2017-09-27 20:47:39 +00:00
Than McIntosh dee2cf67ea [CodeGen] Emit necessary .note sections for -fsplit-stack
Summary:
According to https://gcc.gnu.org/wiki/SplitStacks, the linker expects a zero-sized .note.GNU-split-stack section if split-stack is used (and also .note.GNU-no-split-stack section if it also contains non-split-stack functions), so it can handle the cases where a split-stack function calls non-split-stack function.

This change adds the sections if needed.

Fixes PR #34670.

Reviewers: thanm, rnk, luqmana

Reviewed By: rnk

Subscribers: llvm-commits

Patch by Cherry Zhang <cherryyz@google.com>

Differential Revision: https://reviews.llvm.org/D38051

llvm-svn: 314335
2017-09-27 19:34:00 +00:00
Sanjay Patel 0f9b4773c1 [SimplifyCFG] add a struct to house optional folds (PR34603)
This was intended to be no-functional-change, but it's not - there's a test diff.

So I thought I should stop here and post it as-is to see if this looks like what was expected 
based on the discussion in PR34603:
https://bugs.llvm.org/show_bug.cgi?id=34603

Notes:
 1. The test improvement occurs because the existing 'LateSimplifyCFG' marker is not carried 
    through the recursive calls to 'SimplifyCFG()->SimplifyCFGOpt().run()->SimplifyCFG()'. 
    The parameter isn't passed down, so we pick up the default value from the function signature 
    after the first level. I assumed that was a bug, so I've passed 'Options' down in all of the 
    'SimplifyCFG' calls.

 2. I split 'LateSimplifyCFG' into 2 bits: ConvertSwitchToLookupTable and KeepCanonicalLoops. 
    This would theoretically allow us to differentiate the transforms controlled by those params 
    independently.

 3. We could stash the optional AssumptionCache pointer and 'LoopHeaders' pointer in the struct too. 
    I just stopped here to minimize the diffs.

 4. Similarly, I stopped short of messing with the pass manager layer. I have another question that 
    could wait for the follow-up: why is the new pass manager creating the pass with LateSimplifyCFG 
    set to true no matter where in the pipeline it's creating SimplifyCFG passes?

    // Create an early function pass manager to cleanup the output of the
    // frontend.
    EarlyFPM.addPass(SimplifyCFGPass());

    -->

    /// \brief Construct a pass with the default thresholds
    /// and switch optimizations.
    SimplifyCFGPass::SimplifyCFGPass()
       : BonusInstThreshold(UserBonusInstThreshold),
         LateSimplifyCFG(true) {}   <-- switches get converted to lookup tables and loops may not be in canonical form

    If this is unintended, then it's possible that the current behavior of dropping the 'LateSimplifyCFG' 
    setting via recursion was masking this bug.

Differential Revision: https://reviews.llvm.org/D38138

llvm-svn: 314308
2017-09-27 14:54:16 +00:00
Mikael Holmen 3bcc9f0c1f [RegAllocGreedy] Fix spelling error, "inteference" -> "interference", NFC
llvm-svn: 314299
2017-09-27 11:27:50 +00:00
Javed Absar 1a77bcc0d2 [Misched]: Remove double call getMicroOpFactor.NFC.
Reviewed by: @MatzeB
Differential Revision: https://reviews.llvm.org/D38176

llvm-svn: 314296
2017-09-27 10:31:58 +00:00
Craig Topper 1c781338ee [SelectionDAG] Make NewSDValueDbgMsg print target specific nodes correctly by passing in the SelectionDAG.
llvm-svn: 314271
2017-09-27 05:17:14 +00:00
Craig Topper 5bc10ede53 [SelectionDAG] Teach simplifyDemandedBits to handle shifts by constant splat vectors
This teach simplifyDemandedBits to handle constant splat vector shifts.

This required changing some uses of getZExtValue to getLimitedValue since we can't rely on legalization using getShiftAmountTy for the shift amount.

I believe there may have been a bug in the ((X << C1) >>u ShAmt) handling where we didn't check if the inner shift was too large. I've fixed that here.

I had to add new patterns to ARM because the zext/sext the patterns were trying to look for got turned into an any_extend with this patch. Happy to split that out too, but not sure how to test without this change.

Differential Revision: https://reviews.llvm.org/D37665

llvm-svn: 314139
2017-09-25 19:26:08 +00:00
Reid Kleckner 8898cd8dcf [DebugInfo] Sort the SDDbgValue list before assuming it is in IR order
Summary:
This code iterates the 'Orders' vector in parallel with the DbgValue
list, emitting all DBG_VALUEs that occurred between the last IR order
insertion point and the next insertion point. This assumes the
SDDbgValue list is sorted in IR order, which it usually is. However, it
is not sorted when a node with a debug value is replaced with another
one. When this happens, TransferDbgValues is called, and the new value
is added to the end of the list.

The problem can be solved by stably sorting the list by IR order.

Reviewers: aprantl, Ka-Ka

Reviewed By: aprantl

Subscribers: MatzeB, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D38197

llvm-svn: 314114
2017-09-25 16:14:53 +00:00
Reid Kleckner 09e75c9399 Use {} instead of make_pair and an iterator for the insertion point, NFC
llvm-svn: 314113
2017-09-25 16:14:39 +00:00
Clement Courbet 2807c0a442 [CodeGenPrepare][NFC] Rename TargetTransformInfo::expandMemCmp -> TargetTransformInfo::enableMemCmpExpansion.
Summary:
Right now there are two functions with the same name, one does the work
and the other one returns true if expansion is needed. Rename
TargetTransformInfo::expandMemCmp to make it more consistent with other
members of TargetTransformInfo.

Remove the unused Instruction* parameter.

Differential Revision: https://reviews.llvm.org/D38165

llvm-svn: 314096
2017-09-25 06:35:16 +00:00
Eugene Zelenko 8e30a1c607 [CodeGen] Fix build bots which uses old Clang broken in r314046. (NFC)
llvm-svn: 314049
2017-09-22 23:55:32 +00:00
Eugene Zelenko f193332994 [CodeGen] Fix some Clang-tidy modernize-use-default-member-init and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 314046
2017-09-22 23:46:57 +00:00
Tim Shen cee7536188 [XRay] support conditional return on PPC.
Summary: Conditional returns were not taken into consideration at all. Implement them by turning them into jumps and normal returns. This means there is a slightly higher performance penalty for conditional returns, but this is the best we can do, and it still disturbs little of the rest.

Reviewers: dberris, echristo

Subscribers: sanjoy, nemanjai, hiraditya, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D38102

llvm-svn: 314005
2017-09-22 18:30:02 +00:00
Eugene Zelenko fb7f792f55 [CodeGen] Fix some Clang-tidy modernize-use-bool-literals and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 313941
2017-09-21 23:20:16 +00:00
Craig Topper de4379251e [DAGCombiner] Slightly simplify some code by using APInt::isMask() and countTrailingOnes instead of getting active bits and checking if all the bits below that make a mask.
At least for the 64-bit and less case, we should be able to determine if we even have a mask without counting any bits. This also removes the need to explicitly check for 0 active bits, isMask will return false for 0.

llvm-svn: 313908
2017-09-21 20:12:19 +00:00
Reid Kleckner 0fe506bc5e Re-land r313825: "[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare"
The fix is to avoid invalidating our insertion point in
replaceDbgDeclare:
     Builder.insertDeclare(NewAddress, DIVar, DIExpr, Loc, InsertBefore);
+    if (DII == InsertBefore)
+      InsertBefore = &*std::next(InsertBefore->getIterator());
     DII->eraseFromParent();

I had to write a unit tests for this instead of a lit test because the
use list order matters in order to trigger the bug.

The reduced C test case for this was:
  void useit(int*);
  static inline void inlineme() {
    int x[2];
    useit(x);
  }
  void f() {
    inlineme();
    inlineme();
  }

llvm-svn: 313905
2017-09-21 19:52:03 +00:00
Bjorn Pettersson 0dde08c3cb [SelectionDAG] Pick correct frame index in LowerArguments
Summary:
SelectionDAGISel::LowerArguments is associating arguments
with frame indices (FuncInfo->setArgumentFrameIndex). That
information is later on used by EmitFuncArgumentDbgValue to
create DBG_VALUE instructions that denotes that a variable
can be found on the stack.

I discovered that for our (big endian) out-of-tree target
the association created by SelectionDAGISel::LowerArguments
sometimes is wrong. I've seen this happen when a 64-bit value
is passed on the stack. The argument will occupy two stack
slots (frame index X, and frame index X+1). The fault is
that a call to setArgumentFrameIndex is associating the
64-bit argument with frame index X+1. The effect is that the
debug information (DBG_VALUE) will point at the least significant
part of the arguement on the stack. When printing the
argument in a debugger I will get the wrong value.

I managed to create a test case for PowerPC that seems to
show the same kind of problem.

The bugfix will look at the datalayout, taking endianness into
account when examining a BUILD_PAIR node, assuming that the
least significant part is in the first operand of the BUILD_PAIR.
For big endian targets we should use the frame index from
the second operand, as the most significant part will be stored
at the lower address (using the highest frame index).

Reviewers: bogner, rnk, hfinkel, sdardis, aprantl

Reviewed By: aprantl

Subscribers: nemanjai, aprantl, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37740

llvm-svn: 313901
2017-09-21 18:52:08 +00:00
Craig Topper 280f133773 [DAGCombiner] Remove duplicate code from visitZERO_EXTEND
This exact block of code exists right below.

Differential Revision: https://reviews.llvm.org/D38122

llvm-svn: 313891
2017-09-21 17:30:02 +00:00
Daniel Jasper 7d2f38d600 Revert r313825: "[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare"
.. as well as the two subsequent changes r313826 and r313875.

This leads to segfaults in combination with ASAN. Will forward repro
instructions to the original author (rnk).

llvm-svn: 313876
2017-09-21 12:07:33 +00:00
Strahinja Petrovic 29202f6dc1 Fixed reverted commit rL312318
This patch contains fix for reverted commit
rL312318 which was causing failure due to use
of unchecked dyn_cast to CIInit.

Patch by: Nikola Prica.

llvm-svn: 313870
2017-09-21 10:04:02 +00:00
Craig Topper f0ba300332 [SelectionDAG] Replace a flag that can never be true with an assert.
llvm-svn: 313847
2017-09-21 00:18:46 +00:00
Craig Topper eb0f71f232 [SelectionDAG] Use APInt::getActivebits instead of Bitwidth - leading zeros.
llvm-svn: 313839
2017-09-20 23:48:56 +00:00
Reid Kleckner 3f547e87b2 [IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare
Summary:
This implements the design discussed on llvm-dev for better tracking of
variables that live in memory through optimizations:
  http://lists.llvm.org/pipermail/llvm-dev/2017-September/117222.html

This is tracked as PR34136

llvm.dbg.addr is intended to be produced and used in almost precisely
the same way as llvm.dbg.declare is today, with the exception that it is
control-dependent. That means that dbg.addr should always have a
position in the instruction stream, and it will allow passes that
optimize memory operations on local variables to insert llvm.dbg.value
calls to reflect deleted stores. See SourceLevelDebugging.rst for more
details.

The main drawback to generating DBG_VALUE machine instrs is that they
usually cause LLVM to emit a location list for DW_AT_location. The next
step will be to teach DwarfDebug.cpp how to recognize more DBG_VALUE
ranges as not needing a location list, and possibly start setting
DW_AT_start_offset for variables whose lifetimes begin mid-scope.

Reviewers: aprantl, dblaikie, probinson

Subscribers: eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37768

llvm-svn: 313825
2017-09-20 21:52:33 +00:00
Saleem Abdulrasool 432b88e5f4 CodeGen: support SwiftError SwiftCC on Windows x64
Add support for passing SwiftError through a register on the Windows x64
calling convention.  This allows the use of swifterror attributes on
parameters which is used by the swift front end for the `Error`
parameter.  This partially enables building the swift standard library
for Windows x86_64.

llvm-svn: 313791
2017-09-20 18:40:59 +00:00
Reid Kleckner 4e04028791 Re-land "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs"
After r313775, it's easier to maintain a parallel BitVector of spilled
locations indexed by location number.

I wasn't able to build a good reduced test case for this iteration of
the bug, but I added a more direct assertion that spilled values must
use frame index locations. If this bug reappears, it won't only fire on
the NEON vector code that we detected it on, but on medium-sized
integer-only programs as well.

llvm-svn: 313786
2017-09-20 18:19:08 +00:00
Reid Kleckner 92687d45db [DebugInfo] Use a MapVector to coalesce MachineOperand locations
Summary:
The new code should be linear in the number of DBG_VALUEs, while the old
code was quadratic. NFC intended.

This is also hopefully a more direct expression of the problem, which is
to:

1. Rewrite all virtual register operands to stack slots or physical
   registers
2. Uniquely number those machine operands, assigning them location
   numbers
3. Rewrite all uses of the old location numbers in the interval map to
   use the new location numbers

In r313400, I attempted to track which locations were spilled in a
parallel bitvector indexed by location number. My code was broken
because these location numbers are not stable during rewriting.

Reviewers: aprantl, hans

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D38068

llvm-svn: 313775
2017-09-20 17:32:54 +00:00
Florian Hahn ceb4494786 Recommit [MachineCombiner] Update instruction depths incrementally for large BBs.
This version of the patch fixes an off-by-one error causing PR34596. We
do not need to use std::next(BlockIter) when calling updateDepths, as
BlockIter already points to the next element.

Original commit message:
> For large basic blocks with lots of combinable instructions, the
> MachineTraceMetrics computations in MachineCombiner can dominate the compile
> time, as computing the trace information is quadratic in the number of
> instructions in a BB and it's relevant successors/predecessors.

> In most cases, knowing the instruction depth should be enough to make
> combination decisions. As we already iterate over all instructions in a basic
> block, the instruction depth can be computed incrementally. This reduces the
> cost of machine-combine drastically in cases where lots of instructions
> are combined. The major drawback is that AFAIK, computing the critical path
> length cannot be done incrementally. Therefore we only compute
> instruction depths incrementally, for basic blocks with more
> instructions than inc_threshold. The -machine-combiner-inc-threshold
> option can be used to set the threshold and allows for easier
> experimenting and checking if using incremental updates for all basic
> blocks has any impact on the performance.
>
> Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
>
> Reviewed By: fhahn
>
> Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 313751
2017-09-20 11:54:37 +00:00
Quentin Colombet d652aeb144 [MIRPrinter] Print empty successor lists when they cannot be guessed
This re-applies commit r313685, this time with the proper updates to
the test cases.

Original commit message:
Unreachable blocks in the machine instr representation are these
weird empty blocks with no successors.
The MIR printer used to not print empty lists of successors. However,
the MIR parser now treats non-printed list of successors as "please
guess it for me". As a result, the parser tries to guess the list of
successors and given the block is empty, just assumes it falls through
the next block (if any).

For instance, the following test case used to fail the verifier.
The MIR printer would print

         entry
        /      \
   true (def)   false (no list of successors)
       |
 split.true (use)

The MIR parser would understand this:

         entry
        /      \
   true (def)   false
       |        /  <-- invalid edge
 split.true (use)

Because of the invalid edge, we get the "def does not
dominate all uses" error.

The fix consists in printing empty successor lists, so that the parser
knows what to do for unreachable blocks.

rdar://problem/34022159

llvm-svn: 313696
2017-09-19 23:34:12 +00:00
Saleem Abdulrasool 399a4e9b0b CodeGen: use range based for loops (NFC)
Simplify the RPOT traversal by using a range based for loop for the
iterator dereference.

llvm-svn: 313687
2017-09-19 22:10:20 +00:00
Quentin Colombet 6888dbcda7 Revert "[MIRPrinter] Print empty successor lists when they cannot be guessed"
This reverts commit r313685.

I thought I had ran ninja check, but apparently I didn't...
Need to update a bunch of mir tests.

llvm-svn: 313686
2017-09-19 22:03:50 +00:00
Quentin Colombet 7fdaa5e641 [MIRPrinter] Print empty successor lists when they cannot be guessed
Unreachable blocks in the machine instr representation are these
weird empty blocks with no successors.
The MIR printer used to not print empty lists of successors. However,
the MIR parser now treats non-printed list of successors as "please
guess it for me". As a result, the parser tries to guess the list of
successors and given the block is empty, just assumes it falls through
the next block (if any).

For instance, the following test case used to fail the verifier.
The MIR printer would print
          entry
         /      \
    true (def)   false (no list of successors)
        |
  split.true (use)

The MIR parser would understand this:
          entry
         /      \
    true (def)   false
        |        /  <-- invalid edge
  split.true (use)

Because of the invalid edge, we get the "def does not
dominate all uses" error.

The fix consists in printing empty successor lists, so that the parser
knows what to do for unreachable blocks.

rdar://problem/34022159

llvm-svn: 313685
2017-09-19 21:55:51 +00:00
Reid Kleckner ffdf087499 Revert "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs"
This reverts r313640, originally r313400, one more time for essentially
the same issue. My BitVector of spilled location numbers isn't working
because we coalesce identical DBG_VALUE locations as we rewrite them,
invalidating the location numbers used to index the BitVector.

llvm-svn: 313679
2017-09-19 21:18:32 +00:00
Reid Kleckner 26fa1bf4da Re-land "Fix Bug 30978 by emitting cv file checksums."
This reverts r313431 and brings back r313374 with a fix to write
checksums as binary data and not ASCII hex strings.

llvm-svn: 313657
2017-09-19 18:14:45 +00:00
Reid Kleckner 86c74dd791 Re-land r313400 "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs"
I forgot to zero out the BitVector when reusing it between UserValues.

Later uses of the same location number for a different UserValue would
falsely indicate that they were spilled. Usually this would lead to
incorrect debug info, but in some cases they would indicate something
nonsensical like a memory location based on a vector register (Q8 on
ARM).

llvm-svn: 313640
2017-09-19 16:32:15 +00:00
Hans Wennborg 4de620ab3d Revert r313400 "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs"
This caused asserts in Chromium. See http://crbug.com/766261

> Summary:
> This comes up in optimized debug info for C++ programs that pass and
> return objects indirectly by address. In these programs,
> llvm.dbg.declare survives optimization, which causes us to emit indirect
> DBG_VALUE instructions. The fast register allocator knows to insert
> DW_OP_deref when spilling indirect DBG_VALUE instructions, but the
> LiveDebugVariables did not until this change.
>
> This fixes part of PR34513. I need to look into why this doesn't work at
> -O0 and I'll send follow up patches to handle that.
>
> Reviewers: aprantl, dblaikie, probinson
>
> Subscribers: qcolombet, hiraditya, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D37911

llvm-svn: 313589
2017-09-18 23:08:42 +00:00
Sanjay Patel f31b1a00ea [DAGCombiner] fold assertzexts separated by trunc
If we have an AssertZext of a truncated value that has already been AssertZext'ed, 
we can assert on the wider source op to improve the zext-y knowledge:
 assert (trunc (assert X, i8) to iN), i1 --> trunc (assert X, i1) to iN

This moves a fold from being Mips-specific to general combining, and x86 shows
improvements.

Differential Revision: https://reviews.llvm.org/D37017

llvm-svn: 313577
2017-09-18 22:05:35 +00:00
Sanjay Patel 7765c93be2 [DAG, x86] allow store merging before and after legalization (PR34217)
rL310710 allowed store merging to occur after legalization to catch stores that are created late,
but this exposes a logic hole seen in PR34217:
https://bugs.llvm.org/show_bug.cgi?id=34217

We will miss merging stores if the target lowers vector extracts into target-specific operations.
This patch allows store merging to occur both before and after legalization if the target chooses
to get maximum merging.

I don't think the potential regressions in the other tests are relevant. The tests are for
correctness of weird IR constructs rather than perf tests, and I think those are still correct.

Differential Revision: https://reviews.llvm.org/D37987

llvm-svn: 313564
2017-09-18 20:54:26 +00:00
Ahmed Bougacha d630a92b0e [GlobalISel] Only build expensive remarks if they're enabled. NFC.
r313390 taught 'allowExtraAnalysis' to check whether remarks are
enabled at all.  Use that to only do the expensive instruction printing
if they are.

llvm-svn: 313552
2017-09-18 18:50:09 +00:00
Simon Pilgrim 0b21ef1fa3 [SelectionDAG] Add BITCAST handling to ComputeNumSignBits for splatted sign bits.
For cases where we are BITCASTing to vectors of smaller elements, then if the entire source was a splatted sign (src's NumSignBits == SrcBitWidth) we can say that the dst's NumSignBit == DstBitWidth, as we're just splitting those sign bits across multiple elements.

We could generalize this but at the moment the only use case I have is to peek through bitcasts to vector comparison results.

Differential Revision: https://reviews.llvm.org/D37849

llvm-svn: 313543
2017-09-18 16:45:05 +00:00
Eric Beckmann 913213c8ae Revert "Fix Bug 30978 by emitting cv file checksums."
This reverts commit 6389e7aa724ea7671d096f4770f016c3d86b0d54.

There is a bug in this implementation where the string value of the
checksum is outputted, instead of the actual hex bytes.  Therefore the
checksum is incorrect, and this prevent pdbs from being loaded by visual
studio.  Revert this until the checksum is emitted correctly.

llvm-svn: 313431
2017-09-16 01:14:36 +00:00
Reid Kleckner eed0973270 Name the sentinel value used for the location number of the undefined register NFC
llvm-svn: 313405
2017-09-15 22:08:50 +00:00
Reid Kleckner 3a66c1cb58 [DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs
Summary:
This comes up in optimized debug info for C++ programs that pass and
return objects indirectly by address. In these programs,
llvm.dbg.declare survives optimization, which causes us to emit indirect
DBG_VALUE instructions. The fast register allocator knows to insert
DW_OP_deref when spilling indirect DBG_VALUE instructions, but the
LiveDebugVariables did not until this change.

This fixes part of PR34513. I need to look into why this doesn't work at
-O0 and I'll send follow up patches to handle that.

Reviewers: aprantl, dblaikie, probinson

Subscribers: qcolombet, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37911

llvm-svn: 313400
2017-09-15 21:54:38 +00:00
Reid Kleckner 9e6c309ef3 [DebugInfo] Add missing DW_OP_deref when an NRVO pointer is spilled
Summary:
Fixes PR34513.

Indirect DBG_VALUEs typically come from dbg.declares of non-trivially
copyable C++ objects that must be passed by address. We were already
handling the case where the virtual register gets allocated to a
physical register and is later spilled. That's what usually happens for
normal parameters that aren't NRVO variables: they usually appear in
physical register parameters, and are spilled later in the function,
which would correctly add deref.

NRVO variables are different because the dbg.declare can come much later
after earlier instructions cause the incoming virtual register to be
spilled.

Also, clean up this code. We only need to look at the first operand of a
DBG_VALUE, which eliminates the operand loop.

Reviewers: aprantl, dblaikie, probinson

Subscribers: MatzeB, qcolombet, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37929

llvm-svn: 313399
2017-09-15 21:49:56 +00:00
Sam Clegg 759631c77b [WebAssembly] MC: Create wasm data segments based on MCSections
This means that we can honor -fdata-sections rather than
always creating a segment for each symbol.

It also allows for a followup change to add .init_array and friends.

Differential Revision: https://reviews.llvm.org/D37876

llvm-svn: 313395
2017-09-15 20:54:59 +00:00
Sam Clegg 66a99e41cd Change encodeU/SLEB128 to pad to certain number of bytes
Previously the 'Padding' argument was the number of padding
bytes to add. However most callers that use 'Padding' know
how many overall bytes they need to write.  With the previous
code this would mean encoding the LEB once to find out how
many bytes it would occupy and then using this to calulate
the 'Padding' value.

See: https://reviews.llvm.org/D36595

Differential Revision: https://reviews.llvm.org/D37494

llvm-svn: 313393
2017-09-15 20:34:47 +00:00
Hans Wennborg 534bfbd3ba Revert r313343 "[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs."
This caused PR34629: asserts firing when building Chromium. It also broke some
buildbots building test-suite as reported on the commit thread.

> Summary:
>    1/  Operand folding during complex pattern matching for LEAs has been
>        extended, such that it promotes Scale to accommodate similar operand
>        appearing in the DAG.
>        e.g.
>           T1 = A + B
>           T2 = T1 + 10
>           T3 = T2 + A
>        For above DAG rooted at T3, X86AddressMode will no look like
>           Base = B , Index = A , Scale = 2 , Disp = 10
>
>    2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
>        so that if there is an opportunity then complex LEAs (having 3 operands)
>        could be factored out.
>        e.g.
>           leal 1(%rax,%rcx,1), %rdx
>           leal 1(%rax,%rcx,2), %rcx
>        will be factored as following
>           leal 1(%rax,%rcx,1), %rdx
>           leal (%rdx,%rcx)   , %edx
>
>    3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
>       thus avoiding creation of any complex LEAs within a loop.
>
> Reviewers: lsaba, RKSimon, craig.topper, qcolombet
>
> Reviewed By: lsaba
>
> Subscribers: spatel, igorb, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 313376
2017-09-15 18:40:26 +00:00
Eric Beckmann 349746f044 Fix Bug 30978 by emitting cv file checksums.
Summary:
The checksums had already been placed in the IR, this patch allows
MCCodeView to actually write it out to an MCStreamer.

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37157

llvm-svn: 313374
2017-09-15 18:20:28 +00:00
Jonas Paulsson 6188f326eb Recommit "[RegAlloc] Make sure live-ranges reflect the state of the IR when
removing them"

This was temporarily reverted, but now that the fix has been commited (r313197)
it should be put back in place.

https://bugs.llvm.org/show_bug.cgi?id=34502

This reverts commit 9ef93d9dc4c51568e858cf8203cd2c5ce8dca796.

llvm-svn: 313349
2017-09-15 07:47:38 +00:00
Jatin Bhateja 908c8b37c2 [X86] PR32755 : Improvement in CodeGen instruction selection for LEAs.
Summary:
   1/  Operand folding during complex pattern matching for LEAs has been
       extended, such that it promotes Scale to accommodate similar operand
       appearing in the DAG.
       e.g.
          T1 = A + B
          T2 = T1 + 10
          T3 = T2 + A
       For above DAG rooted at T3, X86AddressMode will no look like
          Base = B , Index = A , Scale = 2 , Disp = 10

   2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
       so that if there is an opportunity then complex LEAs (having 3 operands)
       could be factored out.
       e.g.
          leal 1(%rax,%rcx,1), %rdx
          leal 1(%rax,%rcx,2), %rcx
       will be factored as following
          leal 1(%rax,%rcx,1), %rdx
          leal (%rdx,%rcx)   , %edx

   3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
      thus avoiding creation of any complex LEAs within a loop.

Reviewers: lsaba, RKSimon, craig.topper, qcolombet

Reviewed By: lsaba

Subscribers: spatel, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 313343
2017-09-15 05:29:51 +00:00
Reid Kleckner 87288b98b6 [codeview] Use a type index of zero for static method "this" types
Otherwise VS won't show anything in the autos or watch window of static
methods.

llvm-svn: 313329
2017-09-15 00:59:07 +00:00
Jan Sjodin 312ccf761c Add AddresSpace to PseudoSourceValue.
Differential Revision: https://reviews.llvm.org/D35089

llvm-svn: 313297
2017-09-14 20:53:51 +00:00
Benjamin Kramer 591aac7cdf Remove usages of deprecated std::unary_function and std::binary_function.
These are removed in C++17. We still have some users of
unary_function::argument_type, so just spell that typedef out. No
functionality change intended.

Note that many of the argument types are actually wrong :)

llvm-svn: 313287
2017-09-14 18:33:25 +00:00
Krzysztof Parzyszek 779d98e1c0 TableGen support for parameterized register class information
This replaces TableGen's type inference to operate on parameterized
types instead of MVTs, and as a consequence, some interfaces have
changed:
- Uses of MVTs are replaced by ValueTypeByHwMode.
- EEVT::TypeSet is replaced by TypeSetByHwMode.

This affects the way that types and type sets are printed, and the
tests relying on that have been updated.

There are certain users of the inferred types outside of TableGen
itself, namely FastISel and GlobalISel. For those users, the way
that the types are accessed have changed. For typical scenarios,
these replacements can be used:
- TreePatternNode::getType(ResNo) -> getSimpleType(ResNo)
- TreePatternNode::hasTypeSet(ResNo) -> hasConcreteType(ResNo)
- TypeSet::isConcrete -> TypeSetByHwMode::isValueTypeByHwMode(false)

For more information, please refer to the review page.

Differential Revision: https://reviews.llvm.org/D31951

llvm-svn: 313271
2017-09-14 16:56:21 +00:00
Krzysztof Parzyszek 6ca02b25a7 [IfConversion] More simple, correct dead/kill liveness handling
Patch by Jesper Antonsson.

Differential Revision: https://reviews.llvm.org/D37611

llvm-svn: 313268
2017-09-14 15:53:11 +00:00
Simon Pilgrim 8bd2d8780a [DAGCombine] (shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2)
We already have a combine for this pattern when the input to shl is add, so we just need to enable the transformation when the input is or.

Original patch by @tstellar

Differential Revision: https://reviews.llvm.org/D19325

llvm-svn: 313251
2017-09-14 10:38:30 +00:00
Simon Pilgrim 523483e0bd [SelectionDAG] ComputeNumSignBits - cleanup ROTL/ROTR wrapping to match DAGCombine etc.
Use RotAmt.urem(VTBits) instead of AND(RotAmt, VTBits - 1)

TBH I don't expect non-power-of-2 types to be created, but it makes the logic clearer and matches what we do in other rotation combines.

llvm-svn: 313245
2017-09-14 10:28:01 +00:00
Dean Michael Berris 01fd7c8bd4 [XRay][CodeGen] Use the current function symbol as the associated symbol for the instrumentation map
Summary:
XRay had been assuming that the previous section is the "text" section
of the function when lowering the instrumentation map. Unfortunately
this is not a safe assumption, because we may be coming from lowering
debug type information for the function being lowered.

This fixes an issue with combining -gsplit-dwarf, -generate-type-units,
-debug-compile and -fxray-instrument for sole member functions. When the
split dwarf section is stripped, we're left with references from the
xray_instr_map to the debug section. The change now uses the function's
symbol instead of the previous section's start symbol.

We found the bug while attempting to strip the split debug sections off
an XRay-instrumented object file, which had a peculiar edge-case for
single-function classes where the single function is being lowered.
Because XRay had assocaited the instrumentation map for a function to
the debug types section instead of the function's section, the objcopy
call will fail due to the misplaced reference from the xray_instr_map
section.

Reviewers: pcc, dblaikie, echristo

Subscribers: llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D37791

llvm-svn: 313233
2017-09-14 07:08:23 +00:00
Reid Kleckner cd7bba0264 [codeview] Fold FIXME into comment, there's nothing to do. NFC
llvm-svn: 313214
2017-09-13 23:30:01 +00:00
Hans Wennborg 06e2a384c2 Revert r312719 "[MachineCombiner] Update instruction depths incrementally for large BBs."
This caused PR34596.

> [MachineCombiner] Update instruction depths incrementally for large BBs.
>
> Summary:
> For large basic blocks with lots of combinable instructions, the
> MachineTraceMetrics computations in MachineCombiner can dominate the compile
> time, as computing the trace information is quadratic in the number of
> instructions in a BB and it's relevant successors/predecessors.
>
> In most cases, knowing the instruction depth should be enough to make
> combination decisions. As we already iterate over all instructions in a basic
> block, the instruction depth can be computed incrementally. This reduces the
> cost of machine-combine drastically in cases where lots of instructions
> are combined. The major drawback is that AFAIK, computing the critical path
> length cannot be done incrementally. Therefore we only compute
> instruction depths incrementally, for basic blocks with more
> instructions than inc_threshold. The -machine-combiner-inc-threshold
> option can be used to set the threshold and allows for easier
> experimenting and checking if using incremental updates for all basic
> blocks has any impact on the performance.
>
> Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
>
> Reviewed By: fhahn
>
> Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 313213
2017-09-13 23:23:09 +00:00
Stanislav Mekhanoshin 7fe9a5d9b4 Allow target to decide when to cluster loads/stores in misched
MachineScheduler when clustering loads or stores checks if base
pointers point to the same memory. This check is done through
comparison of base registers of two memory instructions. This
works fine when instructions have separate offset operand. If
they require a full calculated pointer such instructions can
never be clustered according to such logic.

Changed shouldClusterMemOps to accept base registers as well and
let it decide what to do about it.

Differential Revision: https://reviews.llvm.org/D37698

llvm-svn: 313208
2017-09-13 22:20:47 +00:00
Reid Kleckner 89af112cf5 [codeview] VLAs and unsized arrays should use a size of zero
Previously we used a size of '1' for VLAs because we weren't sure what
MSVC did. However, MSVC does support declaring an array without a size,
for which it emits an array type with a size of zero. Clang emits the
same DI metadata for VLAs and arrays without bound, so we would describe
arrays without bound as having one element. This lead to Microsoft
debuggers only printing a single element.

Emitting a size of zero appears to cause these debuggers to search the
symbol information to find a definition of the variable with accurate
array bounds.

Fixes http://crbug.com/763580

llvm-svn: 313203
2017-09-13 21:54:20 +00:00
Wei Mi c0d066468e [RegAlloc] Keep a copy of live interval for the spilled vregs in HoistSpillHelper.
This is to fix PR34502. After rL311401, the live range of spilled vreg will be
cleared. HoistSpill need to use the live range of the original vreg before splitting
to know the moving range of the spills. The patch saves a copy of live interval for
the spilled vreg inside of HoistSpillHelper.

Differential Revision: https://reviews.llvm.org/D37578

llvm-svn: 313197
2017-09-13 21:41:30 +00:00
Eugene Zelenko 618c555bbe [CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 313194
2017-09-13 21:15:20 +00:00
Adrian McCarthy d91bf3998f Mark static member functions as static in CodeViewDebug
Summary:
To improve CodeView quality for static member functions, we need to make the
static explicit.  In addition to a small change in LLVM's CodeViewDebug to
return the appropriate MethodKind, this requires a small change in Clang to
note the staticness in the debug info metadata.

Subscribers: aprantl, hiraditya

Differential Revision: https://reviews.llvm.org/D37715

llvm-svn: 313192
2017-09-13 20:53:55 +00:00
Mikael Holmen 4eb2a96e7f [MachineScheduler] Put SchedRegion in an anonymous namespace.
Summary: It pollutes the global namespace otherwise.

Patch by: Bevin Hansson

Reviewers: jonpa

Reviewed By: jonpa

Subscribers: MatzeB, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D37555

llvm-svn: 313148
2017-09-13 14:07:47 +00:00
Peter Collingbourne 876da0294a Remove -generate-dwarf-pub-sections flag.
This flag is unnecessary for testing because we can get the coverage
we need by adjusting CU attributes.

Differential Revision: https://reviews.llvm.org/D37725

llvm-svn: 313079
2017-09-12 21:50:55 +00:00
Peter Collingbourne b52e23669c IR: Represent -ggnu-pubnames with a flag on the DICompileUnit.
This allows the flag to be persisted through to LTO.

Differential Revision: https://reviews.llvm.org/D37655

llvm-svn: 313078
2017-09-12 21:50:41 +00:00
Lei Huang 34e6621724 Update branch coalescing to be a PowerPC specific pass
Implementing this pass as a PowerPC specific pass.  Branch coalescing utilizes
the analyzeBranch method which currently does not include any implicit operands.
This is not an issue on PPC but must be handled on other targets.

Pass is currently off by default. Enabled via -enable-ppc-branch-coalesce.

Differential Revision : https: // reviews.llvm.org/D32776

llvm-svn: 313061
2017-09-12 18:39:11 +00:00
Sam Clegg 2176a9f2a3 [WebAssembly] Remove flags from MCSectionWasm
Looks like these were copied from the ELF sections but
don't apply to Wasm and were not used anywhere.

Also remove unused Wasm methods in MCContext.

Differential Revision: https://reviews.llvm.org/D37633

llvm-svn: 313058
2017-09-12 18:31:24 +00:00
Robert Lougher 51529eb0c2 Revert "[DWARF] Incorrect prologue end line record."
This reverts commit r313047 as it is causing buildbot failure (lldb inline
stepping tests).

llvm-svn: 313057
2017-09-12 18:23:15 +00:00
Robert Lougher f696a22d3c [DWARF] Incorrect prologue end line record.
A prologue-end line record is emitted with an incorrect associated address,
which causes a debugger to show the beginning of function body to be inside
the prologue.

Patch written by Carlos Alberto Enciso.

Differential Revision: https://reviews.llvm.org/D37625

llvm-svn: 313047
2017-09-12 16:35:25 +00:00
Eugene Zelenko 32a4056438 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 312971
2017-09-11 23:00:48 +00:00
Hiroshi Yamauchi 9364432cec Unmerge GEPs to reduce register pressure on IndirectBr edges.
Summary:
GEP merging can sometimes increase the number of live values and register
pressure across control edges and cause performance problems particularly if the
increased register pressure results in spills.

This change implements GEP unmerging around an IndirectBr in certain cases to
mitigate the issue. This is in the CodeGenPrepare pass (after all the GEP
merging has happened.)

With this patch, the Python interpreter loop runs faster by ~5%.

Reviewers: sanjoy, hfinkel

Reviewed By: hfinkel

Subscribers: eastig, junbuml, llvm-commits

Differential Revision: https://reviews.llvm.org/D36772

llvm-svn: 312930
2017-09-11 17:52:08 +00:00
Craig Topper 8dff57a0ed [SelectionDAG] Remove a check for type being a vector type after calling getShiftAmountTy. NFCI
getShiftAmountTy already returns the vector type when called for vectors.

llvm-svn: 312924
2017-09-11 16:15:39 +00:00
Matt Arsenault eb4474cf20 Fix typo
llvm-svn: 312919
2017-09-11 15:23:22 +00:00
Elena Demikhovsky cc477bbcea Fixed a bug in splitting Scatter operation in the Type Legalizer.
After the split of the Scatter operation, the order of the new instructions is well defined - Lo goes before Hi. Otherwise the semantic of Scatter (from LSB to MSB) is broken.
I'm chaining 2 nodes to prevent reordering.

Differential Revision https://reviews.llvm.org/D37670

llvm-svn: 312894
2017-09-11 06:18:15 +00:00
Matthias Braun 6b2b88b071 RegAllocFast: Fix warning; NFC
llvm-svn: 312852
2017-09-09 01:16:59 +00:00
Matthias Braun 864cf585ff RegAllocFast: Cleanup; NFC
- Use range based for
- Variable names should start with upper case
- Add `const`
- Change class name to match filename
- Fix doxygen comments
- Use MCPhysReg instead of unsigned
- Use references instead of pointers where things cannot be nullptr
- Misc coding style improvements

llvm-svn: 312846
2017-09-09 00:52:46 +00:00
Matthias Braun a09d18deb0 RegAllocFast: Move vector to class level to avoid reallocation; NFC
llvm-svn: 312845
2017-09-09 00:52:45 +00:00
Matthias Braun a5225e8cb0 RegAllocFast: Remove write-only set; NFC
llvm-svn: 312844
2017-09-09 00:52:42 +00:00
Wei Mi 5d84d9b35c Fix a bug for rL312641.
rL312641 Allowed llvm.memcpy/memset/memmove to be tail calls when parent
function return the intrinsics's first argument. However on arm-none-eabi
platform, llvm.memcpy will be expanded to __aeabi_memcpy which doesn't
have return value. The fix is to check the libcall name after expansion
to match "memcpy/memset/memmove" before allowing those intrinsic to be
tail calls.

llvm-svn: 312799
2017-09-08 16:44:52 +00:00
Krzysztof Parzyszek f78eca8fb5 Preserve existing regs when adding pristines to LivePhysRegs/LiveRegUnits
Differential Revision: https://reviews.llvm.org/D37600

llvm-svn: 312797
2017-09-08 16:29:50 +00:00
Adrian Prantl 99ba97726a Fix a crash when emitting debug info for multi-reg function arguments
by reusing more of the existing machinery

This is a follow-up to r312169.
Thanks to Björn Pettersson for the testcase!

llvm-svn: 312773
2017-09-08 02:31:37 +00:00
Dean Michael Berris 711dec260f [XRay][CodeGen][PowerPC] Fix tail exit codegen for XRay in PPC
Summary:
This fixes code-gen for XRay in PPC. The regression wasn't caught by
codegen tests  which we add in this change.

What happened was the following:

- For tail exits, we used to unconditionally prepend the returns/exits
  with a pseudo-instruction that gets lowered to the instrumentation
  sled (and leave the actual return/exit instruction as-is).
- Changes to the XRay instrumentation pass caused the tail exits to
  suddenly also emit the tail exit pseudo-instruction, since the check
  for whether a return instruction was also a call instruction meant it
  was a tail exit instruction.
- None of the tests caught the regression either due to non-existent
  tests, or the tests being disabled/removed for continuous breakage.

This change re-introduces some of the basic tests and verifies that
we're back to a state that allows the back-end to generate appropriate
XRay instrumented binaries for PPC in the presence of tail exits.

Reviewers: echristo, timshen

Subscribers: nemanjai, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D37570

llvm-svn: 312772
2017-09-08 01:47:56 +00:00
Reid Kleckner 0e8c4bb055 Sink some IntrinsicInst.h and Intrinsics.h out of llvm/include
Many of these uses can get by with forward declarations. Hopefully this
speeds up compilation after adding a single intrinsic.

llvm-svn: 312759
2017-09-07 23:27:44 +00:00
Richard Trieu c7828ebea4 Revert r312318, r312325, r312424, r312489
r312318 - Debug info for variables whose type is shrinked to bool
r312325, r312424, r312489 - Test case for r312318

Revision 312318 introduced a null dereference bug.
Details in https://bugs.llvm.org/show_bug.cgi?id=34490

llvm-svn: 312758
2017-09-07 23:20:35 +00:00
Paul Robinson bb92137080 [DWARF] Line 0 should not have a discriminator.
It's meaningless and takes up extra space in the line table.

Differential Revision: https://reviews.llvm.org/D37364

llvm-svn: 312751
2017-09-07 22:15:44 +00:00
Matt Arsenault 61ec738b60 DAG: Allow creating extract_vector_elt post-legalize
Fixes some combine issues for AMDGPU where we weren't
getting the many extract_vector_elt combines expected
in a future patch.

This should really be checking isOperationLegalOrCustom on
the extract. That improves a number of x86 lit tests, but
a few get stuck in an infinite loop from one place
where a similar looking extract is created. I have a
different workaround in the backend for that which
keeps many of those improvements, but also adds a few
regressions.

llvm-svn: 312730
2017-09-07 17:24:43 +00:00
Florian Hahn d39b8a3533 [MachineCombiner] Update instruction depths incrementally for large BBs.
Summary:
For large basic blocks with lots of combinable instructions, the
MachineTraceMetrics computations in MachineCombiner can dominate the compile
time, as computing the trace information is quadratic in the number of
instructions in a BB and it's relevant successors/predecessors.

In most cases, knowing the instruction depth should be enough to make
combination decisions. As we already iterate over all instructions in a basic
block, the instruction depth can be computed incrementally. This reduces the
cost of machine-combine drastically in cases where lots of instructions
are combined. The major drawback is that AFAIK, computing the critical path
length cannot be done incrementally. Therefore we only compute
instruction depths incrementally, for basic blocks with more
instructions than inc_threshold. The -machine-combiner-inc-threshold
option can be used to set the threshold and allows for easier
experimenting and checking if using incremental updates for all basic
blocks has any impact on the performance.

Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn

Reviewed By: fhahn

Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 312719
2017-09-07 12:49:39 +00:00
Florian Hahn cf0cdd4c02 [MachineTraceMetrics] Add computeDepth function (NFCI).
Summary:
This function is used in D36619 to update the instruction depths
incrementally.

Reviewers: efriedma, Gerolf, MatzeB, fhahn

Reviewed By: fhahn

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36696

llvm-svn: 312714
2017-09-07 11:51:30 +00:00
Jonas Paulsson 0f056352a8 Revert "[RegAlloc] Make sure live-ranges reflect the state of the IR when removing them"
This temporarily reverts commit 463fa38 (r311401).

See https://bugs.llvm.org/show_bug.cgi?id=34502

llvm-svn: 312708
2017-09-07 09:13:17 +00:00
Matthias Braun c9056b834d Insert IMPLICIT_DEFS for undef uses in tail merging
Tail merging can convert an undef use into a normal one when creating a
common tail. Doing so can make the register live out from a block which
previously contained the undef use. To keep the liveness up-to-date,
insert IMPLICIT_DEFs in such blocks when necessary.

To enable this patch the computeLiveIns() function which used to
compute live-ins for a block and set them immediately is split into new
functions:
- computeLiveIns() just computes the live-ins in a LivePhysRegs set.
- addLiveIns() applies the live-ins to a block live-in list.
- computeAndAddLiveIns() is a convenience function combining the other
  two functions and behaving like computeLiveIns() before this patch.

Based on a patch by Krzysztof Parzyszek <kparzysz@codeaurora.org>

Differential Revision: https://reviews.llvm.org/D37034

llvm-svn: 312668
2017-09-06 20:45:24 +00:00
Krzysztof Parzyszek a3017aa2ab [IfConversion] Remove kill flags from common instructions as well
When if-converting a diamond, two separate blocks will be placed back
to back to form a straight line code. To ensure correctness of the
liveness information, any registers that are live in the second block
should not be killed in the first block, even if they were in the
original code.
Additionally, when the two blocks share common instructions at the
beginning, these instructions will not be duplicated, but only placed
once, before both of the blocks. Since the function "isIdenticalTo"
(as used here) ignores kill flags, the common initial code in one
block may have a kill flag for a register that is live in the other
block.
Because the code that removes kill flags only runs for the non-common
parts of the predicated blocks, a kill flag mismatch in the common
code could still lead to a live register being killed prematurely.

llvm-svn: 312654
2017-09-06 17:57:13 +00:00
Wei Mi 818d50a93d [TailCall] Allow llvm.memcpy/memset/memmove to be tail calls when parent
function return the intrinsics's first argument.

llvm.memcpy/memset/memmove return void but they will return the first
argument after they are expanded as libcalls. Now if the parent function
has any return value, llvm.memcpy cannot be turned into tail call after
expansion.

The patch is to handle that case in SelectionDAGBuilder so when caller
function return the same value as the first argument of llvm.memcpy,
tail call is allowed.

Differential Revision: https://reviews.llvm.org/D37406

llvm-svn: 312641
2017-09-06 16:05:17 +00:00
Craig Topper 761bb1b53d [DAGCombiner] When combining EXTRACT_SUBVECTOR of a BUILD_VECTOR, make sure we don't create a BUILD_VECTOR with an illegal type after type legalization.
llvm-svn: 312621
2017-09-06 06:50:03 +00:00
Zachary Turner 37c747498d [CodeView] Don't output S_UDTs for nested typedefs.
S_UDT records are basically the "bridge" between the debugger's
expression evaluator and the type information. If you type
(Foo*)nullptr into the watch window, the debugger looks for an
S_UDT record named Foo. If it can find one, it displays your type.
Otherwise you get an error.

We have always understood this to mean that if you have code like
this:

  struct A {
    int X;
  };

  struct B {
    typedef A AT;
    AT Member;
  };

that you will get 3 S_UDT records. "A", "B", and "B::AT". Because
if you were to type (B::AT*)nullptr into the debugger, it would
need to find an S_UDT record named "B::AT".

But "B::AT" is actually the S_UDT record that would be generated
if B were a namespace, not a struct. So the debugger needs to be
able to distinguish this case. So what it does is:

  1. Look for an S_UDT named "B::AT". If it finds one, it knows
     that AT is in a namespace.
  2. If it doesn't find one, split at the scope resolution operator,
     and look for an S_UDT named B. If it finds one, look up the type
     for B, and then look for AT as one of its members.

With this algorithm, S_UDT records for nested typedefs are not just
unnecessary, but actually wrong!

The results of implementing this in clang are dramatic. It cuts
our /DEBUG:FASTLINK PDB sizes by more than 50%, and we go from
being ~20% larger than MSVC PDBs on average, to ~40% smaller.

It also slightly speeds up link time. We get about 10% faster
links than without this patch.

Differential Revision: https://reviews.llvm.org/D37410

llvm-svn: 312583
2017-09-05 22:06:39 +00:00
Reid Kleckner e33c94f1b0 Add llvm.codeview.annotation to implement MSVC __annotation
Summary:
This intrinsic represents a label with a list of associated metadata
strings. It is modelled as reading and writing inaccessible memory so
that it won't be removed as dead code. I think the intention is that the
annotation strings should appear at most once in the debug info, so I
marked it noduplicate. We are allowed to inline code with annotations as
long as we strip the annotation, but that can be done later.

Reviewers: majnemer

Subscribers: eraman, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D36904

llvm-svn: 312569
2017-09-05 20:14:58 +00:00
Sam McCall f71bb198ed Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding""
This crashes on boringSSL on PPC (will send reduced testcase)

This reverts commit r312328.

llvm-svn: 312490
2017-09-04 15:47:00 +00:00
Dean Michael Berris ebc1659016 [XRay][CodeGen] Use PIC-friendly code in XRay sleds and remove synthetic references in .text
Summary:
This is a re-roll of D36615 which uses PLT relocations in the back-end
to the call to __xray_CustomEvent() when building in -fPIC and
-fxray-instrument mode.

Reviewers: pcc, djasper, bkramer

Subscribers: sdardis, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D37373

llvm-svn: 312466
2017-09-04 05:34:58 +00:00
Ayman Musa 44cde94935 [X86] Fix crash on assert of non-simple type after type-legalization
The function combineShuffleToVectorExtend in DAGCombine might generate an illegal typed node after "legalize types" phase, causing assertion on non-simple type to fail afterwards.

Adding a type check in case the combine is running after the type legalize pass.

Differential Revision: https://reviews.llvm.org/D37330

llvm-svn: 312438
2017-09-03 09:09:16 +00:00
Jessica Paquette b0d17d99dd [MIParser] Ensure getHexUint doesn't produce APInts with a bitwidth of 0
If getHexUint reads in a hex 0, it will create an APInt with a value of 0.
The number of active bits on this APInt is used to calculate the bitwidth of
Result. The number of active bits is defined as an APInt's bitwidth - its
number of leading 0s. Since this APInt is 0, its bitwidth and number of leading
0s are equal.

Thus, Result is constructed with a bitwidth of 0, triggering an APInt assert.

This commit fixes that by checking if the APInt is equal to 0, and setting the
bitwidth to 32 if it is. Otherwise, it sets the bitwidth using getActiveBits.

This caused issues when compiling MIR files with successor probabilities. In
the case that a successor is tagged with a probability of 0, this assert would
fire on debug builds.

https://reviews.llvm.org/D37401

llvm-svn: 312387
2017-09-01 22:17:14 +00:00
Matthias Braun cebdb17522 LiveIntervalAnalysis: Fix alias regunit reserved definition
A register in CodeGen can be marked as reserved: In that case we
consider the register always live and do not use (or rather ignore)
kill/dead/undef operand flags.

LiveIntervalAnalysis however tracks liveness per register unit (not per
register). We already needed adjustments for this in r292871 to deal
with super/sub registers. However I did not look at aliased register
there. Looking at ARM:

FPSCR (regunits FPSCR, FPSCR~FPSCR_NZCV) aliases with FPSCR_NZCV
(regunits FPSCR_NZCV, FPSCR~FPSCR_NZCV) hence they share a register unit
(FPSCR~FPSCR_NZCV) that represents the aliased parts of the registers.
This shared register unit was previously considered non-reserved,
however given that we uses of the reserved FPSCR potentially violate
some rules (like uses without defs) we should make FPSCR~FPSCR_NZCV
reserved too and stop tracking liveness for it.

This patch:
- Defines a register unit as reserved when: At least for one root
  register, the root register and all its super registers are reserved.
- Adjust LiveIntervals::computeRegUnitRange() for new reserved
  definition.
- Add MachineRegisterInfo::isReservedRegUnit() to have a canonical way
  of testing.
- Stop computing LiveRanges for reserved register units in HMEditor even
  with UpdateFlags enabled.
- Skip verification of uses of reserved reg units in the machine
  verifier (this usually didn't happen because there would be no cached
  liverange but there is no guarantee for that and I would run into this
  case before the HMEditor tweak, so may as well fix the verifier too).

Note that this should only affect ARMs FPSCR/FPSCR_NZCV registers today;
aliased registers are rarely used, the only other cases are hexagons
P0-P3/P3_0 and C8/USR pairs which are not mixing reserved/non-reserved
registers in an alias.

Differential Revision: https://reviews.llvm.org/D37356

llvm-svn: 312348
2017-09-01 18:36:26 +00:00
Geoff Berry 65528f2991 Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Issues addressed since original review:
- Moved removal of dead instructions found by
  LiveIntervals::shrinkToUses() outside of loop iterating over
  instructions to avoid instructions being deleted while pointed to by
  iterator.
- Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
- The pass no longer forwards COPYs to physical register uses, since
  doing so can break code that implicitly relies on the physical
  register number of the use.
- The pass no longer forwards COPYs to undef uses, since doing so
  can break the machine verifier by creating LiveRanges that don't
  end on a use (since the undef operand is not considered a use).

  [MachineCopyPropagation] Extend pass to do COPY source forwarding

  This change extends MachineCopyPropagation to do COPY source forwarding.

  This change also extends the MachineCopyPropagation pass to be able to
  be run during register allocation, after physical registers have been
  assigned, but before the virtual registers have been re-written, which
  allows it to remove virtual register COPY LiveIntervals that become dead
  through the forwarding of all of their uses.

llvm-svn: 312328
2017-09-01 14:27:20 +00:00
Clement Courbet 65130e2d8d Reland rL312315: [MergeICmps] MergeICmps is a new optimization pass that turns chains of integer
Add missing header.

This reverts commit 86dd6335cf7607af22f383a9a8e072ba929848cf.

llvm-svn: 312322
2017-09-01 10:56:34 +00:00
Strahinja Petrovic 676fd0b022 Debug info for variables whose type is shrinked to bool
This patch provides such debug information for integer
variables whose type is shrinked to bool by providing 
dwarf expression which returns either constant initial 
value or other value.

Patch by Nikola Prica.

Differential Revision: https://reviews.llvm.org/D35994

llvm-svn: 312318
2017-09-01 10:05:27 +00:00
Clement Courbet 316212575b Revert "[MergeICmps] MergeICmps is a new optimization pass that turns chains of integer"
Break build

This reverts commit d07ab866f7f88f81e49046d691a80dcd32d7198b.

llvm-svn: 312317
2017-09-01 09:43:08 +00:00
Clement Courbet 9473c01e96 [MergeICmps] MergeICmps is a new optimization pass that turns chains of integer
comparisons into memcmp.

Thanks to recent improvements in the LLVM codegen, the memcmp is typically
inlined as a chain of efficient hardware comparisons.
This typically benefits C++ member or nonmember operator==().

For now this is disabled by default until:
 - https://bugs.llvm.org/show_bug.cgi?id=33329 is complete
 - Benchmarks show that this is always useful.

Differential Revision:
https://reviews.llvm.org/D33987

llvm-svn: 312315
2017-09-01 09:07:05 +00:00
Jessica Paquette ffe4abc51b [MachineOutliner] Recommit r312194, missed optimization remarks
Before, this commit caused a buildbot failure:

http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/6026/steps/test_llvm/logs/LLVM%20%3A%3A%20CodeGen__AArch64__machine-outliner-remarks.ll

This was caused by the Key value in DiagnosticInfoOptimizationBase being
deallocated before emitting the remarks defined in MachineOutliner.cpp. As of
r312277 this should no longer be an issue.
 

llvm-svn: 312280
2017-08-31 21:02:45 +00:00
Craig Topper 7081f029f3 [DAGCombiner] Do a better job of ensuring we don't split elements when combining an extract_subvector of a bitcasted build_vector.
llvm-svn: 312253
2017-08-31 17:02:22 +00:00
Reid Kleckner 08f5fd51cc [codeview] Generalize DIExpression parsing to handle load chains
Summary:
Hopefully this also clarifies exactly when and why we're rewriting
certiain S_LOCALs using reference types: We're using the reference type
to stand in for a zero-offset load.

Reviewers: inglorion

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37309

llvm-svn: 312247
2017-08-31 15:56:49 +00:00
Daniel Jasper c0a976d417 Revert r311525: "[XRay][CodeGen] Use PIC-friendly code in XRay sleds; remove synthetic references in .text"
Breaks builds internally. Will forward repo instructions to author.

llvm-svn: 312243
2017-08-31 15:17:17 +00:00
Daniel Jasper b8198f02e6 Revert r312194: "[MachineOutliner] Add missed optimization remarks for the outliner."
Breaks on buildbot:
http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/6026/steps/test_llvm/logs/LLVM%20%3A%3A%20CodeGen__AArch64__machine-outliner-remarks.ll

llvm-svn: 312219
2017-08-31 06:22:35 +00:00
Eric Christopher e42ac21499 Temporarily revert "Update branch coalescing to be a PowerPC specific pass"
From comments and code review it wasn't intended to be enabled by default yet.

This reverts commit r311588.

llvm-svn: 312214
2017-08-31 05:56:16 +00:00
Jessica Paquette 65d953e0b1 [MachineOutliner] Add missed optimization remarks for the outliner.
This adds missed optimization remarks which report viable candidates that
were not outlined because they would increase code size.

Other remarks will come in separate commits.

This will help to diagnose code size regressions and changes in outliner
behaviour in projects using the outliner.

https://reviews.llvm.org/D37085

llvm-svn: 312194
2017-08-30 23:31:49 +00:00
Hans Wennborg 24775a0a6c Revert r312154 "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding""
It caused PR34387: Assertion failed: (RegNo < NumRegs && "Attempting to access record for invalid register number!")

> Issues identified by buildbots addressed since original review:
> - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
> - The pass no longer forwards COPYs to physical register uses, since
>   doing so can break code that implicitly relies on the physical
>   register number of the use.
> - The pass no longer forwards COPYs to undef uses, since doing so
>   can break the machine verifier by creating LiveRanges that don't
>   end on a use (since the undef operand is not considered a use).
>
>   [MachineCopyPropagation] Extend pass to do COPY source forwarding
>
>   This change extends MachineCopyPropagation to do COPY source forwarding.
>
>   This change also extends the MachineCopyPropagation pass to be able to
>   be run during register allocation, after physical registers have been
>   assigned, but before the virtual registers have been re-written, which
>   allows it to remove virtual register COPY LiveIntervals that become dead
>   through the forwarding of all of their uses.

llvm-svn: 312178
2017-08-30 22:11:37 +00:00
Adrian Prantl 608df027fd SelectionDAG: Emit correct debug info for multi-register function arguments.
Previously we would just describe the first register and then call it
quits. This patch emits fragment expressions for each register.

<rdar://problem/34075307>

llvm-svn: 312169
2017-08-30 20:51:20 +00:00
Adrian Prantl b192b545c1 Refactor DIBuilder::createFragmentExpression into a static DIExpression member
NFC

llvm-svn: 312165
2017-08-30 20:04:17 +00:00
Aditya Nandakumar c6615f56f5 [GISel]: Add a clean up combiner during legalization.
Added a combiner which can clean up truncs/extends that are created in
order to make the types work during legalization.

Also moved the combineMerges to the LegalizeCombiner.

https://reviews.llvm.org/D36880

llvm-svn: 312158
2017-08-30 19:32:59 +00:00
Geoff Berry feffb0c8af Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Issues identified by buildbots addressed since original review:
- Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
- The pass no longer forwards COPYs to physical register uses, since
  doing so can break code that implicitly relies on the physical
  register number of the use.
- The pass no longer forwards COPYs to undef uses, since doing so
  can break the machine verifier by creating LiveRanges that don't
  end on a use (since the undef operand is not considered a use).

  [MachineCopyPropagation] Extend pass to do COPY source forwarding

  This change extends MachineCopyPropagation to do COPY source forwarding.

  This change also extends the MachineCopyPropagation pass to be able to
  be run during register allocation, after physical registers have been
  assigned, but before the virtual registers have been re-written, which
  allows it to remove virtual register COPY LiveIntervals that become dead
  through the forwarding of all of their uses.

llvm-svn: 312154
2017-08-30 18:41:07 +00:00
Bob Haarman 1a4cbbe49f [codeview] make DbgVariableLocation::extractFromMachineInstruction use Optional
Summary:
DbgVariableLocation::extractFromMachineInstruction originally
returned a boolean indicating success. This change makes it return
an Optional<DbgVariableLocation> so we cannot try to access the fields
of the struct if they aren't valid.

Reviewers: aprantl, rnk, zturner

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37279

llvm-svn: 312143
2017-08-30 17:50:21 +00:00
Balaram Makam 42adadfca0 Re-land MachineInstr: Reason locally about some memory objects before going to AA.
Summary:
Reverts r311008 to reinstate r310825 with a fix.

Refine alias checking for pseudo vs value to be conservative.
This fixes the original failure in builtbot unittest SingleSource/UnitTests/2003-07-09-SignedArgs.

Reviewers: hfinkel, nemanjai, efriedma

Reviewed By: efriedma

Subscribers: bjope, mcrosier, nhaehnle, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D36900

llvm-svn: 312126
2017-08-30 14:57:12 +00:00
Bob Haarman 68e460194a [codeview] add missing break in CodeGen/AsmPrinter/DebugHandlerBase.cpp
llvm-svn: 312055
2017-08-29 22:54:31 +00:00
Eugene Zelenko 900b633560 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 312053
2017-08-29 22:32:07 +00:00
Bob Haarman a88bce1bce [NFC] clang-format llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
llvm-svn: 312035
2017-08-29 21:01:55 +00:00
Bob Haarman 223303c5a3 Reland r311957 [codeview] support more DW_OPs for more complete debug info
Summary:
Some variables show up in Visual Studio as "optimized out" even in -O0
-Od builds. This change fixes two issues that would cause this to
happen. The first issue is that not all DIExpressions we generate were
recognized by the CodeView writer. This has been addressed by adding
support for DW_OP_constu, DW_OP_minus, and DW_OP_plus. The second
issue is that we had no way to encode DW_OP_deref in CodeView. We get
around that by changinge the type we encode in the debug info to be
a reference to the type in the source code.

This fixes PR34261.

The reland adds two extra checks to the original: It checks if the
DbgVariableLocation is valid before checking any of its fields, and
it only emits ranges with nonzero registers.

Reviewers: aprantl, rnk, zturner

Reviewed By: rnk

Subscribers: mgorny, llvm-commits, aprantl, hiraditya

Differential Revision: https://reviews.llvm.org/D36907

llvm-svn: 312034
2017-08-29 20:59:25 +00:00
Hans Wennborg e7becd7e85 [DAG] Bound loop dependence check in merge optimization.
The loop dependence check looks for dependencies between store merge
candidates not captured by the chain sub-DAG doing a check of
predecessors which may be very large. Conservatively bound number of
nodes checked for compilation time. (Resolves PR34326).

Landing on behalf of Nirav Dave to unblock the 5.0.0 release.

Differential Revision: https://reviews.llvm.org/D37220

llvm-svn: 312022
2017-08-29 18:41:00 +00:00
Sanjay Patel 674d2c23ea [Instruction] add moveAfter() convenience function; NFCI
As suggested in D37121, here's a wrapper for removeFromParent() + insertAfter(),
but implemented using moveBefore() for symmetry/efficiency.

Differential Revision: https://reviews.llvm.org/D37239

llvm-svn: 312001
2017-08-29 14:07:48 +00:00
Bob Haarman 0bf0d66682 Revert "[codeview] support more DW_OPs for more complete debug info"
This reverts commit e160912f53f047bc97e572add179e08e33f4df48.

llvm-svn: 311977
2017-08-29 04:08:31 +00:00
Bob Haarman 858d098383 Revert "[codeview] don't try to emit variable locations without registers"
This reverts commit a256fbcacf448ee793d23552c46ed2971bf9eff5.

llvm-svn: 311976
2017-08-29 04:08:16 +00:00
Bob Haarman a8d0d1ab91 [codeview] don't try to emit variable locations without registers
This fixes a problem introduced 311957, where the compiler would crash
with "fatal error: error in backend: unknown codeview register".

llvm-svn: 311969
2017-08-29 01:45:54 +00:00
Bob Haarman b2a04a1513 [codeview] support more DW_OPs for more complete debug info
Summary:
Some variables show up in Visual Studio as "optimized out" even in -O0
-Od builds. This change fixes two issues that would cause this to
happen. The first issue is that not all DIExpressions we generate were
recognized by the CodeView writer. This has been addressed by adding
support for DW_OP_constu, DW_OP_minus, and DW_OP_plus. The second
issue is that we had no way to encode DW_OP_deref in CodeView. We get
around that by changinge the type we encode in the debug info to be
a reference to the type in the source code.

This fixes PR34261.

Reviewers: aprantl, rnk, zturner

Reviewed By: rnk

Subscribers: mgorny, llvm-commits, aprantl, hiraditya

Differential Revision: https://reviews.llvm.org/D36907

llvm-svn: 311957
2017-08-29 00:06:59 +00:00
Adrian Prantl 4cae108561 Fix a logic error in DwarfExpression::addMachineReg()
This fixes PR34323 and thus splitting undescribable registers into
smaller, describable sub-registers.

https://bugs.llvm.org/show_bug.cgi?id=34323

llvm-svn: 311951
2017-08-28 23:07:43 +00:00
Zachary Turner a7b041748d [CodeView] Don't output S_UDT symbols for forward decls.
S_UDT symbols are the debugger's "index" for all the structs,
typedefs, classes, and enums in a program.  If any of those
structs/classes don't have a complete declaration, or if there
is a typedef to something that doesn't have a complete definition,
then emitting the S_UDT is unhelpful because it doesn't give
the debugger enough information to do anything useful.  On the
other hand, it results in a huge size blow-up in the resulting
PDB, which is exacerbated by an order of magnitude when linking
with /DEBUG:FASTLINK.

With this patch, we drop S_UDT records for types that refer either
directly or indirectly (e.g. through a typedef, pointer, etc) to
a class/struct/union/enum without a complete definition.  This
brings us about 50% of the way towards parity with /DEBUG:FASTLINK
PDBs generated from cl-compiled object files.

Differential Revision: https://reviews.llvm.org/D37162

llvm-svn: 311904
2017-08-28 18:49:04 +00:00
Craig Topper 029a21dfdc [DAGCombiner] Teach visitEXTRACT_SUBVECTOR to turn extracts of BUILD_VECTOR into smaller BUILD_VECTORs
Only do this before operations are legalized of BUILD_VECTOR is Legal for the target.

Differential Revision: https://reviews.llvm.org/D37186

llvm-svn: 311892
2017-08-28 15:28:33 +00:00
NAKAMURA Takumi a1e97a77f5 Untabify.
llvm-svn: 311875
2017-08-28 06:47:47 +00:00
Sanjay Patel a7a61d9768 [DAGCombiner] allow undef shuffle operands when eliminating bitcasts (PR34111)
As noted in the FIXME, this could be improved more, but this is the smallest fix
that helps:
https://bugs.llvm.org/show_bug.cgi?id=34111

llvm-svn: 311853
2017-08-27 17:29:30 +00:00
Jatin Bhateja e4ca95d6aa [DAGCombiner] Extending pattern detection for vector shuffle.
Summary:
If all the operands of a BUILD_VECTOR extract elements from same vector then split the
vector efficiently based on the maximum vector access index.

This will also fix PR 33784

Reviewers: zvi, delena, RKSimon, thakis

Reviewed By: RKSimon

Subscribers: chandlerc, eladcohen, llvm-commits

Differential Revision: https://reviews.llvm.org/D35788

llvm-svn: 311833
2017-08-26 19:02:36 +00:00
Jatin Bhateja b60cfbefac Revert rL311247 : To rectify commit message.
Summary: This reverts commit rL311247.

Differential Revision: https://reviews.llvm.org/D36927

llvm-svn: 311832
2017-08-26 19:02:17 +00:00
Hiroshi Yamauchi 63e17ebf8b Add options to dump block frequency/branch probability info in text.
Summary:
Add options -print-bfi/-print-bpi that dump block frequency and branch
probability info like -view-block-freq-propagation-dags and
-view-machine-block-freq-propagation-dags do but in text.

This is useful when the graph is very large and complex (the dot command
crashes, lines/edges too close to tell apart, hard to navigate without textual
search) or simply when text is preferred.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37165

llvm-svn: 311822
2017-08-26 00:31:00 +00:00
David Blaikie 196f53b295 Fix unused-lambda-capture warning by using default capture-by-ref
Since the lambda isn't escaped (via a std::function or similar) it's
fine/better to use default capture-by-ref to provide semantics similar
to language-level nested scopes (if/for/while/etc).

llvm-svn: 311782
2017-08-25 16:46:07 +00:00
Matt Morehouse 355f6f7444 Fix buildbot breakage from r311763. Remove unused lambda capture.
llvm-svn: 311781
2017-08-25 16:19:26 +00:00
Aditya Nandakumar 892979effc [GISel]: Implement widenScalar for Legalizing G_PHI
https://reviews.llvm.org/D37018

llvm-svn: 311763
2017-08-25 04:57:27 +00:00
Sanjay Patel e404cbff66 [DAG] convert vector select-of-constants to logic/math
This goes back to a discussion about IR canonicalization. We'd like to preserve and convert
more IR to 'select' than we currently do because that's likely the best choice in IR:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105335.html
...but that's often not true for codegen, so we need to account for this pattern coming in
to the backend and transform it to better DAG ops.

Steps in this patch:

  1. Add an EVT param to the existing convertSelectOfConstantsToMath() TLI hook to more finely
     enable this transform. Other targets will probably want that anyway to distinguish scalars
     from vectors. We're using that here to exclude AVX512 targets, but it may not be necessary.

  2. Convert a vselect to ext+add. This eliminates a constant load/materialization, and the
     vector ext is often free.

Implementing a more general fold using xor+and can be a follow-up for targets that don't have
a legal vselect. It's also possible that we can remove the TLI hook for the special case fold
implemented here because we're eliminating a constant, but it needs to be tested on other
targets.

Differential Revision: https://reviews.llvm.org/D36840

llvm-svn: 311731
2017-08-24 23:24:43 +00:00
Eugene Zelenko 5df3d89009 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 311703
2017-08-24 21:21:39 +00:00
Victor Leschuk 6aedf785c5 Remove duplicate code
llvm-svn: 311675
2017-08-24 17:02:38 +00:00
Victor Leschuk 471579b52e Add missing break in switch
llvm-svn: 311673
2017-08-24 16:57:10 +00:00
Matt Arsenault d664315ae8 IPRA: Don't assume called function is first call operand
Fixes not finding the called global for AMDGPU
call pseudoinstructions, which prevented IPRA
from doing much.

llvm-svn: 311637
2017-08-24 07:55:15 +00:00
Matt Arsenault 00459e4a06 IPRA: Exit early on functions without calls
llvm-svn: 311636
2017-08-24 07:55:13 +00:00
Wei Ding a131d3fb29 Add ‘llvm.experimental.constrained.fma‘ Intrinsic.
Differential Revision: http://reviews.llvm.org/D36335

llvm-svn: 311629
2017-08-24 04:18:24 +00:00
Hans Wennborg c39ec95d88 [DAG] Fix Node Replacement in PromoteIntBinOp
When one operand is a user of another in a promoted binary operation
we may replace and delete the returned value before returning
triggering an assertion. Reorder node replacements to prevent this.

Fixes PR34137.

Landing on behalf of Nirav.

Differential Revision: https://reviews.llvm.org/D36581

llvm-svn: 311623
2017-08-24 01:08:27 +00:00
Adrian Prantl 7db6b5e2b3 Retire the llvm.dbg.mir hack after r311594.
llvm-svn: 311610
2017-08-23 22:02:36 +00:00
Aditya Nandakumar efd8a84cd5 [GISEl]: Translate phi into G_PHI
G_PHI has the same semantics as PHI but also has types.
This lets us verify that the types in the G_PHI are consistent.
This also allows specifying legalization actions for G_PHIs.

https://reviews.llvm.org/D36990

llvm-svn: 311596
2017-08-23 20:45:48 +00:00
Reid Kleckner 950567aac4 Attempt to fix the BUILD_SHARED_LIBS build after the DIExpression change
llvm-svn: 311595
2017-08-23 20:39:35 +00:00
Reid Kleckner 6d353348e5 Parse and print DIExpressions inline to ease IR and MIR testing
Summary:
Most DIExpressions are empty or very simple. When they are complex, they
tend to be unique, so checking them inline is reasonable.

This also avoids the need for CodeGen passes to append to the
llvm.dbg.mir named md node.

See also PR22780, for making DIExpression not be an MDNode.

Reviewers: aprantl, dexonsmith, dblaikie

Subscribers: qcolombet, javed.absar, eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37075

llvm-svn: 311594
2017-08-23 20:31:27 +00:00
Lei Huang 0cb591fc4c Update branch coalescing to be a PowerPC specific pass
Implementing this pass as a PowerPC specific pass.  Branch coalescing utilizes
the analyzeBranch method which currently does not include any implicit operands.
This is not an issue on PPC but must be handled on other targets.

Differential Revision : https: // reviews.llvm.org/D32776

llvm-svn: 311588
2017-08-23 19:25:04 +00:00
Dean Michael Berris 0884b73220 [XRay][CodeGen] Use PIC-friendly code in XRay sleds; remove synthetic references in .text
Summary:
This change achieves two things:

  - Redefine the Custom Event handling instrumentation points emitted by
    the compiler to not require dynamic relocation of references to the
    __xray_CustomEvent trampoline.

  - Remove the synthetic reference we emit at the end of a function that
    we used to keep auxiliary sections alive in favour of SHF_LINK_ORDER
    associated with the section where the function is defined.

To achieve the custom event handling change, we've had to introduce the
concept of sled versioning -- this will need to be supported by the
runtime to allow us to understand how to turn on/off the new version of
the custom event handling sleds. That change has to land first before we
change the way we write the sleds.

To remove the synthetic reference, we rely on a relatively new linker
feature that preserves the sections that are associated with each other.
This allows us to limit the effects on the .text section of ELF
binaries.

Because we're still using absolute references that are resolved at
runtime for the instrumentation map (and function index) maps, we mark
these sections write-able. In the future we can re-define the entries in
the map to use relative relocations instead that can be statically
determined by the linker. That change will be a bit more invasive so we
defer this for later.

Depends on D36816.

Reviewers: dblaikie, echristo, pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36615

llvm-svn: 311525
2017-08-23 04:49:41 +00:00
Matthias Braun 8426d1342d Add test case for r311511
This also changes the TailDuplicator to be configured explicitely
pre/post regalloc rather than relying on the isSSA() flag. This was
necessary to have `llc -run-pass` work reliably.

llvm-svn: 311520
2017-08-23 03:17:59 +00:00
Matthias Braun 55bc9b3f9e TargetInstrInfo: Change duplicate() to work on bundles.
Adds infrastructure to clone whole instruction bundles rather than just
single instructions. This fixes a bug where tail duplication would
unbundle instructions while cloning.

This should unbreak the "Clang Stage 1: cmake, RA, with expensive checks
enabled" build on greendragon. The bot broke with r311139 hitting this
pre-existing bug.

A proper testcase will come next.

llvm-svn: 311511
2017-08-22 23:56:30 +00:00
Craig Topper 35189d5221 [SelectionDAG] Make ISD::isConstantSplatVector always return an element sized APInt.
This partially reverts r311429 in favor of making ISD::isConstantSplatVector do something not confusing. Turns out the only other user of it was also having to deal with the weird property of it returning a smaller size.

So rather than continue to deal with this quirk everywhere, just make the interface do something sane.

Differential Revision: https://reviews.llvm.org/D37039

llvm-svn: 311510
2017-08-22 23:54:13 +00:00
Jonas Devlieghere a680a8f5f8 [Debug info] Add new DbgValues after looping over DAG
I was contacted by Jesper Antonsson from Ericsson who ran into problems
with r311181 in their test suites with for an out-of-tree target.
Because of the latter I don't have a reproducer, but we definitely don't
want to modify the data structure on which we are iterating inside the
loop.

llvm-svn: 311466
2017-08-22 16:28:07 +00:00
Renato Golin c070c73d5e [ARM] Avoid creating duplicate ANDs in SelectionDAG
When expanding a BRCOND into a BR_CC, do not create an AND 1
if one already exists.

Review: D36705

Patch by Joel Galenson <jgalenson@google.com>

llvm-svn: 311447
2017-08-22 11:02:45 +00:00
Sjoerd Meijer e0c933f5d6 [SelectionDAG] Add getNode debug messages
This adds debug messages to various functions that create new SDValue nodes.
This is e.g. useful to have during legalization, as otherwise it can prints
legalization info of nodes that did not appear in the dumps before.

Differential Revision: https://reviews.llvm.org/D36984

llvm-svn: 311444
2017-08-22 10:43:51 +00:00
Craig Topper b49f0893b2 [X86] Prevent several calls to ISD::isConstantSplatVector from returning a narrower APInt than the original scalar type
ISD::isConstantSplatVector can shrink to the smallest splat width. But we don't check the size of the resulting APInt at all. This can cause us to misinterpret the results.

This patch just adds a flag to prevent the APInt from changing width.

Fixes PR34271.

Differential Revision: https://reviews.llvm.org/D36996

llvm-svn: 311429
2017-08-22 05:40:17 +00:00
Quentin Colombet 4056e80719 [RegAlloc] Make sure live-ranges reflect the state of the IR when removing them
When removing a live-range we used to not touch them making debug
prints harder to read because the IR was not matching what the
live-ranges information was saying.

This only affects debug printing and allows to put stronger asserts in
the code (see r308906 for instance).

llvm-svn: 311401
2017-08-21 22:56:18 +00:00
Benjamin Kramer 49a49fe816 Move helper classes into anonymous namespaces.
No functionality change intended.

llvm-svn: 311288
2017-08-20 13:03:48 +00:00
Jatin Bhateja 6b4c205685 [DAGCombiner] Extending pattern detection for vector shuffle.
Summary:
    If all the operands of a BUILD_VECTOR extract elements from same vector then split the
    vector efficiently based on the maximum vector access index.

    Reviewers: zvi, delena, RKSimon, thakis

    Reviewed By: RKSimon

    Subscribers: chandlerc, eladcohen, llvm-commits

    Differential Revision: https://reviews.llvm.org/D35788

llvm-svn: 311255
2017-08-19 18:08:59 +00:00
Jatin Bhateja 66f7958e91 Revert rL311247 : To rectify commit message.
Summary: This reverts commit rL311247.

Differential Revision: https://reviews.llvm.org/D36927

llvm-svn: 311252
2017-08-19 17:59:58 +00:00
Jatin Bhateja 6f0d0d23b0 Merge branch 'arcpatch-D35788'
llvm-svn: 311247
2017-08-19 17:00:04 +00:00
Jatin Bhateja 1c56863739 Revert rL311242 "Extension of shuffle vector pattern detection, updating post rebase."
Summary:

This reverts commit rL311242.

Differential Revision: https://reviews.llvm.org/D36924

llvm-svn: 311246
2017-08-19 16:40:06 +00:00
Jatin Bhateja 313f97dd84 Extension of shuffle vector pattern detection, updating post rebase.
llvm-svn: 311242
2017-08-19 15:58:36 +00:00
Adrian Prantl 2116dd360a Filter out non-constant DIGlobalVariableExpressions reachable via the CU
They won't affect the DWARF output, but they will mess with the
sorting of the fragments. This fixes the crash reported in PR34159.

https://bugs.llvm.org/show_bug.cgi?id=34159

llvm-svn: 311217
2017-08-19 01:15:06 +00:00
Jonas Devlieghere e101b07a1d [Debug info] Transfer DI to fragment expressions for split integer values.
This patch teaches the SDag type legalizer how to split up debug info for
integer values that are split into a hi and lo part.

(re-commit)

Differential Revision: https://reviews.llvm.org/D36805

llvm-svn: 311181
2017-08-18 18:07:00 +00:00
Craig Topper e3edd9c9be [DAGCombiner] Fix bad comment that had immediate values swapped from the code and what they need to be to make sense. NFC
llvm-svn: 311144
2017-08-18 04:52:46 +00:00
Geoff Berry bd47e8a4f7 Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" round 2
This reverts commit r311135.

sanitizer-x86_64-linux-android buildbot is timing out with just this
patch applied.

llvm-svn: 311142
2017-08-18 01:43:11 +00:00
Richard Smith c0541dfa3e Increase tail dup threshold for -O3 from 3 to 4.
We see a modest performance improvement from this slightly higher tail dup threshold.

Differential Revision: https://reviews.llvm.org/D36775

llvm-svn: 311139
2017-08-17 23:38:41 +00:00
Geoff Berry 51f52c4fca Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Two issues identified by buildbots were addressed:
    - The pass no longer forwards COPYs to physical register uses, since
      doing so can break code that implicitly relies on the physical
      register number of the use.
    - The pass no longer forwards COPYs to undef uses, since doing so
      can break the machine verifier by creating LiveRanges that don't
      end on a use (since the undef operand is not considered a use).

    [MachineCopyPropagation] Extend pass to do COPY source forwarding

    This change extends MachineCopyPropagation to do COPY source forwarding.

    This change also extends the MachineCopyPropagation pass to be able to
    be run during register allocation, after physical registers have been
    assigned, but before the virtual registers have been re-written, which
    allows it to remove virtual register COPY LiveIntervals that become dead
    through the forwarding of all of their uses.

    Reviewers: qcolombet, javed.absar, MatzeB, jonpa

    Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny

    Differential Revision: https://reviews.llvm.org/D30751

llvm-svn: 311135
2017-08-17 23:06:55 +00:00
Eugene Zelenko 6e07bfd0d9 [CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 311124
2017-08-17 21:26:39 +00:00
Jonas Devlieghere 30756da212 Revert "[Debug info] Transfer DI to fragment expressions for split integer values."
This reverts commit r311102.

llvm-svn: 311111
2017-08-17 17:58:33 +00:00
Jonas Devlieghere 622fedc001 [Debug info] Transfer DI to fragment expressions for split integer values.
This patch teaches the SDag type legalizer how to split up debug info for
integer values that are split into a hi and lo part.

Differential Revision: https://reviews.llvm.org/D36805

llvm-svn: 311102
2017-08-17 17:06:48 +00:00
Adrian Prantl 6a57daad81 Improve line debug info when translating a CaseBlock to SDNodes.
The SelectionDAGBuilder translates various conditional branches into
CaseBlocks which are then translated into SDNodes. If a conditional
branch results in multiple CaseBlocks only the first CaseBlock is
translated into SDNodes immediately, the rest of the CaseBlocks are
put in a queue and processed when all LLVM IR instructions in the
basic block have been processed.

When a CaseBlock is transformed into SDNodes the SelectionDAGBuilder
is queried for the current LLVM IR instruction and the resulting
SDNodes are annotated with the debug info of the current
instruction (if it exists and has debug metadata).

When the deferred CaseBlocks are processed, the SelectionDAGBuilder
does not have a current LLVM IR instruction, and the resulting SDNodes
will not have any debuginfo. As DwarfDebug::beginInstruction() outputs
a .loc directive for the first instruction in a labeled
block (typically the case for something coming from a CaseBlock) this
tends to produce a line-0 directive.

This patch changes the handling of CaseBlocks to store the current
instruction's debug info into the CaseBlock when it is created (and the
SelectionDAGBuilder knows the current instruction) and to always use
the stored debug info when translating a CaseBlock to SDNodes.

Patch by Frej Drejhammar!

Differential Revision: https://reviews.llvm.org/D36671

llvm-svn: 311097
2017-08-17 16:57:13 +00:00
Simon Pilgrim 8be9f4af4f [DAGCombiner] Add support for non-uniform constant vectors to (mul x, (1 << c)) -> x << c
llvm-svn: 311083
2017-08-17 13:03:34 +00:00
Jonas Paulsson 57a705d9d0 [SystemZ, MachineScheduler] Improve post-RA scheduling.
The idea of this patch is to continue the scheduler state over an MBB boundary
in the case where the successor block has only one predecessor. This means
that the scheduler will continue in the successor block (after emitting any
branch instructions) with e.g. maintained processor resource counters.
Benchmarks have been confirmed to benefit from this.

The algorithm in MachineScheduler.cpp that extracts scheduling regions of an
MBB has been extended so that the strategy may optionally reverse the order
of processing the regions themselves. This is controlled by a new method
doMBBSchedRegionsTopDown(), which defaults to false.

Handling the top-most region of an MBB first also means that a top-down
scheduler can continue the scheduler state across any scheduling boundary
between to regions inside MBB.

Review: Ulrich Weigand, Matthias Braun, Andy Trick.
https://reviews.llvm.org/D35053

llvm-svn: 311072
2017-08-17 08:33:44 +00:00
Elad Cohen 124d32829c [SelectionDAG] Teach the vector-types operand scalarizer about SETCC
When v1i1 is legal (e.g. AVX512) the legalizer can reach
a case where a v1i1 SETCC with an illgeal vector type operand
wasn't scalarized (since v1i1 is legal) but its operands does
have to be scalarized. This used to assert because SETCC was
missing from the vector operand scalarizer.

This patch attemps to teach the legalizer to handle these cases
by scalazring the operands, converting the node into a scalar
SETCC node.

Differential revision: https://reviews.llvm.org/D36651

llvm-svn: 311071
2017-08-17 08:06:36 +00:00
Serguei Katkov 9e5604dbe1 [CGP] Fix the rematerialization of gc.relocates
If we want to substitute the relocation of derived pointer with gep of base then
we must ensure that relocation of base dominates the relocation of derived pointer.

Currently only check for basic block is present. However it is possible that both
relocation are in the same basic block but relocation of derived pointer is defined
earlier.

The patch moves the relocation of base pointer right before relocation of derived
pointer in this case.

Reviewers: sanjoy,artagnon,igor-laevsky,reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36462

llvm-svn: 311067
2017-08-17 05:48:30 +00:00
Geoff Berry 4e38e02e6f Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
This reverts commit r311038.

Several buildbots are breaking, and at least one appears to be due to
the forwarding of physical regs enabled by this change.  Reverting while
I investigate further.

llvm-svn: 311062
2017-08-17 04:04:11 +00:00
Geoff Berry 87f8d25150 [MachineCopyPropagation] Extend pass to do COPY source forwarding
This change extends MachineCopyPropagation to do COPY source forwarding.

This change also extends the MachineCopyPropagation pass to be able to
be run during register allocation, after physical registers have been
assigned, but before the virtual registers have been re-written, which
allows it to remove virtual register COPY LiveIntervals that become dead
through the forwarding of all of their uses.

Reviewers: qcolombet, javed.absar, MatzeB, jonpa

Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny

Differential Revision: https://reviews.llvm.org/D30751

llvm-svn: 311038
2017-08-16 20:50:01 +00:00
Balaram Makam c5698befb6 Revert "MachineInstr: Reason locally about some memory objects before going to AA."
r310825 caused the clang-ppc64le-linux-lnt bot to go red
(http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/5712)
because of a test-suite failure of
SingleSource/UnitTests/2003-07-09-SignedArgs

This reverts commit 0028f6a87224fb595a1c19c544cde9b003035996.

llvm-svn: 311008
2017-08-16 14:17:43 +00:00
Quentin Colombet 647b482c1e [VirtRegRewriter] Properly model the register liveness on undef subreg definition
Undef subreg definition means that the content of the super register
doesn't matter at this point. While that's true for virtual registers,
this may not hold when replacing them with actual physical registers.
Indeed, some part of the physical register may be coalesced with the
related virtual register and thus, the values for those parts matter and
must be live.

The fix consists in checking whether or not subregs of the physical register
being assigned to an undef subreg definition are live through that def and
insert an implicit use if they are. Doing so, will keep them alive until
that point like they should be.

E.g., let vreg14 being assigned to R0_R1 then
%vreg14:gsub_0<def,read-undef> = COPY %R0 ; <-- R1 is still live here
%vreg14:gsub_1<def> = COPY %R1

Before this changes, the rewriter would change the code into:
%R0<def> = KILL %R0, %R0_R1<imp-def> ; <-- this tells R1 is redefined
%R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use> ; this value of this R1
                                                      ; is believed to come
                                                      ; from the previous
                                                      ; instruction

Because of this invalid liveness, later pass could make wrong choices and in
particular clobber live register as it happened with the register scavenger in
llvm.org/PR34107

Now we would generate:
%R0<def> = KILL %R0, %R0_R1<imp-def>, %R0_R1<imp-use> ; This tells R1 needs to
                                                      ; reach this point
%R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use>

The bug has been here forever, it got exposed recently because the register
scavenger got smarter.

Fixes llvm.org/PR34107

llvm-svn: 310979
2017-08-16 00:17:05 +00:00
Jessica Paquette 95c1107f4c [MachineOutliner] Only outline candidates of length >= 2
Since we don't factor in instruction lengths into outlining calculations
right now, it's never the case that a candidate could have length < 2.

Thus, we should quit early when we see such candidates.

llvm-svn: 310894
2017-08-14 22:57:41 +00:00
Andrew Kaylor 53a5fbb45f Add strictfp attribute to prevent unwanted optimizations of libm calls
Differential Revision: https://reviews.llvm.org/D34163

llvm-svn: 310885
2017-08-14 21:15:13 +00:00
Matt Arsenault 81da0d45f8 IPRA: Allow target to enable IPRA by default
llvm-svn: 310876
2017-08-14 19:54:47 +00:00
Matt Arsenault f9273c81d6 IPRA: Run RegUsageInfoPropagate much later
This was running immediately after isel, before
isel pseudos were even expanded which is really
unreasonable. Move this to before pre-reglloc
passes in case some other pre-regalloc pass wants to
use the updated regmask info.

Fixes one of the reasons IPRA doesn't do anything on
AMDGPU currently. Tests will be included with future
patch after a few more are fixed.

llvm-svn: 310875
2017-08-14 19:54:45 +00:00
Amaury Sechet 9c529b6be3 [DAGCombine] Do not try to deduplicate commutative operations if both operand are the same.
Summary: It is creating useless work as the commuted nodes is the same as the node we are working on in that case.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33840

llvm-svn: 310832
2017-08-14 11:44:03 +00:00
Elad Cohen 6a9edda356 [SelectionDAG] combine vextract (v1iX extract_subvector(vNiX, Idx))
into vextract(vNiX,Idx) when creating vextract with getNode().
This case appeared in AVX512 after fixing pr33349 in r310552.

Differential revision: https://reviews.llvm.org/D36571

llvm-svn: 310828
2017-08-14 10:49:45 +00:00
Balaram Makam d9f53414de MachineInstr: Reason locally about some memory objects before going to AA.
This addresses a FIXME in MachineInstr::mayAlias.

llvm-svn: 310825
2017-08-14 09:41:40 +00:00
Elad Cohen 3a90a0c10d Revert "[DAGCombiner] Extending pattern detection for vector shuffle (REAPPLIED)"
This reverts commit r310782.

llvm-svn: 310822
2017-08-14 09:06:00 +00:00
Craig Topper 2251ef95a3 [X86][ARM][TargetLowering] Add SrcVT to isExtractSubvectorCheap
Summary:
Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not.

For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors.

Reviewers: RKSimon, zvi, efriedma

Reviewed By: RKSimon

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D36649

llvm-svn: 310793
2017-08-13 17:29:07 +00:00
Simon Pilgrim 5a86f0e717 [DAGCombiner] Extending pattern detection for vector shuffle (REAPPLIED)
If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index.

Reapplied with fix to only work with simple value types.

Committed on behalf of @jbhateja (Jatin Bhateja)

Differential Revision: https://reviews.llvm.org/D35788

llvm-svn: 310782
2017-08-12 17:43:25 +00:00
Sanjay Patel 169dae70a6 [x86] use more shift or LEA for select-of-constants (2nd try)
The previous rev (r310208) failed to account for overflow when subtracting the
constants to see if they're suitable for shift/lea. This version add a check
for that and more test were added in r310490.

We can convert any select-of-constants to math ops:
http://rise4fun.com/Alive/d7d

For this patch, I'm enhancing an existing x86 transform that uses fake multiplies
(they always become shl/lea) to avoid cmov or branching. The current code misses
cases where we have a negative constant and a positive constant, so this is just
trying to plug that hole.

The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start
with a select in IR, create a select DAG node, convert it into a sext, convert it
back into a select, and then lower it to sext machine code.

Some notes about the test diffs:

1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR.
2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. We could avoid the 
   push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a 
   post-DAG problem though.
3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if
   that's a regression, but those would always be nearly equivalent.
4. pr22338.ll and sext-i1.ll - These tests have undef operands, so we don't actually care about these diffs.
5. sbb.ll - This shows a win for what is likely a common case: choose -1 or 0.
6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again.
7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops.

Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass.

Differential Revision: https://reviews.llvm.org/D35340

llvm-svn: 310717
2017-08-11 15:44:14 +00:00
Nirav Dave 0a48e5d506 Improve handling of insert_subvector of bitcast values
Fix insert_subvector / extract_subvector merges of bitcast values.

Reviewers: efriedma, craig.topper, RKSimon

Subscribers: RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D34571

llvm-svn: 310711
2017-08-11 13:21:41 +00:00
Nirav Dave d1b3f09faa [X86][DAG] Switch X86 Target to post-legalized store merge
Move store merge to happen after intrinsic lowering to allow lowered
stores to be merged.

Some regressions due in MergeConsecutiveStores to missing
insert_subvector that are addressed in follow up patch.

Reviewers: craig.topper, efriedma, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34559

llvm-svn: 310710
2017-08-11 13:21:35 +00:00
Simon Pilgrim 83cf3a29b5 [DAGCombiner] Remove shuffle support from simplifyShuffleMask
rL310372 enabled simplifyShuffleMask to support undef shuffle mask inputs, but its causing hangs.

Removing support until I can triage the problem

llvm-svn: 310699
2017-08-11 08:37:00 +00:00
Mikael Holmen 8b10680922 [IfConversion] Maintain the CFG when predicating/merging blocks in IfConvert*
Summary:
This fixes PR32721 in IfConvertTriangle and possible similar problems in
IfConvertSimple, IfConvertDiamond and IfConvertForkedDiamond.

In PR32721 we had a triangle

   EBB
   | \
   |  |
   | TBB
   |  /
   FBB

where FBB didn't have any successors at all since it ended with an
unconditional return. Then TBB and FBB were be merged into EBB, but EBB
would still keep its successors, and the use of analyzeBranch and
CorrectExtraCFGEdges wouldn't help to remove them since the return
instruction is not analyzable (at least not on ARM).

The edge updating code and branch probability updating code is now pushed
into MergeBlocks() which allows us to share the same update logic between
more callsites. This lets us remove several dependencies on analyzeBranch
and completely eliminate RemoveExtraEdges.

One thing that showed up with this patch was that IfConversion sometimes
left a successor with 0% probability even if there was no branch or
fallthrough to the successor.

One such example from the test case ifcvt_bad_zero_prob_succ.mir. The
indirect branch tBRIND can only jump to bb.1, but without the patch we
got:

  bb.0:
    successors: %bb.1(0x80000000)

  bb.1:
    successors: %bb.1(0x80000000), %bb.2(0x00000000)
    tBRIND %r1, 1, %cpsr
    B %bb.1

  bb.2:

There is no way to jump from bb.1 to bb2, but still there is a 0% edge
from bb.1 to bb.2.

With the patch applied we instead get the expected:

  bb.0:
    successors: %bb.1(0x80000000)

  bb.1:
    successors: %bb.1(0x80000000)
    tBRIND %r1, 1, %cpsr
    B %bb.1

Since bb.2 had no predecessor at all, it was removed.

Several testcases had to be updated due to this since the removed
successor made the "Branch Probability Basic Block Placement" pass
sometimes place blocks in a different order.

Finally added a couple of new test cases:

* PR32721_ifcvt_triangle_unanalyzable.mir:
  Regression test for the original problem dexcribed in PR 32721.

* ifcvt_triangleWoCvtToNextEdge.mir:
  Regression test for problem that caused a revert of my first attempt
  to solve PR 32721.

* ifcvt_simple_bad_zero_prob_succ.mir:
  Test case showing the problem where a wrong successor with 0% probability
  was previously left.

* ifcvt_[diamond|forked_diamond|simple]_unanalyzable.mir
  Very simple test cases for the simple and (forked) diamond cases
  involving unanalyzable branches that can be nice to have as a base if
  wanting to write more complicated tests.

Reviewers: iteratee, MatzeB, grosser, kparzysz

Reviewed By: kparzysz

Subscribers: kbarton, davide, aemerson, nemanjai, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34099

llvm-svn: 310697
2017-08-11 06:57:08 +00:00
Nirav Dave f8556ad48f Revert "[DAG] Cleanup unused nodes after store merge. NFCI."
This reverts commit r310648 which causes an unexpected assertion failure

llvm-svn: 310659
2017-08-10 21:03:36 +00:00
Nirav Dave 4d28c0ff4f [DAG] Relax type restriction for store merge
Summary: Allow stores of bitcastable types to be merged by peeking through BITCAST nodes and recasting stored values constant and vector extract nodes as necessary.

Reviewers: jyknight, hfinkel, efriedma, RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34569

llvm-svn: 310655
2017-08-10 19:52:45 +00:00
Nirav Dave 99d9d24553 [DAG] Cleanup unused nodes after store merge. NFCI.
llvm-svn: 310648
2017-08-10 18:53:14 +00:00
Taewook Oh f5040b9685 Make .file directive to have basename only
Summary:
Currently LLVM puts directory along with the filename in .file directive, but this behavior doesn't match gcc. There's a no clear description about which one is right (https://sourceware.org/binutils/docs/as/File.html#File), but one document (https://sourceware.org/gdb/current/onlinedocs/stabs/ELF-Linker-Relocation.html) suggests that STT_FILE symbol in elf file is expected to have basename only, which should have a same sting file .file directive according to (https://docs.oracle.com/cd/E26502_01/html/E28388/eoiyg.html).

This also affects badly on the build system that uses hashing, as the directory info could be differnt from developer to developer even when they're working on same file.

Reviewers: pcc, mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36018

llvm-svn: 310642
2017-08-10 18:17:11 +00:00
Krzysztof Parzyszek bea30c6286 Add "Restored" flag to CalleeSavedInfo
The liveness-tracking code assumes that the registers that were saved
in the function's prolog are live outside of the function. Specifically,
that registers that were saved are also live-on-exit from the function.
This isn't always the case as illustrated by the LR register on ARM.

Differential Revision: https://reviews.llvm.org/D36160

llvm-svn: 310619
2017-08-10 16:17:32 +00:00
Nirav Dave 06242a99ce [DAG] Rewrite expression. NFC.
llvm-svn: 310608
2017-08-10 15:29:33 +00:00
Nirav Dave 926e2d39bf [X86] Keep dependencies when constructing loads in combineStore
Summary:
Preserve chain dependecies between old and new loads constructed to
prevent loads from reordering below later stores.

Fixes PR34088.

Reviewers: craig.topper, spatel, RKSimon, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36528

llvm-svn: 310604
2017-08-10 15:12:32 +00:00
Guy Blank 136b543745 [SelectionDAG] Allow constant folding for implicitly truncating BUILD_VECTOR nodes.
In FoldConstantArithmetic, handle BUILD_VECTOR nodes that do implicit truncation on the elements.

This is similar to what is done in FoldConstantVectorArithmetic.

Differential Revision:
https://reviews.llvm.org/D36506

llvm-svn: 310593
2017-08-10 14:09:50 +00:00
Elad Cohen 22ba97a0a6 [SelectionDAG] When scalarizing vselect, don't assert on
a legal cond operand.

When scalarizing the result of a vselect, the legalizer currently expects
to already have scalarized the operands. While this is true for the true/false
operands (which have the same type as the result), it is not case for the
condition operand. On X86 AVX512, v1i1 is legal - this leads to operations such
as '< N x type> vselect < N x i1> < N x type> < N x type>' where < N x type > is
illegal to hit an assertion during the scalarization.

The handling is similar to r205625.
This also exposes the fact that (v1i1 extract_subvector) should be legal
and selectable on AVX512 - We do this by custom lowering to vector_extract_elt.
This still leaves us in some cases with redundant dag nodes which will be
combined in a separate soon to come patch.

This fixes pr33349.

Differential revision: https://reviews.llvm.org/D36511

llvm-svn: 310552
2017-08-10 07:44:23 +00:00
David Blaikie 76fb649b0d Reduce variable scope by moving declaration into if clause
llvm-svn: 310506
2017-08-09 18:34:18 +00:00
Nirav Dave 6110d3ad00 [DAG] Explicitly cleanup merged load values during store merge. NFCI.
llvm-svn: 310474
2017-08-09 13:37:07 +00:00
Serguei Katkov 6ea2e81cf6 [ImplicitNullCheck] Fix the bug when dependent instruction accesses memory
It is possible that dependent instruction may access memory.
In this case we must reject optimization because the memory change will
be visible in null handler basic block. So we will execute an instruction which
we must not execute if check fails.

Reviewers: sanjoy, reames
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36392

llvm-svn: 310443
2017-08-09 05:17:02 +00:00
Reid Kleckner e2e82061f9 [codeview] Emit nested enums and typedefs from classes
Previously we limited ourselves to only emitting nested classes, but we
need other kinds of types as well.

This fixes the Visual Studio STL visualizers, so that users can
visualize std::string and other objects.

llvm-svn: 310410
2017-08-08 20:30:14 +00:00
Nirav Dave 8a813cf646 [DAG] Introduce peekThroughBitcast function. NFCI.
llvm-svn: 310405
2017-08-08 20:01:18 +00:00
Nirav Dave 515116d7c2 [DAG] Update comments. NFC.
llvm-svn: 310404
2017-08-08 19:52:19 +00:00
Simon Pilgrim 91b7b991d4 [DAGCombiner] simplifyShuffleMask - handle UNDEF inputs from shuffles as well as BUILD_VECTOR
Minor extension to D36393

llvm-svn: 310372
2017-08-08 16:10:33 +00:00
Simon Pilgrim ef44228acb [DAGCombiner] Simplify shuffle mask index if the referenced input element is UNDEF
Fixes one of the cases in PR34041.

Differential Revision: https://reviews.llvm.org/D36393

llvm-svn: 310344
2017-08-08 11:03:30 +00:00
Sanjay Patel 807f92b8ff [x86] revert r310208 to investigate test-suite failures (PR34105 / PR34097)
llvm-svn: 310264
2017-08-07 15:47:48 +00:00
Nirav Dave 3d3bde7682 [DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector.
Relanding after case to insert explicit truncation as necessary.

Allow SCALAR_TO_VECTOR of EXTRACT_VECTOR_ELT to reduce to
EXTRACT_SUBVECTOR of vector shuffle when output is smaller. Marginally
improves vector shuffle computations.

Reviewers: efriedma, RKSimon, spatel

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35566

llvm-svn: 310256
2017-08-07 14:07:49 +00:00
Guy Blank 5ca01695f7 [SelectionDAG] reset NewNodesMustHaveLegalTypes flag between basic blocks
The NewNodesMustHaveLegalTypes flag is set to false at the beginning of CodeGenAndEmitDAG, and set to true after legalizing types.
But before calling CodeGenAndEmitDAG we build the DAG for the basic block.
So for the first basic block NewNodesMustHaveLegalTypes would be 'false' during the SDAG building, and for all other basic blocks it would be 'true'.

This patch sets the flag to false before SDAG building each basic block.

Differential Revision:
https://reviews.llvm.org/D33435

llvm-svn: 310239
2017-08-07 05:51:14 +00:00
Sanjay Patel a923c2ee95 [x86] use more shift or LEA for select-of-constants
We can convert any select-of-constants to math ops:
http://rise4fun.com/Alive/d7d

For this patch, I'm enhancing an existing x86 transform that uses fake multiplies 
(they always become shl/lea) to avoid cmov or branching. The current code misses 
cases where we have a negative constant and a positive constant, so this is just 
trying to plug that hole.

The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start 
with a select in IR, create a select DAG node, convert it into a sext, convert it 
back into a select, and then lower it to sext machine code.

Some notes about the test diffs:

1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR.
2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. I 
   think we could avoid the push/pop in some cases if we used 'movzbl %al' instead of an xor on 
   a different reg? That's a post-DAG problem though.
3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if 
   that's a regression, but I think those would always be nearly equivalent.
4. pr22338.ll and sext-i1.ll - These tests have undef operands, so I don't think we actually care about these diffs.
5. sbb.ll - This shows a win for what I think is a common case: choose -1 or 0.
6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again.
7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops.

Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass.

Differential Revision: https://reviews.llvm.org/D35340

llvm-svn: 310208
2017-08-06 16:27:07 +00:00
Matt Arsenault b94972cb82 IPRA: Don't crash on null getCallPreservedMask
Kernels aren't callable, so they don't have a call preserved mask.

llvm-svn: 310172
2017-08-05 07:50:18 +00:00
Kyle Butt 74f61dd8ef BlockPlacement: add a flag to force cold block outlining w/o a profile.
NFC.

llvm-svn: 310129
2017-08-04 21:13:41 +00:00
Nico Weber b24df62bb6 Revert r310058, it caused PR34073.
llvm-svn: 310118
2017-08-04 20:24:13 +00:00
Quentin Colombet c7652e22d7 [GlobalISel] Remove a stall comment in CMake.
Thanks to Diana Picus <diana.picus@linaro.org> for noticing.

NFC

llvm-svn: 310114
2017-08-04 20:15:41 +00:00
Marcello Maggioni 8de4bbdaa5 [MachineOperand] Add ChangeToTargetIndex method. NFC
Differential Revision: https://reviews.llvm.org/D36301

llvm-svn: 310083
2017-08-04 18:24:09 +00:00
Simon Pilgrim 5c63586489 [DAGCombiner] Extending pattern detection for vector shuffle.
If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index.

Committed on behalf of @jbhateja (Jatin Bhateja)

Differential Revision: https://reviews.llvm.org/D35788

llvm-svn: 310058
2017-08-04 12:46:35 +00:00
Eric Christopher 52854dcd34 Fix typo.
llvm-svn: 309997
2017-08-03 22:41:12 +00:00
Matt Arsenault a52391f2db DAG: Provide access to Pass instance from SelectionDAG
This allows accessing an analysis pass during lowering.

llvm-svn: 309991
2017-08-03 21:54:00 +00:00
Quentin Colombet 250e050a50 [GlobalISel] Make GlobalISel a non-optional library.
With this change, the GlobalISel library gets always built. In
particular, this is not possible to opt GlobalISel out of the build
using the LLVM_BUILD_GLOBAL_ISEL variable any more.

llvm-svn: 309990
2017-08-03 21:52:25 +00:00
Nirav Dave 3fc1c2365c [DAG] Allow merging of stores of vector loads
Remove restriction disallowing merging of stores vector loads into
larger store of larger vector load.

Reviewers: RKSimon, efriedma, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36158

llvm-svn: 309951
2017-08-03 15:51:20 +00:00
Robert Lougher 10f740df4d [LiveDebugVariables] Use lexical scope to trim debug value live intervals
The debug value live intervals computed by Live Debug Variables may extend
beyond the range of the debug location's lexical scope. In this case,
splitting of an interval can result in an interval outside of the scope being
created, causing extra unnecessary DBG_VALUEs to be emitted. To prevent this,
trim the intervals to the lexical scope.

This resolves PR33730.

Reviewers: aprantl

Differential Revision: https://reviews.llvm.org/D35953

llvm-svn: 309933
2017-08-03 11:54:02 +00:00
Simon Dardis 51296593a8 [SelectionDAG] Resolve PR33978.
rL306209 taught SelectionDAG how to add the dereferenceable flag when
expanding memcpy and memmove. The fix however contained a nit where
the offset + size was constructed as an APInt of PointerSize rather
than PointerSizeInBits.

This lead to isDereferenceableAndAlignedPointer() get truncated values or
values which would be sign extended within that function leading to
incorrect results.

Thanks to Alex Crichton for reporting the issue!

This resolves PR33978.

Reviewers: inouehrs

Differential Revision: https://reviews.llvm.org/D36236

llvm-svn: 309930
2017-08-03 09:38:46 +00:00
Sameer AbuAsal c8befe687f [RegisterCoalescer] Add wrapper for Erasing Instructions
Summary:
      To delete an instruction the coalescer needs to call eraseFromParent()
      on the MachineInstr, insert it in the ErasedInstrs list and update the
      Live Ranges structure. This patch re-factors the code to do all that in
      one function. This will also fix cases where previous code wasn't
      inserting deleted instructions in the ErasedList.

Reviewers: qcolombet, kparzysz

Reviewed By: qcolombet

Subscribers: MatzeB, llvm-commits, qcolombet

Differential Revision: https://reviews.llvm.org/D36204

llvm-svn: 309915
2017-08-03 02:41:17 +00:00
Rafael Espindola 79e238afee Delete Default and JITDefault code models
IMHO it is an antipattern to have a enum value that is Default.

At any given piece of code it is not clear if we have to handle
Default or if has already been mapped to a concrete value. In this
case in particular, only the target can do the mapping and it is nice
to make sure it is always done.

This deletes the two default enum values of CodeModel and uses an
explicit Optional<CodeModel> when it is possible that it is
unspecified.

llvm-svn: 309911
2017-08-03 02:16:21 +00:00
Hiroshi Inoue 0bd906ec8f [StackColoring] Update AliasAnalysis information in stack coloring pass (part 2)
This patch is update after the first patch (https://reviews.llvm.org/rL309651) based on the post-commit comments.

Stack coloring pass need to maintain AliasAnalysis information when merging stack slots of different types.
Actually, there is a FIXME comment in StackColoring.cpp

// FIXME: In order to enable the use of TBAA when using AA in CodeGen,
// we'll also need to update the TBAA nodes in MMOs with values
// derived from the merged allocas.

But, TBAA has been already enabled in CodeGen without fixing this pass.
The incorrect TBAA metadata results in recent failures in bootstrap test on ppc64le (PR33928) by allowing unsafe instruction scheduling.
Although we observed the problem on ppc64le, this is a platform neutral issue.

This patch makes the stack coloring pass maintains AliasAnalysis information when merging multiple stack slots.

This patch fixes PR33928.

llvm-svn: 309849
2017-08-02 18:16:32 +00:00
Adrian Prantl bd6d291c59 Assert that the offset of a DBG_VALUE is always 0. (NFC)
llvm-svn: 309834
2017-08-02 17:19:13 +00:00
Adrian Prantl 5577363a2c Remove the unused Offset field from MachineLocation (NFC)
rdar://problem/33580047

llvm-svn: 309831
2017-08-02 17:07:38 +00:00
Nirav Dave e44032f7e7 [DAG] Improve candidate pruning in store merge failure case. NFCI
During store merge we construct a sorted list of consecutive store
candidates and consider subsequences for merging into a single
store. For each subsequence we check if the stored value type is legal
the merged store would have valid and fast and if the constructed
value to be stored is valid. The only properties that affect this
check between subsequences is the size of the subsequence, the
alignment of the first store, the alignment of the stored load value
(when merging stores-of-loads), and whether the merged value is a
constant zero.

If we do not find a viable mergeable subsequence starting from the
first store of length N, we know that a subsequence starting at a
later store of length N will also fail unless the new store's
alignment, the new load's alignment (if we're merging store-of-loads),
or we've dropped stores of nonzero value and could construct a merged
stores of zero (for merging constants).

As a result if we fail to find a valid subsequence starting from the
first store we can safely skip considering subsequences that start
with subsequent stores unless one of the above properties is
true. This significantly (2x) improves compile time in some
pathological cases.

Reviewers: RKSimon, efriedma, zvi, spatel, waltl

Subscribers: grandinj, llvm-commits

Differential Revision: https://reviews.llvm.org/D35901

llvm-svn: 309830
2017-08-02 16:35:58 +00:00
Adrian Prantl 61b39b2aec Remove unused includes of MachineLocation.h (NFC)
llvm-svn: 309824
2017-08-02 15:32:18 +00:00
Adrian Prantl 2049c0d734 Remove unreachable code. (NFC)
MachineLocation::getOffset() always returns 0.

rdar://problem/33580047

llvm-svn: 309823
2017-08-02 15:22:17 +00:00
Diana Picus d5a00b0ff6 [MIR] Print target-specific constant pools
This should enable us to test the generation of target-specific constant
pools, e.g. for ARM:

constants:
 - id:              0
   value:           'g(GOT_PREL)-(LPC0+8-.)'
   alignment:       4
   isTargetSpecific: true

I intend to use this to test PIC support in GlobalISel for ARM.

This is difficult to test outside of that context, since the existing
MIR tests usually rely on parser support as well, and that seems a bit
trickier to add. We could try to add a unit test, but the setup for that
seems rather convoluted and overkill.

We do test however that the parser reports a nice error when
encountering a target-specific constant pool.

Differential Revision: https://reviews.llvm.org/D36092

llvm-svn: 309806
2017-08-02 11:09:30 +00:00
Nirav Dave 15894f8dec [DAG] Refactor store merge subexpressions. NFC.
Distribute various expressions across ifs.

llvm-svn: 309777
2017-08-02 01:08:38 +00:00
Matt Arsenault acc5e82b0e DAG: Undo and->or combine with FrameIndexes
This pattern shows up when lowering byval copies on AMDGPU.

The byval object access is split into 4-byte chunks, adding a
constant offset to the FixedStack base. When some of the offsets
turn into ors, this prevents combining the constant offsets.

This makes it not apparent that the object is there when matching
addressing modes, so it ends up using a scratch wave offset
relative access and the lengthy frame index expansion for that.

llvm-svn: 309775
2017-08-02 00:43:42 +00:00
Adrian Prantl 83ca4fc7bc Update LiveDebugValues to generate DIExpressions for spill offsets
instead of using the deprecated offset field of DBG_VALUE.

This has no observable effect on the generated DWARF, but the
assembler comments will look different.

rdar://problem/33580047

llvm-svn: 309773
2017-08-02 00:16:56 +00:00
Adrian Prantl aac78ce47e Use helper function instead of manually constructing DBG_VALUEs (NFC)
rdar://problem/33580047

llvm-svn: 309757
2017-08-01 22:37:35 +00:00
Adrian Prantl 032d2381bf Remove PrologEpilogInserter's usage of DBG_VALUE's offset field
In the last half-dozen commits to LLVM I removed code that became dead
after removing the offset parameter from llvm.dbg.value gradually
proceeding from IR towards the backend. Before I can move on to
DwarfDebug and friends there is one last side-called offset I need to
remove:  This patch modifies PrologEpilogInserter's use of the
DBG_VALUE's offset argument to use a DIExpression instead. Because the
PrologEpilogInserter runs at the Machine level I had to play a little
trick with a named llvm.dbg.mir node to get the DIExpressions to print
in MIR dumps (which print the llvm::Module followed by the
MachineFunction dump).

I also had to add rudimentary DwarfExpression support to CodeView and
as a side-effect also fixed a bug (CodeViewDebug::collectVariableInfo
was supposed to give up on variables with complex DIExpressions, but
would fail to do so for fragments, which are also modeled as
DIExpressions).

With this last holdover removed we will have only one canonical way of
representing offsets to debug locations which will simplify the code
in DwarfDebug (and future versions of CodeViewDebug once it starts
handling more complex expressions) and make it easier to reason about.

This patch is NFC-ish: All test case changes are for assembler
comments and the binary output does not change.

rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D36125

llvm-svn: 309751
2017-08-01 21:45:24 +00:00
Nirav Dave dcc5afaad9 [DAG] Factor out common expressions. NFC.
llvm-svn: 309740
2017-08-01 20:30:52 +00:00
Reid Kleckner 29c675247d [DebugInfo] Don't turn dbg.declare into DBG_VALUE for static allocas
Summary:
We already have information about static alloca stack locations in our
side table. Emitting instructions for them is inefficient, and it only
happens when the address of the alloca has been materialized within the
current block, which isn't often.

Reviewers: aprantl, probinson, dblaikie

Subscribers: jfb, dschuff, sbc100, jgravelle-google, hiraditya, llvm-commits, aheejin

Differential Revision: https://reviews.llvm.org/D36117

llvm-svn: 309729
2017-08-01 19:45:09 +00:00
Nirav Dave 35dd1ac29c Pull out VectorNumElements value. NFC.
llvm-svn: 309719
2017-08-01 18:19:56 +00:00
Nirav Dave 27a6605bdc Revert "[DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector."
This reverts commit r309680 which appears to be raising an assertion
in the test-suite.

llvm-svn: 309717
2017-08-01 18:09:25 +00:00
Sanjay Patel 4dbdd470a8 [CGP] use narrower types in memcmp expansion when possible
This only affects very small memcmp on x86 for now, but it
will become more important if we allow vector-sized load and
compares.

llvm-svn: 309711
2017-08-01 17:24:54 +00:00
Nirav Dave f54c8370e5 [DAG] Convert extload check to equivalent type check. NFC.
Replace check with check that consuming store has the same type.

llvm-svn: 309708
2017-08-01 17:19:41 +00:00
Nirav Dave b5a0af6b6e [DAG] Move extload check in store merge. NFC.
Move candidate check from later check to initial candidate check.

llvm-svn: 309698
2017-08-01 16:00:47 +00:00
Manoj Gupta 7ebbe655ed [X86] Fix a crash in FEntryInserter Pass.
Summary:
FEntryInserter pass unconditionally derefs the first Instruction
in the first Basic Block. The pass crashes when the first
BasicBlock is empty. Fix the crash by not dereferencing the basic
Block iterator. This fixes an issue observed when building Linux kernel
4.4 with clang.

Fixes PR33971.

Reviewers: hfinkel, niravd, dblaikie

Reviewed By: niravd

Subscribers: davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D35979

llvm-svn: 309694
2017-08-01 15:39:12 +00:00
David Blaikie c2be863681 DebugInfo: Update flag description that'd been copypasted from another
Post-commit review feedback from Paul Robinson on r309630. Thanks Paul!

llvm-svn: 309685
2017-08-01 14:50:50 +00:00
Nirav Dave b5cb48c6ae [DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector.
Summary:
Allow SCALAR_TO_VECTOR of EXTRACT_VECTOR_ELT to reduce to
EXTRACT_SUBVECTOR of vector shuffle when output is smaller. Marginally
improves vector shuffle computations.

Reviewers: efriedma, RKSimon, spatel

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35566

llvm-svn: 309680
2017-08-01 13:45:35 +00:00
Andrew V. Tischenko d56595184b Support itineraries in TargetSubtargetInfo::getSchedInfoStr - Now if the given instr does not have sched model then we try to calculate the latecy/throughput with help of itineraries.
Differential Revision https://reviews.llvm.org/D35997

llvm-svn: 309666
2017-08-01 09:15:43 +00:00
Hiroshi Inoue b9417dbd48 [StackColoring] Update AliasAnalysis information in stack coloring pass
Stack coloring pass need to maintain AliasAnalysis information when merging stack slots of different types.
Actually, there is a FIXME comment in StackColoring.cpp

// FIXME: In order to enable the use of TBAA when using AA in CodeGen,
// we'll also need to update the TBAA nodes in MMOs with values
// derived from the merged allocas.

But, TBAA has been already enabled in CodeGen without fixing this pass.
The incorrect TBAA metadata results in recent failures in bootstrap test on ppc64le (PR33928) by allowing unsafe instruction scheduling.
Although we observed the problem on ppc64le, this is a platform neutral issue.

This patch makes the stack coloring pass maintains AliasAnalysis information when merging multiple stack slots.

llvm-svn: 309651
2017-08-01 03:32:15 +00:00
Eli Friedman 37f41d1e7c [ScheduleDAG] Don't schedule node with physical register interference
https://reviews.llvm.org/D31536 didn't really solve the problem it was
trying to solve; it got rid of the assertion failure, but we were still
scheduling the DAG incorrectly (mixing together instructions from
different calls), leading to a MachineVerifier failure.

In order to schedule the DAG correctly, we have to make sure we don't
schedule a node which should be blocked by an interference. Fix
ScheduleDAGRRList::PickNodeToScheduleBottomUp so it doesn't pick a node
like that.

The added call to FindAvailableNode() is the key change here; this makes
sure we don't try to schedule a call while we're in the middle of
scheduling a different call. I'm not sure this is the right approach; in
particular, I'm not sure how to prove we don't end up with an infinite
loop of repeatedly backtracking.

This also reverts the code change from D31536. It doesn't do anything
useful: we should never schedule an ADJCALLSTACKDOWN unless we've
already scheduled the corresponding ADJCALLSTACKUP.

Differential Revision: https://reviews.llvm.org/D33818

llvm-svn: 309642
2017-08-01 00:28:40 +00:00
David Blaikie 038e28a5a7 DebugInfo: Put range base specifier entry functionality behind a flag
Chromium's gold build seems to have trouble with this (gold produces
errors) - not sure if it's gold that's not coping with the valid
representation, or a bug in the implementation in LLVM, etc.

llvm-svn: 309630
2017-07-31 21:48:42 +00:00
Reid Kleckner 2de471dca9 [codeview] Ignore DBG_VALUEs when choosing a BB start source loc
When the first instruction of a basic block has no location (consider a
LEA materializing the address of an alloca for a call), we want to start
the line table for the block with the first valid source location in the
block.  We need to ignore DBG_VALUE instructions during this scan to get
decent line tables.

llvm-svn: 309628
2017-07-31 21:03:08 +00:00
Quentin Colombet 15f6ffbf7c [TargetPassConfig] Feature generic options to setup start/stop-after/before
This patch refactors the code used in llc such that all the users of the
addPassesToEmitFile API have access to a homogeneous way of handling
start/stop-after/before options right out of the box.

In particular, just invoking addPassesToEmitFile will set the proper
pipeline without additional effort (modulo parsing a .mir file if the
start-before/after options are used.

NFC.

Differential Revision: https://reviews.llvm.org/D30913

llvm-svn: 309599
2017-07-31 18:24:07 +00:00
Sanjay Patel fea731a4aa [CGP] use subtract or subtract-of-cmps for result of memcmp expansion
As noted in the code comment, transforming this in the other direction might require 
a separate transform here in CGP given the block-at-a-time DAG constraint.

Besides that theoretical motivation, there are 2 practical motivations for the 
subtract-of-cmps form:

1. The codegen for both x86 and PPC is better for this IR (though PPC could be better still). 
   There is discussion about canonicalizing IR to the select form 
   ( http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html ), 
   so we probably need to add DAG transforms for those patterns anyway, but this improves the 
   memcmp output without waiting for that step.

2. If we allow vector-sized chunks for the load and compare, x86 is better prepared to convert
   that to optimal code when using subtract-of-cmps, so another prerequisite patch is avoided
   if we choose to enable that.

Differential Revision: https://reviews.llvm.org/D34904

llvm-svn: 309597
2017-07-31 18:08:24 +00:00
Aditya Nandakumar 02c602e18c [GISel]: Support Widening G_ICMP's destination operand.
Updated AArch64 to widen destination to s32.
https://reviews.llvm.org/D35737

Reviewed by Tim

llvm-svn: 309579
2017-07-31 17:00:16 +00:00
Florian Hahn 1b09a37c07 Extend ifndef to printDebugLoc.
GCC7 did not warn about that, but Clang does.

llvm-svn: 309573
2017-07-31 16:29:00 +00:00
Florian Hahn 94b8a87c8e Extend ifdefs to more unused helper functions.
This fixes a buildbot failure with -Werror introduced by r309553

llvm-svn: 309572
2017-07-31 16:11:43 +00:00
Simon Dardis 8930b4a049 [SelectionDAG][mips] Fix PR33883
PR33883 shows that calls to intrinsic functions should not have their vector
arguments or returns subject to ABI changes required by the target.

This resolves PR33883.

Thanks to Alex Crichton for reporting the issue!

Reviewers: zoran.jovanovic, atanasyan

Differential Revision: https://reviews.llvm.org/D35765

llvm-svn: 309561
2017-07-31 14:06:58 +00:00
Florian Hahn 6b3216aad8 Guard print() functions only used by dump() functions.
Summary:
Since  r293359, most dump() function are only defined when
`!defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)` holds. print() functions
only used by dump() functions are now unused in release builds,
generating lots of warnings. This patch only defines some print()
functions if they are used.

Reviewers: MatzeB

Reviewed By: MatzeB

Subscribers: arsenm, mzolotukhin, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D35949

llvm-svn: 309553
2017-07-31 10:07:49 +00:00
David Blaikie 4dd663752d DebugInfo: Fix r309526, ensure resetting base address selection entries are used
Missed the resetting base address selections when going from a base
address version to zero base address for non-base-addressed entries.

llvm-svn: 309529
2017-07-31 00:18:24 +00:00
David Blaikie 89c81a0b91 DebugInfo: Use base address selection entries in debug_ranges to reduce relocations
(from comments in the test)
Group ranges in a range list that apply to the same section and use a base
address selection entry to reduce the number of relocations to one reloc per
section per range list. DWARF5 debug_rnglist will be more efficient than this
in terms of relocations, but it's still better than one reloc per entry in a
range list.

This is an object/executable size tradeoff - shrinking objects, but growing
the linked executable. In one large binary tested, total object size (not just
debug info) shrank by 16%, entirely relocation entries. Linked executable
grew by 4%. This was with compressed debug info in the objects, uncompressed
in the linked executable. Without compression in the objects, the win would be
smaller (the growth of debug_ranges itself would be more significant).

llvm-svn: 309526
2017-07-30 22:10:00 +00:00
Simon Pilgrim 718cb0ea62 [SelectionDAG][X86] CombineBT - more aggressively determine demanded bits
This patch is in 2 parts:

1 - replace combineBT's use of SimplifyDemandedBits (hasOneUse only) with SelectionDAG::GetDemandedBits to more aggressively determine the lower bits used by BT.

2 - update SelectionDAG::GetDemandedBits to support ANY_EXTEND - if the demanded bits are only in the non-extended portion, then peek through and demand from the source value and then ANY_EXTEND that if we found a match.

Differential Revision: https://reviews.llvm.org/D35896

llvm-svn: 309486
2017-07-29 14:50:25 +00:00
Jessica Paquette d87f54493d [MachineOutliner] NFC: Change IsTailCall to a call class + frame class
This commit

- Removes IsTailCall and replaces it with a target-defined unsigned
- Refactors getOutliningCallOverhead and getOutliningFrameOverhead so that they don't use IsTailCall
- Adds a call class + frame class classification to OutlinedFunction and Candidate respectively

This accomplishes a couple things.

Firstly, we don't need the notion of *tail call* in the general outlining algorithm.

Secondly, we now can have different "outlining classes" for each candidate within a set of candidates.
This will make it easy to add new ways to outline sequences for certain targets and dynamically choose
an appropriate cost model for a sequence depending on the context that that sequence lives in.

Ultimately, this should get us closer to being able to do something like, say avoid saving the link
register when outlining AArch64 instructions.

llvm-svn: 309475
2017-07-29 02:55:46 +00:00
Adrian Prantl 359846fe1c Remove the unused offset field from LiveDebugValues (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309455
2017-07-28 23:25:51 +00:00
Adrian Prantl b2811e50f9 Remove the unused offset field from LiveDebugVariables (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309451
2017-07-28 23:06:50 +00:00
Adrian Prantl 8b9bb534a1 Remove the unused offset from DBG_VALUE (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309450
2017-07-28 23:00:45 +00:00
Adrian Prantl d92ac5a259 Remove the unused DBG_VALUE offset parameter from GlobalISel (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309449
2017-07-28 22:46:20 +00:00
Adrian Prantl 1b63dc54b0 Remove the unused DBG_VALUE offset parameter from RegAllocFast (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309446
2017-07-28 22:36:55 +00:00
Adrian Prantl a617576bb1 Remove the unused dbg.value offset from SelectionDAG (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309436
2017-07-28 21:27:35 +00:00
Adrian Prantl abe04759a6 Remove the obsolete offset parameter from @llvm.dbg.value
There is no situation where this rarely-used argument cannot be
substituted with a DIExpression and removing it allows us to simplify
the DWARF backend. Note that this patch does not yet remove any of
the newly dead code.

rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D35951

llvm-svn: 309426
2017-07-28 20:21:02 +00:00
Reid Kleckner 9be82c3169 Fix conditional tail call branch folding when both edges are the same
The conditional tail call logic did the wrong thing when both
destinations of a conditional branch were the same:

BB#1: derived from LLVM BB %entry
    Live Ins: %EFLAGS
    Predecessors according to CFG: BB#0
        JE_1 <BB#5>, %EFLAGS<imp-use,kill>
        JMP_1 <BB#5>

BB#5: derived from LLVM BB %sw.epilog
    Predecessors according to CFG: BB#1
        TCRETURNdi64 <ga:@mergeable_conditional_tailcall>, 0, ...

We would fold the JE_1 to a TCRETURNdi64cc, and then remove our BB#5
successor. Then BB#5 would be deleted as it had no predecessors, leaving
a dangling "JMP_1 <BB#5>" reference behind to cause assertions later.

This patch checks that both conditional branch destinations are
different before doing the transform. The standard branch folding logic
is able to remove both the JMP_1 and the JE_1, and for my test case we
end up forming a better conditional tail call later.

Fixes PR33980

llvm-svn: 309422
2017-07-28 19:48:40 +00:00
Jessica Paquette 4602c3437c [MachineOutliner] NFC: Comment tidying
The comment on describing the suffix tree had some pruning
stuff that was out of date in it.

Also fixed some typos.

llvm-svn: 309365
2017-07-28 05:59:30 +00:00
Jessica Paquette 809d708b8a [MachineOutliner] NFC: Split up getOutliningBenefit
This is some more cleanup in preparation for some actual
functional changes. This splits getOutliningBenefit into
two cost functions: getOutliningCallOverhead and
getOutliningFrameOverhead. These functions return the
number of instructions that would be required to call
a specific function and the number of instructions
that would be required to construct a frame for a
specific funtion. The actual outlining benefit logic
is moved into the outliner, which calls these functions.

The goal of refactoring getOutliningBenefit is to:

- Get us closer to getting rid of the IsTailCall flag

- Further split up "target-specific" things and
"general algorithm" things

llvm-svn: 309356
2017-07-28 03:21:58 +00:00
David Blaikie 89daf77a11 DebugInfo: Consider a CU containing only local imported entities to be 'empty'
This can come up in ThinLTO & wastes space & makes degenerate IR.

As per the added FIXME, ultimately, local imported entities should hang
off the function and that way the imported entity list on the CU can be
tested for emptiness like all the other CU lists.

(function-attached local imported entities are probably also the best
path forward for fixing how imported entities are handled both in
cross-module use (currently, while ThinLTO preserves the imported
entities, they would not get used at the imported inlined location -
only in the abstract origin that appears in the partial CU created by
the import (which isn't emitted under Fission due to cross-CU
limitations there)) and to reduce the number of points where imported
entities are emitted (they're currently emitted into every inlined
instance, concrete instance, and abstract origin - they should only go
in teh abstract origin if there is one, otherwise in the concrete
instance - but this requires lots of delayed handling and wiring up,
same as abstract variables & subprograms))

llvm-svn: 309354
2017-07-28 03:06:25 +00:00
Jessica Paquette 78681be2a4 [MachineOutliner] Cleanup: move findCandidates out of suffix tree
Doing some cleanup in preparation for some functional changes.
This commit moves findCandidates out of the suffix tree and into the
MachineOutliner class. This is much easier to follow, and removes
the burden of candidate choice from the suffix tree.

It also adds a couple FIXMEs and simplifies building outlined function
names.

llvm-svn: 309334
2017-07-27 23:24:43 +00:00
Simon Pilgrim ac84850ea6 [SelectionDAG] Improve DAGTypeLegalizer::convertMask assertion (PR33960)
Improve DAGTypeLegalizer::convertMask's isSETCCorConvertedSETCC assertion to properly check for any mixture of SETCC or BUILD_VECTOR of constants, or a logical mask op of them.

llvm-svn: 309302
2017-07-27 18:15:54 +00:00
Adam Nemet 6374331a8c [OptRemark] Allow streaming of 64-bit integers
llvm-svn: 309293
2017-07-27 16:54:13 +00:00
Simon Pilgrim 64a795c5f5 [SelectionDAG] Avoid repeated calls to getNumOperands in for loop. NFCI.
llvm-svn: 309283
2017-07-27 15:42:21 +00:00
Adrian Prantl 960e7663f3 remove redundant check
llvm-svn: 309280
2017-07-27 15:24:20 +00:00
Simon Pilgrim 0d543b5921 [SelectionDAG] Tidyup mask creation. NFCI.
Assign all concat elements to UNDEF and then just replace the first element, instead of copying everything individually.

llvm-svn: 309277
2017-07-27 15:08:53 +00:00
David Blaikie 2195e13676 DebugInfo: Ensure imported entities at the top level of an inlined function don't cause degenerate concrete definitions
Local imported entities at the top level of a subprogram were being
handled differently from those in nested scopes - that different
handling would cause pseudo concrete out-of-line definitions to be
created (but without any of their attributes, nor an abstract_origin) in
the case where there was no real concrete definition.

These local imported entities also only appeared in the concrete
definition where those imported entities in nested scopes appear in all
cases (abstract, concrete, and inlined). This change at least makes top
level case handle the same as the others - though there's a FIXME to
improve this to /only/ emit them into the abstract origin (though this
requires more plumbing - like the abstract subprogram and variable
handling that must defer population until the end of the unit to
discover if there is an abstract origin, or only a standalone concrete
definition).

llvm-svn: 309237
2017-07-27 00:06:53 +00:00
Peter Collingbourne 081ffe2ff2 Change CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI.
This was a use-after-free waiting to happen.

llvm-svn: 309159
2017-07-26 19:15:29 +00:00
Andrew V. Tischenko d1fefa3d7c This patch returns proper value to indicate the case when instruction throughput can't be calculated.
Differential revision https://reviews.llvm.org/D35831

llvm-svn: 309156
2017-07-26 18:55:14 +00:00
Adrian Prantl 833ad37c90 Do a better job at emitting prefrabricated skeleton CUs.
This is a better fix than r308708 for the problem introduced in
r304020. It restores the skeleton CU testcases modified by that commit
to their original form and most importantly ensures that
frontend-generated skeleton CUs (such as used to point to Clang
modules) come after the regular CUs. This broke for DICompileUnit
nodes that don't have any immediate children because they are now
constructed lazily instead of the order in which they are listed in
!llvm.dbg.cu. After this commit we still don't guarantee that order,
but we do guarantee that empty skeletons come last.

Shipping versions of LLDB are very sensitive to the ordering of
CUs. I'll track a fix for LLDB to be more permissive separately.
This fixes a test failure in the LLDB testsuite.

rdar://problem/33357252

llvm-svn: 309154
2017-07-26 18:48:32 +00:00
Zvi Rackover 092f199188 DAGCombiner: Extend reduceBuildVecToTrunc to handle non-zero offset
Summary:
Adding support for combining power2-strided build_vector's where the
first build_vectori's operand is extracted from a non-zero index.

Example:

 v4i32 build_vector((extract_elt V, 1),
                    (extract_elt V, 3),
                    (extract_elt V, 5),
                    (extract_elt V, 7))
 -->
 v4i32 truncate (bitcast (shuffle<1,u,3,u,5,u,7,u> V, u) to v4i64)

Reviewers: delena, RKSimon, guyblank

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35700

llvm-svn: 309108
2017-07-26 12:57:03 +00:00
Adrian Prantl be66271f04 Debug Info: Support fragmented variables in the MMI side table
This reapplies commit r309034 with a bugfix+test for inlined variables.

llvm-svn: 309057
2017-07-25 23:32:59 +00:00
Adrian Prantl b6d5faf2ea Revert "Debug Info: Support fragmented variables in the MMI side table"
This reverts commit r309034 because of a sanitizer issue.

llvm-svn: 309035
2017-07-25 21:50:45 +00:00
Adrian Prantl 3d1ab0cd1e Debug Info: Support fragmented variables in the MMI side table
<rdar://problem/17816343>

llvm-svn: 309034
2017-07-25 21:29:22 +00:00
Simon Pilgrim 6d59933175 [DAG] Move DAGCombiner::GetDemandedBits to SelectionDAG
This patch moves the DAGCombiner::GetDemandedBits function to SelectionDAG::GetDemandedBits as a first step towards making it easier for targets to get to the source of any demanded bits without the limitations of SimplifyDemandedBits.

Differential Revision: https://reviews.llvm.org/D35841

llvm-svn: 308983
2017-07-25 16:36:44 +00:00
Francois Pichet 82bf3de606 Fix endianness bug in DAGCombiner::visitTRUNCATE and visitEXTRACT_VECTOR_ELT
Summary:
Do not assume little endian architecture in DAGCombiner::visitTRUNCATE and DAGCombiner::visitEXTRACT_VECTOR_ELT.
PR33682

Reviewers: hfinkel, sdardis, RKSimon

Reviewed By: sdardis, RKSimon

Subscribers: uabelho, RKSimon, sdardis, llvm-commits

Differential Revision: https://reviews.llvm.org/D34990

llvm-svn: 308960
2017-07-25 09:40:35 +00:00
Matt Arsenault 5fbc87021e RA: Replace asserts related to empty live intervals
These don't exactly assert the same thing anymore, and
allow empty live intervals with non-empty uses.

Removed in r308808 and r308813.

llvm-svn: 308906
2017-07-24 18:07:55 +00:00
Benjamin Kramer fc638c11bb [CodeGenPrepare] Cut off FindAllMemoryUses if there are too many uses.
This avoids excessive compile time. The case I'm looking at is
Function.cpp from an old version of LLVM that still had the giant memcmp
string matcher in it. Before r308322 this compiled in about 2 minutes,
after it, clang takes infinite* time to compile it. With this patch
we're at 5 min, which is still bad but this is a pathological case.

The cut off at 20 uses was chosen by looking at other cut-offs in LLVM
for user scanning. It's probably too high, but does the job and is very
unlikely to regress anything.

Fixes PR33900.

* I'm impatient and aborted after 15 minutes, on the bug report it was
  killed after 2h.

llvm-svn: 308891
2017-07-24 16:18:09 +00:00
Reid Kleckner 898ddf61c0 [codeview] Emit 'D' as the cv source language for D code
This matches DMD:
522263965c/src/ddmd/backend/cv8.c (L199)

Fixes PR33899.

llvm-svn: 308890
2017-07-24 16:16:42 +00:00
Reid Kleckner 7f6b2534fb Format some case labels and shrink an anonymous namespace NFC
llvm-svn: 308889
2017-07-24 16:16:17 +00:00
Petr Hosek 710479cede [CodeGen][X86] Fuchsia supports sincos* libcalls and sin+cos->sincos optimization
Patch by Roland McGrath

Differential Revision: https://reviews.llvm.org/D35748

llvm-svn: 308854
2017-07-23 22:30:00 +00:00
Nirav Dave 4e6dcf73f9 [DAG] Fix typo preventing some stores merges to truncated stores.
Check the actual memory type stored and not the extended value size
when considering if truncated store merge is worthwhile.

Reviewers: efriedma, RKSimon, spatel, jyknight

Reviewed By: efriedma

Subscribers: llvm-commits, nhaehnle

Differential Revision: https://reviews.llvm.org/D35623

llvm-svn: 308833
2017-07-23 02:06:28 +00:00