Commit Graph

23755 Commits

Author SHA1 Message Date
Sam Clegg e1694f9bf8 [WebAssembly] section kind can be code
Currently, when creating a named section, the Wasm
frontend forces it to use `SectionKind::Data`, whereas
in fact C++ does generate code sections with custom
names.

Patch by Nicholas Wilson

Differential Revision: https://reviews.llvm.org/D40906

llvm-svn: 320002
2017-12-07 02:55:51 +00:00
Florian Hahn 001c3dd202 [MachineCombiner] Add up latencies of all instructions in new pattern.
Summary:
When calculating the RootLatency, we add up all the latencies of the
deleted instructions. But for NewRootLatency we only add the latency of
the new root instructions, ignoring the latencies of the other
instructions inserted. This leads the combiner to underestimate the cost
of patterns which add multiple instructions. This patch fixes that by
summing up the latencies of all new instructions. For NewRootNode, the
more complex getLatency function is used.

Note that we may be slightly more precise than just summing up
all latencies. For example, consider a pattern like

    r1 = INS1 ..
    r2 = INS2 ..
    r3 = INS3 r1, r2

I think in some other places, the total latency of the pattern would be
estimated as lat(INS3) + max(lat(INS1), lat(INS2)). If you consider
that worth changing, I think it would be best to do in a follow-up
patch.

Reviewers: Gerolf, sebpop, spop, fhahn

Reviewed By: fhahn

Subscribers: evandro, llvm-commits

Differential Revision: https://reviews.llvm.org/D40307

llvm-svn: 319951
2017-12-06 20:27:33 +00:00
Nirav Dave 7d8f3e0c93 [ARM][AArch64][DAG] Reenable post-legalize store merge
Reenable post-legalize stores with constant merging computation and
corresponding test case.

 * Properly truncate store merge constants
 * Disable merging of truncated stores floating points
 * Ensure merges of constant stores into a single vector are
   constructed from legal elements.

Reviewers: eastig, efriedma

Reviewed By: eastig

Subscribers: spatel, rengolin, aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40701

llvm-svn: 319899
2017-12-06 15:30:13 +00:00
Francis Visoiu Mistrih c2641d632b [CodeGen] Fix formatting error from r319885
llvm-svn: 319886
2017-12-06 11:57:53 +00:00
Francis Visoiu Mistrih 95a0591545 [CodeGen] Better handling of detached MachineOperands
Basically use getMFIfAvailable to check if we can crawl up to the
function.

llvm-svn: 319885
2017-12-06 11:55:42 +00:00
Mikael Holmen 898eb34b49 [[Machine]Dominators] Improved printout when verifyDomTree fails [NFC]
Include the function name in the printout.

llvm-svn: 319882
2017-12-06 09:27:48 +00:00
Vlad Tsyrklevich 0b40f21134 Revert "[DAGCombine] Move AND nodes to multiple load leaves"
This reverts commit r319773. It was causing some buildbots to hang, e.g.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/5589

llvm-svn: 319867
2017-12-06 01:16:08 +00:00
Craig Topper 0328d0083e [SelectionDAG] Don't promote the condition operand of VSELECT when promoting the result.
The condition operand should be promoted during operand promotion.

llvm-svn: 319853
2017-12-05 23:08:32 +00:00
Craig Topper dfd90802af [SelectionDAG] Don't promote mask operand when widening mstore and mscatter.
If the mask needs to be promoted that should occur by the legalizer detecting the mask operand needs to be promoted not as a side effect of another action.

llvm-svn: 319852
2017-12-05 23:08:30 +00:00
Craig Topper ddc6bba0ee [SelectionDAG] Don't promote mask when splitting mstore.
If the mask needs to be promoted it should be handled by operand promotion after the result is legalized.

llvm-svn: 319851
2017-12-05 23:08:28 +00:00
Craig Topper 2e684593b7 [SelectionDAG] Don't promote mask operands of MGATHER and MLOAD to setcc result type while widening the result. Just widen the mask.
The mask will be promoted if necessary when operands are promoted. It's possible the mask type is legal, but the setcc result type is a different. We shouldn't promote to the setcc result type unless the mask needs to be promoted.

llvm-svn: 319850
2017-12-05 23:08:27 +00:00
Craig Topper 57440a6f65 [SelectionDAG] Don't call GetWidenedVector for mask operands of MLOAD/MSTORE.
GetWidenedVector does't guarantee the widened elements are zero which would break the intended behavior of the operation.

llvm-svn: 319849
2017-12-05 23:08:25 +00:00
Hans Wennborg 5df9f0878b Re-commit r319490 "XOR the frame pointer with the stack cookie when protecting the stack"
The patch originally broke Chromium (crbug.com/791714) due to its failing to
specify that the new pseudo instructions clobber EFLAGS. This commit fixes
that.

> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622

llvm-svn: 319824
2017-12-05 20:22:20 +00:00
Craig Topper 8adcbe8c9f [SelectionDAG] Remove the code that handles SETCC with a scalar result type from vector widening.
There's no such thing as a setcc with vector operands and scalar result. And if we're trying to widen the result we would have to already be looking at a vector result type.

So this patch renames the VSETCC function as the SETCC function and delete the original SETCC function.

llvm-svn: 319799
2017-12-05 17:37:19 +00:00
Craig Topper 558cc48b44 [SelectionDAG] Remove unused method declaration.
The method implementation was removed in r318982.

llvm-svn: 319798
2017-12-05 17:37:17 +00:00
Sam Parker 0a436a9d62 [DAGCombine] Move AND nodes to multiple load leaves
Search from AND nodes to find whether they can be propagated back to
loads, so that the AND and load can be combined into a narrow load.
We search through OR, XOR and other AND nodes and all bar one of the
leaves are required to be loads or constants. The exception node then
needs to be masked off meaning that the 'and' isn't removed, but the
loads(s) are narrowed still.

Differential Revision: https://reviews.llvm.org/D39604

llvm-svn: 319773
2017-12-05 15:13:47 +00:00
Bjorn Pettersson 823b299fbc [DAGCombine] Handle big endian correctly in CombineConsecutiveLoads
Summary:
Found out, at code inspection, that there was a fault in
DAGCombiner::CombineConsecutiveLoads for big-endian targets.

A BUILD_PAIR is always having the least significant bits of
the composite value in element 0. So when we are doing the checks
for consecutive loads, for big endian targets, we should check
if the load to elt 1 is at the lower address and the load
to elt 0 is at the higher address.

Normally this bug only resulted in missed oppurtunities for
doing the load combine. I guess that in some rare situation it
could lead to faulty combines, but I've not seen that happen.

Note that this patch actually will trigger load combine for
some big endian regression tests.
One example is test/CodeGen/PowerPC/anon_aggr.ll where we now get
  t76: i64,ch = load<LD8[FixedStack-9]
instead of
  t37: i32,ch = load<LD4[FixedStack-10]>
  t35: i32,ch = load<LD4[FixedStack-9]>
  t41: i64 = build_pair t37, t35
before legalization. Then the legalization will split the LD8
into two loads, so the end result is the same. That should
verify that the transfomation is correct now.

Reviewers: niravd, hfinkel

Reviewed By: niravd

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D40444

llvm-svn: 319771
2017-12-05 14:50:05 +00:00
Sam Parker 8b73630c32 [DAGCombine] isLegalNarrowLoad function (NFC)
Pull the checks upon the load out from ReduceLoadWidth into their own
function.

Differential Revision: https://reviews.llvm.org/D40833

llvm-svn: 319766
2017-12-05 14:03:51 +00:00
Jonas Paulsson 86c40db49d [Regalloc] Generate and store multiple regalloc hints.
MachineRegisterInfo used to allow just one regalloc hint per virtual
register. This patch extends this to a vector of regalloc hints, which is
filled in by common code with sorted copy hints. Such hints will make for
more ID copies that can be removed.

NB! This improvement is currently (and hopefully temporarily) *disabled* by
default, except for SystemZ. The only reason for this is the big impact this
has on tests, which has unfortunately proven unmanageable. It was a long
while since all the tests were updated and just waiting for review (which
didn't happen), but now targets have to enable this themselves
instead. Several targets could get a head-start by downloading the tests
updates from the Phabricator review. Thanks to those who helped, and sorry
you now have to do this step yourselves.

This should be an improvement generally for any target!

The target may still create its own hint, in which case this has highest
priority and is stored first in the vector. If it has target-type, it will
not be recomputed, as per the previous behaviour.

The temporary hook enableMultipleCopyHints() will be removed as soon as all
targets return true.

Review: Quentin Colombet, Ulrich Weigand.
https://reviews.llvm.org/D38128

llvm-svn: 319754
2017-12-05 10:52:24 +00:00
Craig Topper 98495291a7 [SelectionDAG] Use WidenTargetBoolean in WidenVecRes_MLOAD and WidenVecOp_MSTORE instead of implementing it manually and incorrectly.
The CONCAT_VECTORS operand get its type from getSetCCResultType, but if the mask type and the setcc have different scalar sizes this creates an illegal CONCAT_VECTORS operation. The concat type should be 2x the mask type, and then an extend should be added if needed.

llvm-svn: 319744
2017-12-05 08:15:03 +00:00
Daniel Sanders 3c1c4c0ee0 Revert r319691: [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
Some concerns were raised with the direction. Revert while we discuss it and look into an alternative

llvm-svn: 319739
2017-12-05 05:52:07 +00:00
Matthias Braun 7afbfd0f24 MachineFrameInfo: Cleanup some parameter naming inconsistencies; NFC
Consistently use the same parameter names as the names of the affected
fields. This avoids some unintuitive abbreviations like `isSS`.

llvm-svn: 319722
2017-12-05 01:18:15 +00:00
Matthias Braun 62378bb5ab TwoAddressInstructionPass: Trigger -O0 behavior on optnone
While we cannot skip the whole TwoAddressInstructionPass even for -O0
there are some parts of the pass that are currently skipped at -O0 but
not for optnone. Changing this as there is no reason to have those two
hit different code paths here.

llvm-svn: 319721
2017-12-05 00:56:14 +00:00
Hans Wennborg 361d4392cf Revert r319490 "XOR the frame pointer with the stack cookie when protecting the stack"
This broke the Chromium build (crbug.com/791714). Reverting while investigating.

> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622
>
> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319490 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-svn: 319706
2017-12-04 22:21:15 +00:00
Hans Wennborg e117129ef7 DAG: Follow-up to r319692 check the truncates inputs have the same type
MatchRotate assumes the types of the types of LHS and RHS are equal,
which is always the case then they come from an OR node, but here
we're getting them from two different TRUNC nodes, so we have to check
the types.

llvm-svn: 319695
2017-12-04 20:48:50 +00:00
Hans Wennborg 7e61f24962 DAG: Match truncated rotation (PR35487)
If the truncation has been pushed past the or-node, look through it and
truncate afterwards.

Differential revision: https://reviews.llvm.org/D40792

llvm-svn: 319692
2017-12-04 20:39:57 +00:00
Daniel Sanders 04e4f47e93 [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
This patch splits atomics out of the generic G_LOAD/G_STORE and into their own
G_ATOMIC_LOAD/G_ATOMIC_STORE. This is a pragmatic decision rather than a
necessary one. Atomic load/store has little in implementation in common with
non-atomic load/store. They tend to be handled very differently throughout the
backend. It also has the nice side-effect of slightly improving the common-case
performance at ISel since there's no longer a need for an atomicity check in the
matcher table.

All targets have been updated to remove the atomic load/store check from the
G_LOAD/G_STORE path. AArch64 has also been updated to mark
G_ATOMIC_LOAD/G_ATOMIC_STORE legal.

There is one issue with this patch though which also affects the extending loads
and truncating stores. The rules only match when an appropriate G_ANYEXT is
present in the MIR. For example,
  (G_ATOMIC_STORE (G_TRUNC:s16 (G_ANYEXT:s32 (G_ATOMIC_LOAD:s16 X))))
will match but:
  (G_ATOMIC_STORE (G_ATOMIC_LOAD:s16 X))
will not. This shouldn't be a problem at the moment, but as we get better at
eliminating extends/truncates we'll likely start failing to match in some
cases. The current plan is to fix this in a patch that changes the
representation of extending-load/truncating-store to allow the MMO to describe
a different type to the operation.

llvm-svn: 319691
2017-12-04 20:39:32 +00:00
Hiroshi Yamauchi 9364fa3434 Move splitIndirectCriticalEdges() to BasicBlockUtils.h.
Summary:
Move splitIndirectCriticalEdges() from CodeGenPrepare to BasicBlockUtils.h so
that it can be called from other places.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40750

llvm-svn: 319689
2017-12-04 20:36:01 +00:00
Matthias Braun 7eae251bae MachineVerifier: undef phi arg doesn't need to be live-out from predecessor
Differential Revision: https://reviews.llvm.org/D40756

llvm-svn: 319674
2017-12-04 18:57:48 +00:00
Francis Visoiu Mistrih 25528d6de7 [CodeGen] Unify MBB reference format in both MIR and debug output
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR name of a MBB only for block definitions.

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix

Differential Revision: https://reviews.llvm.org/D40422

llvm-svn: 319665
2017-12-04 17:18:51 +00:00
Jonas Paulsson e86327f290 [TwoAddressInstructionPass] Bugfix in handling of sunk instructions.
An instruction returned by TII->convertToThreeAddress() may contain a %noreg
(undef) operand, which is not expected by tryInstructionTransform(). So if
this MI is sunk to a lower point in MBB, it must be skipped when later
encountered.

A new set SunkInstrs is used for this purpose.

Note: there is no test supplied here, as this was triggered on SystemZ while
working on a review of instruction flags. A test case for this bugfix will be
included in the upcoming SystemZ commit.

Review: Quentin Colombet
https://reviews.llvm.org/D40711

llvm-svn: 319646
2017-12-04 10:03:14 +00:00
Sam Parker 1e26d986aa [DAGCombine] Remove isAndLoadExtLoad arguments
Both LoadedVT and NarrowLoad are passed as references and neither
of them are used by any of its callers.

Differential Revision: https://reviews.llvm.org/D40713

llvm-svn: 319645
2017-12-04 09:48:26 +00:00
Craig Topper 67217d7eb4 [SelectionDAG] Teach computeKnownBits some improvements to ISD::SRL with a non-splat constant shift amount.
If we have a non-splat constant shift amount, the minimum shift amount can be used to infer the number of zero upper bits of the result. There's probably a lot more that we can do here, but this
fixes a case where I wanted to infer the sign bit as zero when all the shift amounts are non-zero.

llvm-svn: 319639
2017-12-04 05:38:42 +00:00
Yaxun Liu 30e4608cca CodeGen: Fix SelectionDAGISel::LowerArguments for sret addr space
SelectionDAGISel::LowerArguments assumes sret addr space is 0, which is
not true for amdgcn---amdgiz target.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D40255

llvm-svn: 319630
2017-12-03 03:31:45 +00:00
Craig Topper f3470e1ed4 [SelectionDAG] Use the inlined APInt shift methods since we've already bounds checked the shift.
The version that takes APInt is out of line. The 'unsigned' version optimizes for the common case of single word APInts.

llvm-svn: 319628
2017-12-03 03:07:09 +00:00
Yaxun Liu 494770403a CodeGen: Fix pointer info in SplitVecOp_EXTRACT_VECTOR_ELT/SplitVecRes_INSERT_VECTOR_ELT
Two issues found when doing codegen for splitting vector with non-zero alloca addr space:

DAGTypeLegalizer::SplitVecRes_INSERT_VECTOR_ELT/SplitVecOp_EXTRACT_VECTOR_ELT uses dummy pointer info for creating
SDStore. Since one pointer operand contains multiply and add, InferPointerInfo is unable to
infer the correct pointer info, which ends up with a dummy pointer info for the target to lower
store and results in isel failure. The fix is to introduce MachinePointerInfo::getUnknownStack to
represent MachinePointerInfo which is known in alloca address space but without other information.

TargetLowering::getVectorElementPointer uses value type of pointer in addr space 0 for
multiplication of index and then add it to the pointer. However the pointer may be in an addr
space which has different size than addr space 0. The fix is to use the pointer value type for
index multiplication.

Differential Revision: https://reviews.llvm.org/D39758

llvm-svn: 319622
2017-12-02 22:13:22 +00:00
Matt Morehouse 9e658c974b Revert "[X86] Improvement in CodeGen instruction selection for LEAs."
This reverts r319543, due to ASan bot breakage.

llvm-svn: 319591
2017-12-01 22:20:26 +00:00
Jessica Paquette 52df8015c5 [MachineOutliner] NFC: Throw out self-intersections on candidates early
Currently, the outliner considers candidates that intersect with themselves in
the candidate pruning step. That is, candidates of the form "AA" in ranges like
"AAAAAA". In that range, it looks like there are 5 instances of "AA" that could
possibly be outlined, and that's considered in the benefit calculation.

However, only at most 3 instances of "AA" could ever be outlined in "AAAAAA".
Thus, it's possible to pass through "AA" to the candidate selection step even
though it's *never* the case that "AA" could be outlined. This makes it so that
when we find candidates, we consider only non-overlapping occurrences of that
candidate.

llvm-svn: 319588
2017-12-01 21:56:56 +00:00
Nirav Dave 3e76e1e89e [DAG][ARM] Revert "Reenable post-legalize store merge"
due to failures in AArch and ARM code gen.

llvm-svn: 319587
2017-12-01 21:55:47 +00:00
Adam Nemet 9303f62255 [opt-remarks] If hotness threshold is set, ignore remarks without hotness
These are blocks that haven't not been executed during training.  For large
projects this could make a significant difference.  For the project, I was
looking at, I got an order of magnitude decrease in the size of the total YAML
files with this and r319235.

Differential Revision: https://reviews.llvm.org/D40678

Re-commit after fixing the failing testcase in rL319576, rL319577 and
rL319578.

llvm-svn: 319581
2017-12-01 20:41:38 +00:00
Eli Friedman b34a8198a9 [DAGCombine] Simplify ISD::AND handling in ReduceLoadWidth
Followup to D39595. Removes a bunch of redundant checks.

Differential Revision: https://reviews.llvm.org/D40667

llvm-svn: 319573
2017-12-01 19:33:56 +00:00
Adam Nemet 57783730fd Revert "[opt-remarks] If hotness threshold is set, ignore remarks without hotness"
This reverts commit r319556.

Something is not working with this when used with sample-based profiling.
Investigating...

llvm-svn: 319562
2017-12-01 18:12:29 +00:00
Adam Nemet 8d1fc2b65b [opt-remarks] If hotness threshold is set, ignore remarks without hotness
These are blocks that haven't not been executed during training.  For large
projects this could make a significant difference.  For the project, I was
looking at, I got an order of magnitude decrease in the size of the total YAML
files with this and r319235.

Differential Revision: https://reviews.llvm.org/D40678

llvm-svn: 319556
2017-12-01 17:02:04 +00:00
Nirav Dave eb2b24fded [ARM][DAG] Reenable post-legalize store merge
Summary: Reenable post-legalize stores with constant merging computation and cofrresponding test case.

Reviewers: eastig, efriedma

Subscribers: aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40701

llvm-svn: 319547
2017-12-01 14:49:26 +00:00
Jatin Bhateja 328199ec26 [X86] Improvement in CodeGen instruction selection for LEAs.
Summary:
1/  Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to
     accommodate similar operand appearing in the DAG  e.g.
                 T1 = A + B
                 T2 = T1 + 10
                 T3 = T2 + A
    For above DAG rooted at T3, X86AddressMode will now look like
                Base = B , Index = A , Scale = 2 , Disp = 10

2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity
     then complex LEAs (having 3 operands) could be factored out  e.g.
                 leal 1(%rax,%rcx,1), %rdx
                 leal 1(%rax,%rcx,2), %rcx
     will be factored as following
                 leal 1(%rax,%rcx,1), %rdx
                 leal (%rdx,%rcx)   , %edx

3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop.

4/ Simplify LEA converts (lea (BASE,1,INDEX,0)  --> add (BASE, INDEX) which offers better through put.

PR32755 will be taken care of by this pathc.

Previous patch revisions : r313343 , r314886

Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy, jbhateja

Reviewed By: lsaba, RKSimon, jbhateja

Subscribers: jmolloy, spatel, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 319543
2017-12-01 14:07:38 +00:00
Volkan Keles a32ff00b00 GlobalISel: Enable the legalization of G_MERGE_VALUES and G_UNMERGE_VALUES
Summary: LegalizerInfo assumes all G_MERGE_VALUES and G_UNMERGE_VALUES instructions are legal, so it is not possible to legalize vector operations on illegal vector types. This patch fixes the problem by removing the related check and adding default actions for G_MERGE_VALUES and G_UNMERGE_VALUES.

Reviewers: qcolombet, ab, dsanders, aditya_nandakumar, t.p.northover, kristof.beyls

Reviewed By: dsanders

Subscribers: rovka, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39823

llvm-svn: 319524
2017-12-01 08:19:10 +00:00
Craig Topper c261213abc [X86][SelectionDAG] Make sure we explicitly sign extend the index when type promoting the index of scatter and gather.
Type promotion makes no guarantee about the contents of the promoted bits. Since the gather/scatter instruction will use the bits to calculate addresses, we need to ensure they aren't garbage.

llvm-svn: 319520
2017-12-01 06:02:00 +00:00
Zachary Turner 8065f0b975 Mark all library options as hidden.
These command line options are not intended for public use, and often
don't even make sense in the context of a particular tool anyway. About
90% of them are already hidden, but when people add new options they
forget to hide them, so if you were to make a brand new tool today, link
against one of LLVM's libraries, and run tool -help you would get a
bunch of junk that doesn't make sense for the tool you're writing.

This patch hides these options. The real solution is to not have
libraries defining command line options, but that's a much larger effort
and not something I'm prepared to take on.

Differential Revision: https://reviews.llvm.org/D40674

llvm-svn: 319505
2017-12-01 00:53:10 +00:00
Reid Kleckner ba4014e9dc XOR the frame pointer with the stack cookie when protecting the stack
Summary: This strengthens the guard and matches MSVC.

Reviewers: hans, etienneb

Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits

Differential Revision: https://reviews.llvm.org/D40622

llvm-svn: 319490
2017-11-30 22:41:21 +00:00
Daniel Sanders aef1dfc690 [aarch64][globalisel] Legalize G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMICRMW_*
G_ATOMICRMW_* is generally legal on AArch64. The exception is G_ATOMICRMW_NAND.

G_ATOMIC_CMPXCHG_WITH_SUCCESS needs to be lowered to G_ATOMIC_CMPXCHG with an
external comparison.

Note that IRTranslator doesn't generate these instructions yet.

llvm-svn: 319466
2017-11-30 20:11:42 +00:00
Amara Emerson d78d65c2a4 [GlobalISel][IRTranslator] Fix crash during translation of zero sized loads/stores/args/returns.
This fixes PR35358.

rdar://35619533

Differential Revision: https://reviews.llvm.org/D40604

llvm-svn: 319465
2017-11-30 20:06:02 +00:00
Zachary Turner ca6dbf1440 Split TypeTableBuilder into two classes.
llvm-svn: 319456
2017-11-30 18:39:50 +00:00
Francis Visoiu Mistrih c71cced0aa [CodeGen] Always use `printReg` to print registers in both MIR and debug
output

As part of the unification of the debug format and the MIR format,
always use `printReg` to print all kinds of registers.

Updated the tests using '_' instead of '%noreg' until we decide which
one we want to be the default one.

Differential Revision: https://reviews.llvm.org/D40421

llvm-svn: 319445
2017-11-30 16:12:24 +00:00
Sean Eveson a6bcd53d52 [MC] Function stack size section.
Re applying after fixing issues in the diff, sorry for any painful conflicts/merges!

Original RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-August/117028.html

This change adds a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. The section contains pairs of function symbol references (8 byte) and stack sizes (unsigned LEB128).

The contents of this section can be used to measure changes to stack sizes between different versions of the compiler or a source base. The advantage of having a section is that we can extract this information when examining binaries that we didn't build, and it allows users and tools easy access to that information just by referencing the binary.

There is a follow up change to add an option to clang.

Thanks.

Reviewers: hfinkel, MatzeB

Reviewed By: MatzeB

Subscribers: thegameg, asb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39788

llvm-svn: 319430
2017-11-30 13:05:14 +00:00
Sean Eveson 661e4fbf83 Revert r319423: [MC] Function stack size section.
I messed up the diff.

llvm-svn: 319429
2017-11-30 12:43:25 +00:00
Francis Visoiu Mistrih 93ef145862 [CodeGen] Print "%vreg0" as "%0" in both MIR and debug output
As part of the unification of the debug format and the MIR format, avoid
printing "vreg" for virtual registers (which is one of the current MIR
possibilities).

Basically:

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g"
* grep -nr '%vreg' . and fix if needed
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g"
* grep -nr 'vreg[0-9]\+' . and fix if needed

Differential Revision: https://reviews.llvm.org/D40420

llvm-svn: 319427
2017-11-30 12:12:19 +00:00
Sean Eveson f77b4d2f38 [MC] Function stack size section.
Summary:
Original RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-August/117028.html

I wasn't sure who to put as reviewers, so please add/remove people as appropriate.

This change adds a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. The section contains pairs of function symbol references (8 byte) and stack sizes (unsigned LEB128).

The contents of this section can be used to measure changes to stack sizes between different versions of the compiler or a source base. The advantage of having a section is that we can extract this information when examining binaries that we didn't build, and it allows users and tools easy access to that information just by referencing the binary.

There is a follow up change to add an option to clang.

Thanks.

Reviewers: hfinkel, MatzeB

Reviewed By: MatzeB

Subscribers: thegameg, asb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39788

llvm-svn: 319423
2017-11-30 12:01:16 +00:00
Sam Parker 4bd776e001 [DAGCombine] Refactor ReduceLoadWidth
visitAND attempts to narrow the width of extending loads that are
then masked off. ReduceLoadWidth already exists for a similar purpose
and handles shifts, so I've moved the code to handle AND nodes there.

Differential Revision: https://reviews.llvm.org/D39595

llvm-svn: 319421
2017-11-30 11:49:11 +00:00
Serge Guelton 24386867b8 Support generic lowering of vector bswap
llvm-svn: 319419
2017-11-30 11:06:22 +00:00
Craig Topper cf461a0a32 [SelectionDAG][X86] Teach promotion legalization for fp_to_sint/fp_to_uint to insert an assertsext/assertzext based on the original type
If we put in an assertsext/zext here, we're able to generate better truncate code using pack on pre-avx512 targets.

Similar is already done during type legalization. This is the equivalent for op legalization

Differential Revision: https://reviews.llvm.org/D40591

llvm-svn: 319368
2017-11-29 22:15:43 +00:00
Serguei Katkov d4df744434 [CGP] Enable complex addr mode
Enable complex addr modes after two critical fixes: rL319109 and rL319292

llvm-svn: 319302
2017-11-29 09:48:50 +00:00
Serguei Katkov 5036459ae3 [CGP] Fix common type handling in optimizeMemoryInst
If common type is different we should bail out due to we will not be
able to create a select or Phi of these values.

Basically it is done in ExtAddrMode::compare however it does not work
if we handle the null first and then two values of different types.
so add a check in initializeMap as well. The check in ExtAddrMode::compare
is used as earlier bail out.

Reviewers: reames, john.brawn
Reviewed By: john.brawn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D40479

llvm-svn: 319292
2017-11-29 05:51:26 +00:00
Matt Arsenault b655fa9ce2 DAG: Add nuw when splitting loads and stores
The object can't straddle the address space
wrap around, so I think it's OK to assume any
offsets added to the base object pointer can't
overflow. Similar logic already appears to be
applied in SelectionDAGBuilder when lowering
aggregate returns.

llvm-svn: 319272
2017-11-29 01:25:12 +00:00
Craig Topper 88ffb5d4d5 [X86] Mark ISD::FP_TO_UINT v16i8/v16i16 as Promote under AVX512 instead of legal. Fix infinite loop in op legalization when promotion requires 2 steps.
Previously we had an isel pattern to add the truncate. Instead use Promote to add the truncate to the DAG before isel.

The Promote legalization code had to be updated to prevent an infinite loop if promotion took multiple steps because it wasn't remembering the previously tried value.

llvm-svn: 319259
2017-11-28 23:56:02 +00:00
Mandeep Singh Grang 230b0a1477 [SelectionDAG] Make sorting predicate stronger to remove non-deterministic ordering
Summary:
Recommitting this with the correct sorting predicate. The Low field of Clusters is a ConstantInt and
cannot be directly compared. So we needed to invoke slt (signed less than) to compare correctly.

This fixes failures in the following tests uncovered by D39245:

LLVM :: CodeGen/ARM/ifcvt3.ll
LLVM :: CodeGen/ARM/switch-minsize.ll
LLVM :: CodeGen/X86/switch.ll
LLVM :: CodeGen/X86/switch-bt.ll
LLVM :: CodeGen/X86/switch-density.ll

Reviewers: hans, fhahn

Reviewed By: hans

Subscribers: aemerson, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D40541

llvm-svn: 319210
2017-11-28 19:55:54 +00:00
Francis Visoiu Mistrih 3aa8eaa951 [CodeGen] Fix doxygen \file comment style
llvm-svn: 319207
2017-11-28 19:23:39 +00:00
Francis Visoiu Mistrih d4b340b460 [CodeGen] Fix doxygen
llvm-svn: 319206
2017-11-28 19:15:46 +00:00
Daniel Sanders 17d277b734 [mir] Print/Parse both MOLoad and MOStore when they occur together.
Summary:
They're not always mutually exclusive. read-modify-write atomics are both
at the same time. One example of this is the SWP instructions on AArch64.
Another example is GlobalISel's G_ATOMICRMW_* generic instructions which
will be added in a later patch.

Reviewers: arphaman, aemerson

Reviewed By: aemerson

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D40157

llvm-svn: 319202
2017-11-28 18:57:02 +00:00
Zachary Turner 6900de1dfb [CodeView] Refactor / Rewrite TypeSerializer and TypeTableBuilder.
The motivation behind this patch is that future directions require us to
be able to compute the hash value of records independently of actually
using them for de-duplication.

The current structure of TypeSerializer / TypeTableBuilder being a
single entry point that takes an unserialized type record, and then
hashes and de-duplicates it is not flexible enough to allow this.

At the same time, the existing TypeSerializer is already extremely
complex for this very reason -- it tries to be too many things. In
addition to serializing, hashing, and de-duplicating, ti also supports
splitting up field list records and adding continuations. All of this
functionality crammed into this one class makes it very complicated to
work with and hard to maintain.

To solve all of these problems, I've re-written everything from scratch
and split the functionality into separate pieces that can easily be
reused. The end result is that one class TypeSerializer is turned into 3
new classes SimpleTypeSerializer, ContinuationRecordBuilder, and
TypeTableBuilder, each of which in isolation is simple and
straightforward.

A quick summary of these new classes and their responsibilities are:

- SimpleTypeSerializer : Turns a non-FieldList leaf type into a series of
  bytes. Does not do any hashing. Every time you call it, it will
  re-serialize and return bytes again. The same instance can be re-used
  over and over to avoid re-allocations, and in exchange for this
  optimization the bytes returned by the serializer only live until the
  caller attempts to serialize a new record.

- ContinuationRecordBuilder : Turns a FieldList-like record into a series
  of fragments. Does not do any hashing. Like SimpleTypeSerializer,
  returns references to privately owned bytes, so the storage is
  invalidated as soon as the caller tries to re-use the instance. Works
  equally well for LF_FIELDLIST as it does for LF_METHODLIST, solving a
  long-standing theoretical limitation of the previous implementation.

- TypeTableBuilder : Accepts sequences of bytes that the user has already
  serialized, and inserts them by de-duplicating with a hash table. For
  the sake of convenience and efficiency, this class internally stores a
  SimpleTypeSerializer so that it can accept unserialized records. The
  same is not true of ContinuationRecordBuilder. The user is required to
  create their own instance of ContinuationRecordBuilder.

Differential Revision: https://reviews.llvm.org/D40518

llvm-svn: 319198
2017-11-28 18:33:17 +00:00
Francis Visoiu Mistrih aa739695a4 [CodeGen] Separate MachineOperand implementation from MachineInstr
Move the implementation to its own file.

Differential Revision: https://reviews.llvm.org/D40419

llvm-svn: 319194
2017-11-28 17:58:43 +00:00
Francis Visoiu Mistrih 946e394e33 [CodeGen] Cleanup MachineOperand
* clang-format
* move doxygen from the implementation to headers
* remove duplicate doxygen

llvm-svn: 319193
2017-11-28 17:58:38 +00:00
Francis Visoiu Mistrih 9d7bb0cb40 [CodeGen] Print register names in lowercase in both MIR and debug output
As part of the unification of the debug format and the MIR format,
always print registers as lowercase.

* Only debug printing is affected. It now follows MIR.

Differential Revision: https://reviews.llvm.org/D40417

llvm-svn: 319187
2017-11-28 17:15:09 +00:00
Matt Arsenault e123aba94e DAG: Legalize truncstores to illegal int types
Truncate to a legal int type, and produce a new
truncstore from a narrower type.

llvm-svn: 319185
2017-11-28 17:11:30 +00:00
Jonas Paulsson f0ff20f1f0 Use getStoreSize() in various places instead of 'BitSize >> 3'.
This is needed for cases when the memory access is not as big as the width of
the data type. For instance, storing i1 (1 bit) would be done in a byte (8
bits).

Using 'BitSize >> 3' (or '/ 8') would e.g. give the memory access of an i1 a
size of 0, which for instance makes alias analysis return NoAlias even when
it shouldn't.

There are no tests as this was done as a follow-up to the bugfix for the case
where this was discovered (r318824). This handles more similar cases.

Review: Björn Petterson
https://reviews.llvm.org/D40339

llvm-svn: 319173
2017-11-28 14:44:32 +00:00
Francis Visoiu Mistrih 9d419d3b0c [CodeGen] Rename functions PrintReg* to printReg*
LLVM Coding Standards:
  Function names should be verb phrases (as they represent actions), and
  command-like function should be imperative. The name should be camel
  case, and start with a lower case letter (e.g. openFile() or isFoo()).

Differential Revision: https://reviews.llvm.org/D40416

llvm-svn: 319168
2017-11-28 12:42:37 +00:00
Martin Storsjo 04b68446eb [COFF] Implement constructor priorities
The priorities in the section name suffixes are zero padded,
allowing the linker to just do a lexical sort.

Add zero padding for .ctors sections in ELF as well.

Differential Revision: https://reviews.llvm.org/D40407

llvm-svn: 319150
2017-11-28 08:07:18 +00:00
Simon Dardis 3aeb1a5404 [DAGCombine] Disable finding better chains for stores at O0
Unoptimized IR can have linear sequences of stores to an array, where the
initial GEP for the first store is formed from the pointer to the array, and the
GEP for each store after the first is formed from the previous GEP with some
offset in an inductive fashion.

The (large) resulting DAG when analyzed by DAGCombine undergoes an excessive
number of combines as each store node is examined every time its' offset node
is combined with any child of the offset. One of the transformations is
findBetterNeighborChains which assists MergeConsecutiveStores. The former
relies on repeated chain walking to do its' work, however MergeConsecutiveStores
is disabled at O0 which makes the transformation redundant.

Any optimization level other than O0 would invoke InstCombine which would
resolve the chain of GEPs into flat base + offset GEP for each store which
does not exhibit the repeated examination of each store to the array.

Disabling this optimization fixes an excessive compile time issue (30~ minutes
for the test case provided) at O0.

Reviewers: niravd, craig.topper, t.p.northover

Differential Revision: https://reviews.llvm.org/D40193

llvm-svn: 319142
2017-11-28 04:07:59 +00:00
Matthias Braun eca985847c MachineVerifier: Improve register operand checks
This fixes cases where we wouldn't perform various register operand
checks just because we didn't happen to have a definition in the
MCInstrDesc. This changes the code to only skip the tests that actually
depend on the MCInstrDesc definition.

This makes the machine verifier spot the problem from
https://llvm.org/PR33071 after the pass that actually caused it.

llvm-svn: 319141
2017-11-28 03:54:20 +00:00
Matthias Braun a6d5374ee6 MachineVerifier: Improve PHI operand checking
Additional checks for phi operands:
- first operand should be a virtual register def. It should not be
  tied, implicit, internalread, earlyclobber or a read.
- The other operands should be register/mbb operands next to each other
- The register operands should not be implicit, internalread,
  earlyclobber, debug or tied.
- We can perform most of the PHI checks even for unreachable blocks.

llvm-svn: 319140
2017-11-28 03:54:19 +00:00
Craig Topper dbd4a7fecc [DAGCombiner] Don't combine aext(setcc) if the setcc is already using the target's preferred result type.
With AVX512 vXi1 types are legal so we shouldn't be extending them.

This change is similar to existing code in the zext(setcc) combine.

llvm-svn: 319120
2017-11-27 23:51:40 +00:00
Craig Topper 57c02d18b9 [DAGCombiner] Use EVT::changeVectorElementTypeToInteger() instead of implementing manually.
llvm-svn: 319119
2017-11-27 23:51:31 +00:00
Craig Topper 820ce04377 [SelectionDAG] Add a debug message when vector_shuffle nodes are created.
We print a debug message when most nodes are created, but getVectorShuffle was missing.

llvm-svn: 319085
2017-11-27 19:54:57 +00:00
John Brawn 4b476488ba [CGP] Fix handling of null pointer values in optimizeMemoryInst
The current way that trivial addressing modes are detected incorrectly thinks
that null pointers are non-trivial, leading to an infinite loop where we keep
duplicating the same select. Fix this by aware of null when deciding if an
addressing mode is trivial.

Differential Revision: https://reviews.llvm.org/D40447

llvm-svn: 319019
2017-11-27 11:29:15 +00:00
Craig Topper 400c32ecb9 [SelectionDAG] Teach SplitVecRes_SETCC to call GetSplitVector if the operands have already been split.
llvm-svn: 319010
2017-11-27 05:52:54 +00:00
Craig Topper 968491b4e7 [SelectionDAG] Fix function name in comment. NFC
llvm-svn: 319009
2017-11-27 05:52:52 +00:00
Craig Topper e7426556c1 [SelectionDAG] Remove some dead code from vector scalaring
Summary:
Currently ScalarizeVecRes_SETCC checks for the result type being a vector and jumps to ScalarizeVecRes_VSETCC. But if we're scalarizing a vector result, aren't we guaranteed to be looking at a vector type?

This patch deletes the current ScalarizeVecRes_SETCC and renames  ScalarizeVecRes_VSETCC to ScalarizeVecRes_SETCC.

Reviewers: RKSimon, arsenm, eladcohen, zvi

Reviewed By: RKSimon

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D40452

llvm-svn: 318982
2017-11-25 17:59:00 +00:00
Simon Dardis 230f453574 [CodeGenPrepare] Check that erased sunken address are not reused
CodeGenPrepare sinks address computations from one basic block to another
and attempts to reuse address computations that have already been sunk. If
the same address computation appears twice with the first instance as an
operand of a load whose result is an operand to a simplifable select,
CodeGenPrepare simplifies the select and recursively erases the now dead
instructions. CodeGenPrepare then attempts to use the erased address
computation for the second load.

Fix this by erasing the cached address value if it has zero uses before
looking for the address value in the sunken address map.

This partially resolves PR35209.

Thanks to Alexander Richardson for reporting the issue!

This fixed version relands r318032 which was reverted in r318049 due to
sanitizer buildbot failures.

Reviewers: john.brawn

Differential Revision: https://reviews.llvm.org/D39841

llvm-svn: 318956
2017-11-24 16:45:28 +00:00
Benjamin Kramer 51ebcaaf25 Make helpers static. NFC.
llvm-svn: 318953
2017-11-24 14:55:41 +00:00
John Brawn 70cdb5b391 [CGP] Make optimizeMemoryInst able to combine more kinds of ExtAddrMode fields
This patch extends the recent work in optimizeMemoryInst to make it able to
combine more ExtAddrMode fields than just the BaseReg.

This fixes some benchmark regressions introduced by r309397, where GVN PRE is
hoisting a getelementptr such that it can no longer be combined into the
addressing mode of the load or store that uses it.

Differential Revision: https://reviews.llvm.org/D38133

llvm-svn: 318949
2017-11-24 14:10:45 +00:00
Diana Picus c01f7f131b [ARM GlobalISel] Support G_FDIV for s32 and s64
TableGen already generates code for selecting a G_FDIV, so we only need
to add a test.

For the legalizer and reg bank select, we do the same thing as for the
other floating point binary operations: either mark as legal if we have
a FP unit or lower to a libcall, and map to the floating point
registers.

llvm-svn: 318915
2017-11-23 13:26:07 +00:00
Diana Picus 9faa09b21e [ARM GlobalISel] Support G_FMUL for s32 and s64
TableGen already generates code for selecting a G_FMUL, so we only need
to add a test for that part.

For the legalizer and reg bank select, we do the same thing as the other
floating point binary operators: either mark as legal if we have a FP
unit or lower to a libcall, and map to the floating point registers.

llvm-svn: 318910
2017-11-23 12:44:20 +00:00
Yaxun Liu 6aaae46f93 [NFC] CodeGen: Handle shift amount type in DAGTypeLegalizer::SplitInteger
This patch reverts change to X86TargetLowering::getScalarShiftAmountTy in
rL318727 and move the logic to DAGTypeLegalizer::SplitInteger.

The reason is that getScalarShiftAmountTy returns a shift amount type that
is suitable for common use cases in CodeGen. DAGTypeLegalizer::SplitInteger
is a rare situation which requires a shift amount type larger than what
getScalarShiftAmountTy. In this case, it is more reasonable to do special
handling of shift amount type in DAGTypeLegalizer::SplitInteger only. If
similar situations arises the logic may be moved to a separate function.

Differential Revision: https://reviews.llvm.org/D40320

llvm-svn: 318890
2017-11-23 03:08:51 +00:00
Jonas Paulsson 181e260e32 [DAGCombiner] Bugfix in isAlias().
Since i1 is a legal type, this:

  NumBytes = Op1->getMemoryVT().getSizeInBits() >> 3;

is wrong and should be instead

  NumBytes = Op0->getMemoryVT().getStoreSize();

There seems to be more places where this should be fixed outside DAGCombiner.

Review: Hal Finkel
https://bugs.llvm.org/show_bug.cgi?id=35366

llvm-svn: 318824
2017-11-22 08:58:30 +00:00
Craig Topper fb0d4cd48c [SelectionDAG] Add a isel matcher op to check the type of node results other than result 0.
I plan to use this to check the type of the mask result of masked gathers in the X86 backend.

llvm-svn: 318820
2017-11-22 07:11:01 +00:00
Serguei Katkov ac17aadf41 Revert "[CGP] Enable complex addr mode (2nd attempt)"
Revert the patch rl318728 causing buildbot hangs-ups.

llvm-svn: 318731
2017-11-21 06:03:43 +00:00
Serguei Katkov fc1ff29966 [CGP] Enable complex addr mode (2nd attempt)
2nd attempt to enable complex addr modes after
fix of the crash by rL318638.

llvm-svn: 318728
2017-11-21 05:31:47 +00:00
Yaxun Liu 3cea36f03e [AMDGPU] Fix DAGTypeLegalizer::SplitInteger for shift amount type
DAGTypeLegalizer::SplitInteger uses default pointer size as shift amount constant type,
which causes less performant ISA in amdgcn---amdgiz target since the default pointer
type is i64 whereas the desired shift amount type is i32.

This patch fixes that by using TLI.getScalarShiftAmountTy in DAGTypeLegalizer::SplitInteger.

The X86 change is necessary since splitting i512 requires shifting amount of 256, which
cannot be held by i8.

Differential Revision: https://reviews.llvm.org/D40148

llvm-svn: 318727
2017-11-21 02:29:54 +00:00
Craig Topper 67eb30ab60 [SelectionDAG] When promoting the result of a VSELECT, make sure we promote the condition to the SetCC type for the final result type not the original type.
Normally this would be cleaned up by promoting the condition operand next. But in the attached case we promoted the result from v2i48 to v2i64 and the condition from v2i1 to v2i48. Then we tried to "promote" the v2i48 condition back to v2i1 because that's what the SetCC result type for v2i64 is on X86 with VLX. But promote is either a NOP or SIGN_EXTEND and this would need a truncation.

With the change here we now get the SetCC type of v2i1 when we're handling the result promotion and the operand no longer needs to be promoted itself.

Fixes PR35272.

llvm-svn: 318706
2017-11-20 23:08:50 +00:00
Mandeep Singh Grang f953e4b881 Revert "[SelectionDAG] Make sorting predicate stronger to remove non-deterministic ordering"
This broke the bots. Reverting this until I can fix the failures.

This reverts commit 5a3db2856d12a3c4b400f487d39f8f05989e79f0.

llvm-svn: 318686
2017-11-20 19:17:11 +00:00
Paul Robinson 746edea0ae Revert "Fix out-of-order stepping behavior in programs with sunk instructions."
This reverts commit 30419e150cd940893a13b345e85f96053850208f.
aka r318679.  It caused "sanitizer-windows" bot to fail.

llvm-svn: 318684
2017-11-20 19:07:52 +00:00
Mandeep Singh Grang dc9de50902 [SelectionDAG] Make sorting predicate stronger to remove non-deterministic ordering
Summary:
This fixes failures in the following tests uncovered by D39245:
        LLVM :: CodeGen/ARM/ifcvt3.ll
        LLVM :: CodeGen/ARM/switch-minsize.ll
        LLVM :: CodeGen/X86/switch.ll

Reviewers: hans, efriedma

Reviewed By: hans

Subscribers: fhahn, aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D39995

llvm-svn: 318680
2017-11-20 18:46:11 +00:00
Paul Robinson f0b02965e4 Fix out-of-order stepping behavior in programs with sunk instructions.
MachineSink attempts to place instructions near the basic blocks where
they are needed.  Once an instruction has been sunk, its location
relative to other instructions is no longer consistent with the
original source code. In order to ensure correct single-stepping and
profiling, the debug location for sunk instructions is either merged
with the insertion point or erased if the target successor block is
empty.

Patch by Matthew Voss!

Differential Revision: https://reviews.llvm.org/D39933

llvm-svn: 318679
2017-11-20 18:42:17 +00:00
Tony Jiang f75f4d6573 [MachineCSE] Add new callback for is caller preserved or constant physregs
The instructions addis,addi, bl are used to calculate the address of TLS thread
local variables. These TLS access code sequences are generated repeatedly every
time the thread local variable is accessed. By communicating to Machine CSE that
X2 is guaranteed to have the same value within the same function call (so called
Caller Preserved Physical Register), the redundant TLS access code sequences are
cleaned up.

Differential Revision: https://reviews.llvm.org/D39173

llvm-svn: 318661
2017-11-20 16:55:07 +00:00
Serguei Katkov 505359f705 [CGP] Fix the crash caused by enable of complex addr mode
We must collect all AddModes even if they are the same.
This is due to Original value is different but we need all original
values collected as they are used as anchors in common phi finding.

Reviewers: john.brawn, reames
Reviewed By: john.brawn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D40166

llvm-svn: 318638
2017-11-20 05:42:36 +00:00
Quentin Colombet fe538f7145 [RegisterBankInfo] Relax the assert of having matching type sizes on default mappings
Instead of asserting that the type sizes are exactly equal, we check
that the new size is big enough to contain the original type.
We have to relax this constrain because, right now, we sometimes
specify that things that are smaller than a storage type are legal
instead of widening everything to the size of a storage type.
E.g., we say that G_AND s16 is legal and we map that on GPR32.

This is something we may revisit in the future (either by changing
the legalization process or keeping track separately of the storage
size and the size of the type), but let us reflect the reality of
the situation for now.

llvm-svn: 318587
2017-11-18 04:28:58 +00:00
Justin Bogner 6c7663fd54 MIRParser: Avoid reading uninitialized memory on generic vregs
If a vreg's bank is specified in the registers block and one of its
defs or uses also specifies the bank, we end up checking that the
RegBank is equal to diagnose conflicting banks. The problem comes up
for generic vregs, where we weren't fully initializing the VRegInfo
when parsing the registers block, so we'd end up comparing a null
pointer to uninitialized memory.

This fixes a non-deterministic failure when round tripping through MIR
with generic vregs.

llvm-svn: 318543
2017-11-17 18:51:20 +00:00
Craig Topper e9a6456ab3 [SelectionDAG] Allow custom vector widening through ReplaceNodeResults to handle nodes with chain outputs.
Previously we were assuming all results were vectors and calling SetWidenedVector, but if its a chain result we should just replace uses instead.

This fixes an error found by expensive checks after r318368.

llvm-svn: 318509
2017-11-17 07:03:57 +00:00
Vedant Kumar 4d7f2b02d6 [SelectionDAG] Consolidate (t|T)ransferDbgValues methods, NFC (reapply)
TransferDbgValues (capital 'T') is wired into ReplaceAllUsesWith, and
transferDbgValues (lowercase 't') is used elsewhere (e.g in Legalize).

Both functions should be doing the exact same thing. This patch
consolidates the logic into one place.

This was reverted in r318455 because some newly introduced asserts,
which I thought were NFC, were firing. I filed PR35338. For now I've
weakened the asserts.

Testing: check-llvm, check-clang, and a stage2 Rel+Deb build of clang

Differential Revision: https://reviews.llvm.org/D40104

llvm-svn: 318498
2017-11-17 01:48:33 +00:00
David Blaikie b3bde2ea50 Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490
2017-11-17 01:07:10 +00:00
Craig Topper 242374e219 [X86] Don't remove sign extend of gather/scatter indices during SelectionDAGBuilder.
The sign extend might be from an i16 or i8 type and was inserted by InstCombine to match the pointer width. X86 gather legalization isn't currently detecting this to reinsert a sign extend to make things legal.

It's a bit weird for the SelectionDAGBuilder to do this kind of optimization in the first place. With this removed we can at least lean on InstCombine somewhat to ensure the index is i32 or i64.

I'll work on trying to recover some of the test cases by removing sign extends in the backend when its safe to do so with an understanding of the current legalizer capabilities.

This should fix PR30690.

llvm-svn: 318466
2017-11-16 23:08:57 +00:00
Vedant Kumar 53418797fd Revert "[SelectionDAG] Consolidate (t|T)ransferDbgValues methods, NFC."
This reverts commit r318448. It looks like some of the asserts need to
be weakened.

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/16296

llvm-svn: 318455
2017-11-16 21:08:51 +00:00
Craig Topper b6b61dfb15 [DAGCombiner] Use cast instead of an unchecked dyn_cast.
llvm-svn: 318450
2017-11-16 20:23:12 +00:00
Vedant Kumar 494814d52a [SelectionDAG] Consolidate (t|T)ransferDbgValues methods, NFC.
TransferDbgValues (capital 'T') is wired into ReplaceAllUsesWith, and
transferDbgValues (lowercase 't') is used elsewhere (e.g in Legalize).

Both functions should be doing the exact same thing. This patch
consolidates the logic into one place.

Differential Revision: https://reviews.llvm.org/D40104

llvm-svn: 318448
2017-11-16 19:50:24 +00:00
Yaxun Liu 0844ff2aa7 Fix pointer EVT in SelectionDAGBuilder::visitAlloca
SelectionDAGBuilder::visitAlloca assumes alloca address space is 0, which is
incorrect for triple amdgcn---amdgiz and causes isel failure.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D40095

llvm-svn: 318392
2017-11-16 12:22:19 +00:00
Sam Parker 43fa5911a1 [DAGCombine] Enable more srl -> load combines
Change the calculation for the desired ValueType for non-sign
extending loads, as in those cases we don't care about the
higher bits. This creates a smaller ExtVT and allows for such
combinations as:
(srl (zextload i16, [addr]), 8) -> (zextload i8, [addr + 1])

Differential Revision: https://reviews.llvm.org/D40034

llvm-svn: 318390
2017-11-16 11:28:26 +00:00
Benjamin Kramer bd20e9755f Assert correct removal of SUnit in LatencyPriorityQueue
The LatencyPriorityQueue doesn't currently check whether the SU being removed really exists in the Queue.
This method fails quietly when SU is not found and removes the last element from the Queue, leading to unexpected behavior.

Unfortunately, this only occurs on our custom target, with the custom scheduler. In our case, when remove() is invoked, it removes the wrong SU at the end of the Queue, which is only discovered later when VerifyScheduledDAG() is invoked and finds that some nodes were not scheduled at all.

As this is only reproducible with a lot of proprietary code, I'm hopeful this assert is straightforward enough to not necessitate a test.

Patch by Ondrej Glasnak!

Differential Revision: https://reviews.llvm.org/D40084

llvm-svn: 318387
2017-11-16 10:18:07 +00:00
Mikael Holmen 56e4abc2cf [MachineRegisterInfo] Avoid having dbg.values affect code generation
Summary:
Use use_nodbg_empty() rather than use_empty() in
MachineRegisterInfo::EmitLiveInCopies() when determining if a livein
register has any uses or not. Otherwise a single dbg.value can make us
generate different code, meaning -g would affect code generation.

Found when compiling code for my out-of-tree target. Unfortunately I
haven't been able to reproduce the problem on X86 or any of the other
in-tree targets that I tried, so no test case.

Reviewers: MatzeB

Reviewed By: MatzeB

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39044

llvm-svn: 318382
2017-11-16 07:01:23 +00:00
Craig Topper 36e8d66e1a [SelectionDAG] Use report_fatal_error instead of llvm_unreachable in some code that can be reached if targets don't configure things correctly.
For example, this is currently reachable by X86 if you use a masked store intrinsic with a v1iX type.

Using a fatal error seems like a better user experience if someone were to encounter this on a release build. There are several other similar places that have been converted from unreachable to fatal error previously.

llvm-svn: 318379
2017-11-16 06:02:03 +00:00
Yaxun Liu 4d9a4d7ac8 Fix APInt bit size in processDbgDeclares
processDbgDeclares assumes pointer size is the same for different addr spaces.
It uses pointer size for addr space 0 for all pointers, which causes assertion
in stripAndAccumulateInBoundsConstantOffsets for amdgcn---amdgiz since
pointer in addr space 5 has different size than in addr space 0.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D40085

llvm-svn: 318370
2017-11-16 02:54:49 +00:00
Daniel Sanders f76f315436 [globalisel][tablegen] Generate rule coverage and use it to identify untested rules
Summary:
This patch adds a LLVM_ENABLE_GISEL_COV which, like LLVM_ENABLE_DAGISEL_COV,
causes TableGen to instrument the generated table to collect rule coverage
information. However, LLVM_ENABLE_GISEL_COV goes a bit further than
LLVM_ENABLE_DAGISEL_COV. The information is written to files
(${CMAKE_BINARY_DIR}/gisel-coverage-* by default). These files can then be
concatenated into ${LLVM_GISEL_COV_PREFIX}-all after which TableGen will
read this information and use it to emit warnings about untested rules.

This technique could also be used by SelectionDAG and can be further
extended to detect hot rules and give them priority over colder rules.

Usage:
* Enable LLVM_ENABLE_GISEL_COV in CMake
* Build the compiler and run some tests
* cat gisel-coverage-[0-9]* > gisel-coverage-all
* Delete lib/Target/*/*GenGlobalISel.inc*
* Build the compiler

Known issues:
* ${LLVM_GISEL_COV_PREFIX}-all must be generated as a manual
  step due to a lack of a portable 'cat' command. It should be the
  concatenation of all ${LLVM_GISEL_COV_PREFIX}-[0-9]* files.
* There's no mechanism to discard coverage information when the ruleset
  changes

Depends on D39742

Reviewers: ab, qcolombet, t.p.northover, aditya_nandakumar, rovka

Reviewed By: rovka

Subscribers: vsk, arsenm, nhaehnle, mgorny, kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39747

llvm-svn: 318356
2017-11-16 00:46:35 +00:00
Rong Xu e4572c6b73 [CodeGen] Fix the branch probability assertion in r318202
Due to integer precision, we might have numerator greater than denominator in
the branch probability scaling. Add a check to prevent this from happening.

llvm-svn: 318353
2017-11-16 00:14:05 +00:00
Aditya Nandakumar 954eea074b [GISel][NFC]: Move getOpcodeDef from the LegalizationArtifactCombiner into GlobalISel/Utils for use elsewhere
llvm-svn: 318350
2017-11-15 23:45:04 +00:00
Jonas Devlieghere 294e689509 [DebugInfo] Fix potential CU mismatch for SubprogramScopeDIEs.
In constructAbstractSubprogramScopeDIE there can be a potential mismatch
between `this` and the CU of ContextDIE when a scope is shared between
two DISubprograms belonging to a different CU. In that case, `this` is
the CU that was specified in the IR, but the CU of ContextDIE is that of
the first subprogram that was emitted. This patch fixes the mismatch by
looking up the CU of ContextDIE, and switching to use that.

This fixes PR35212 (https://bugs.llvm.org/show_bug.cgi?id=35212)

Patch by Philip Craig!

Differential revision: https://reviews.llvm.org/D39981

llvm-svn: 318289
2017-11-15 10:57:05 +00:00
Fangrui Song e73534464d NFC Remove default argument of DataLayout::getPointerABIAlignment
Differential Revision: https://reviews.llvm.org/D40005

llvm-svn: 318272
2017-11-15 06:17:32 +00:00
Aditya Nandakumar e6201c8724 [GISel]: Rework legalization algorithm for better elimination of
artifacts along with DCE

Legalization Artifacts are all those insts that are there to make the
type system happy. Currently, the target needs to say all combinations
of extends and truncs are legal and there's no way of verifying that
post legalization, we only have *truly* legal instructions. This patch
changes roughly the legalization algorithm to process all illegal insts
at one go, and then process all truncs/extends that were added to
satisfy the type constraints separately trying to combine trivial cases
until they converge. This has the added benefit that, the target
legalizerinfo can only say which truncs and extends are okay and the
artifact combiner would combine away other exts and truncs.

Updated legalization algorithm to roughly the following pseudo code.

WorkList Insts, Artifacts;
collect_all_insts_and_artifacts(Insts, Artifacts);

do {
  for (Inst in Insts)
         legalizeInstrStep(Inst, Insts, Artifacts);
  for (Artifact in Artifacts)
         tryCombineArtifact(Artifact, Insts, Artifacts);
} while(!Insts.empty());

Also, wrote a simple wrapper equivalent to SetVector, except for
erasing, it avoids moving all elements over by one and instead just
nulls them out.

llvm-svn: 318210
2017-11-14 22:42:19 +00:00
Rong Xu 3573d8da36 [CodeGen] Peel off the dominant case in switch statement in lowering
This patch peels off the top case in switch statement into a branch if the
probability exceeds a threshold. This will help the branch prediction and
avoids the extra compares when lowering into chain of branches.

Differential Revision: http://reviews.llvm.org/D39262

llvm-svn: 318202
2017-11-14 21:44:09 +00:00
Hans Wennborg e1ecd61b98 Rename CountingFunctionInserter and use for both mcount and cygprofile calls, before and after inlining
Clang implements the -finstrument-functions flag inherited from GCC, which
inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit.

This is useful for getting a trace of how the functions in a program are
executed. Normally, the calls remain even if a function is inlined into another
function, but it is useful to be able to turn this off for users who are
interested in a lower-level trace, i.e. one that reflects what functions are
called post-inlining. (We use this to generate link order files for Chromium.)

LLVM already has a pass for inserting similar instrumentation calls to
mcount(), which it does after inlining. This patch renames and extends that
pass to handle calls both to mcount and the cygprofile functions, before and/or
after inlining as controlled by function attributes.

Differential Revision: https://reviews.llvm.org/D39287

llvm-svn: 318195
2017-11-14 21:09:45 +00:00
Easwaran Raman 0d55b55bb6 [CodeGenPrepare] Disable div bypass when working set size is huge.
Summary:
Bypass of slow divs based on operand values is currently disabled for
-Os. Do the same when profile summary is available and the working set
size of the application is huge. This is similar to how loop peeling is
guarded by hasHugeWorkingSetSize. In the div bypass case, the generated
extra code (and the extra branch) tendss to outweigh the benefits of the
bypass. This results in noticeable performance improvement on an
internal application.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39992

llvm-svn: 318179
2017-11-14 19:31:51 +00:00
Yaxun Liu 0b2f73fd84 CodeGen: Fix TargetLowering::LowerCallTo for sret value type
TargetLowering::LowerCallTo assumes that sret value type corresponds to a
pointer in default address space, which is incorrect, since sret value type
should correspond to a pointer in alloca address space, which may not
be the default address space. This causes assertion for amdgcn target
in amdgiz environment.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D39996

llvm-svn: 318167
2017-11-14 18:46:52 +00:00
Sam Clegg 999660761e [WebAssembly] Explicily disable comdat support for wasm output
For now at least.  We clearly need some kind of comdat or
linkonce_odr support for wasm but currently COMDAT is not
supported.

Disable COMDAT support in the same way we do the Mach-O.  This
also causes clang not to generated COMDATs.

Differential Revision: https://reviews.llvm.org/D39873

llvm-svn: 318123
2017-11-14 00:49:16 +00:00
Adrian Prantl 73d0e94e82 Fix an assertion in SelectionDAG::transferDbgValues()
when transferring debug info describing the lower bits of an extended SDNode.

rdar://problem/35504722

llvm-svn: 318086
2017-11-13 21:24:54 +00:00
Simon Dardis 8222160eb3 Revert "[CodeGenPrepare] Check that erased sunken address are not reused"
This reverts commit r318032. The test broke some sanitizer bots.

llvm-svn: 318049
2017-11-13 16:41:17 +00:00
Simon Dardis 8e2a5bd235 [CodeGenPrepare] Check that erased sunken address are not reused
CodeGenPrepare sinks address computations from one basic block to another
and attempts to reuse address computations that have already been sunk. If
the same address computation appears twice with the first instance as an
operand of a load whose result is an operand to a simplifable select,
CodeGenPrepare simplifies the select and recursively erases the now dead
instructions. CodeGenPrepare then attempts to use the erased address
computation for the second load.

Fix this by erasing the cached address value if it has zero uses before
looking for the address value in the sunken address map.

This partially resolves PR35209.

Thanks to Alexander Richardson for reporting the issue!

Reviewers: john.brawn

Differential Revision: https://reviews.llvm.org/D39841

llvm-svn: 318032
2017-11-13 11:47:21 +00:00
Matt Arsenault 88efb9ff8e MI: Print ranges on MMO
llvm-svn: 318020
2017-11-13 07:09:20 +00:00
Daniel Sanders 7e52367398 [globalisel][tablegen] Import signextload and zeroextload.
Allow a pattern rewriter to be installed in CodeGenDAGPatterns and use it to
correct situations where SelectionDAG and GlobalISel disagree on
representation. For example, it would rewrite:
  (sextload:i32 $ptr)<<unindexedload>><<sextload>><<sextloadi16>
to:
  (sext:i32 (load:i16 $ptr)<<unindexedload>>)

I'd have preferred to replace the fragments and have the expansion happen
naturally as part of PatFrag expansion but the type inferencing system can't
cope with loads of types narrower than those mentioned in register classes.
This is because the SDTCisInt's on the sext constrain both the result and
operand to the 'legal' integer types (where legal is defined as 'a register
class can contain the type') which immediately rules the narrower types out.
Several targets (those with only one legal integer type) would then go on to
crash on the SDTCisOpSmallerThanOp<> when it removes all the possible types
for the result of the extend.

Also, improve isObviouslySafeToFold() slightly to automatically return true for
neighbouring instructions. There can't be any re-ordering problems if
re-ordering isn't happenning. We'll need to improve it further to handle
sign/zero-extending loads when the extend and load aren't immediate neighbours
though.

llvm-svn: 317971
2017-11-11 03:23:44 +00:00
Craig Topper bdb8db4589 [SelectionDAG] Make getUniformBase in SelectionDAGBuilder fail if any of the middle GEP indices are non-constant.
This is a fix for a bug in r317947. We were supposed to check that all the indices are are constant 0, but instead we're only make sure that indices that are constant are 0. Non-constant indices are being ignored.

llvm-svn: 317950
2017-11-10 23:36:56 +00:00
Craig Topper ffd48e3c27 [SelectionDAG] Teach SelectionDAGBuilder's getUniformBase for gather/scatter handling to accept GEPs with more than 2 operands if the middle operands are all 0s
Currently we can only get a uniform base from a simple GEP with 2 operands. This causes us to miss address folding opportunities for simple global array accesses as the test case shows.

This patch adds support for larger GEPs if the other indices are 0 since those don't require any additional computations to be inserted.

We may also want to handle constant splats of zero here, but I'm leaving that for future work when I have a real world example.

Differential Revision: https://reviews.llvm.org/D39911

llvm-svn: 317947
2017-11-10 22:50:50 +00:00
Amaury Sechet 3f0f650f49 [DAGcombine] Do not replace truncate node by itself when doing constant folding, this trigger needless extra rounds of combine for nothing. NFC
llvm-svn: 317926
2017-11-10 20:59:53 +00:00
Jatin Bhateja aaa5944ad4 [WebAssembly] Fix stack offsets of return values from call lowering.
Summary: Fixes PR35220

Reviewers: vadimcn, alexcrichton

Reviewed By: alexcrichton

Subscribers: pepyakin, alexcrichton, jfb, dschuff, sbc100, jgravelle-google, llvm-commits, aheejin

Differential Revision: https://reviews.llvm.org/D39866

llvm-svn: 317895
2017-11-10 16:26:04 +00:00
Alexander Timofeev 28da06778f [AMDGPU] Prevent Machine Copy Propagation from replacing live copy with the dead one
Differential revision: https://reviews.llvm.org/D38754

llvm-svn: 317884
2017-11-10 12:21:10 +00:00
Karl-Johan Karlsson bd5c522e4d [RegisterCoalescer] Move debug value after rematerialize trivial def
Summary:
The associated debug value is updated when the virtual source register
of a copy is completely eliminated and replaced with a rematerialize
value in the defed register of the copy. As the debug value now is
associated with another register it also need to be moved, otherwise
the debug value isn't valid.

Reviewers: aprantl

Reviewed By: aprantl

Subscribers: MatzeB, llvm-commits, qcolombet

Differential Revision: https://reviews.llvm.org/D38024

llvm-svn: 317880
2017-11-10 09:48:40 +00:00
Jonas Paulsson 4b017e682d [RegAlloc, SystemZ] Increase number of LOCRs by passing "hard" regalloc hints.
* The method getRegAllocationHints() is now of bool type instead of void. If
true is returned, regalloc (AllocationOrder) will *only* try to allocate the
hints, as opposed to merely trying them before non-hinted registers.

* TargetRegisterInfo::getRegAllocationHints() is implemented for SystemZ with
an increase in number of LOCRs.

In this case, it is desired to force the hints even though there is a slight
increase in spilling, because if a non-hinted register would be allocated,
the LOCRMux pseudo would have to be expanded with a jump sequence. The LOCR
(Load On Condition) SystemZ instruction must have both operands in either the
low or high part of the 64 bit register.

Reviewers: Quentin Colombet and Ulrich Weigand
https://reviews.llvm.org/D36795

llvm-svn: 317879
2017-11-10 08:46:26 +00:00
Adrian Prantl 1c8c544946 Preserve debug info when DAG-combinging (zext (truncate x)) -> (and x, mask).
rdar://problem/27139077

llvm-svn: 317825
2017-11-09 19:50:20 +00:00
Mandeep Singh Grang 8c60365d74 [GlobalMerge] Stable sort GlobalSets to fix non-deterministic sort order
Summary: This fixes failure in CodeGen/AArch64/global-merge-group-by-use.ll uncovered by D39245.

Reviewers: ab, asl

Reviewed By: ab

Subscribers: aemerson, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D39635

llvm-svn: 317817
2017-11-09 18:05:17 +00:00
Andrew V. Tischenko 3543f0a712 Add -print-schedule scheduling comments to inline asm.
Differential Revision: https://reviews.llvm.org/D39728

llvm-svn: 317782
2017-11-09 12:45:40 +00:00
Adrian Prantl a8e56458e6 Let replaceVTableHolder accept any type.
In Rust, a trait can be implemented for any type, and if a trait
object pointer is used for the type, then a virtual table will be
emitted for that trait/type combination.

We would like debuggers to be able to inspect trait objects, which
requires finding the concrete type associated with a given vtable.

This patch changes LLVM so that any type can be passed to
replaceVTableHolder. This allows the Rust compiler to emit the needed
debug info -- associating a vtable with the concrete type for which it
was emitted.

This is a DWARF extension: DWARF only specifies the meaning of
DW_AT_containing_type in one specific situation. This style of DWARF
extension is routine, though, and LLVM already has one such case for
DW_AT_containing_type.

Patch by Tom Tromey!

Differential Revision: https://reviews.llvm.org/D39503

llvm-svn: 317730
2017-11-08 22:04:43 +00:00
Dan Gohman 2c74fe977d Add an @llvm.sideeffect intrinsic
This patch implements Chandler's idea [0] for supporting languages that
require support for infinite loops with side effects, such as Rust, providing
part of a solution to bug 965 [1].

Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual
effect, but which appears to optimization passes to have obscure side effects,
such that they don't optimize away loops containing it. It also teaches
several optimization passes to ignore this intrinsic, so that it doesn't
significantly impact optimization in most cases.

As discussed on llvm-dev [2], this patch is the first of two major parts.
The second part, to change LLVM's semantics to have defined behavior
on infinite loops by default, with a function attribute for opting into
potential-undefined-behavior, will be implemented and posted for review in
a separate patch.

[0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
[1] https://bugs.llvm.org/show_bug.cgi?id=965
[2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html

Differential Revision: https://reviews.llvm.org/D38336

llvm-svn: 317729
2017-11-08 21:59:51 +00:00
Reid Kleckner 7adb2fdbba Revert "Correct dwarf unwind information in function epilogue for X86"
This reverts r317579, originally committed as r317100.

There is a design issue with marking CFI instructions duplicatable. Not
all targets support the CFIInstrInserter pass, and targets like Darwin
can't cope with duplicated prologue setup CFI instructions. The compact
unwind info emission fails.

When the following code is compiled for arm64 on Mac at -O3, the CFI
instructions end up getting tail duplicated, which causes compact unwind
info emission to fail:
  int a, c, d, e, f, g, h, i, j, k, l, m;
  void n(int o, int *b) {
    if (g)
      f = 0;
    for (; f < o; f++) {
      m = a;
      if (l > j * k > i)
        j = i = k = d;
      h = b[c] - e;
    }
  }

We get assembly that looks like this:
; BB#1:                                 ; %if.then
Lloh3:
	adrp	x9, _f@GOTPAGE
Lloh4:
	ldr	x9, [x9, _f@GOTPAGEOFF]
	mov	 w8, wzr
Lloh5:
	str		wzr, [x9]
	stp	x20, x19, [sp, #-16]!   ; 8-byte Folded Spill
	.cfi_def_cfa_offset 16
	.cfi_offset w19, -8
	.cfi_offset w20, -16
	cmp		w8, w0
	b.lt	LBB0_3
	b	LBB0_7
LBB0_2:                                 ; %entry.if.end_crit_edge
Lloh6:
	adrp	x8, _f@GOTPAGE
Lloh7:
	ldr	x8, [x8, _f@GOTPAGEOFF]
Lloh8:
	ldr		w8, [x8]
	stp	x20, x19, [sp, #-16]!   ; 8-byte Folded Spill
	.cfi_def_cfa_offset 16
	.cfi_offset w19, -8
	.cfi_offset w20, -16
	cmp		w8, w0
	b.ge	LBB0_7
LBB0_3:                                 ; %for.body.lr.ph

Note the multiple .cfi_def* directives. Compact unwind info emission
can't handle that.

llvm-svn: 317726
2017-11-08 21:31:14 +00:00
Alex Bradbury fa18b9e73c Set hasSideEffects=0 for PHI and fix affected passes
Previously, hasSideEffects was ? for TargetOpcode::PHI and would be inferred 
as 1. D37065 sets the previously inferred properties explicitly. This patch sets 
hasSideEffects=0 for PHI, as it is for G_PHI. MachineInstr::isSafeToMove has 
been updated so it still returns false for PHI.

Additionally, HexagonBitSimplify relied on a PHI node having the 
hasUnmodeledSideEffects property. This patch fixes that assumption.

Differential Revision: https://reviews.llvm.org/D37097

llvm-svn: 317721
2017-11-08 20:19:16 +00:00
Adrian Prantl 93faeecd8f Handle inlined variables in SelectionDAGBuilder::EmitFuncArgumentDbgValue().
In 2010 a commit with no testcase and no further explanation
explicitly disabled the handling of inlined variables in
EmitFuncArgumentDbgValue(). I don't think there is a good reason for
this any more and re-enabling this adds debug locations for variables
associated with an LLVM function argument in functions that are
inlined into the first basic block. The only downside of doing this is
that we may insert a DBG_VALUE before the inlined scope, but (1) this
could be filtered out later, and (2) LiveDebugValues will not
propagate it into subsequent basic blocks if they don't dominate the
variable's lexical scope, so this seems like a small price to pay.

rdar://problem/26228128

llvm-svn: 317702
2017-11-08 18:27:13 +00:00