Commit Graph

21646 Commits

Author SHA1 Message Date
David Majnemer d065e23dac [codeview] Return type indices for typedefs
Use the type index of the underlying type unless we have a typedef from
long to HRESULT; HRESULT typedefs are translated to T_HRESULT.

llvm-svn: 271494
2016-06-02 06:21:37 +00:00
Matt Arsenault 8f4d43a41f Make MachineCopyPropagation preserve CFG
This doesn't touch it as far as I can tell.

llvm-svn: 271445
2016-06-02 00:04:26 +00:00
Justin Bogner f807dce6da SDAG: Drop a redundant replace and move the dead node removal closer. NFC
llvm-svn: 271429
2016-06-01 20:55:26 +00:00
Michael Kuperstein 738ae45ce8 [DAG] Improve legalization of INSERT_SUBVECTOR
When the index is known to be constant 0, insert directly into the the low half,
instead of spilling, performing the insert in-memory, and reloading.

Differential Revision: http://reviews.llvm.org/D20763

llvm-svn: 271428
2016-06-01 20:49:35 +00:00
Than McIntosh 4ef761aa35 Better fix for PR27903.
Summary:
Re-enable lifetime-start-on-first-use for stack coloring,
but explicitly disable it for slots with more than one start
or end lifetime marker.

Bug: 27903

Reviewers: wmi, tejohnson, qcolombet, gbiv

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20739

llvm-svn: 271412
2016-06-01 17:55:10 +00:00
Reid Kleckner 846edb6fdc Fix the NDEBUG build
llvm-svn: 271411
2016-06-01 17:31:24 +00:00
Reid Kleckner 5acacbb04f [codeview] Translate basic DITypes to CV type records
Summary:
This is meant to be the tiniest step towards DIType to CV type index
translation that I could come up with. Whenever translation fails, we use type
index zero, which is the unknown type.

Reviewers: aaboud, zturner

Subscribers: llvm-commits, amccarth

Differential Revision: http://reviews.llvm.org/D20840

llvm-svn: 271408
2016-06-01 17:05:51 +00:00
Peter Collingbourne b326986de0 DwarfDebug: Simplify. NFC.
llvm-svn: 271360
2016-06-01 02:58:40 +00:00
Petr Hosek faef3207de [MC] Rename EmitFill to emitFill
This is to match the overloaded variants as well as the new style.

Differential Revision: http://reviews.llvm.org/D20690

llvm-svn: 271359
2016-06-01 01:59:58 +00:00
Matt Arsenault 5d06439c54 DAGCombiner: Fix broken size check in isAlias
This should have been converting the size to bytes, but wasn't really.
These should probably all be using getStoreSize instead.

I haven't been able to come up with a meaningful testcase for this.
I can trigger it using combinations of struct loads and stores,
but can't observe a difference in non-broken testcases.

isAlias is only really used during store merging, so I'm not sure how
to get into the vector splitting situation the comment describes
since store merging is only done before type legalization.

llvm-svn: 271356
2016-06-01 01:00:36 +00:00
Matthias Braun f9acacaa92 CodeGen: Refactor renameDisconnectedComponents() as a pass
Refactor LiveIntervals::renameDisconnectedComponents() to be a pass.
Also change the name to "RenameIndependentSubregs":

- renameDisconnectedComponents() worked on a MachineFunction at a time
  so it is a natural candidate for a machine function pass.

- The algorithm is testable with a .mir test now.

- This also fixes a problem where the lazy renaming as part of the
  MachineScheduler introduced IMPLICIT_DEF instructions after the number
  of a nodes in a region were counted leading to a mismatch.

Differential Revision: http://reviews.llvm.org/D20507

llvm-svn: 271345
2016-05-31 22:38:06 +00:00
Ahmed Bougacha 96ef87e910 [CodeGen] Promote FMINNAN/FMAXNAN like other binops.
We think it's OK to generate half fminnan because it's legal for the
transform-to type (f32; r245196). However, PromoteFloatRes was missing
the case; simply promote like the other binops, including minnum.

llvm-svn: 271317
2016-05-31 18:50:25 +00:00
Ahmed Bougacha e4b3812ec2 [CodeGen] Don't mark FMINNUM/FMAXNUM Expand twice. NFC.
They're already in the all_valuetypes() loop above.

llvm-svn: 271316
2016-05-31 18:50:21 +00:00
Reid Kleckner fbdbe9e22b [codeview] Improve readability of type record assembly
Adds the method MCStreamer::EmitBinaryData, which is usually an alias
for EmitBytes. In the MCAsmStreamer case, it is overridden to emit hex
dump output like this:
        .byte   0x0e, 0x00, 0x08, 0x10
        .byte   0x03, 0x00, 0x00, 0x00
        .byte   0x00, 0x00, 0x00, 0x00
        .byte   0x00, 0x10, 0x00, 0x00

Also, when verbose asm comments are enabled, this patch prints the dump
output for each comment before its record, like this:
        # ArgList (0x1000) {
        #   TypeLeafKind: LF_ARGLIST (0x1201)
        #   NumArgs: 0
        #   Arguments [
        #   ]
        # }
        .byte   0x06, 0x00, 0x01, 0x12
        .byte   0x00, 0x00, 0x00, 0x00

This should make debugging easier and testing more convenient.

Reviewers: aaboud

Subscribers: majnemer, zturner, amccarth, aaboud, llvm-commits

Differential Revision: http://reviews.llvm.org/D20711

llvm-svn: 271313
2016-05-31 18:45:36 +00:00
Saleem Abdulrasool d2f705ddf9 X86: permit using SjLj EH on x86 targets as an option
This adds support to the backed to actually support SjLj EH as an exception
model.  This is *NOT* the default model, and requires explicitly opting into it
from the frontend.  GCC supports this model and for MinGW can still be enabled
via the `--using-sjlj-exceptions` options.

Addresses PR27749!

llvm-svn: 271244
2016-05-31 01:48:07 +00:00
Rafael Espindola fd82f0501f Add RelaxELFRelocations to TargetOptions.h.
It will be used in clang.

llvm-svn: 271161
2016-05-29 01:57:20 +00:00
Andrew Kaylor 04f8e06696 Update the stack coloring pass to remove lifetime intrinsics in the optnone/opt-bisect skip case.
Differential Revision: http://reviews.llvm.org/D20453

llvm-svn: 271068
2016-05-27 22:56:49 +00:00
Matthias Braun 49cb6e909d MachineScheduler: Introduce ONLY1 reason to improve debug output
llvm-svn: 271058
2016-05-27 22:14:26 +00:00
Michael Kuperstein a75c77b127 [X86] Detect SAD patterns and emit psadbw instructions.
This recommits r267649 with a fix for PR27539.

Differential Revision: http://reviews.llvm.org/D20598

llvm-svn: 271033
2016-05-27 18:53:22 +00:00
Than McIntosh 4daf7f13b6 Disable lifetime-start-on-first-use analysis.
Summary:
Turn off lifetime-start-on-first-use enhancement for the moment
pending a fix for bug 27903.

Bug: 27903

Reviewers: tejohnson, wmi, qcolombet, gbiv

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20731

llvm-svn: 271003
2016-05-27 15:27:51 +00:00
Benjamin Kramer 82de7d323d Apply clang-tidy's misc-move-constructor-init throughout LLVM.
No functionality change intended, maybe a tiny performance improvement.

llvm-svn: 270997
2016-05-27 14:27:24 +00:00
George Rimar c91e38c5eb Recommit 270977 - [llvm-mc] - Teach llvm-mc to generate zlib styled compression sections.
Fix: updated clang code which was not updated by mistake.

Original commit message:
[llvm-mc] - Teach llvm-mc to generate zlib styled compression sections.

This patch is strongly based on previously reverted D20331.
(because of gnuutils < 2.26 does not support compressed debug sections in non zlib-gnu style)

Difference that this patch supports both zlib and zlib-gnu styles.

-compress-debug-sections option now supports next values:

-compress-debug-sections=zlib-gnu
-compress-debug-sections=zlib
-compress-debug-sections=none
Previously specifying -compress-debug-sections enabled zlib-gnu compression,
so anyone can put "-compress-debug-sections=zlib-gnu" to restore the behavior
that was before this patch for case when compression was enabled.

Differential revision: http://reviews.llvm.org/D20676

llvm-svn: 270987
2016-05-27 12:27:32 +00:00
Benjamin Kramer 3e9a5d3468 Apply clang-tidy's misc-static-assert where it makes sense.
Also fold conditions into assert(0) where it makes sense. No functional
change intended.

llvm-svn: 270982
2016-05-27 11:36:04 +00:00
George Rimar e79fc3efca Revert r270977 ([llvm-mc] - Teach llvm-mc to generate zlib styled compression sections.)
It broke buildbot:
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/13585/steps/build/logs/stdio

Initial commit message:
[llvm-mc] - Teach llvm-mc to generate zlib styled compression sections.

This patch is strongly based on previously reverted D20331.
(because of gnuutils < 2.26 does not support compressed debug sections in non zlib-gnu style)

Difference that this patch supports both zlib and zlib-gnu styles.

-compress-debug-sections option now supports next values:

-compress-debug-sections=zlib-gnu
-compress-debug-sections=zlib
-compress-debug-sections=none
Previously specifying -compress-debug-sections enabled zlib-gnu compression,
so anyone can put "-compress-debug-sections=zlib-gnu" to restore the behavior
that was before this patch for case when compression was enabled.

Differential revision: http://reviews.llvm.org/D20676

llvm-svn: 270978
2016-05-27 10:06:16 +00:00
George Rimar 48dcd2b806 [llvm-mc] - Teach llvm-mc to generate zlib styled compression sections.
This patch is strongly based on previously reverted D20331.
(because of gnuutils < 2.26 does not support compressed debug sections in non zlib-gnu style)

Difference that this patch supports both zlib and zlib-gnu styles.

-compress-debug-sections option now supports next values:

-compress-debug-sections=zlib-gnu
-compress-debug-sections=zlib
-compress-debug-sections=none
Previously specifying -compress-debug-sections enabled zlib-gnu compression,
so anyone can put "-compress-debug-sections=zlib-gnu" to restore the behavior
that was before this patch for case when compression was enabled.

Differential revision: http://reviews.llvm.org/D20676

llvm-svn: 270977
2016-05-27 09:58:08 +00:00
Mitch Bodart 05aeeb5cf1 [CodeGen] Fix problem with X86 byte registers in CriticalAntiDepBreaker
CriticalAntiDepBreaker was not correctly tracking defs of the high X86 byte
registers, leading to incorrect use of a busy register to break an
antidependence.

Fixes pr27681, and its duplicates pr27580, pr27804.

Differential Revision: http://reviews.llvm.org/D20456

llvm-svn: 270935
2016-05-26 23:08:52 +00:00
Justin Bogner c04a76c176 SDAG: Use an Optional<> instead of a sigil value. NFC
This just makes it a bit more clear that we don't intend to use a
deleted node for anything here.

llvm-svn: 270931
2016-05-26 22:29:34 +00:00
Adrian Prantl 7509d54b21 PR26055: Speed up LiveDebugValues::transferDebugValue()
This patch builds upon r270776 and speeds up
LiveDebugValues::transferDebugValue() by adding an index that maps each
DebugVariable to its open VarLoc.

The transferDebugValue() function needs to close all open ranges for a
given DebugVariable. Iterating over the set bits of OpenRanges is
prohibitively slow in practice. I experimented with using the sorted map
of VarLocs in the UniqueVector to iterate only over the range of VarLocs
with a given DebugVariable, but the binary search turned out to be even
more expensive than just iterating over the set bits in OpenRanges.
Instead, this patch exploits the fact that there can only be one open
location for each DebugVariable and redundantly stores this location in a
DenseMap.

This patch brings the time spent in the LiveDebugValues pass down to an
almost neglectiable amount.

http://llvm.org/bugs/show_bug.cgi?id=26055
http://reviews.llvm.org/D20636
rdar://problem/24091200

llvm-svn: 270923
2016-05-26 21:42:47 +00:00
Krzysztof Parzyszek 143f684a79 Do not rename registers that do not start an independent live range
llvm-svn: 270885
2016-05-26 18:22:53 +00:00
Adrian Prantl aa9d6c3630 Undo a suboptimal clang-format decision. NFC
llvm-svn: 270861
2016-05-26 16:06:04 +00:00
Rafael Espindola a224de06bc Use shouldAssumeDSOLocal on AArch64.
This reduces code duplication and now AArch64 also handles PIE.

llvm-svn: 270844
2016-05-26 12:42:55 +00:00
Reid Kleckner 5d122f872d [codeview] Use comdats for debug info describing comdat functions
Summary:
This allows the linker to discard unused symbol information for comdat
functions that were discarded during the link. Before this change,
searching for the name of an inline function in the debugger would
return multiple results, one per symbol subsection in the object file.
After this change, there is only one result, the result for the function
chosen by the linker.

Reviewers: zturner, majnemer

Subscribers: aaboud, amccarth, llvm-commits

Differential Revision: http://reviews.llvm.org/D20642

llvm-svn: 270792
2016-05-25 23:16:12 +00:00
Adrian Prantl 00698731ed Work around an MSVC compiler issue in r270776.
llvm-svn: 270783
2016-05-25 22:37:29 +00:00
Adrian Prantl 6ee02c7fce PR26055: Speed up LiveDebugValues by replacing lists with bitvectors.
This patch modifies the LiveDebugValues pass to use more efficient set
data structures as outlined in PR26055. Both VarLocSet and VarLocList are
now SparseBitVectors which allows us to perform much faster bitvector
arithmetic on them.

The speedup can be in the order of minutes especially on ASANified code.

The change is not NFC in the assembler output because the inserted
DBG_VALUEs are now sorted by variable and location.

Many thanks to Daniel Berlin for helping design the improved algorithm and
reviewing the patch.

https://llvm.org/bugs/show_bug.cgi?id=26055
http://reviews.llvm.org/D20178
rdar://problem/24091200

llvm-svn: 270776
2016-05-25 22:21:12 +00:00
Chad Rosier dca7651d59 [MBB] Early exit to reduce indentation, per coding guidelines. NFC.
llvm-svn: 270773
2016-05-25 21:53:46 +00:00
Simon Pilgrim fdbc64beea Simplify std::all_of predicate (to one line) by using llvm::all_of. NFCI.
llvm-svn: 270749
2016-05-25 20:17:39 +00:00
Simon Pilgrim 0a6b95a60a Simplify std::all_of predicate (to one line) by using llvm::all_of. NFCI.
llvm-svn: 270747
2016-05-25 20:13:39 +00:00
Chad Rosier e5314a94eb [SelectionDAG] Add smarts for BSWAP in computeKnownBits.
llvm-svn: 270738
2016-05-25 17:52:38 +00:00
Hal Finkel 6f3387f434 [SDAG] Add a fallback multiplication expansion
LegalizeIntegerTypes does not have a way to expand multiplications for large
integer types (i.e. larger than twice the native bit width). There's no
standard runtime call to use in that case, and so we'd just assert.

Unfortunately, as it turns out, it is possible to hit this case from
standard-ish C code in rare cases. A particular case a user ran into yesterday
involved an __int128 induction variable and a loop with a quadratic (not
linear) recurrence which triggered some backend logic using SCEVExpander. In
this case, the BinomialCoefficient code in SCEV generates some i129 variables,
which get widened to i256. At a high level, this is not actually good (i.e. the
underlying optimization, PPCLoopPreIncPrep, should not be transforming the loop
in question for performance reasons), but regardless, the backend shouldn't
crash because of cost-modeling issues in the optimizer.

This is a straightforward implementation of the multiplication expansion, based
on the algorithm in Hacker's Delight. I validated it against the code for the
mul256b function from http://locklessinc.com/articles/256bit_arithmetic/ using
random inputs. There should be no functional change for previously-working code
(the new expansion code only replaces an assert).

Fixes PR19797.

llvm-svn: 270720
2016-05-25 16:50:22 +00:00
Chad Rosier a00df49dc5 Clarify that we match BSwap in InstCombine and BitReverse in CGP. NFC.
Also, rename recognizeBitReverseOrBSwapIdiom to recognizeBSwapOrBitReverseIdiom,
so the ordering of the MatchBSwaps and MatchBitReversals arguments are
consistent with the function name.

llvm-svn: 270715
2016-05-25 16:22:14 +00:00
Matthias Braun 4c994ee42b ScheduleDAGInstrs: Fix memory corruption
We have to modify V2SU before inserting new elements into the
CurrentVRegDefs set because that may move V2SU in memory invalidating
the reference.

llvm-svn: 270644
2016-05-25 01:18:00 +00:00
Haicheng Wu 90a55651e6 [MBP] Factor out the optimizations on branch conditions and unanalyzable branches. NFCI.
The benefits of this patch are

-- We call AnalyzeBranch() to optimize unanalyzable branches, but the result of
   AnalyzeBranch() is not used. Now the result is useful.

-- Before the layout of all the MBBs is set, the result of AnalyzeBranch() is
   not correct and needs to be fixed before using it to optimize the branch
   conditions. Now this optimization is called after the layout, the code used
   to fix the result of AnalyzeBranch() is not needed.

-- The branch condition of the last block is not optimized before. Now it is
   optimized.

Differential Revision: http://reviews.llvm.org/D20177

llvm-svn: 270623
2016-05-24 22:16:14 +00:00
Matthias Braun fc4c8a1e46 LiveIntervalAnalysis: Fix handleMove() re-using the wrong value number
This fixes http://llvm.org/PR27856

llvm-svn: 270619
2016-05-24 21:54:01 +00:00
David Blaikie c53e18d93a DWARF: Omit DW_AT_APPLE attributes (except ObjC ones) when not targeting LLDB
These attributes aren't used by other debuggers (& may be confused with
other DWARF extensions) so they just waste space (about 1.5% on .dwo
file size on a random large program I tested).

We could remove the ObjC property ones too, but I figured they were
probably more necessary when trying to understand ObjC (I could be wrong
though) & so any debugger interested in working with ObjC would use
them, perhaps? (also, there are some legacy tests in Clang that test for
them - making it one of those annoying cross-project commits and/or
cleanup to refactor those tests)

llvm-svn: 270613
2016-05-24 21:19:28 +00:00
Than McIntosh 879ad8fa99 Rework/enhance stack coloring data flow analysis.
Replace bidirectional flow analysis to compute liveness with forward
analysis pass. Treat lifetimes as starting when there is a first
reference to the stack slot, as opposed to starting at the point of the
lifetime.start intrinsic, so as to increase the number of stack
variables we can overlap.

Reviewers: gbiv, qcolumbet, wmi
Differential Revision: http://reviews.llvm.org/D18827

Bug: 25776
llvm-svn: 270559
2016-05-24 13:23:44 +00:00
Justin Bogner 4a57bb5a3b PrologEpilogInserter: Avoid an infinite loop when MinCSFrameIndex == 0
Before r269750 we did the comparisons in this loop in signed ints so
that it DTRT when MinCSFrameIndex was 0. This was changed because it's
now possible for MinCSFrameIndex to be UINT_MAX, but that introduced a
bug when we were comparing `>= 0` - this is tautological in unsigned.

Rework the comparisons here to avoid issues with unsigned wrapping.

No test. I couldn't find a way to get any of the StackGrowsUp in-tree
targets to reach the code that sets MinCSFrameIndex.

llvm-svn: 270492
2016-05-23 21:40:52 +00:00
Reid Kleckner 2280f9325e Modify emitTypeInformation to use MemoryTypeTableBuilder, take 2
This effectively revers commit r270389 and re-lands r270106, but it's
almost a rewrite.

The behavior change in r270106 was that we could no longer assume that
each LF_FUNC_ID record got its own type index. This patch adds a map
from DINode* to TypeIndex, so we can stop making that assumption.

This change also emits padding bytes between type records similar to the
way MSVC does. The size of the type record includes the padding bytes.

llvm-svn: 270485
2016-05-23 20:23:46 +00:00
Wei Mi f3c8f532d2 InsertPointAnalysis: Move current live interval from being a class member
to query interfaces argument; NFC

Differential Revision: http://reviews.llvm.org/D20532

llvm-svn: 270481
2016-05-23 19:39:19 +00:00
Justin Lebar f6f4a2a972 Fix DEBUG logs in MachineLICM.
Summary:
MBBs don't necessarily have a name (in my experience, they almost never
do), in which case this logging is quite unhelpful.  The number seems to
work well.

Reviewers: iteratee

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20533

llvm-svn: 270477
2016-05-23 18:56:07 +00:00
Zachary Turner a78ecd1e6c [codeview] Refactor symbol records to use same pattern as types.
This will pave the way to introduce a full fledged symbol visitor
similar to how we have a type visitor, thus allowing the same
dumping code to be used in llvm-readobj and llvm-pdbdump.

Differential Revision: http://reviews.llvm.org/D20384
Reviewed By: rnk

llvm-svn: 270475
2016-05-23 18:49:06 +00:00
David Majnemer 6cd7c9185b Revert "Modify emitTypeInformation to use MemoryTypeTableBuilder"
This reverts commit r270106.  It results in certain function types
omitted in the output.

llvm-svn: 270389
2016-05-23 01:37:45 +00:00
Hal Finkel 7b1b3daf6e [LiveIntervalAnalysis] Don't dereference an end iterator in repairIntervalsInRange
This fixes a bug introduced in:

  r262115 - CodeGen: Take MachineInstr& in SlotIndexes and LiveIntervals, NFC

The iterator End here might == MBB->end(), and so we can't unconditionally
dereference it. This often goes unnoticed (I don't have a test case that always
crashes, and ASAN does not catch it either) because the function call arguments are
turned right back into iterators. MachineInstrBundleIterator's constructor,
however, does have an assert which might randomly fire.

llvm-svn: 270323
2016-05-21 16:03:50 +00:00
Quentin Colombet f2723a2a91 [RegBankSelect] Compute the repairing cost for copies.
Prior to this patch, we were using 1 for all the repairing costs.
Now, we use the information from the target to get this information.

llvm-svn: 270304
2016-05-21 01:43:25 +00:00
Matthias Braun 71f9564e7f LiveIntervalAnalysis: Rework constructMainRangeFromSubranges()
We now use LiveRangeCalc::extendToUses() instead of a specially designed
algorithm in constructMainRangeFromSubranges():
- The original motivation for constructMainRangeFromSubranges() were
  differences between the main liverange and subranges because of hidden
  dead definitions. This case however cannot happen anymore with the
  DetectDeadLaneMasks pass in place.
- It simplifies the code.
- This fixes a longstanding bug where we did not properly create new SSA
  values on merging control flow (the MachineVerifier missed most of
  these cases).
- Move constructMainRangeFromSubranges() to LiveIntervalAnalysis and
  LiveRangeCalc to better match the implementation/available helper
  functions.

This re-applies r269016. The fixes from r270290 and r270259 should avoid
the machine verifier problems this time.

llvm-svn: 270291
2016-05-20 23:14:56 +00:00
Matthias Braun e29b7689bd MachineVerifier: subregs so not require defs/valnos on every path
It is fine for subregister ranges to be undefined on some CFG paths as
we may have a "vregX:other_subreg<read-undef> =" def on that path. We
do not (and should not) have live segments for the subregister ranges.
The MachineVerifier should not complain about this.

This is a slight variant of http://llvm.org/PR27705

llvm-svn: 270290
2016-05-20 23:02:13 +00:00
Krzysztof Parzyszek ccf5ee0b8f Use report_fatal_error after all
Depending on the compiler used to build LLVM, llvm_unreachable can either
expand to a call to abort(), or to a __builtin_unreachable. The latter
does not have a predictable behavior at runtime.

llvm-svn: 270260
2016-05-20 19:46:42 +00:00
Matthias Braun 858d1df246 LiveIntervalAnalysis: Fix missing defs in renameDisconnectedComponents().
Fix renameDisconnectedComponents() creating vreg uses that can be
reached from function begin withouthaving a definition (or explicit
live-in). Fix this by inserting IMPLICIT_DEF instruction before
control-flow joins as necessary.

Removes an assert from MachineScheduler because we may now get
additional IMPLICIT_DEF when preparing the scheduling policy.

This fixes the underlying problem of http://llvm.org/PR27705

llvm-svn: 270259
2016-05-20 19:46:13 +00:00
Peter Collingbourne 5973bc8a82 CodeGen: Move the call to DwarfDebug::beginModule() out of the constructor.
This gives AsmPrinter a chance to initialize its DD field before
we call beginModule(), which is about to start using it.

Differential Revision: http://reviews.llvm.org/D20413

llvm-svn: 270258
2016-05-20 19:35:35 +00:00
Peter Collingbourne 96c9ae6a20 CodeGen: Do not require a MachineFunction just to create a DIEDwarfExpression.
We are about to start using DIEDwarfExpression to create global variable
DIEs, which happens before we generate code for functions.

Differential Revision: http://reviews.llvm.org/D20412

llvm-svn: 270257
2016-05-20 19:35:17 +00:00
Quentin Colombet 79fe1bea6b [RegBankSelect] Look for the best mapping in greedy mode.
The Fast mode takes the first mapping, the greedy mode loops over all
the possible mapping for an instruction and choose the cheaper one.
Test case will come with target specific code, since we currently do not
have instructions that have several mappings.

llvm-svn: 270249
2016-05-20 18:37:33 +00:00
Quentin Colombet 4f147a54a1 [RegBankSelect] Get rid of a now dead method: setSafeInsertPoint.
This is now encapsulated in the RepairingPlacement class.

llvm-svn: 270247
2016-05-20 18:17:16 +00:00
Quentin Colombet 6e80dbcde3 [RegBankSelect] Take advantage of a potential best cost information in
computeMapping.

Computing the cost of a mapping takes some time.
Since in Fast mode, the cost is irrelevant, just spare some cycles by not
computing it.
In Greedy mode, we need to choose the best cost, that means that when
the local cost gets more expensive than the best cost, we can stop
computing the repairing and cost for the current mapping.

llvm-svn: 270245
2016-05-20 18:00:46 +00:00
Quentin Colombet 25fcef73de [RegBankSelect] Use frequency and probability information to compute
more precise cost in Greedy mode.

In Fast mode the cost is irrelevant so do not bother requiring that
those passes get scheduled.

llvm-svn: 270244
2016-05-20 17:54:09 +00:00
Quentin Colombet a553012874 [RegBankSelect] Use the Fast mode for functions with the optnone attribute.
llvm-svn: 270242
2016-05-20 17:36:54 +00:00
Quentin Colombet 46df722eb0 [RegBankSelect] Specify different optimization mode for the pass.
The mode should be choose by the target when instantiating the pass.

llvm-svn: 270235
2016-05-20 16:55:35 +00:00
Krzysztof Parzyszek 64439ac775 Fix error reporting in register scavenger (lack of emergency spill slot)
- Do not store Twine objects.
- Remove report_fatal_error, since llvm_unreachable does terminate the
  program in release mode.

llvm-svn: 270233
2016-05-20 16:38:34 +00:00
Quentin Colombet f75c2bfc6b [RegBankSelect] Add a method to avoid splitting while repairing.
The previous choice of the insertion points for repairing was
straightfoward but may introduce some basic block or edge splitting. In
some situation this is something we can avoid.
For instance, when repairing a phi argument, instead of placing the
repairing on the related incoming edge, we may move it to the previous
block, before the terminators. This is only possible when the argument
is not defined by one of the terminator.

llvm-svn: 270232
2016-05-20 16:36:12 +00:00
Krzysztof Parzyszek ce6f3bdee4 Correction to r270219: fix detection of invalid frame index
llvm-svn: 270220
2016-05-20 14:34:03 +00:00
Krzysztof Parzyszek 70b1eee793 Skip entries with invalid indexes in the search loop in register scavenger
llvm-svn: 270219
2016-05-20 14:18:54 +00:00
Diana Picus 86f1f4ca77 Fix some comment typos in SelectionDAGBuilder. NFC
llvm-svn: 270190
2016-05-20 08:06:31 +00:00
Quentin Colombet d84d00baf1 [RegBankSelect] Refactor the code to split the repairing and mapping of
an instruction.

Use the previously introduced RepairingPlacement class to split the code
computing the repairing placement from the code doing the actual
placement. That way, we will be able to consider different placement and
then, only apply the best one.

llvm-svn: 270168
2016-05-20 00:55:51 +00:00
Quentin Colombet 5565075418 [RegBankSelect] Add helper class for repairing code placement.
When assigning the register banks we may have to insert repairing code
to move already assigned values accross register banks.

Introduce a few helper classes to keep track of what is involved in the
repairing of an operand:
- InsertPoint and its derived classes record the positions, in the CFG,
  where repairing has to be inserted.
- RepairingPlacement holds all the insert points for the repairing of an
  operand plus the kind of action that is required to do the repairing.

This is going to be used to keep track of how the repairing should be
done, while comparing different solutions for an instruction. Indeed, we
will need the repairing placement to capture the cost of a solution and
we do not want to compute it a second time when we do the actual
repairing.

llvm-svn: 270167
2016-05-20 00:49:10 +00:00
Quentin Colombet 0d77da4ef8 [RegBankSelect] Refactor assignmentMatch to avoid testing the current
register bank twice.

Prior to this change, we were checking if the assignment for the current
machine operand was matching, then we would check if the mismatch
requires to insert repair code.
We actually already have this information from the first check, so just
pass it along.

NFCI.

llvm-svn: 270166
2016-05-20 00:42:57 +00:00
Rafael Espindola 78d947b4f5 Fix pr27728.
Sorry for the lack testcase. There is one in the pr, but it depends on
std::sort and the .ll version is 110 lines, so I don't think it is
wort it.

The bug was that we were sorting after adding a terminator, and the
sorting algorithm could end up putting the terminator in the middle of
the List vector.

With that we would create a Spans map entry keyed on nullptr which would
then be added to CUs and fail in that sorting.

llvm-svn: 270165
2016-05-20 00:38:28 +00:00
Quentin Colombet cfd97b9386 [RegBankSelect] Introduce MappingCost helper class.
This helper class will be used to represent the cost of mapping an
instruction to a specific register bank.
The particularity of these costs is that they are mostly local, thus the
frequency of the basic block is irrelevant. However, for few
instructions (e.g., phis and terminators), the cost may be non-local and
then, we need to account for the frequency of the involved basic blocks.

This will be used by the greedy mode I am working on.

llvm-svn: 270163
2016-05-20 00:35:26 +00:00
Rafael Espindola 0a78f8c463 clang-format. NFC.
llvm-svn: 270156
2016-05-19 23:17:37 +00:00
Quentin Colombet b926bdac4c Reapply r263460: [SpillPlacement] Fix a quadratic behavior in spill placement.
Using Chandler's words from r265331:
This commit was greatly exacerbating PR17409 and effectively regressed
build time for lot of (very large) code when compiled with ASan or MSan.

PR17409 is fixed by r269249, so this is fine to reapply r263460.

Original commit message:
The bad behavior happens when we have a function with a long linear
chain of basic blocks, and have a live range spanning most of this
chain, but with very few uses.

Let say we have only 2 uses.

The Hopfield network is only seeded with two active blocks where the
uses are, and each iteration of the outer loop in
`RAGreedy::growRegion()` only adds two new nodes to the network due to
the completely linear shape of the CFG.  Meanwhile,
`SpillPlacer->iterate()` visits the whole set of discovered nodes, which
adds up to a quadratic algorithm.

This is an historical accident effect from r129188.

When the Hopfield network is expanding, most of the action is happening
on the frontier where new nodes are being added. The internal nodes in
the network are not likely to be flip-flopping much, or they will at
least settle down very quickly. This means that while
`SpillPlacer->iterate()` is recomputing all the nodes in the network, it
is probably only the two frontier nodes that are changing their output.

Instead of recomputing the whole network on each iteration, we can
maintain a SparseSet of nodes that need to be updated:

- `SpillPlacement::activate()` adds the node to the todo list.
- When a node changes value (i.e., `update()` returns true), its
  neighbors are added to the todo list.
- `SpillPlacement::iterate()` only updates the nodes in the list.

The result of Hopfield iterations is not necessarily exact. It should
converge to a local minimum, but there is no guarantee that it will find
a global minimum. It is possible that updating nodes in a different
order will cause us to switch to a different local minimum. In other
words, this is not NFC, but although I saw a few runtime improvements
and regressions when I benchmarked this change, those were side effects
and actually the performance change is in the noise as expected.

Huge thanks to Jakob Stoklund Olesen <stoklund@2pi.dk> for his
feedbacks, guidance and time for the review.

llvm-svn: 270149
2016-05-19 22:40:37 +00:00
Matthew Simpson 476c0afc01 [ARM, AArch64] Match additional patterns to ldN instructions
When matching an interleaved load to an ldN pattern, the interleaved access
pass checks that all users of the load are shuffles. If the load is used by an
instruction other than a shuffle, the pass gives up and an ldN is not
generated. This patch considers users of the load that are extractelement
instructions. It attempts to modify the extracts to use one of the available
shuffles rather than the load. After the transformation, the load is only used
by shuffles and will then be matched with an ldN pattern.

Differential Revision: http://reviews.llvm.org/D20250

llvm-svn: 270142
2016-05-19 21:39:00 +00:00
Adrian McCarthy a972d6121e Modify emitTypeInformation to use MemoryTypeTableBuilder
A baby step toward translating DIType records to CodeView.

This does not (yet) combine the record length with the record data. I'm going back and forth trying to determine if that's a good idea.

llvm-svn: 270106
2016-05-19 20:12:56 +00:00
Matthew Simpson 330a125542 [ARM, AArch64] Properly initialize InterleavedAccessPass
InterleavedAccessPass is an IR-level pass, so this change will enable testing
it with opt. This is part of D20250.

llvm-svn: 270101
2016-05-19 20:08:32 +00:00
Mitch Bodart 6453501403 CodeGen: Move check of EnablePostRAScheduler to avoid disabling antidependency breaker
Previously, specifying -post-RA-scheduler=true had the side effect of
disabling the antidependency breaker, yielding different behavior than
if the post-RA-scheduler was enabled via the scheduling model.

Differential Revision: http://reviews.llvm.org/D20186

llvm-svn: 270077
2016-05-19 16:40:49 +00:00
Sanjay Patel f39f42d3fb [SelectionDAG] rename/move isKnownToBeAPowerOfTwo() from TargetLowering (NFC)
There are at least 2 places (DAGCombiner, X86ISelLowering) where this could be used instead
of ad-hoc and watered down code that is trying to match a power-of-2 pattern.

Differential Revision: http://reviews.llvm.org/D20439

llvm-svn: 270073
2016-05-19 15:53:52 +00:00
Peter Collingbourne fe12d0e3e5 CodeGen: Make the global-merge pass independently testable, and add a test.
llvm-svn: 270023
2016-05-19 04:38:56 +00:00
Sanjay Patel b2bcd95aab reduce indentation; NFCI
llvm-svn: 270007
2016-05-19 00:33:07 +00:00
Haicheng Wu c01919e796 [MBP] Remove a redundant skipFunction(). NFC.
skipFunction() is called twice.

Differential Revision: http://reviews.llvm.org/D20377

llvm-svn: 269994
2016-05-18 22:34:45 +00:00
Krzysztof Parzyszek 14a1c18448 When looking for a spill slot in reg scavenger, find one that matches RC
When looking for an available spill slot, the register scavenger would stop
after finding the first one with no register assigned to it. That slot may
have size and alignment that do not meet the requirements of the register
that is to be spilled. Instead, find an available slot that is the closest
in size and alignment to one that is needed to spill a register from RC.

Differential Revision: http://reviews.llvm.org/D20295

llvm-svn: 269969
2016-05-18 18:16:00 +00:00
Hans Wennborg 8eb336c14e Re-commit r269828 "X86: Avoid using _chkstk when lowering WIN_ALLOCA instructions"
with an additional fix to make RegAllocFast ignore undef physreg uses. It would
previously get confused about the "push %eax" instruction's use of eax. That
method for adjusting the stack pointer is used in X86FrameLowering::emitSPUpdate
as well, but since that runs after register-allocation, we didn't run into the
RegAllocFast issue before.

llvm-svn: 269949
2016-05-18 16:10:17 +00:00
Zachary Turner 63a2846e84 [codeview] Some cleanup of Symbol Records.
* Reworks the CVSymbolTypes.def to work similarly to TypeRecords.def.
* Moves some enums from SymbolRecords.h to CodeView.h to maintain
  consistency with how we do type records.
* Generalize a few simple things like the record prefix
* Define the leaf enum and the kind enum similar to how we do with tyep
  records.

Differential Revision: http://reviews.llvm.org/D20342
Reviewed By: amccarth, rnk

llvm-svn: 269867
2016-05-17 23:50:21 +00:00
Paul Robinson 101772128a [DwarfDebug] Make tuning predicates private, should be used only in ctor.
llvm-svn: 269859
2016-05-17 22:53:20 +00:00
Adrian Prantl 6323ddf99c Debug Info: Introduce a DwarfDebug::UseDWARF2Bitfields flag
instead of having DwarfUnit query the debugger tuning options.

Follow-up commmit to r269827.
Thanks to Paul Robinson for pointing this out!

llvm-svn: 269840
2016-05-17 21:07:16 +00:00
Adrian Prantl f0a41089ff Debug Info: Don't emit bitfields in the DWARF4 format when tuning for GDB.
As discovered in PR27758, GDB does not fully support the DWARF 4 format.
This patch ensures we always emit bitfields in the DWARF 2 when tuning for GDB.

llvm-svn: 269827
2016-05-17 20:12:08 +00:00
Renato Golin 38ed8021c7 Fix an assert in SelectionDAGBuilder when processing inline asm
When processing inline asm that contains errors, make sure we can recover
gracefully by creating an UNDEF SDValue for the inline asm statement before
returning from SelectionDAGBuilder::visitInlineAsm. This is necessary for
consumers that don't exit on the first error that is emitted (e.g. clang)
and that would assert later on.

Fixes PR24071.

Patch by Diana Picus.

llvm-svn: 269811
2016-05-17 19:52:01 +00:00
Rafael Espindola 712f957cae Simplify handling of hidden stub.
Since r207518 they are printed exactly like non-hidden stubs on x86 and
since r207517 on ARM.

This means we can use a single set for all stubs in those platforms.

llvm-svn: 269776
2016-05-17 16:01:32 +00:00
Derek Schuff 1aaf87e91d Factor PrologEpilogInserter around spilling, frame finalization, and scavenging
PrologEpilogInserter has these 3 phases, which are related, but not
all of them are needed by all targets. This patch reorganizes PEI's
varous functions around those phases for more clear separation. It also
introduces a new TargetMachine hook, usesPhysRegsForPEI, which is true
for non-virtual targets. When it is true, all the phases operate as
before, and PEI requires the AllVRegsAllocated property on
MachineFunctions. Otherwise, CSR spilling and scavenging are skipped and
only prolog/epilog insertion/frame finalization is done.

Differential Revision: http://reviews.llvm.org/D18366

llvm-svn: 269750
2016-05-17 08:49:59 +00:00
Adrian Prantl 7aa34c8cbb Debug Info: Don't emit a DW_AT_data_member_location for DWARF bitfields.
The DWARF spec states that a member entry may have either a
DW_AT_data_member_location or a DW_AT_data_bit_offset, but not both.

This fixes a bug found in PR 27758.

llvm-svn: 269731
2016-05-17 02:37:53 +00:00
Easwaran Raman 01d98ba0b2 Remove .hot and .unlikely prefixes from function section names.
This code currently relies on static methods in ProfileSummary to determine whether a function is hot or unlikley. I am refactoring the ProfileSummary code and these methods will be removed. As discussed offline, the right way to re-introduce this is to add a pass to annotate functions with unlikely/hot hints and use the hints to determine the prefix here.

llvm-svn: 269726
2016-05-16 23:59:04 +00:00
Adrian Prantl e7d833defb Debug info: Don't emit a DW_AT_byte_size when emitting a DWARF4 bit field.
The DWARF spec clearly states that a bit field member should have either a
DW_AT_byte_size or a DW_AT_bit_size, but not both.
Also the DW_AT_byte_size is redundant with the size of the type of the member.

This fixes a bug found in PR 27758.

llvm-svn: 269714
2016-05-16 22:45:10 +00:00
Rafael Espindola e64619ce6e Fail early on unknown appending linkage variables.
In practice only a few well known appending linkage variables work.

Currently if codegen sees an unknown appending linkage variable it will
just print it as a regular global. That is wrong as the symbol in the
produced object file has different semantics as the one provided by the
appending linkage.

This just errors early instead of producing a broken .o.

llvm-svn: 269706
2016-05-16 21:14:24 +00:00
Matt Arsenault c31a9d0671 SelectionDAG: Select min/max when both are used
Allow two users of the condition if the other user
is also a min/max select. i.e.

%c = icmp slt i32 %x, %y
%min = select i1 %c, i32 %x, i32 %y
%max = select i1 %c, i32 %y, i32 %x

llvm-svn: 269699
2016-05-16 20:58:23 +00:00
Chad Rosier 1cb56a1850 Remove extra whitespace. NFC.
llvm-svn: 269685
2016-05-16 20:03:02 +00:00
Reid Kleckner 4525fbe22a [codeview] Align class and print names of types
Summary: This way we can get rid of one of the fields in the .def file.

Reviewers: llvm-commits

Subscribers: zturner

Differential Revision: http://reviews.llvm.org/D20251

llvm-svn: 269461
2016-05-13 19:37:07 +00:00
Jun Bum Lim be11bdc4b0 Rename getLargestLegalIntTypeSize to getLargestLegalIntTypeSizeInBits(). NFC.
Summary: Rename DataLayout::getLargestLegalIntTypeSize to DataLayout::getLargestLegalIntTypeSizeInBits() to prevent similar mistakes  fixed in r269433.

Reviewers: joker.eph, mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20248

llvm-svn: 269456
2016-05-13 18:38:35 +00:00
Tom Stellard 740af6f3b0 Revert "LiveIntervalAnalysis: Rework constructMainRangeFromSubranges()"
This reverts commit r269016 and also the follow-up commit r269020.

This patch caused PR27705.

llvm-svn: 269344
2016-05-12 20:27:40 +00:00
Xinliang David Li b840bb8714 Fix option description /NFC
llvm-svn: 269307
2016-05-12 16:39:02 +00:00
Simon Pilgrim 89b89650f3 [SelectionDAG] Attempt to split BITREVERSE vector legalization into BSWAP and BITREVERSE stages
For BITREVERSE, bit shifting/masking every bit in a vector element is a very lengthy procedure.

If the input vector type is a whole multiple of bytes wide then we can split this into a BSWAP shuffle stage (to reverse at the byte level) and then a BITREVERSE stage applied to each byte. Most vector capable targets can efficiently BSWAP using shuffles resulting in a considerable reduction in instructions.

With this patch targets would only need to implement a target specific vXi8 BITREVERSE implementation to efficiently reverse most legal vector types.

Differential Revision: http://reviews.llvm.org/D19978

llvm-svn: 269290
2016-05-12 13:09:49 +00:00
Xinliang David Li f0ab6dfedc [Layout] Add a new option (NFC)
Currently cost based loop rotation algo can only be turned on with
two conditions: the function has real profile data, and -precise-rotation-cost
flag is turned on. This is not convenient for developers to experiment
when profile is not available. Add a new option to force the new
rotation algorithm -force-precise-rotation-cost

llvm-svn: 269266
2016-05-12 02:04:41 +00:00
Wei Mi 8c4136b0d8 Fix a bug when hoist spill to a BB with landingpad successor.
This is to fix the bug in https://llvm.org/bugs/show_bug.cgi?id=27612.

When spill is hoisted to a BB with landingpad successor, and if the VNI
of the spill reg lives into the landingpad successor, the spill should be
inserted before the call which may throw exception. InsertPointAnalysis
is used to compute the safe insert point.

http://reviews.llvm.org/D20027 is a preparing patch for this patch.

Differential Revision: http://reviews.llvm.org/D19884.

llvm-svn: 269249
2016-05-11 22:37:43 +00:00
Wei Mi 35ee9339a8 [NFC] Extract LastSplitPoint computation from SplitAnalysis to a new class
InsertPointAnalysis.

Because both split and spill hoisting want to use LastSplitPoint computation
result, extract the LastSplitPoint computation from SplitAnalysis class which
also contains a bunch of other analysises only related to split.

Differential Revision: http://reviews.llvm.org/D20027.

llvm-svn: 269248
2016-05-11 22:28:29 +00:00
Matthias Braun 30668dd802 MachineVerifier: Fix error reporting.
Do not use getVRegDef() to print "the definition" of a vreg. If there
are multiple or none the function will fail.

llvm-svn: 269239
2016-05-11 21:31:39 +00:00
Justin Bogner b3534c494f SDAG: Have SelectNodeTo replace uses if it CSE's instead of morphing a node
It's awkward to force callers of SelectNodeTo to figure out whether
the node was morphed or CSE'd. Update uses here instead of requiring
callers to (sometimes) do it.

llvm-svn: 269235
2016-05-11 21:00:33 +00:00
Rafael Espindola 83658d6e7a Return a StringRef from getSection.
This is similar to how getName is handled.

llvm-svn: 269218
2016-05-11 18:21:59 +00:00
Zachary Turner ae3882a19a Refactor CodeView type records to use common code.
Differential Revision: http://reviews.llvm.org/D20138
Reviewed By: rnk

llvm-svn: 269216
2016-05-11 17:47:35 +00:00
Sanjay Patel 87f6ed6f48 fix typos in comments; NFC
llvm-svn: 269206
2016-05-11 17:00:07 +00:00
Rafael Espindola 610a4e916e Merge two unreachable cases.
llvm-svn: 269189
2016-05-11 14:41:30 +00:00
Justin Bogner 1df01f0e31 SDAG: Make SelectCodeCommon return void
This means SelectCode unconditionally returns nullptr now. I'll follow
up with a change to make that return void as well, but it seems best
to keep that one very mechanical.

This is part of the work to have Select return void instead of an
SDNode *, which is in turn part of llvm.org/pr26808.

llvm-svn: 269136
2016-05-10 22:58:26 +00:00
Matthias Braun 8a5b46737a ScheduleDAGInstrs: Comment on why subreg defs are not seen as uses; NFC
Usually subregister definitions are consider uses of the remaining
lanes that did not get defined. Add a comment why the code in
ScheduleDAGInstrs does not add use dependencies regardless.

llvm-svn: 269107
2016-05-10 20:11:58 +00:00
Adrian Prantl 723ccd2790 Debug Info: Prevent DW_AT_abstract_origin from being emitted twice
for the same subprogram.

This fixes a bug where DW_AT_abstract_origin is being emitted twice for
the same subprogram if a function is both inlined and emitted in the same
translation unit, by restoring the pre-r266446 behavior.

http://reviews.llvm.org/D20072

llvm-svn: 269103
2016-05-10 19:38:51 +00:00
Mandeep Singh Grang e5a2f116d6 Fix PR26655: Bail out if all regs of an inst BUNDLE have the correct kill flag
Summary:
While setting kill flags on instructions inside a BUNDLE, we bail out as soon
as we set kill flag on a register.  But we are missing a check when all the
registers already have the correct kill flag set. We need to bail out in that
case as well.

This patch refactors the old code and simply makes use of the addRegisterKilled
function in MachineInstr.cpp in order to determine whether to set/remove kill
on an instruction.

Reviewers: apazos, t.p.northover, pete, MatzeB

Subscribers: MatzeB, davide, llvm-commits

Differential Revision: http://reviews.llvm.org/D17356

llvm-svn: 269092
2016-05-10 17:57:27 +00:00
Krzysztof Parzyszek a356bb7fa4 [ScheduleDAG] Make sure to process all def operands before any use operands
An example from Hexagon where things went wrong:
  %R0<def> = L2_loadrigp <ga:@fp04>      ; load function address
  J2_callr %R0<kill>, ..., %R0<imp-def>  ; call *R0, return value in R0

ScheduleDAGInstrs::buildSchedGraph would visit all instructions going
backwards, and in each instruction it would visit all operands in their
order on the operand list. In the case of this call, it visited the use
of R0 first, then removed it from the set Uses after it visited the def.
This caused the DAG to be missing the data dependence edge on R0 between
the load and the call.

Differential Revision: http://reviews.llvm.org/D20102

llvm-svn: 269076
2016-05-10 16:50:30 +00:00
Marcin Koscielnicki bbac890b53 [PR27599] [SystemZ] [SelectionDAG] Fix extension of atomic cmpxchg result.
Currently, SelectionDAG assumes 8/16-bit cmpxchg returns either a sign
extended result, or a zero extended result.  SystemZ takes a third
option by returning junk in the high bits (rotated contents of the other
bytes in the memory word).  In that case, don't use Assert*ext, and
zero-extend the result ourselves if a comparison is needed.

Differential Revision: http://reviews.llvm.org/D19800

llvm-svn: 269075
2016-05-10 16:49:04 +00:00
Jonas Paulsson 8e5b0c65cc [foldMemoryOperand()] Pass LiveIntervals to enable liveness check.
SystemZ (and probably other targets as well) can fold a memory operand
by changing the opcode into a new instruction that as a side-effect
also clobbers the CC-reg.

In order to do this, liveness of that reg must first be checked. When
LIS is passed, getRegUnit() can be called on it and the right
LiveRange is computed on demand.

Reviewed by Matthias Braun.
http://reviews.llvm.org/D19861

llvm-svn: 269026
2016-05-10 08:09:37 +00:00
Matthias Braun 8d6e57b216 LiveIntervalAnalysis: Rework constructMainRangeFromSubranges()
We now use LiveRangeCalc::extendToUses() instead of a specially designed
algorithm in constructMainRangeFromSubranges():
- The original motivation for constructMainRangeFromSubranges() were
  differences between the main liverange and subranges because of hidden
  dead definitions. This case however cannot happen anymore with the
  DetectDeadLaneMasks pass in place.
- It simplifies the code.
- This fixes a longstanding bug where we did not properly create new SSA
  values on merging control flow (the MachineVerifier missed most of
  these cases).
- Move constructMainRangeFromSubranges() to LiveIntervalAnalysis and
  LiveRangeCalc to better match the implementation/available helper
  functions.

llvm-svn: 269016
2016-05-10 04:51:14 +00:00
Matthias Braun 9c7e4dea1f LiveInterval: Avoid unnecessary auto, add const; NFC
llvm-svn: 269015
2016-05-10 04:51:09 +00:00
Matthias Braun 0663b61e1a TargetPassConfig: Set PrintMachineCode even if addMachinePasses() does not run.
llvm-svn: 269013
2016-05-10 04:51:04 +00:00
Dan Gohman 0cfb5f852d [WebAssembly] Move register stackification and coloring to a late phase.
Move the register stackification and coloring passes to run very late, after
PEI, tail duplication, and most other passes. This means that all code emitted
and expanded by those passes is now exposed to these passes. This also
eliminates the need for prologue/epilogue code to be manually stackified,
which significantly simplifies the code.

This does require running LiveIntervals a second time. It's useful to think
of these late passes not as late optimization passes, but as a domain-specific
compression algorithm based on knowledge of liveness information. It's used to
compress the code after all conventional optimizations are complete, which is
why it uses LiveIntervals at a phase when actual optimization passes don't
typically need it.

Differential Revision: http://reviews.llvm.org/D20075

llvm-svn: 269012
2016-05-10 04:24:02 +00:00
Matthias Braun 31d19d43c7 CodeGen: Move TargetPassConfig from Passes.h to an own header; NFC
Many files include Passes.h but only a fraction needs to know about the
TargetPassConfig class. Move it into an own header. Also rename
Passes.cpp to TargetPassConfig.cpp while we are at it.

llvm-svn: 269011
2016-05-10 03:21:59 +00:00
Matthias Braun d06896138c PrologEpilogInserter: Remove unnecessary dependency
llvm-svn: 269010
2016-05-10 03:21:47 +00:00
Matthias Braun 47cf918e20 LLVMTargetMachine: Add functions to create MIModuleInfo/MIFunction; NFC
Add convenience function to create MachineModuleInfo and
MachineFunctionAnalysis passes and add them to a pass manager.

Despite factoring out some shared code in
LiveIntervalTest/LLVMTargetMachine this will be used by my upcoming llc
change.

llvm-svn: 269002
2016-05-10 01:32:40 +00:00
Sanjay Patel c7b91e65d8 [CGP] avoid crashing from weightlessness
It's possible that we have branch weights with 0 values.
In that case, don't try to create an impossible BranchProbability.

llvm-svn: 268935
2016-05-09 17:31:55 +00:00
Sanjay Patel 91592568f9 [TargetLowering] make helper function for SetCC + and optimizations (NFC)
After looking at D19087 again, it occurred to me that we can do better. If we consolidate
the valueHasExactlyOneBitSet() transforms, we won't incur extra overhead from calling it a
2nd time, and we can shrink SimplifySetCC() a bit. No functional change intended.

Differential Revision: http://reviews.llvm.org/D20050

llvm-svn: 268932
2016-05-09 16:42:50 +00:00
Simon Pilgrim ed39d150f5 Fix unused variable warning.
llvm-svn: 268867
2016-05-07 20:19:59 +00:00
Simon Pilgrim b6f82c449a [SelectionDAG] Added bitreverse(bitreverse(v)) --> v
Added bitreverse creation testing

llvm-svn: 268865
2016-05-07 20:12:36 +00:00
Sanjay Patel c2751e7050 [x86, BMI] add TLI hook for 'andn' and use it to simplify comparisons
For the sake of minimalism, this patch is x86 only, but I think that at least
PPC, ARM, AArch64, and Sparc probably want to do this too.

We might want to generalize the hook and pattern recognition for a target like
PPC that has a full assortment of negated logic ops (orc, nand).

Note that http://reviews.llvm.org/D18842 will cause this transform to trigger
more often.

For reference, this relates to:
https://llvm.org/bugs/show_bug.cgi?id=27105
https://llvm.org/bugs/show_bug.cgi?id=27202
https://llvm.org/bugs/show_bug.cgi?id=27203
https://llvm.org/bugs/show_bug.cgi?id=27328

Differential Revision: http://reviews.llvm.org/D19087

llvm-svn: 268858
2016-05-07 15:03:40 +00:00
Matthias Braun 22152acf7b DetectDeadLanes: Increase precision when detecting undef inputs
In case of COPY-like instruction we may be able to deduce that a certain
input is unused, based on the used lanes of the register defined by the
instruction.
This even works accross otherwise incompatible copies (no need to have
compatible lanemasks, completely unused operands are still completely
unused). It even makes sense to redo the analysis in this case since we
gained information for a case we previously stopped at because of the
incompatible masks.

llvm-svn: 268815
2016-05-06 22:43:50 +00:00
Matthias Braun 8f429ead58 DetectDeadLanes: Cleanup, assert on some impossible cases.
llvm-svn: 268814
2016-05-06 22:43:46 +00:00
Matthias Braun 71474e8d22 LiveIntervalAnalysis: Fix handleMove() extending liverange for undef inputs
Fix handleMove() incorrectly extending liveranges when an undef input of
a vreg was moved past the (current) end of the liverange.

llvm-svn: 268805
2016-05-06 21:47:41 +00:00
Justin Bogner c45c960006 SDAG: Don't leave dangling dead nodes after SelectCodeCommon
Relying on the caller to clean up after we've replaced all uses of a
node won't work when we've migrated to the `void Select(...)` API.

llvm-svn: 268774
2016-05-06 18:42:16 +00:00
Ahmed Bougacha 16547c4e31 [CodeGen] Round [SU]INT_TO_FP result when promoting from f16.
If we don't, values that aren't precisely representable in f16 could
be used as-is in a promoted f32 operation, which would produce
incorrect results.

AArch64 had the correct behavior; add a focused test.

Fixes http://llvm.org/PR26871

llvm-svn: 268700
2016-05-06 00:58:00 +00:00
Justin Bogner b012699741 SDAG: Rename Select->SelectImpl and repurpose Select as returning void
This is a step towards removing the rampant undefined behaviour in
SelectionDAG, which is a part of llvm.org/PR26808.

We rename SelectionDAGISel::Select to SelectImpl and update targets to
match, and then change Select to return void and consolidate the
sketchy behaviour we're trying to get away from there.

Next, we'll update backends to implement `void Select(...)` instead of
SelectImpl and eventually drop the base Select implementation.

llvm-svn: 268693
2016-05-05 23:19:08 +00:00
Justin Bogner 465886ece1 SDAG: Remove OPC_MarkGlueResults and associated logic. NFC
This opcode never happens in practice, and yet the logic we have in
place to handle it would be undefined behaviour if we ever executed
it. Remove it rather than trying to refactor code that's never
reached.

llvm-svn: 268692
2016-05-05 22:37:45 +00:00
Matthias Braun 0e881d61c1 MachineFunction: Add a const modifier to print() parameter
llvm-svn: 268657
2016-05-05 18:14:43 +00:00
Sanjay Patel c91351c2b7 clean up; NFCI
llvm-svn: 268564
2016-05-04 22:39:36 +00:00
Simon Pilgrim 1f5ad702f8 [SelectionDAG] BITREVERSE vector legalization of bit operations (REAPPLIED)
Some vector bit operations are promoted instead of having custom lowering. This patch changes the isOperationLegalOrCustom tests for vector AND/OR operations to use a new TLI helper isOperationLegalOrCustomOrPromote instead, allowing the SSE implementations to stay on the simd unit.

Differential Revision: http://reviews.llvm.org/D19805

llvm-svn: 268561
2016-05-04 22:08:51 +00:00
Eric Christopher 75d661a280 Spelling and grammar corrections in comments.
llvm-svn: 268560
2016-05-04 21:45:36 +00:00
Simon Pilgrim 1a14f0d25c Revert r268504
llvm-svn: 268526
2016-05-04 17:49:14 +00:00
Simon Pilgrim b97c06210b [SelectionDAG] BITREVERSE vector legalization of bit operations
Vector bit operations are typically promoted instead of having custom lowering. This patch changes the isOperationLegalOrCustom tests for vector AND/OR operations to use isOperationLegalOrPromote instead, allowing the SSE implementations to stay on the simd unit.

Differential Revision: http://reviews.llvm.org/D19805

llvm-svn: 268504
2016-05-04 15:01:13 +00:00
Andrew Kaylor 50271f787e Add opt-bisect support to additional passes that can be skipped
Differential Revision: http://reviews.llvm.org/D19882

llvm-svn: 268457
2016-05-03 22:32:30 +00:00
Quentin Colombet 26dab3a485 [ImplicitNullChecks] Account for implicit-defs as well when updating the liveness.
The replaced load may have implicit-defs and those defs may be used
in the block of the original load. Make sure to update the liveness
accordingly.

This is a generalization of r267817.

llvm-svn: 268412
2016-05-03 18:09:06 +00:00
Craig Topper 3fc0e668ff [CodeGen] Add some space optimized forms of EmitNode and MorphNodeTo that implicitly indicate the number of result VTs. This shaves about 16K off the X86 matching table taking it down to about 470K.
Overall this reduces the llc binary size with all in-tree targets by about 40K.

llvm-svn: 268365
2016-05-03 05:54:13 +00:00
Matthias Braun d1aabb2813 livePhysRegs: Pass MBB by reference in addLive{Ins|Outs}(); NFC
The block must no be nullptr for the addLiveIns()/addLiveOuts()
function.

llvm-svn: 268340
2016-05-03 00:24:32 +00:00
Matthias Braun 24f26e6d91 LivePhysRegs: Automatically determine presence of pristine regs.
Remove the AddPristinesAndCSRs parameters from
addLiveIns()/addLiveOuts().

We need to respect pristine registers after prologue epilogue insertion,
Seeing that we got this wrong in at least two commits already, we should
rather pay the small price to query MachineFrameInfo for it.

There are three cases that did not set AddPristineAndCSRs to true even
after register allocation:
- ExecutionDepsFix: live-out registers are used as a hint that the
  register is used soon. This is not true for pristine registers so
  use the new addLiveOutsNoPristines() to maintain this behaviour.
- SystemZShortenInst: Not setting AddPristineAndCSRs to true looks like
  a bug, should do the right thing automatically now.
- StackMapLivenessAnalysis: Not adding pristine registers looks like a
  bug to me. Added a FIXME comment but maintain the current behaviour
  as a change may need to get coordinated with GC runtimes.

llvm-svn: 268336
2016-05-03 00:08:46 +00:00
Reid Kleckner 97837b7b09 [MC] Create unique .pdata sections for every .text section
Summary:
This adds a unique ID to the COFF section uniquing map, similar to the
one we have for ELF.  The unique id is not currently exposed via the
assembler because we don't have a use case for it yet. Users generally
create .pdata with the .seh_* family of directives, and the assembler
internally needs to produce .pdata and .xdata sections corresponding to
the code section.

The association between .text sections and the assembler-created .xdata
and .pdata sections is maintained as an ID field of MCSectionCOFF. The
CFI-related sections are created with the given unique ID, so if more
code is added to the same text section, we can find and reuse the CFI
sections that were already created.

Reviewers: majnemer, rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19376

llvm-svn: 268331
2016-05-02 23:22:18 +00:00
Quentin Colombet 776e6de516 [MachineBlockPlacement] Let the target optimize the branches at the end.
After the layout of the basic blocks is set, the target may be able to get rid
of unconditional branches to fallthrough blocks that the generic code does not
catch. This happens any time TargetInstrInfo::AnalyzeBranch is not able to
analyze all the branches involved in the terminators sequence, while still
understanding a few of them.

In such situation, AnalyzeBranch can directly modify the branches if it has been
instructed to do so.

This patch takes advantage of that.

llvm-svn: 268328
2016-05-02 22:58:59 +00:00
Quentin Colombet 4e1d389ac5 [X86] Model FAULTING_LOAD_OP as a terminator and branch.
This operation may branch to the handler block and we do not want it
to happen anywhere within the basic block.
Moreover, by marking it "terminator and branch" the machine verifier
does not wrongly assume (because of AnalyzeBranch not knowing better)
the branch is analyzable. Indeed, the target was seeing only the
unconditional branch and not the faulting load op and thought it was
a simple unconditional block.
The machine verifier was complaining because of that and moreover,
other optimizations could have done wrong transformation!

In the process, simplify the representation of the handler block in
the faulting load op. Now, we directly reference the handler block
instead of using a label. This has the benefits of:
1. MC knows how to issue a label for a BB, so leave that to it.
2. Accessing the target BB from its label is painful, whereas it is
   direct from a MBB operand.

Note: The 2 bytes offset in implicit-null-check.ll comes from the
fact the unconditional jumps are not removed anymore, as the whole
terminator sequence is not analyzable anymore.

Will fix it in a subsequence commit.

llvm-svn: 268327
2016-05-02 22:58:54 +00:00
Wolfgang Pieb 56aa4b0629 DebugInfo: Avoid propagating incorrect debug locations in SelectionDAG via CSE.
Summary:
When SelectionDAG performs CSE it is possible that the context's source
location is different from that of the selected node. This can lead to
incorrect line number records. We update the debug location to the
one that occurs earlier in the instruction sequence.

This fixes PR21006.

Reviewers: echristo, sdmitrouk

Subscribers: jevinskie, asl, llvm-commits

Differential Revision: http://reviews.llvm.org/D12094

llvm-svn: 268323
2016-05-02 22:50:51 +00:00
NAKAMURA Takumi bc46f624cd ScheduleDAGInstrs.cpp: Don't peel the iterator when it points the end. This will fix the crash in r268143.
llvm-svn: 268257
2016-05-02 17:29:55 +00:00
Chad Rosier a306eeb252 Cleanup comments. NFC.
llvm-svn: 268233
2016-05-02 14:32:17 +00:00
Eric Christopher 94a9ee65c6 Fix grammar and correct comment - the debug information wasn't incorrect, rather suboptimal.
llvm-svn: 268211
2016-05-02 05:30:26 +00:00
Craig Topper e3c1e225d7 [CodeGen] Add OPC_MoveChild0-OPC_MoveChild7 opcodes to isel matching tables to optimize table size. Shaves about 12K off the X86 matcher table.
llvm-svn: 268209
2016-05-02 01:53:30 +00:00
Igor Breger 110af565c7 getelementptr instruction, support index vector of EVT.
Differential Revision: http://reviews.llvm.org/D19775

llvm-svn: 268195
2016-05-01 13:29:12 +00:00
Saleem Abdulrasool e0f0c0e247 CodeGen: convert to range based loops
Convert to using some range based loops, avoid unnecessary variables for
unchecked casts.  NFC.

llvm-svn: 268165
2016-04-30 18:15:34 +00:00
Amjad Aboud 72da9391f0 Reverting 268054 & 268063 as they caused PR27579.
llvm-svn: 268150
2016-04-30 01:44:07 +00:00
Haicheng Wu 4afe0425db [MBP] Use Function::optForSize() instead of checking OptimizeForSize directly.
Fix a FIXME.  Disable loop alignment if compiled with -Oz now.

llvm-svn: 268121
2016-04-29 22:01:10 +00:00
Matt Arsenault ab2232cf73 DAGCombiner: Reduce truncated shl width
llvm-svn: 268094
2016-04-29 19:53:16 +00:00
Simon Pilgrim 464f1f3bea Use SelectionDAG::getTargetConstant* helper functions. NFC.
Instead of SelectionDAG::getConstant directly to make it more obvious that we're creating target constants.

llvm-svn: 268074
2016-04-29 17:42:45 +00:00
Haicheng Wu e749ce53d4 [MBP] Split placement and alignment into two functions. NFC.
Cut and Paste.

llvm-svn: 268067
2016-04-29 17:06:44 +00:00
Amjad Aboud 293ee8bba1 Recommitted r264280 "Supporting all entities declared in lexical scope in LLVM debug info."
After fixing PR26942 in r267004.

llvm-svn: 268054
2016-04-29 16:07:55 +00:00
Filipe Cabecinhas 0da9937517 Unify XDEBUG and EXPENSIVE_CHECKS (into the latter), and add an option to the cmake build to enable them.
Summary:
Historically, we had a switch in the Makefiles for turning on "expensive
checks". This has never been ported to the cmake build, but the
(dead-ish) code is still around.

This will also make it easier to turn it on in buildbots.

Reviewers: chandlerc

Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits

Differential Revision: http://reviews.llvm.org/D19723

llvm-svn: 268050
2016-04-29 15:22:48 +00:00
Matthias Braun f3619b8212 RegisterPressure: Fix default lanemask for missing regunit intervals
In case of missing live intervals for a physical registers
getLanesWithProperty() would report 0 which was not a safe default in
all situations. Add a parameter to pass in a safe default.
No testcase because in-tree targets do not skip computing register unit
live intervals.

Also cleanup the getXXX() functions to not perform the
RequireLiveIntervals checks anymore so we do not even need to return
safe defaults.

llvm-svn: 267977
2016-04-29 02:44:54 +00:00
Matthias Braun 5e4ac856d6 RegisterPressure: Cannot produce dead (subregister) defs anymore
With the DetectDeadLanes pass in place we cannot run into situations
anymore where defs suddenly become dead.
Also add a missing check so we do not try to add an undef flag to a
physreg (found by visual inspection, no failing test).

llvm-svn: 267976
2016-04-29 02:44:48 +00:00
Matthias Braun f84547c6e0 LiveIntervalAnalysis: Remove LiveVariables requirement
This requirement was a huge hack to keep LiveVariables alive because it
was optionally used by TwoAddressInstructionPass and PHIElimination.
However we have AnalysisUsage::addUsedIfAvailable() which we can use in
those passes.

This re-applies r260806 with LiveVariables manually added to PowerPC to
hopefully not break the stage 2 bots this time.

llvm-svn: 267954
2016-04-28 23:42:51 +00:00
Marcin Koscielnicki 3a592df3e4 [CodeGen] Remove extra ';'
Squashes a -Wpedantic warning.

llvm-svn: 267944
2016-04-28 21:49:46 +00:00
Matthias Braun e9631f166e LiveIntervalAnalysis: No need to deal with dead subregister defs anymore.
The DetectDeadLaneMask already ensures that we have no dead subregister
definitions making the special handling in LiveIntervalAnalysis
unnecessary. This reverts most of r248335.

llvm-svn: 267937
2016-04-28 20:35:26 +00:00
Krzysztof Parzyszek 7ea9a529aa Reset the TopRPTracker's position in ScheduleDAGMILive::initQueues
ScheduleDAGMI::initQueues changes the RegionBegin to the first non-debug
instruction. Since it does not track register pressure, it does not affect
any RP trackers. ScheduleDAGMILive inherits initQueues from ScheduleDAGMI,
and it does reset the TopTPTracker in its schedule method. Any derived,
target-specific scheduler will need to do it as well, but the TopRPTracker
is only exposed as a "const" object to derived classes. Without the ability
to modify the tracker directly, this leaves a derived scheduler with a
potential of having the TopRPTracker out-of-sync with the CurrentTop.

The symptom of the problem:
  void llvm::ScheduleDAGMILive::scheduleMI(llvm::SUnit *, bool):
  Assertion `TopRPTracker.getPos() == CurrentTop && "out of sync"' failed.

Differential Revision: http://reviews.llvm.org/D19438

llvm-svn: 267918
2016-04-28 19:17:44 +00:00
Adrian Prantl e5447574c8 Debug Info: Restore the pre-r240853 behavior for DWARF2 bitfields.
The DWARF2 specification of DW_AT_bit_offset is ambiguous for
little-endian machines, but by restoring to the old behavior
we match what debuggers expect and what other popular compilers
generate.

llvm-svn: 267896
2016-04-28 15:37:52 +00:00
Adrian Prantl f393d313ec Debug info: Support DWARF4 bitfields via DW_AT_data_bit_offset.
The DWARF2 specification of DW_AT_bit_offset was written from the perspective of
a big-endian machine with unclear semantics for other systems.  DWARF4
deprecated DW_AT_bit_offset and introduced a new attribute DW_AT_data_bit_offset
that simply counts the number of bits from the beginning of the containing
entity regardless of endianness.

After this patch LLVM emits DW_AT_bit_offset for DWARF 2 or 3 and
DW_AT_data_bit_offset when DWARF 4 or later is requested.

llvm-svn: 267895
2016-04-28 15:37:48 +00:00
Craig Topper 33772c5375 [CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior.
llvm-svn: 267853
2016-04-28 03:34:31 +00:00
Matthias Braun fbe85ae12e CodeGen: Add DetectDeadLanes pass.
The DetectDeadLanes pass performs a dataflow analysis of used/defined
subregister lanes across COPY instructions and instructions that will
get lowered to copies. It detects dead definitions and uses reading
undefined values which are obscured by COPY and subregister usage.

These dead definitions cause trouble in the register coalescer which
cannot deal with definitions suddenly becoming dead after coalescing
COPY instructions.

For now the pass only adds dead and undef flags to machine operands. It
should be possible to extend it in the future to remove the dead
instructions and redo the analysis for the affected virtual
registers.

Differential Revision: http://reviews.llvm.org/D18427

llvm-svn: 267851
2016-04-28 03:07:16 +00:00
Matthias Braun c9e759acff LiveIntervalAnalysis: Fix handleMove() using wrong value numbers
handleMove() was incorrectly swapping two value numbers. This was missed
before because the problem only occured when moving subregister definitions
and needed -verify-machineinstrs to be detected.

I cannot add a testcase as long as I cannot reapply r260905/r260806.

llvm-svn: 267840
2016-04-28 02:11:49 +00:00
Quentin Colombet 12b69919a2 [ImplicitNullChecks] Properly update the live-in of the block of the memory operation.
We basically replace:
HoistBB:
cond_br NullBB, NotNullBB

NullBB:
  ...

NotNullBB:
  <reg> = load

into
HoistBB
<reg> = load_faulting_op NullBB
uncond_br NotNullBB

NullBB:
  ...

NotNullBB: ## <reg> is now live-in of NotNullBB
  ...

This partially fixes the machine verifier error for
test/CodeGen/X86/implicit-null-check.ll, but it still fails because
of the implicit CFG structure.

llvm-svn: 267817
2016-04-27 23:26:40 +00:00
Than McIntosh a541320908 Fix build failure under NDEBUG.
llvm-svn: 267774
2016-04-27 20:07:02 +00:00
David Majnemer 0c80e2eac6 [CodeGenPrepare] Don't sink a cast past its user
The sink cast machinery is supposed to sink casts as close to their user
as possible.  However, an EH pad is the first instruction in it's basic
block.  Don't sink if the user is an EH pad.

This fixes PR27536.

llvm-svn: 267767
2016-04-27 19:36:38 +00:00
Than McIntosh 1b60168576 Refactor debugging code, NFC.
Summary:
Refactor debugging routines to reduce code duplication. Remove a couple
of #include's that were not needed. Don't require MachineDominator as a
prereq for this pass (not needed).

These changes split off from http://reviews.llvm.org/D18827.

Reviewers: wmi, gbiv, qcolombet

Subscribers: llvm-commits, davidxl, jevinskie

Differential Revision: http://reviews.llvm.org/D18992

llvm-svn: 267766
2016-04-27 19:26:25 +00:00
Gerolf Hoflehner 50426191d7 [DAGCombiner] Follow coding convention for function name (NFC)
llvm-svn: 267745
2016-04-27 17:27:16 +00:00
Nico Weber e69b9548b8 Revert r267649, it caused PR27539.
llvm-svn: 267723
2016-04-27 15:16:54 +00:00
Cong Hou 6f879d9eb1 Detects the SAD pattern on X86 so that much better code will be emitted once the pattern is matched.
Differential revision: http://reviews.llvm.org/D14840

llvm-svn: 267649
2016-04-27 01:29:18 +00:00
Quentin Colombet ddad5aa152 [MachineInstrBundle] Actually set the PartialDeadDef flag only when the register
is defined!

The users were checking the proper thing (Defined + PartialDeadDef), but the
information may have been wrong for other use cases, so fix that.

llvm-svn: 267641
2016-04-27 00:16:29 +00:00
Quentin Colombet 08e79990a0 [MachineBasicBlock] Take advantage of the partially dead information.
Thanks to that information we wouldn't lie on a register being live whereas it
is not.

llvm-svn: 267622
2016-04-26 23:14:29 +00:00
Quentin Colombet 3f19245015 [MachineInstrBundle] Improvement the recognition of dead definitions.
Now, it is possible to know that partial definitions are dead definitions and
recognize that clobbered registers are also dead.

llvm-svn: 267621
2016-04-26 23:14:24 +00:00
Ahmed Bougacha 128f8732a5 [CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI.
Differential Revision: http://reviews.llvm.org/D17176

llvm-svn: 267606
2016-04-26 21:15:30 +00:00
Krzysztof Parzyszek 4773f647bd [Tail duplication] Handle source registers with subregisters
When a block is tail-duplicated, the PHI nodes from that block are
replaced with appropriate COPY instructions. When those PHI nodes
contained use operands with subregisters, the subregisters were
dropped from the COPY instructions, resulting in incorrect code.

Keep track of the subregister information and use this information
when remapping instructions from the duplicated block.

Differential Revision: http://reviews.llvm.org/D19337

llvm-svn: 267583
2016-04-26 18:36:34 +00:00
Sanjay Patel d66607bd8c [CodeGenPrepare] use branch weight metadata to decide if a select should be turned into a branch
This is part of solving PR27344:
https://llvm.org/bugs/show_bug.cgi?id=27344

CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this
same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the
IR further with a select in place.

For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly
predictable branch. Even the most limited HW branch predictors will be correct on this branch almost
all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all
the times the branch was predicted correctly.

As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's
MispredictPenalty. Or we could just let targets override the default by implementing the hook with that
and other target-specific options. Note that trying to statically determine mispredict rates for 
close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie, 
50/50 taken/not-taken might still be 100% predictable.

Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable()
branch weight default values are 4 and 64. A proposal to change that is in D19435.

Differential Revision: http://reviews.llvm.org/D19488

llvm-svn: 267572
2016-04-26 17:11:17 +00:00
Sanjay Patel a31b0c0ece [CodeGenPrepare] don't convert an unpredictable select into control flow
Suggested in the review of D19488:
http://reviews.llvm.org/D19488

llvm-svn: 267504
2016-04-26 00:47:39 +00:00
Marcin Koscielnicki 1c1af6ef77 [PR27390] [CodeGen] Reject indexed loads in CombinerDAG.
visitAND, when folding and (load) forgets to check which output of
an indexed load is involved, happily folding the updated address
output on the following testcase:

target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

%typ = type { i32, i32 }

define signext i32 @_Z8access_pP1Tc(%typ* %p, i8 zeroext %type) {
  %b = getelementptr inbounds %typ, %typ* %p, i64 0, i32 1
  %1 = load i32, i32* %b, align 4
  %2 = ptrtoint i32* %b to i64
  %3 = and i64 %2, -35184372088833
  %4 = inttoptr i64 %3 to i32*
  %_msld = load i32, i32* %4, align 4
  %zzz = add i32 %1,  %_msld
  ret i32 %zzz
}

Fix this by checking ResNo.

I've found a few more places that currently neglect to check for
indexed load, and tightened them up as well, but I don't have test
cases for them.  In fact, they might not be triggerable at all,
at least with current targets.  Still, better safe than sorry.

Differential Revision: http://reviews.llvm.org/D19202

llvm-svn: 267420
2016-04-25 15:43:44 +00:00
David Majnemer dd21523653 [WinEH] Update SplitAnalysis::computeLastSplitPoint to cope with multiple EH successors
We didn't have logic to correctly handle CFGs where there was more than
one EH-pad successor (these are novel with WinEH).
There were situations where a register was live in one exceptional
successor but not another but the code as written would only consider
the first exceptional successor it found.

This resulted in split points which were insufficiently early if an
invoke was present.

This fixes PR27501.

N.B.  This removes getLandingPadSuccessor.

llvm-svn: 267412
2016-04-25 14:31:32 +00:00
Gerolf Hoflehner 01b3a6184a [MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098)
The original patch caused crashes because it could derefence a null pointer
for SelectionDAGTargetInfo for targets that do not define it.

Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:

- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math

llvm-svn: 267328
2016-04-24 05:14:01 +00:00
Craig Topper 36c133159a [CodeGen] Teach DAG combine to fold select_cc seteq X, 0, sizeof(X), ctlz_zero_undef(X) -> ctlz(X). InstCombine already does this for IR and X86 pattern matches this during isel.
A follow up commit will remove the X86 patterns to allow this to be tested.

llvm-svn: 267325
2016-04-24 04:38:32 +00:00
Duncan P. N. Exon Smith a59d3e5af8 DebugInfo: Remove MDString-based type references
Eliminate DITypeIdentifierMap and make DITypeRef a thin wrapper around
DIType*.  It is no longer legal to refer to a DICompositeType by its
'identifier:', and DIBuilder no longer retains all types with an
'identifier:' automatically.

Aside from the bitcode upgrade, this is mainly removing logic to resolve
an MDString-based reference to an actualy DIType.  The commits leading
up to this have made the implicit type map in DICompileUnit's
'retainedTypes:' field superfluous.

This does not remove DITypeRef, DIScopeRef, DINodeRef, and
DITypeRefArray, or stop using them in DI-related metadata.  Although as
of this commit they aren't serving a useful purpose, there are patchces
under review to reuse them for CodeView support.

The tests in LLVM were updated with deref-typerefs.sh, which is attached
to the thread "[RFC] Lazy-loading of debug info metadata":

  http://lists.llvm.org/pipermail/llvm-dev/2016-April/098318.html

llvm-svn: 267296
2016-04-23 21:08:00 +00:00
Sanjay Patel dc88bd6e1f replace duplicated static functions for profile metadata access with BranchInst member function; NFCI
llvm-svn: 267295
2016-04-23 20:01:22 +00:00
Craig Topper 7e5fad66f3 [CodeGen] When promoting CTTZ operations to larger type, don't insert a select to detect if the input is zero to return the original size instead of the extended size. Instead just set the first bit in the zero extended part.
llvm-svn: 267280
2016-04-23 05:20:47 +00:00
Andrew Kaylor aa641a5171 Re-commit optimization bisect support (r267022) without new pass manager support.
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267231
2016-04-22 22:06:11 +00:00
Peter Collingbourne 7dd8dbf486 Introduce llvm.load.relative intrinsic.
This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads
a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that
value and returns it. The constant folder specifically recognizes the form of
this intrinsic and the constant initializers it may load from; if a loaded
constant initializer is known to have the form ``i32 trunc(x - %ptr)``,
the intrinsic call is folded to ``x``.

LLVM provides that the calculation of such a constant initializer will
not overflow at link time under the medium code model if ``x`` is an
``unnamed_addr`` function. However, it does not provide this guarantee for
a constant initializer folded into a function body. This intrinsic can be
used to avoid the possibility of overflows when loading from such a constant.

Differential Revision: http://reviews.llvm.org/D18367

llvm-svn: 267223
2016-04-22 21:18:02 +00:00
Matt Arsenault 940d19a09c TLI: Only iterate over integer vector types
Instead of iterating over all vectors and skipping integers.

llvm-svn: 267220
2016-04-22 21:16:17 +00:00
Matt Arsenault 3b748d76f6 DAGCombiner: Relax alignment restriction when changing store type
If the target allows the alignment, this should be OK.

llvm-svn: 267217
2016-04-22 21:01:41 +00:00
Peter Collingbourne 265ebd7d70 CodeGen: Use PLT relocations for relative references to unnamed_addr functions.
The relative vtable ABI (PR26723) needs PLT relocations to refer to virtual
functions defined in other DSOs. The unnamed_addr attribute means that the
function's address is not significant, so we're allowed to substitute it
with the address of a PLT entry.

Also includes a bonus feature: addends for COFF image-relative references.

Differential Revision: http://reviews.llvm.org/D17938

llvm-svn: 267211
2016-04-22 20:40:10 +00:00
Matt Arsenault 629d12de70 DAGCombiner: Relax alignment restriction when changing load type
If the target allows the alignment, this should still be OK.

llvm-svn: 267209
2016-04-22 20:21:36 +00:00
Matthias Braun 4f57377c68 MachineScheduler: Move code to initialize a Candidate out of tryCandidate(); NFC
llvm-svn: 267191
2016-04-22 19:10:15 +00:00
Matthias Braun 6493bc2b97 MachineScheduler: Limit the size of the ready list.
Avoid quadratic complexity in unusually large basic blocks by limiting
the size of the ready lists.

Differential Revision: http://reviews.llvm.org/D19349

llvm-svn: 267189
2016-04-22 19:09:17 +00:00
Tom Stellard 2339f6f5a3 PostRAHazardRecocgnizer: Fix unused-private-field warning
llvm-svn: 267160
2016-04-22 15:11:08 +00:00
Tom Stellard ee34680bb0 CodeGen: Add a stand-alone hazard recognizer pass
Summary:
This new pass allows targets to use the hazard recognizer without having
to also run one of the schedulers.  This is useful when compiling with
optimizations disabled for targets that still need noop hazards
to be handled correctly.

Reviewers: hfinkel, atrick

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18594

llvm-svn: 267156
2016-04-22 14:43:50 +00:00
Eric Liu 6be128e43d Fix -Wunused-variable in non-asserts build.
llvm-svn: 267128
2016-04-22 09:50:31 +00:00
Daniel Sanders 591c379563 Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64
It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others.

llvm-svn: 267127
2016-04-22 09:37:26 +00:00
Vedant Kumar 6013f45f92 Revert "Initial implementation of optimization bisect support."
This reverts commit r267022, due to an ASan failure:

  http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549

llvm-svn: 267115
2016-04-22 06:51:37 +00:00
Nicolai Haehnle b0c9748709 AMDGPU/SI: add llvm.amdgcn.ps.live intrinsic
Summary:
This intrinsic returns true if the current thread belongs to a live pixel
and false if it belongs to a pixel that we are executing only for derivative
computation. It will be used by Mesa to implement gl_HelperInvocation.

Note that for pixels that are killed during the shader, this implementation
also returns true, but it doesn't matter because those pixels are always
disabled in the EXEC mask.

This unearthed a corner case in the instruction verifier, which complained
about a v_cndmask 0, 1, exec, exec<imp-use> instruction. That's stupid but
correct code, so make the verifier accept it as such.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19191

llvm-svn: 267102
2016-04-22 04:04:08 +00:00
Gerolf Hoflehner b32f11fc62 [MachineCombiner] Support for floating-point FMA on ARM64
Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:

- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math

llvm-svn: 267098
2016-04-22 02:15:19 +00:00
David Blaikie f0f6c29cec Fix more -Wunused-variable in non-asserts build.
llvm-svn: 267077
2016-04-21 23:24:09 +00:00
David Blaikie 3d42a86f9d Fix some -Wunused-variable warnings in non-asserts builds.
llvm-svn: 267073
2016-04-21 22:53:33 +00:00
Derek Schuff 025191d42f Improve error message reporting for MachineFunctionProperties
When printing the properties required by a pass, only print the
properties that are set, and not those that are clear (only properties
that are set are verified, clear properties are "don't-care").

llvm-svn: 267070
2016-04-21 22:19:24 +00:00
Quentin Colombet 23341a84ca [MachineBasicBlock] Make the pass argument truly mandatory when
splitting edges.

MachineBasicBlock::SplitCriticalEdges will crash if a nullptr would have
been passed for the Pass argument. Do not allow that by turning this
argument into a reference.
The alternative would have been to make the Pass a truly optional
argument, but although this is easy to do, I was afraid users using it
like this would not be aware the livness information, dominator tree and
such would silently be broken.

llvm-svn: 267052
2016-04-21 21:01:13 +00:00
Quentin Colombet 77e1878954 [MachineBasicBlock] Refactor SplitCriticalEdge to expose a query API.
Introduce canSplitCriticalEdge, so that clients can now query whether or
not a critical edge can be split without actually needing to split it.
This may be useful when gathering information for cost models for
instance.

llvm-svn: 267046
2016-04-21 20:46:27 +00:00
Quentin Colombet c320fb4eae [RegisterBankInfo] Change the API for the verify methods.
Return bool instead of void so that it is natural to put the calls into
asserts.

llvm-svn: 267033
2016-04-21 18:34:43 +00:00
Matt Arsenault 7846d885ed LegalizeDAG: Move unaligned load/store expansion to TLI
When custom lowered, this is not called if the store is custom
lowered. Move it to be a utility function so targets can
easily expand unaligned accesses when custom lowering.

llvm-svn: 267029
2016-04-21 18:19:11 +00:00
Quentin Colombet 0e5ff58567 [RegisterBankInfo] Change the representation of the partial mappings.
Instead of holding a mask, hold two value: the start index and the
length of the mapping. This is a more compact representation, although
less powerful. That being said, arbitrary masks would not have worked
for the generic so do not allow them in the first place.

llvm-svn: 267025
2016-04-21 18:09:34 +00:00
Matt Arsenault 8d1052f55c DAGCombiner: Reduce 64-bit BFE pattern to pattern on 32-bit component
If the extracted bits are restricted to the upper half or lower half,
this can be truncated.

llvm-svn: 267024
2016-04-21 18:03:06 +00:00
Andrew Kaylor f0f279291c Initial implementation of optimization bisect support.
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.

The bisection is enabled using a new command line option (-opt-bisect-limit).  Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit.  A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.

The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check.  Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute.  A new function call has been added for module and SCC passes that behaves in a similar way.

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267022
2016-04-21 17:58:54 +00:00
Amjad Aboud a5ba99140c Fixed Dwarf debug info emission to skip DILexicalBlockFile entries.
Before this fix, DILexicalBlockFile entries were skipped only in some cases and were not in other cases.

Differential Revision: http://reviews.llvm.org/D18724

llvm-svn: 267004
2016-04-21 16:58:49 +00:00
Craig Topper 52cb5ec36f [SelectionDAG] Teach LegalizeVectorOps to directly Expand CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to CTTZ/CTLZ directly if those ops are Legal/Custom instead of deferring it to LegalizeOps.
This is needed to support CTTZ/CTLZ Custom correctly since LegalizeOps would be too late to do the custom lowering.

llvm-svn: 266951
2016-04-21 04:43:57 +00:00
Matthias Braun b550b765bd MachineSched: Cleanup; NFC
llvm-svn: 266946
2016-04-21 01:54:13 +00:00
Mehdi Amini ea0b1e7c17 ScoreboardHazardRecognizer: unbreak TSAN by moving a static mutated variable to a member
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266837
2016-04-20 00:21:24 +00:00
Tim Shen a1d8bc5597 [PPC, SSP] Support PowerPC Linux stack protection.
llvm-svn: 266809
2016-04-19 20:14:52 +00:00
Tim Shen e885d5e4d3 [SSP, 2/2] Create llvm.stackguard() intrinsic and lower it to LOAD_STACK_GUARD
With this change, ideally IR pass can always generate llvm.stackguard
call to get the stack guard; but for now there are still IR form stack
guard customizations around (see getIRStackGuard()). Future SSP
customization should go through LOAD_STACK_GUARD.

There is a behavior change: stack guard values are not CSEed anymore,
since we should never reuse the value in case that it has been spilled (and
corrupted). See ssp-guard-spill.ll. This also cause the change of stack
size and codegen in X86 and AArch64 test cases.

Ideally we'd like to know if the guard created in llvm.stackprotector() gets
spilled or not. If the value is spilled, discard the value and reload
stack guard; otherwise reuse the value. This can be done by teaching
register allocator to know how to rematerialize LOAD_STACK_GUARD and
force a rematerialization (which seems hard), or check for spilling in
expandPostRAPseudo. It only makes sense when the stack guard is a global
variable, which requires more instructions to load. Anyway, this seems to go out
of the scope of the current patch.

llvm-svn: 266806
2016-04-19 19:40:37 +00:00
Sanjoy Das 4519ff73df Add a description for the PatchableFunction pass; NFC
llvm-svn: 266721
2016-04-19 06:25:02 +00:00
Sanjoy Das c0441c29df Introduce a "patchable-function" function attribute
Summary:
The `"patchable-function"` attribute can be used by an LLVM client to
influence LLVM's code generation in ways that makes the generated code
easily patchable at runtime (for instance, to redirect control).
Right now only one patchability scheme is supported,
`"prologue-short-redirect"`, but this can be expanded in the future.

Reviewers: joker.eph, rnk, echristo, dberris

Subscribers: joker.eph, echristo, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19046

llvm-svn: 266715
2016-04-19 05:24:47 +00:00
Paul Robinson 43d1e45347 [DWARF] Force a linkage_name on an inlined subprogram's abstract origin.
When we suppress linkage names, for a non-inlined subprogram the name
can still be found in the object-file symbol table, because we have
the code address of the subprogram.  This is not necessarily the case
for an inlined subprogram, so we still want to emit the linkage name
in the DWARF.  Put this on the abstract-origin DIE because it's common
to all inlined instances.

Differential Revision: http://reviews.llvm.org/D18706

llvm-svn: 266692
2016-04-18 22:41:41 +00:00
JF Bastien bbb0aee66e NFC: unify clang / LLVM atomic ordering
This makes the C11 / C++11 *ABI* atomic ordering accessible from LLVM,
as discussed in http://reviews.llvm.org/D18200#inline-151433

This re-applies r266573 which I had reverted in r266576.

Original review: http://reviews.llvm.org/D18875

llvm-svn: 266640
2016-04-18 18:01:43 +00:00
Mehdi Amini b550cb1750 [NFC] Header cleanup
Removed some unused headers, replaced some headers with forward class declarations.

Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'

Patch by Eugene Kosov <claprix@yandex.ru>

Differential Revision: http://reviews.llvm.org/D19219

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266595
2016-04-18 09:17:29 +00:00
JF Bastien fb9871b495 Revert "NFC: unify clang / LLVM atomic ordering"
This reverts commit 537951f2f16d6a8542571c7722fcbae07d4e62c2.

Causes an assert in:
  test/Transforms/AtomicExpand/SPARC/libcalls.ll
  (Ordering2 != AtomicOrdering::NotAtomic && "expect atomic MO")

Bot:
  http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/21724/testReport/junit/LLVM/Transforms_AtomicExpand_SPARC/libcalls_ll/

I'm not getting this assert on my local debug build, but I'll revert
just to be sure.

llvm-svn: 266576
2016-04-17 21:29:01 +00:00
JF Bastien 6ef3aa2b7e NFC: unify clang / LLVM atomic ordering
Summary: This makes the C11 / C++11 *ABI* atomic ordering accessible from LLVM, as discussed in http://reviews.llvm.org/D18200#inline-151433

Reviewers: jyknight, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18875

llvm-svn: 266573
2016-04-17 21:00:57 +00:00
Davide Italiano caa1169653 [ParallelCG] SmallVector<char> -> SmallString.
llvm-svn: 266568
2016-04-17 19:38:57 +00:00
Rafael Espindola 3c1c9875b9 Keep only the splitCodegen version that takes a factory.
This makes it much easier to see that all created TargetMachines are
equivalent.

llvm-svn: 266564
2016-04-17 18:42:27 +00:00
Mehdi Amini 47b292d3fd Remove some unneeded headers and replace some headers with forward class declarations (NFC)
Differential Revision: http://reviews.llvm.org/D19154

Patch by Eugene Kosov <claprix@yandex.ru>

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266524
2016-04-16 07:51:28 +00:00
Mehdi Amini 59ae854503 Do not modify a cl::opt programmatically, global mutable state is evil.
Found by TSAN on ThinLTO.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266514
2016-04-16 04:58:30 +00:00
Richard Smith 2db6f2e508 Update and fix LLVM_ENABLE_MODULES:
1) We need to add this flag prior to adding any other, in case the user has
specified a -fmodule-cache-path= flag in their custom CXXFLAGS. Such a flag
causes -Werror builds to fail, and thus all config checks fail, until we add
the corresponding -fmodules flag. The modules selfhost bot does this, for
instance.

2) Delete module maps that were putting .cpp files into modules.

3) Enable -fmodules-local-submodule-visibility, to get proper module
visibility rules applied across submodules of the same module. Disable
-fmodules for C builds, since that flag is not available there.

llvm-svn: 266502
2016-04-16 00:48:58 +00:00
Wei Mi 963f2df4d2 Don't skip splitSeparateComponents in eliminateDeadDefs for HoistSpillHelper::hoistAllSpills.
Because HoistSpillHelper::hoistAllSpills is called in postOptimization, before the
patch we didn't want LiveRangeEdit::eliminateDeadDefs to call splitSeparateComponents
and generate unassigned new vregs. However, skipping splitSeparateComponents will make
verify-machineinstrs unhappy, so I remove the early return, and use
HoistSpillHelper::LRE_DidCloneVirtReg to assign physreg/stackslot for those new vregs.

In addition, some code reorganization to make class HoistSpillHelper privately inheriting
from LiveRangeEdit::Delegate possible. This is to be consistent with class RAGreedy and
class RegisterCoalescer.

Differential Revision: http://reviews.llvm.org/D19142

llvm-svn: 266489
2016-04-15 23:16:44 +00:00
Hans Wennborg 07f6d3a893 Switch lowering: don't add incoming PHI values from skipped bit test MBB's (PR27135)
After r245976, LLVM will skip the last bit test case if knows it will always be
true. However, we would still erroneously update PHI nodes with incoming values
from the MBB that would perform the final bit test, causing -verify-machineinstrs
to fail.

llvm-svn: 266479
2016-04-15 21:45:30 +00:00
Hans Wennborg c944c13dc1 SelectionDAGISel: rangeify a loop
llvm-svn: 266478
2016-04-15 21:45:09 +00:00
Davide Italiano 7950b12957 [ParallelCG] Add a new splitCodeGen() API which takes a TargetMachineFactory.
This is a recommit of r266390 with a fix that will allow tests to pass
(hopefully). Before we got a StringRef to M->getTargetTriple() and right
after we moved the Module so we were referencing a dangling object.

llvm-svn: 266456
2016-04-15 17:34:32 +00:00
Adrian Prantl 75819aedf6 [PR27284] Reverse the ownership between DICompileUnit and DISubprogram.
Currently each Function points to a DISubprogram and DISubprogram has a
scope field. For member functions the scope is a DICompositeType. DIScopes
point to the DICompileUnit to facilitate type uniquing.

Distinct DISubprograms (with isDefinition: true) are not part of the type
hierarchy and cannot be uniqued. This change removes the subprograms
list from DICompileUnit and instead adds a pointer to the owning compile
unit to distinct DISubprograms. This would make it easy for ThinLTO to
strip unneeded DISubprograms and their transitively referenced debug info.

Motivation
----------

Materializing DISubprograms is currently the most expensive operation when
doing a ThinLTO build of clang.

We want the DISubprogram to be stored in a separate Bitcode block (or the
same block as the function body) so we can avoid having to expensively
deserialize all DISubprograms together with the global metadata. If a
function has been inlined into another subprogram we need to store a
reference the block containing the inlined subprogram.

Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script
that updates LLVM IR testcases to the new format.

http://reviews.llvm.org/D19034
<rdar://problem/25256815>

llvm-svn: 266446
2016-04-15 15:57:41 +00:00
Jun Bum Lim 4c5bd58ebe [MachineScheduler]Add support for store clustering
Perform store clustering just like load clustering. This change add
StoreClusterMutation in machine-scheduler. To control StoreClusterMutation,
added enableClusterStores() in TargetInstrInfo.h. This is enabled only on
AArch64 for now.

This change also add support for unscaled stores which were not handled in
getMemOpBaseRegImmOfs().

llvm-svn: 266437
2016-04-15 14:58:38 +00:00
Davide Italiano 2abf2e7c8c Revert "[LTO] Add a new splitCodeGen() API which takes a TargetMachineFactory."
This reverts commits r266390 and r266396 as they broke some bots.

llvm-svn: 266408
2016-04-15 02:07:03 +00:00
Justin Lebar 7dba2e0d0c [ifcnv] Don't duplicate blocks that contain convergent instructions.
It's unsafe to duplicate blocks that contain convergent instructions
during ifcnv.  See the patch for details.

Reviewers: hfinkel

Differential Revision: http://reviews.llvm.org/D17518

llvm-svn: 266404
2016-04-15 01:38:41 +00:00
Davide Italiano 3fdd27df03 [LTO] Add a new splitCodeGen() API which takes a TargetMachineFactory.
This will be used in lld to avoid creating TargetMachine in two
different places. See D18999 for a more detailed discussion.

Differential Revision:  http://reviews.llvm.org/D19139

llvm-svn: 266390
2016-04-15 00:07:28 +00:00
Geoff Berry 6381713b37 [ScheduleDAGInstrs] Re-factor for based on review feedback. NFC.
Summary:
Re-factor some code to improve clarity and style based on review
comments from http://reviews.llvm.org/D18093.

Reviewers: MatzeB, mcrosier

Subscribers: MatzeB, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19128

llvm-svn: 266372
2016-04-14 21:31:07 +00:00
Reid Kleckner 28865809fe Sink DI metadata usage out of MachineInstr.h and MachineInstrBuilder.h
MachineInstr.h and MachineInstrBuilder.h are very popular headers,
widely included across all LLVM backends. It turns out that there only a
handful of TUs that actually care about DI operands on MachineInstrs.

After this change, touching DebugInfoMetadata.h and rebuilding llc only
needs 112 actions instead of 542.

llvm-svn: 266351
2016-04-14 18:29:59 +00:00
Tom Stellard b72a65ff53 [GlobalISel] Coding style and whitespace fixes
Reviewers: qcolombet

Subscribers: joker.eph, llvm-commits, vkalintiris

Differential Revision: http://reviews.llvm.org/D19119

llvm-svn: 266342
2016-04-14 17:23:33 +00:00
David Majnemer 0f26b0aeb4 [CodeGen] Teach LLVM how to lower @llvm.{min,max}num to {MIN,MAX}NAN
The behavior of {MIN,MAX}NAN differs from that of {MIN,MAX}NUM when only
one of the inputs is NaN: -NUM will return the non-NaN argument while
-NAN would return NaN.

It is desirable to lower to @llvm.{min,max}num to -NAN if they don't
have a native instruction for -NUM.  Notably, ARMv7 NEON's vmin has the
-NAN semantics.

N.B.  Of course, it is only safe to do this if the intrinsic call is
marked nnan.

llvm-svn: 266279
2016-04-14 07:13:24 +00:00
Matt Arsenault 9cd90712f0 AMDGPU: Implement canonicalize
Also add generic DAG node for it.

llvm-svn: 266272
2016-04-14 01:42:16 +00:00
Matthias Braun 46b0f03e12 TargetLowering: Factor out common code for tail call eligibility checking; NFC
llvm-svn: 266270
2016-04-14 01:10:42 +00:00
Nirav Dave 2477491a92 Cleanup Store Merging in UseAA case
This patch fixes a bug (PR26827) when using anti-aliasing in store
merging. This sets the chain users of the component stores to point to
the new store instead of the component stores chain parent.

Reviewers: jyknight

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18909

llvm-svn: 266217
2016-04-13 17:27:26 +00:00
Petar Jovanovic 644b8c1a5d Calculate __builtin_object_size when pointer depends on a condition
This patch fixes calculating of builtin_object_size if it depends on a
condition. Before this patch compiler did not know how to calculate the
object size when it finds a condition that cannot be eliminated.
This patch enables calculating of builtin_object_size even in case when
condition cannot be eliminated by choosing minimum or maximum value as a
result from condition. Choosing minimum or maximum value from condition
is based on the second argument of __builtin_object_size function.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D18438

llvm-svn: 266193
2016-04-13 12:25:25 +00:00
Wei Mi 9a16d655c7 Recommit r265547, and r265610,r265639,r265657 on top of it, plus
two fixes with one about error verify-regalloc reported, and
another about live range update of phi after rematerialization.

r265547:
Replace analyzeSiblingValues with new algorithm to fix its compile
time issue. The patch is to solve PR17409 and its duplicates.

analyzeSiblingValues is a N x N complexity algorithm where N is
the number of siblings generated by reg splitting. Although it
causes siginificant compile time issue when N is large, it is also
important for performance since it removes redundent spills and
enables rematerialization.

To solve the compile time issue, the patch removes analyzeSiblingValues
and replaces it with lower cost alternatives containing two parts. The
first part creates a new spill hoisting method in postOptimization of
register allocation. It does spill hoisting at once after all the spills
are generated instead of inside every instance of selectOrSplit. The
second part queries the define expr of the original register for
rematerializaiton and keep it always available during register allocation
even if it is already dead. It deletes those dead instructions only in
postOptimization. With the two parts in the patch, it can remove
analyzeSiblingValues without sacrificing performance.

Patches on top of r265547:
r265610 "Fix the compare-clang diff error introduced by r265547."
r265639 "Fix the sanitizer bootstrap error in r265547."
r265657 "InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]"

Differential Revision: http://reviews.llvm.org/D15302
Differential Revision: http://reviews.llvm.org/D18934
Differential Revision: http://reviews.llvm.org/D18935
Differential Revision: http://reviews.llvm.org/D18936

llvm-svn: 266162
2016-04-13 03:08:27 +00:00
Justin Bogner 263f314ba7 CodeGen: Clear the MFI's save and restore point after PrologEpilogInserter
This state is no longer useful and not guaranteed to be valid in later
codegen passes. For example, see the added test, which would print a
savepoint of %bb.-1 without this change, and crashes with a
use-after-free error under ASan if you apply the recycling allocator
patch from llvm.org/PR26808.

llvm-svn: 266150
2016-04-12 23:21:53 +00:00
James Y Knight 7873fb9d73 Pre-fill LibcallRoutineNames with nullptr.
And rearrange InitLibcallNames slightly.

llvm-svn: 266142
2016-04-12 22:32:47 +00:00
James Y Knight 19f6cce4e3 Add __atomic_* lowering to AtomicExpandPass.
(Recommit of r266002, with r266011, r266016, and not accidentally
including an extra unused/uninitialized element in LibcallRoutineNames)

AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and
cmpxchg instructions to __atomic_* library calls, when the target
doesn't support atomics of a given size.

This is the first step towards moving all atomic lowering from clang
into llvm. When all is done, the behavior of __sync_* builtins,
__atomic_* builtins, and C11 atomics will be unified.

Previously LLVM would pass everything through to the ISelLowering
code. There, unsupported atomic instructions would turn into __sync_*
library calls. Because of that behavior, Clang currently avoids emitting
llvm IR atomic instructions when this would happen, and emits __atomic_*
library functions itself, in the frontend.

This change makes LLVM able to emit __atomic_* libcalls, and thus will
eventually allow clang to depend on LLVM to do the right thing.

It is advantageous to do the new lowering to atomic libcalls in
AtomicExpandPass, before ISel time, because it's important that all
atomic operations for a given size either lower to __atomic_*
libcalls (which may use locks), or native instructions which won't. No
mixing and matching.

At the moment, this code is enabled only for SPARC, as a
demonstration. The next commit will expand support to all of the other
targets.

Differential Revision: http://reviews.llvm.org/D18200

llvm-svn: 266115
2016-04-12 20:18:48 +00:00
Ahmed Bougacha 7ac86c47d2 [CodeGen] Remove constant-folding dead code. NFC.
This code was specific to vector operations with scalar operands:
all the opcodes in FoldValue (via FoldConstantArithmetic) can't
match those criteria.

Replace it with an assert if that ever changes: at that point,
we might need to add back a splat BUILD_VECTOR.

llvm-svn: 266100
2016-04-12 18:15:39 +00:00
Philip Reames 92d1f0cb6d Introduce an GCRelocateInst class [NFC]
Previously, we were using isGCRelocate predicates.  Using a subclass of IntrinsicInst is far more idiomatic.  The refactoring also enables a couple of minor simplifications and code sharing.

llvm-svn: 266098
2016-04-12 18:05:10 +00:00
Geoff Berry c0739d8305 [ScheduleDAGInstrs] Handle instructions with multiple MMOs
Summary:
In getUnderlyingObjectsForInstr(): Don't give up on instructions with
multiple MMOs, instead look through all the MMOs and if they all meet
the conservative criteria previously used for single MMO instructions,
then return all of the underlying objects derived from the MMOs.

The change to ScheduleDAGInstrs::buildSchedGraph() is needed to avoid
the case where multiple underlying objects are present and are related
in such a way that successive iterations of the loop end up adding a
dependency from an instruction to itself.

Reviewers: atrick, hfinkel

Subscribers: MatzeB, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18093

llvm-svn: 266084
2016-04-12 15:50:19 +00:00
Rafael Espindola d41b54be11 This reverts commit r266002, r266011 and r266016.
They broke the msan bot.

Original message:

Add __atomic_* lowering to AtomicExpandPass.

AtomicExpandPass can now lower atomic load, atomic store, atomicrmw,and
cmpxchg instructions to __atomic_* library calls, when the target
doesn't support atomics of a given size.

This is the first step towards moving all atomic lowering from clang
into llvm. When all is done, the behavior of __sync_* builtins,
__atomic_* builtins, and C11 atomics will be unified.

Previously LLVM would pass everything through to the ISelLowering
code. There, unsupported atomic instructions would turn into __sync_*
library calls. Because of that behavior, Clang currently avoids emitting
llvm IR atomic instructions when this would happen, and emits __atomic_*
library functions itself, in the frontend.

This change makes LLVM able to emit __atomic_* libcalls, and thus will
eventually allow clang to depend on LLVM to do the right thing.

It is advantageous to do the new lowering to atomic libcalls in
AtomicExpandPass, before ISel time, because it's important that all
atomic operations for a given size either lower to __atomic_*
libcalls (which may use locks), or native instructions which won't. No
mixing and matching.

At the moment, this code is enabled only for SPARC, as a
demonstration. The next commit will expand support to all of the other
targets.

Differential Revision: http://reviews.llvm.org/D18200

llvm-svn: 266062
2016-04-12 12:30:25 +00:00
Quentin Colombet 777a7717ef [RegBankSelect] Teach the repairing code how to handle physical
registers.

llvm-svn: 266029
2016-04-12 00:38:51 +00:00
Quentin Colombet 5aacb1da00 [RegisterBankInfo] Do not provide a default mapping for non-reg of phi
operations.

llvm-svn: 266027
2016-04-12 00:30:14 +00:00
Quentin Colombet 904a2c7422 [RegBankSelect] Teach how to repair definitions.
Although repairing definitions is not mandatory for correctness (only
phis would be impacted because of the RPO traversal), not repairing
might go against the cost model. Therefore, just repair when it is
possible.

llvm-svn: 266025
2016-04-12 00:12:59 +00:00
Derek Schuff f7b2bce1f1 Replace MachineRegisterInfo::TracksLiveness with a MachineFunctionProperty
Use the MachineFunctionProperty mechanism to indicate whether the
liveness info is accurate instead of a bool flag on MRI.
Keeps the MRI accessor function for convenience. NFC

Differential Revision: http://reviews.llvm.org/D18767

llvm-svn: 266020
2016-04-11 23:32:13 +00:00
JF Bastien b3ac75f748 AtomicExpandPass: mark assert variable as used
Avoid -Wunused-variable

llvm-svn: 266016
2016-04-11 23:03:54 +00:00
James Y Knight 00db547f97 Fix compile with GCC after r266002 (Add __atomic_* lowering to AtomicExpandPass)
It doesn't like implicitly calling the ArrayRef constructor with a
returned array -- it appears to decays the returned value to a pointer,
first, before trying to make an ArrayRef out of it.

llvm-svn: 266011
2016-04-11 22:52:42 +00:00
Justin Bogner 1faf01578e CodeGen: Fix a use-after-free in TailDuplication
The call to processPHI already erased MI from its parent, so MI isn't
even valid here, making the getParent() call a use-after-free in
addition to being redundant.

Found by ASan with the ArrayRecycler changes in llvm.org/pr26808.

llvm-svn: 266008
2016-04-11 22:37:13 +00:00
Evgeniy Stepanov f17120a85f [safestack] Add canary to unsafe stack frames
Add StackProtector to SafeStack. This adds limited protection against
data corruption in the caller frame. Current implementation treats
all stack protector levels as -fstack-protector-all.

llvm-svn: 266004
2016-04-11 22:27:48 +00:00
James Y Knight b91d38c5fe Add __atomic_* lowering to AtomicExpandPass.
AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and
cmpxchg instructions to __atomic_* library calls, when the target
doesn't support atomics of a given size.

This is the first step towards moving all atomic lowering from clang
into llvm. When all is done, the behavior of __sync_* builtins,
__atomic_* builtins, and C11 atomics will be unified.

Previously LLVM would pass everything through to the ISelLowering
code. There, unsupported atomic instructions would turn into __sync_*
library calls. Because of that behavior, Clang currently avoids emitting
llvm IR atomic instructions when this would happen, and emits __atomic_*
library functions itself, in the frontend.

This change makes LLVM able to emit __atomic_* libcalls, and thus will
eventually allow clang to depend on LLVM to do the right thing.

It is advantageous to do the new lowering to atomic libcalls in
AtomicExpandPass, before ISel time, because it's important that all
atomic operations for a given size either lower to __atomic_*
libcalls (which may use locks), or native instructions which won't. No
mixing and matching.

At the moment, this code is enabled only for SPARC, as a
demonstration. The next commit will expand support to all of the other
targets.

Differential Revision: http://reviews.llvm.org/D18200

llvm-svn: 266002
2016-04-11 22:22:33 +00:00
Simon Pilgrim 82e54871d0 [DAGCombiner] Fold xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) anytime before LegalizeVectorOprs
xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) was only being combined at the AfterLegalizeTypes stage, this patch permits the combine to occur anytime before then as well.

The main aim with this to improve the ability to recognise bitmasks that can be converted to shuffles.

I had to modify a number of AVX512 mask tests as the basic bitcast to/from scalar pattern was being stripped out, preventing testing of the mmask bitops. By replacing the bitcasts with loads we can get almost the same result.

Differential Revision: http://reviews.llvm.org/D18944

llvm-svn: 265998
2016-04-11 21:10:33 +00:00
Hans Wennborg e9134897f4 Fix a couple of redundant conditional expressions (PR27283, PR28282)
llvm-svn: 265987
2016-04-11 20:35:01 +00:00
Sanjay Patel 892f167aa5 use range-loops; NFCI
llvm-svn: 265985
2016-04-11 20:13:44 +00:00
Reid Kleckner b6800b3052 Combine redundant stack realignment booleans in MachineFrameInfo
MachineFrameInfo does not need to be able to distinguish between the
user asking us not to realign the stack and the target telling us it
doesn't support stack realignment. Either way, fixed stack objects have
their alignment clamped.

llvm-svn: 265971
2016-04-11 17:54:03 +00:00
Tom Stellard 52686e4182 TargetRegisterInfo: Add getRegAsmName()
Summary:
The motivation for this new function is to move an invalid assumption
about the relationship between the names of register definitions in
tablegen files and their assembly names into TargetRegisterInfo, so that
we can begin working on fixing this assumption.

The current problem is that if you have a register definition in
TableGen like:

def MYReg0 : Register<"r0", 0>;

The function TargetLowering::getRegForInlineAsmConstraint() derives the
assembly name from the tablegen name: "MyReg0" rather than the given
assembly name "r0".  This is working, because on most targets the
tablegen name and the assembly names are case insensitive matches for
each other (e.g. def EAX : X86Reg<"eax", ...>

getRegAsmName() will allow targets to override this default assumption and
return the correct assembly name.

Reviewers: echristo, hfinkel

Subscribers: SamWot, echristo, hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D15614

llvm-svn: 265955
2016-04-11 16:21:12 +00:00
Charles Davis 2f65f35c27 [CodeGen] Don't assume that fixed stack objects are aligned in a stack-realigned function.
Summary:
After we make the adjustment, we can assume that for local allocas, but
not for stack parameters, the return address, or any other fixed stack
object (which has a negative offset and therefore lies prior to the
adjusted SP).

Fixes PR26662.

Reviewers: hfinkel, qcolombet, rnk

Subscribers: rnk, llvm-commits

Differential Revision: http://reviews.llvm.org/D18471

llvm-svn: 265886
2016-04-09 23:34:42 +00:00
Adrian Prantl 3891e9e859 Drop debug info for DISubprograms that are not referenced by anything
This patch drops the debug info for all DISubprograms that are
(a) not attached to an llvm::Function and
(b) not indirectly reachable via inline scopes from any surviving Function and
(c) not reachable from a type (i.e.: member functions).

Background: I'm currently working on a patch to reverse the pointers
between DICompileUnit and DISubprogram (for more info check Duncan's RFC
on lazy-loading of debug info metadata
http://lists.llvm.org/pipermail/llvm-dev/2016-March/097419.html).
The idea is to remove the list of subprograms from DICompileUnit and
instead point to the owning compile unit from each DISubprogram.
After doing this all DISubprograms fulfilling the above criteria will be
implicitly dropped unless we go through an extra effort to preserve them.

http://reviews.llvm.org/D18477
<rdar://problem/25256815>

llvm-svn: 265876
2016-04-09 18:10:22 +00:00
Sanjay Patel 4abae4e0fa [x86] use BMI 'andn' for logic + compare ops
With BMI, we can use 'andn' to save an instruction when the result is only used in a compare.
This is related to one of the potential sequences to check 'isfinite' in:
https://llvm.org/bugs/show_bug.cgi?id=27164

Differential Revision: http://reviews.llvm.org/D18910

llvm-svn: 265875
2016-04-09 16:02:52 +00:00
Adrian Prantl 5992a72b4d Support the Nodebug emission kind for DICompileUnits.
Sample-based profiling and optimization remarks currently remove
DICompileUnits from llvm.dbg.cu to suppress the emission of debug info
from them. This is somewhat of a hack and only borderline legal IR.

This patch uses the recently introduced NoDebug emission kind in
DICompileUnit to achieve the same result without breaking the Verifier.
A nice side-effect of this change is that it is now possible to combine
NoDebug and regular compile units under LTO.

http://reviews.llvm.org/D18808
<rdar://problem/25427165>

llvm-svn: 265861
2016-04-08 22:43:03 +00:00
Tim Shen 0012756489 [SSP] Remove llvm.stackprotectorcheck.
This is a cleanup patch for SSP support in LLVM. There is no functional change.
llvm.stackprotectorcheck is not needed, because SelectionDAG isn't
actually lowering it in SelectBasicBlock; rather, it adds check code in
FinishBasicBlock, ignoring the position where the intrinsic is inserted
(See FindSplitPointForStackProtector()).

llvm-svn: 265851
2016-04-08 21:26:31 +00:00
Kyle Butt 3232dbbf02 Codegen: Factor tail duplication into a utility class. NFC
This is in preparation for tail duplication during block placement. See D18226.
This needs to be a utility class for 2 reasons. No passes may run after block
placement, and also, tail-duplication affects subsequent layout decisions, so
it must be interleaved with placement, and can't be separated out into its own
pass. The original pass is still useful, and now runs by delegating to the
utility class.

llvm-svn: 265842
2016-04-08 20:35:01 +00:00
Nirav Dave 66f485f4e2 Fix Load Control Dependence in MemCpy Generation
In Memcpy lowering we had missed a dependence from the load of the
operation to successor operations. This causes us to potentially
construct an in initial DAG with a memory dependence not fully
represented in the chain sub-DAG but rather require looking at the
entire DAG breaking alias analysis by allowing incorrect repositioning
of memory operations.

To work around this, r200033 changed DAGCombiner::GatherAllAliases to be
conservative if any possible issues to happen. Unfortunately this check
forbade many non-problematic situations as well. For example, it's
common for incoming argument lowering to add a non-aliasing load hanging
off of EntryNode. Then, if GatherAllAliases visited EntryNode, it would
find that other (unvisited) use of the EntryNode chain, and just give up
entirely. Furthermore, the check was incomplete: it would not actually
detect all such potentially problematic DAG constructions, because
GatherAllAliases did not guarantee to visit all chain nodes going up to
the root EntryNode. This is in general fine -- giving up early will just
miss a potential optimization, not generate incorrect results. But, for
this non-chain dependency detection code, it's possible that you could
have a load attached to a higher-up chain node than any which were
visited. If that load aliases your store, but the only dependency is
through the value operand of a non-aliasing store, it would've been
missed by this code, and potentially reordered.

With the dependence added, this check can be removed and Alias Analysis
can be much more aggressive. This fixes code quality regression in the
Consecutive Store Merge cleanup (D14834).

Test Change:

ppc64-align-long-double.ll now may see multiple serializations
of its stores

Differential Revision: http://reviews.llvm.org/D18062

llvm-svn: 265836
2016-04-08 19:44:40 +00:00
Quentin Colombet ab8c21f72b [RegBankSelect] Use reverse post order traversal.
When assigning the register banks of an instruction, it is best to know
all the constraints of the input to have a good idea of how this will
impact the cost of the whole function.

llvm-svn: 265812
2016-04-08 17:19:10 +00:00
Quentin Colombet 88805c1917 [RegisterBankInfo] Change the implementation for the default mapping.
Do not give that much importance to the current register bank of an
operand. This is likely just a side effect of the current execution and
it is properly wise to prefer a register bank that can be extracted from
the information available statically (like encoding constraints and
type).

llvm-svn: 265810
2016-04-08 16:59:50 +00:00
Quentin Colombet 6d6d6af226 [RegBankSelect] Improve debug output.
Add verbose information when checking if the current and the desired
register banks match.
Detail what happens when we assign a register bank.

llvm-svn: 265804
2016-04-08 16:48:16 +00:00
Quentin Colombet 876ddf8107 [MIR] Teach the parser how to deal with register banks.
llvm-svn: 265802
2016-04-08 16:40:43 +00:00
Quentin Colombet c1c94bc2ca [MachineVerifier] Teach how to check some of the properties of generic
virtual registers.

Generic virtual registers:
- May not have a register class
- May not have a register bank
- If they do not have a register class they must have a size
- If they have a register bank, the size of the register bank must be
  greater or equal to the size of the virtual register (basically check
  that the virtual register will fit into that register class)

llvm-svn: 265798
2016-04-08 16:35:22 +00:00
Quentin Colombet fab1cfe673 [MIR] Teach the mir printer how to print the register bank.
For now, we put the register bank in the Class field since a register
may only have one of those at a given time. The downside of that
representation is that if a register class and a register bank have the
same name, we will not be able to distinguish them.

llvm-svn: 265796
2016-04-08 16:26:22 +00:00
Hans Wennborg 5a7723c7a2 Revert r265547 "Recommit r265309 after fixed an invalid memory reference bug happened"
It caused PR27275: "ARM: Bad machine code: Using an undefined physical register"

Also reverting the following commits that were landed on top:
r265610 "Fix the compare-clang diff error introduced by r265547."
r265639 "Fix the sanitizer bootstrap error in r265547."
r265657 "InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]"

llvm-svn: 265790
2016-04-08 15:17:43 +00:00
Chuang-Yu Cheng 98c1894755 CXX_FAST_TLS calling convention: performance improvement for PPC64
This is the same change on PPC64 as r255821 on AArch64. I have even borrowed
his commit message.

The access function has a short entry and a short exit, the initialization
block is only run the first time. To improve the performance, we want to
have a short frame at the entry and exit.

We explicitly handle most of the CSRs via copies. Only the CSRs that are not
handled via copies will be in CSR_SaveList.

Frame lowering and prologue/epilogue insertion will generate a short frame
in the entry and exit according to CSR_SaveList. The majority of the CSRs will
be handled by register allcoator. Register allocator will try to spill and
reload them in the initialization block.

We add CSRsViaCopy, it will be explicitly handled during lowering.

1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target
   supports it for the given machine function and the function has only return
   exits). We also call TLI->initializeSplitCSR to perform initialization.
2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to
   virtual registers at beginning of the entry block and copies from virtual
   registers to CSRsViaCopy at beginning of the exit blocks.
3> we also need to make sure the explicit copies will not be eliminated.

Author: Tom Jablin (tjablin)
Reviewers: hfinkel kbarton cycheng

http://reviews.llvm.org/D17533

llvm-svn: 265781
2016-04-08 12:04:32 +00:00
Craig Topper 00230805f2 Use std::fill to simplify some code. NFC
llvm-svn: 265771
2016-04-08 07:10:46 +00:00
Quentin Colombet e57546de40 [TargetRegisterInfo] Re-apply r265734.
Original commit message:
[TargetRegisterInfo] Refactor the code to use BitMaskClassIterator.

llvm-svn: 265764
2016-04-08 00:51:00 +00:00
Adrian Prantl 3e9c88753b DwarfDebug: Support floating point constants in location lists.
This patch closes a gap in the DWARF backend that caused LLVM to drop
debug info for floating point variables that were constant for part of
their scope. Floating point constants are emitted as one or more
DW_OP_constu joined via DW_OP_piece.

This fixes a regression caught by the LLDB testsuite that I introduced
in r262247 when we stopped blindly expanding the range of singular
DBG_VALUEs to span the entire scope and started to emit location lists
with accurate ranges instead.

Also deletes a now-impossible testcase (debug-loc-empty-entries).

<rdar://problem/25448338>

llvm-svn: 265760
2016-04-08 00:38:37 +00:00
Quentin Colombet e0a7ffa6cb Revert "[TargetRegisterInfo] Refactor the code to use BitMaskClassIterator."
This reverts commit r265734.
Looks like ASan is not happy about it.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/11741

Looking.

llvm-svn: 265755
2016-04-08 00:03:51 +00:00
Quentin Colombet dcf5cf6a29 [RegisterBankInfo] Make the debug output more compact.
Print the mask of the partial mapping as an hexadecimal instead of a
binary value.

llvm-svn: 265754
2016-04-08 00:03:49 +00:00
Quentin Colombet e16f561d91 [RegBankSelect] Add a few debug statements.
llvm-svn: 265749
2016-04-07 23:53:55 +00:00
Quentin Colombet 9a2ae85e67 [RegisterBankInfo] Add print and dump method to the InstructionMapping
helper class.

llvm-svn: 265747
2016-04-07 23:31:58 +00:00
Quentin Colombet e087c9fc12 [RegisterBankInfo] Add print and dump method to the ValueMapping helper
class.

llvm-svn: 265746
2016-04-07 23:25:43 +00:00
Quentin Colombet 03c419628e [MachineInstr] Teach the print method about RegisterBank.
Properly print either the register class or the register bank or a
virtual register.
Get rid of a few ifdefs in the process.

llvm-svn: 265745
2016-04-07 23:18:11 +00:00
Quentin Colombet ac40034e06 [RegisterBankInfo] Strengthen getInstrMappingImpl.
Teach the target independent code how to take advantage of type
information to get the mapping of an instruction.

llvm-svn: 265739
2016-04-07 22:52:49 +00:00
Quentin Colombet e918006a87 [RegisterBankInfo] Add a way to record what register bank covers a
specific type.

This will be used to find the default mapping of the instruction.
Also, this information is recorded, instead of computed, because it is
expensive from a type to know which register bank maps it.
Indeed, we need to iterate through all the register classes of all the
register banks to find the one that maps the given type.

llvm-svn: 265736
2016-04-07 22:45:42 +00:00
Quentin Colombet c8d612f6fd [RegisterBankInfo] Introduce getRegBankFromConstraints as an helper
method.

NFC.

The refactoring intends to make the code more readable and expose
more features to potential derived classes.

llvm-svn: 265735
2016-04-07 22:35:03 +00:00
Quentin Colombet 2445dc1916 [TargetRegisterInfo] Refactor the code to use BitMaskClassIterator.
llvm-svn: 265734
2016-04-07 22:16:56 +00:00
Quentin Colombet cf477ffc58 [RegisterBankInfo] Refactor the code to use BitMaskClassIterator.
llvm-svn: 265733
2016-04-07 22:08:56 +00:00
Quentin Colombet aac71a4a0e [RegBankSelect] Reuse RegisterBankInfo logic to get to the register bank
from a register.
On top of duplicating the logic, it was buggy! It would assert on
physical registers, since MachineRegisterInfo does not have any
information regarding register classes/banks for them.

llvm-svn: 265727
2016-04-07 21:32:23 +00:00
Amaury Sechet c53ad4f3b2 Do not select EhPad BB in MachineBlockPlacement when there is regular BB to schedule
Summary:
EHPad BB are not entered the classic way and therefor do not need to be placed after their predecessors. This patch make sure EHPad BB are not chosen amongst successors to form chains, and are selected as last resort when selecting the best candidate.

EHPad are scheduled in reverse probability order in order to have them flow into each others naturally.

Reviewers: chandlerc, majnemer, rafael, MatzeB, escha, silvas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17625

llvm-svn: 265726
2016-04-07 21:29:39 +00:00
Quentin Colombet d4131814b3 [GlobalISel] Add RegBankSelect hooks into the pass pipeline.
Now, RegBankSelect will happen after the IRTranslation and the target
may optionally add additional passes in between.

llvm-svn: 265716
2016-04-07 20:27:33 +00:00
Quentin Colombet 40ad573d2c [RegBankSelect] Initial implementation for non-optimized output.
The pass walk through the machine function and assign the register banks
using the default mapping. In other words, there is no attempt to reduce
cross register copies.

llvm-svn: 265707
2016-04-07 18:19:27 +00:00
Quentin Colombet fe1ee4f9be [RegisterBankInfo] Provide a target independent helper function to guess
the mapping of an instruction on register bank.

For most instructions, it is possible to guess the mapping of the
instruciton by using the encoding constraints.
It remains instructions without encoding constraints.
For copy-like instructions, we try to propagate the information we get
from the other operands. Otherwise, the target has to give this
information.

llvm-svn: 265703
2016-04-07 18:01:19 +00:00
Quentin Colombet ee366eff44 [RegisterBankInfo] Change the signature of getSizeInBits to factor out
the access to MRI and TRI.

llvm-svn: 265701
2016-04-07 17:44:54 +00:00
Quentin Colombet 5b7ba5092c [RegisterBankInfo] Provide a default constructor for InstructionMapping
helper class.

The default constructor creates invalid (isValid() == false) instances
and may be used to communicate that a mapping was not found.

llvm-svn: 265699
2016-04-07 17:30:18 +00:00
Quentin Colombet c33085f2c6 [MachineRegisterInfo] Track register bank for virtual registers.
A virtual register may have either a register bank or a register class.
This is represented by a PointerUnion between the related classes.

Typically, a virtual register went through the following states
regarding register class and register bank:

1. Creation: None is set. Virtual registers are fully generic.
2. Register bank assignment: Register bank is set. Virtual registers
live into a register bank, but we do not know the constraints they need
to fulfil.
3. Instruction selection: Register class is set. Virtual registers are
bound by encoding constraints.

To map these states to GlobalISel, the IRTranslator implements #1,
RegBankSelect #2, and Select #3.

llvm-svn: 265696
2016-04-07 17:20:29 +00:00
Quentin Colombet d21115876c [RegisterBank] Rename RegisterBank::contains into RegisterBank::covers.
llvm-svn: 265695
2016-04-07 17:09:39 +00:00
Dmitry Polukhin a1feff7024 [GCC] Attribute ifunc support in llvm
This patch add support for GCC attribute((ifunc("resolver"))) for
targets that use ELF as object file format. In general ifunc is a
special kind of function alias with type @gnu_indirect_function. Patch
for Clang http://reviews.llvm.org/D15524

Differential Revision: http://reviews.llvm.org/D15525

llvm-svn: 265667
2016-04-07 12:32:19 +00:00
NAKAMURA Takumi e546211492 InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]
llvm-svn: 265657
2016-04-07 11:30:06 +00:00
Amaury Sechet 33c161c02f [BlockPlacement] Remove an unnecessary continue
NFC.

llvm-svn: 265643
2016-04-07 06:35:00 +00:00
Amaury Sechet 9ee4ddd710 [MBP] Remove an unused function parameter
NFC.

llvm-svn: 265642
2016-04-07 06:34:47 +00:00
Wei Mi 979e9756ec Fix the sanitizer bootstrap error in r265547.
The iterators of SmallPtrSet SpillsInSubTreeMap[Child].first may be
invalidated when SpillsInSubTreeMap grows. Rearrange the code to
ensure the grow of SpillsInSubTreeMap only happens before getting
the iterators of the SmallPtrSet.

llvm-svn: 265639
2016-04-07 05:27:17 +00:00
Amaury Sechet 41474a52e7 Revert "[BlockPlacement] Remove an unnecessary continue" and "[MBP] Remove an unused function parameter"
llvm-svn: 265638
2016-04-07 04:28:40 +00:00
Duncan P. N. Exon Smith da68cbc4ad IR: RF_IgnoreMissingValues => RF_IgnoreMissingLocals, NFC
Clarify what this RemapFlag actually means.

  - Change the flag name to match its intended behaviour.
  - Clearly document that it's not supposed to affect globals.
  - Add a host of FIXMEs to indicate how to fix the behaviour to match
    the intent of the flag.

RF_IgnoreMissingLocals should only affect the behaviour of
RemapInstruction for function-local operands; namely, for operands of
type Argument, Instruction, and BasicBlock.  Currently, it is *only*
passed into RemapInstruction calls (and the transitive MapValue calls
that it makes).

When I split Metadata from Value I didn't understand the flag, and I
used it in a bunch of places for "global" metadata.

This commit doesn't have any functionality change, but prepares to
cleanup MapMetadata and MapValue.

llvm-svn: 265628
2016-04-07 00:26:43 +00:00
Quentin Colombet 4359784c1b [RegisterBankInfo] Implement a target independent version of
getInstrMapping.

This implementation requires that the target implemented
getRegBankFromRegClass.
Indeed, the implementation uses the register classes for the encoding
constraints for the instructions to deduce the mapping of a value.

llvm-svn: 265624
2016-04-07 00:07:50 +00:00
Quentin Colombet 8c0d66bc54 [RegisterBankInfo] Add an helper function to get the size of a register.
The previous method to get the size was too simple and could fail for
physical registers.

llvm-svn: 265620
2016-04-06 23:59:53 +00:00
Wei Mi 284fa0bd71 Fix the compare-clang diff error introduced by r265547.
Use MapVector instead of DenseMap for MergeableSpillsMap so it will be
iterated in determined order.

llvm-svn: 265610
2016-04-06 22:31:17 +00:00
Quentin Colombet c916204a81 [RegisterBankInfo] Add methods to get the possible mapping of an instruction on a register bank.
This will be used by the register bank select pass to assign register banks
for generic virtual registers.

This was originally committed as r265573 but broke at least one windows bot.
The problem with the windows bot was that it was using a copy constructor for
the InstructionMappings class and could not synthesize it. Actually, the fact
that this class is not copy constructable is expected and the compiler should
use the move assignment constructor. Marking the problematic assignment
explicitly as using the move constructor has its own problems.

Indeed, with recent clang we get a warning that we may prevent the elision of
the copy by the compiler. A proper fix for both compilers would be to change the
API of getPossibleInstrMapping to take a InstructionMappings as input/output
parameter. This does not feel natural and since GISel is not used on windows
yet, I chose to workaround the problem by not compiling the problematic code on
windows.

llvm-svn: 265604
2016-04-06 21:37:22 +00:00
JF Bastien 800f87a871 NFC: make AtomicOrdering an enum class
Summary:
In the context of http://wg21.link/lwg2445 C++ uses the concept of
'stronger' ordering but doesn't define it properly. This should be fixed
in C++17 barring a small question that's still open.

The code currently plays fast and loose with the AtomicOrdering
enum. Using an enum class is one step towards tightening things. I later
also want to tighten related enums, such as clang's
AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI'
enum).

This change touches a few lines of code which can be improved later, I'd
like to keep it as NFC for now as it's already quite complex. I have
related changes for clang.

As a follow-up I'll add:
  bool operator<(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>(AtomicOrdering, AtomicOrdering) = delete;
  bool operator<=(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>=(AtomicOrdering, AtomicOrdering) = delete;
This is separate so that clang and LLVM changes don't need to be in sync.

Reviewers: jyknight, reames

Subscribers: jyknight, llvm-commits

Differential Revision: http://reviews.llvm.org/D18775

llvm-svn: 265602
2016-04-06 21:19:33 +00:00
Haicheng Wu 1951cf24a7 [MBP] Remove an unused function parameter
NFC.

llvm-svn: 265596
2016-04-06 20:38:20 +00:00
Quentin Colombet fb000583aa Revert "[RegisterBankInfo] Add methods to get the possible mapping of an
instruction on a register bank. This will be used by the register bank select
pass to assign register banks for generic virtual registers." and the follow-on
commits while I find out a way to fix the win7 bot:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/19882

This reverts commit r265578, r265581, r265584, and r265585.

llvm-svn: 265587
2016-04-06 19:04:58 +00:00
Evgeniy Stepanov 268826a287 [gold] Save bitcode for module partitions (save-temps + split codegen).
llvm-svn: 265583
2016-04-06 18:32:13 +00:00
Quentin Colombet df4aee09f8 [RegisterBankInfo] Provide a default constructor for InstructionMapping
helper class.

The default constructor creates invalid (isValid() == false) instances
and may be used to communicate that a mapping was not found.

llvm-svn: 265581
2016-04-06 18:24:34 +00:00
Quentin Colombet bb756dbf39 [RegisterBankInfo] Add an helper function to get the size of a register.
The previous method to get the size was too simple and could fail for
physical registers.

llvm-svn: 265578
2016-04-06 18:04:35 +00:00
Quentin Colombet 9af77135e5 [RegisterBankInfo] Add methods to get the possible mapping of an instruction on a register bank.
This will be used by the register bank select pass to assign register banks
for generic virtual registers.

llvm-svn: 265573
2016-04-06 17:45:40 +00:00
Quentin Colombet 4812c91f56 [RegisterBankInfo] Implement the verify method of the InstructionMapping helper class.
This checks that all the register operands get a proper mapping.

llvm-svn: 265563
2016-04-06 17:01:43 +00:00
Quentin Colombet 3768f7005d [RegisterBankInfo] Implement the verify method for the ValueMapping helper class.
The method checks that the value is fully defined accross the different partial
mappings and that the partial mappings are compatible between each other.

llvm-svn: 265556
2016-04-06 16:40:23 +00:00
Quentin Colombet 2423fc419c [RegisterBankInfo] Add a verify method for the PartialMapping helper class.
This verifies that the PartialMapping can be accomadated into the related
register bank.

llvm-svn: 265555
2016-04-06 16:33:26 +00:00
Quentin Colombet 89c33caee3 [RegisterBankInfo] Add a couple of helper classes for the future cost model.
llvm-svn: 265553
2016-04-06 16:27:01 +00:00
Quentin Colombet 911181882e [RegisterBankInfo] Inline the destructor to avoid link-time error when GlobalISel is not built.
llvm-svn: 265548
2016-04-06 15:47:17 +00:00
Wei Mi 18293bef4e Recommit r265309 after fixed an invalid memory reference bug happened
when DenseMap growed and moved memory. I verified it fixed the bootstrap
problem on x86_64-linux-gnu but I cannot verify whether it fixes
the bootstrap error on clang-ppc64be-linux. I will watch the build-bot
result closely.

Replace analyzeSiblingValues with new algorithm to fix its compile
time issue. The patch is to solve PR17409 and its duplicates.

analyzeSiblingValues is a N x N complexity algorithm where N is
the number of siblings generated by reg splitting. Although it
causes siginificant compile time issue when N is large, it is also
important for performance since it removes redundent spills and
enables rematerialization.

To solve the compile time issue, the patch removes analyzeSiblingValues
and replaces it with lower cost alternatives containing two parts. The
first part creates a new spill hoisting method in postOptimization of
register allocation. It does spill hoisting at once after all the spills
are generated instead of inside every instance of selectOrSplit. The
second part queries the define expr of the original register for
rematerializaiton and keep it always available during register allocation
even if it is already dead. It deletes those dead instructions only in
postOptimization. With the two parts in the patch, it can remove
analyzeSiblingValues without sacrificing performance.

Differential Revision: http://reviews.llvm.org/D15302

llvm-svn: 265547
2016-04-06 15:41:07 +00:00
Matthias Braun 7dc03f060e RegisterScavenger: Take a reference as enterBasicBlock() argument.
Make it obvious that the argument cannot be nullptr.
Remove an unnecessary nullptr check in initRegState.

llvm-svn: 265511
2016-04-06 02:47:09 +00:00
Matthias Braun 3bb0fcc118 LivePhysRegs: Remove redundant check
llvm-svn: 265509
2016-04-06 02:46:04 +00:00
Sanjoy Das 65a60670e8 Lower @llvm.experimental.deoptimize as a noreturn call
While preserving the return value for @llvm.experimental.deoptimize at
the IR level is useful during mid-level optimization, doing so at the
machine instruction level requires generating some extra code and a
return that is non-ideal.  This change has LLVM lower

```
  %val = call @llvm.experimental.deoptimize
  ret %val
```

to effectively

```
  call @__llvm_deoptimize()
  unreachable
```

instead.

llvm-svn: 265502
2016-04-06 01:33:49 +00:00
Quentin Colombet 06bdd3c914 [RegisterBankInfo] Simplify the API for build a register bank.
As part of the TRI argument of addRegBankCoverage we already have access to
the TargetRegisterClass through the ID of that register class.
Therefore, there is no point in needing a TargetRegisterClass instance,
the ID is enough to get to it.

llvm-svn: 265487
2016-04-05 23:26:39 +00:00
Evgeniy Stepanov dde29e2799 Faster stack-protector for Android/AArch64.
Bionic has a defined thread-local location for the stack protector
cookie. Emit a direct load instead of going through __stack_chk_guard.

llvm-svn: 265481
2016-04-05 22:41:50 +00:00
Quentin Colombet 64bba01a63 [RegisterBank] Implement the verify method to check for the obvious mistakes.
llvm-svn: 265479
2016-04-05 22:34:01 +00:00
Quentin Colombet 0195826998 [RegisterBankInfo] Add debug print to check how the initialization is going.
llvm-svn: 265475
2016-04-05 21:47:56 +00:00
Quentin Colombet c94fbee9f6 [RegisterBank] Add printable capabilities for future debugging.
llvm-svn: 265473
2016-04-05 21:40:43 +00:00
Quentin Colombet 85689d934a [RegisterBankInfo] Make addRegBankCoverage more capable to ease
targeting jobs.
Now, addRegBankCoverage also adds the subreg-classes not just the
sub-classes of the given register class.

llvm-svn: 265469
2016-04-05 21:20:12 +00:00
Quentin Colombet d347d695c2 [RegisterBankInfo] Implement the methods to create register banks.
llvm-svn: 265464
2016-04-05 21:06:15 +00:00
Quentin Colombet c4db2ad5b8 [RegisterBank] Provide a way to check if a register bank is valid.
Change the default constructor to create invalid object.
The target will have to properly initialize the register banks before
using them.

llvm-svn: 265460
2016-04-05 20:48:32 +00:00
Quentin Colombet b235d32e74 [GlobalISel] Add the RegisterBankInfo class for the handling of register banks.
llvm-svn: 265449
2016-04-05 20:02:47 +00:00
Quentin Colombet bdc3b4d523 [GlobalISel] Add a class, RegisterBank, to represent register banks.
llvm-svn: 265445
2016-04-05 19:54:44 +00:00
Quentin Colombet 8e8e85c19f [GlobalISel] Add the skeleton of the RegBankSelect pass.
This pass is reponsible for assigning the generic virtual registers to register
banks.

llvm-svn: 265440
2016-04-05 19:06:01 +00:00
Manman Ren e221a870d3 Swift Calling Convention: swifterror target-independent change.
At IR level, the swifterror argument is an input argument with type
ErrorObject**. For targets that support swifterror, we want to optimize it
to behave as an inout value with type ErrorObject*; it will be passed in a
fixed physical register.

The main idea is to track the virtual registers for each swifterror value. We
define swifterror values as AllocaInsts with swifterror attribute or a function
argument with swifterror attribute.

In SelectionDAGISel.cpp, we set up swifterror values (SwiftErrorVals) before
handling the basic blocks.

When iterating over all basic blocks in RPO, before actually visiting the basic
block, we call mergeIncomingSwiftErrors to merge incoming swifterror values when
there are multiple predecessors or to simply propagate them. There, we create a
virtual register for each swifterror value in the entry block. For predecessors
that are not yet visited, we create virtual registers to hold the swifterror
values at the end of the predecessor. The assignments are saved in
SwiftErrorWorklist and will be materialized at the end of visiting the basic
block.

When visiting a load from a swifterror value, we copy from the current virtual
register assignment. When visiting a store to a swifterror value, we create a
virtual register to hold the swifterror value and update SwiftErrorMap to
track the current virtual register assignment.

Differential Revision: http://reviews.llvm.org/D18108

llvm-svn: 265433
2016-04-05 18:13:16 +00:00
Haicheng Wu 3618fa786f [BlockPlacement] Remove an unnecessary continue
NFC.

llvm-svn: 265407
2016-04-05 15:37:08 +00:00
Chuang-Yu Cheng d3fb38cae5 Don't delete empty preheaders in CodeGenPrepare if it would create a critical edge
Presently, CodeGenPrepare deletes all nearly empty (only phi and branch)
basic blocks. This pass can delete loop preheaders which frequently creates
critical edges. A preheader can be a convenient place to spill registers to
the stack. If the entrance to a loop body is a critical edge, then spills
may occur in the loop body rather than immediately before it. This patch
protects loop preheaders from deletion in CodeGenPrepare even if they are
nearly empty.

Since the patch alters the CFG, it affects a large number of test cases.
In most cases, the changes are merely cosmetic (basic blocks have different
names or instruction orders change slightly). I am somewhat concerned about
the test/CodeGen/Mips/brdelayslot.ll test case. If the loop preheader is not
deleted, then the MIPS backend does not take advantage of a branch delay
slot. Consequently, I would like some close review by a MIPS expert.

The patch also partially subsumes D16893 from George Burgess IV. George
correctly notes that CodeGenPrepare does not actually preserve the dominator
tree. I think the dominator tree was usually not valid when CodeGenPrepare
ran, but I am using LoopInfo to mark preheaders, so the dominator tree is
now always valid before CodeGenPrepare.

Author: Tom Jablin (tjablin)
Reviewers: hfinkel george.burgess.iv vkalintiris dsanders kbarton cycheng

http://reviews.llvm.org/D16984

llvm-svn: 265397
2016-04-05 14:06:20 +00:00
Dmitry Polukhin a3d5b0b218 [IFUNC] Use GlobalIndirectSymbol when aliases and ifuncs have something similar
Second part extracted from http://reviews.llvm.org/D15525

Use GlobalIndirectSymbol in all cases when aliases and ifuncs have
something in common.

Differential Revision: http://reviews.llvm.org/D18754

llvm-svn: 265382
2016-04-05 08:47:51 +00:00
Sanjay Patel 769b5fd546 fix typos; NFC
llvm-svn: 265356
2016-04-04 22:45:56 +00:00
Justin Bogner 35c6903f22 Revert "CodeGen: Remove dead code in TailDuplicate"
It seems this is reachable after all. It hit on 7zip-benchmark in lnt
on ppc64:

  http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/2317

This reverts r265347.

llvm-svn: 265352
2016-04-04 21:41:54 +00:00
Matthias Braun 7511abd5c1 MachineScheduler: Ignore COPYs with undef/dead op in CopyConstrain mutation.
There is no problem with the code today, but the fix will avoid a crash
in test/CodeGen/AMDGPU/subreg-coalescer-undef-use.ll once the
DetectDeadLanes pass is added.

llvm-svn: 265351
2016-04-04 21:23:46 +00:00
Justin Bogner 9ab8131a57 CodeGen: Remove dead code in TailDuplicate
I noticed that this isn't covered by our existing tests and spent some
time trying to come up with an example it actually hits. I tried hand
rolling something based on the explanation in the comment, but couldn't
get anything that didn't abort tail duplication earlier for one reason
or another.

Then, I tried cranking tail-dup-size cranked up so this would fire
more and ran a bootstrap of clang and the nightly test suite - those
don't hit this either.

This reverts r132816 and replaces it with an assert.

llvm-svn: 265347
2016-04-04 21:11:40 +00:00
Chandler Carruth 613eec8210 Revert r263460: [SpillPlacement] Fix a quadratic behavior in spill placement.
That commit looks wonderful and awesome. Sadly, it greatly exacerbates
PR17409 and effectively regresses build time for a lot of (very large)
code when compiled with ASan or MSan.

We thought this could be fixed forward by landing D15302 which at last
fixes that PR, but some issues were discovered and it looks like that
got reverted, so reverting this as well temporarily. As soon as the fix
for PR17409 lands and sticks, we should re-land this patch as it won't
trigger more significant test cases hitting that bug.

Many thanks to Quentin and Wei here as they're doing all the awesome
hard work!!!

llvm-svn: 265331
2016-04-04 18:57:50 +00:00
Matthias Braun 870c34f0cf ARM, AArch64, X86: Check preserved registers for tail calls.
We can only perform a tail call to a callee that preserves all the
registers that the caller needs to preserve.

This situation happens with calling conventions like preserver_mostcc or
cxx_fast_tls. It was explicitely handled for fast_tls and failing for
preserve_most. This patch generalizes the check to any calling
convention.

Related to rdar://24207743

Differential Revision: http://reviews.llvm.org/D18680

llvm-svn: 265329
2016-04-04 18:56:13 +00:00
Derek Schuff 73900c6876 Replace MachineRegisterInfo::isSSA() with a MachineFunctionProperty
Use the MachineFunctionProperty mechanism to indicate whether a MachineFunction
is in SSA form instead of a custom method on MachineRegisterInfo. NFC

Differential Revision: http://reviews.llvm.org/D18574

llvm-svn: 265318
2016-04-04 18:03:29 +00:00
Wei Mi fb5252cac1 Revert r265309 and r265312 because they caused some errors I need to investigate.
llvm-svn: 265317
2016-04-04 17:45:03 +00:00
Derek Schuff 1dbf7a571f Add MachineFunctionProperty checks for AllVRegsAllocated for target passes
Summary:
This adds the same checks that were added in r264593 to all
target-specific passes that run after register allocation.

Reviewers: qcolombet

Subscribers: jyknight, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D18525

llvm-svn: 265313
2016-04-04 17:09:25 +00:00
Wei Mi cdaf1df657 Fix unused var warning caused by r265309.
llvm-svn: 265312
2016-04-04 17:03:58 +00:00
Wei Mi ffbc9c7f3b Replace analyzeSiblingValues with new algorithm to fix its compile
time issue. The patch is to solve PR17409 and its duplicates.

analyzeSiblingValues is a N x N complexity algorithm where N is
the number of siblings generated by reg splitting. Although it
causes siginificant compile time issue when N is large, it is also
important for performance since it removes redundent spills and
enables rematerialization.

To solve the compile time issue, the patch removes analyzeSiblingValues
and replaces it with lower cost alternatives containing two parts. The
first part creates a new spill hoisting method in postOptimization of
register allocation. It does spill hoisting at once after all the spills
are generated instead of inside every instance of selectOrSplit. The
second part queries the define expr of the original register for
rematerializaiton and keep it always available during register allocation
even if it is already dead. It deletes those dead instructions only in
postOptimization. With the two parts in the patch, it can remove
analyzeSiblingValues without sacrificing performance.

Differential Revision: http://reviews.llvm.org/D15302

llvm-svn: 265309
2016-04-04 16:42:40 +00:00
Peter Zotov 8efe38a1e2 [CodeGenPrepare] Fix r265264 (again).
Don't require TLI for SinkCmpExpression, like it wasn't before
r265264.

llvm-svn: 265271
2016-04-03 19:32:13 +00:00
Peter Zotov f87e550e89 [CodeGenPrepare] Fix r265264.
The case where there was no TargetLowering was not handled,
leading to null pointer dereferences.

llvm-svn: 265265
2016-04-03 17:11:53 +00:00
Peter Zotov 0b6d7bc682 [CodeGenPrepare] Avoid sinking soft-FP comparisons
Sinking comparisons in CGP can undo the job of hoisting them done
earlier by LICM, and soft-FP makes this an expensive mistake.

A common pattern that produces floating point comparisons uniform
over a loop is an explicit check for division by zero. If the divisor
is hoisted out of the loop, the comparison can also be, but hoisting
the function that unwinds is never legal, since it may cause side
effects in the loop body prior to the unwinding to not be executed.

Differential Revision: http://reviews.llvm.org/D18744

llvm-svn: 265264
2016-04-03 16:36:17 +00:00
Manman Ren 9bfd0d03e9 Swift Calling Convention: add swifterror attribute.
A ``swifterror`` attribute can be applied to a function parameter or an
AllocaInst.

This commit does not include any target-specific change. The target-specific
optimization will come as a follow-up patch.

Differential Revision: http://reviews.llvm.org/D18092

llvm-svn: 265189
2016-04-01 21:41:15 +00:00
Sanjoy Das 9d41a8f269 Don't use an i64 return type with webkit_jscc
Re-enable an assertion enabled by Justin Lebar in rL265092.  rL265092
was breaking test/CodeGen/X86/deopt-intrinsic.ll because webkit_jscc
does not like non-i64 return types.  Change the test case to not do
that.

llvm-svn: 265099
2016-04-01 02:51:21 +00:00
Chuang-Yu Cheng 35c6181982 Fix Sub-register Rewriting in Aggressive Anti-Dependence Breaker
Previously, HandleLastUse would delete RegRef information for sub-registers
if they were dead even if their corresponding super-register were still live.

If the super-register were later renamed, then the definitions of the
sub-register would not be updated appropriately. This patch alters the
behavior so that RegInfo information for sub-registers is only deleted when
the sub-register and super-register are both dead.

This resolves PR26775. This is the mirror image of Hal's r227311 commit.

Author: Tom Jablin (tjablin)
Reviewers: kbarton uweigand nemanjai hfinkel

http://reviews.llvm.org/D18448

llvm-svn: 265097
2016-04-01 02:05:29 +00:00
Justin Lebar 98981e5573 Revert "Protect some assertions with NDEBUG rather than DEBUG()."
This reverts r265092, because it breaks CodeGen/X86/deopt-intrinsic.ll.

llvm-svn: 265093
2016-04-01 01:23:23 +00:00
Justin Lebar c814e8e4ab Protect some assertions with NDEBUG rather than DEBUG().
DEBUG() only runs if you pass -debug, but these assertions are generally
useful.

llvm-svn: 265092
2016-04-01 01:09:12 +00:00
Justin Lebar acc47105f8 [ifcnv] Add brief comment explaining what ifcnv is.
llvm-svn: 265088
2016-04-01 01:09:03 +00:00
Adrian Prantl b939a25707 Move the DebugEmissionKind enum from DIBuilder into DICompileUnit.
This mostly cosmetic patch moves the DebugEmissionKind enum from DIBuilder
into DICompileUnit. DIBuilder is not the right place for this enum to live
in — a metadata consumer should not have to include DIBuilder.h.
I also added a Verifier check that checks that the emission kind of a
DICompileUnit is actually legal.

http://reviews.llvm.org/D18612
<rdar://problem/25427165>

llvm-svn: 265077
2016-03-31 23:56:58 +00:00
Tim Shen 800ed436e5 [AsmPrinter] Print aliases in topological order
Print aliases in topological order, that is, for any alias a = b,
b must be printed before a. This is because on some targets (e.g. PowerPC)
linker expects aliases in such an order to generate correct TOC information.

GCC also prints aliases in topological order.

llvm-svn: 265064
2016-03-31 22:08:19 +00:00
Chandler Carruth b472856a73 Fix PR26940 where compiles times regressed massively.
Patch by Jonas Paulsson. Original description:
Bugfix in buildSchedGraph() to make -dag-maps-huge-region work properly

I found that the reduction of the maps did in fact never happen in this
test case. This was because *all* the stores / loads were made with
addresses from arguments and they thus became "unknown" stores / loads.
Fixed by removing continue statements and making sure that the test for
reduction always takes place.

Differential Revision: http://reviews.llvm.org/D18673

llvm-svn: 265063
2016-03-31 21:55:58 +00:00
Sanjay Patel 4d71160d5d fix typo; NFC
llvm-svn: 265054
2016-03-31 21:00:48 +00:00
Hans Wennborg e1a2e90ffa Change eliminateCallFramePseudoInstr() to return an iterator
This will become necessary in a subsequent change to make this method
merge adjacent stack adjustments, i.e. it might erase the previous
and/or next instruction.

It also greatly simplifies the calls to this function from Prolog-
EpilogInserter. Previously, that had a bunch of logic to resume iteration
after the call; now it just continues with the returned iterator.

Note that this changes the behaviour of PEI a little. Previously,
it attempted to re-visit the new instruction created by
eliminateCallFramePseudoInstr(). That code was added in r36625,
but I can't see any reason for it: the new instructions will obviously
not be pseudo instructions, they will not have FrameIndex operands,
and we have already accounted for the stack adjustment.

Differential Revision: http://reviews.llvm.org/D18627

llvm-svn: 265036
2016-03-31 18:33:38 +00:00
Stephan Bergmann 480de227f6 Don't use potentially invalidated iterator
If the lhs is evaluated before the rhs, FuncletI's operator-> can trigger the

  assert(isHandleInSync() && "invalid iterator access!");

at include/llvm/ADT/DenseMap.h:1061.  (Happens e.g. when compiled with GCC 6.)

Differential Revision: http://reviews.llvm.org/D18440

llvm-svn: 265024
2016-03-31 15:42:01 +00:00
Nirav Dave 83ce54aac2 Prevent X86ISelLowering from merging volatile loads
Change isConsecutiveLoads to check that loads are non-volatile as this
is a requirement for any load merges. Propagate change to two callers.

Reviewers: RKSimon

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18546

llvm-svn: 265013
2016-03-31 13:40:55 +00:00
Matthias Braun 8d41436004 CodeGen: Factor out code for tail call result compatibility check; NFC
llvm-svn: 264959
2016-03-30 22:46:04 +00:00
Matt Arsenault 46ba31650e LegalizeDAG: Don't replace vector store with integer if not legal
For the same reason as the corresponding load change.

Note that ExpandStore is completely broken for non-byte sized element
vector stores, but preserve the current broken behavior which has tests
for it. The behavior should be the same, but now introduces a new typed
store that is incorrectly split later rather than doing it directly.

llvm-svn: 264928
2016-03-30 21:15:18 +00:00
Matt Arsenault a4b1b6ea05 LegalizeDAG: Don't replace vector load with integer unless legal
On AMDGPU we want to be able to promote i64/f64 loads to v2i32.
If the access is unaligned, this would conclude that since i64 is legal,
it would convert it back to i64 and there is an endless legalization
loop.

Extract the logic for scalarizing the load into a new TargetLowering
function, where this can also replace the custom function AMDGPU
has for this.

llvm-svn: 264927
2016-03-30 21:15:10 +00:00
Fiona Glaser 44a2f7a298 MachineSink: make shouldSink a TII target hook
Some targets may disagree on what they want sunk or not sunk,
so make this a target hook instead of hardcoded.

llvm-svn: 264799
2016-03-29 22:44:57 +00:00
Derek Schuff 07636cd5e7 Add a print method to MachineFunctionProperties for better error messages
This makes check failures much easier to understand.
Make it empty (but leave it in the class) for NDEBUG builds.

Differential Revision: http://reviews.llvm.org/D18529

llvm-svn: 264780
2016-03-29 20:28:20 +00:00
Matthias Braun 72a58c3e28 MachineVerifier: On dead-def live segments, check that corresponding machine operand has a dead flag
llvm-svn: 264769
2016-03-29 19:07:43 +00:00
Matthias Braun 1c20c8280a LiveVariables: Fix typo and shorten comment
llvm-svn: 264768
2016-03-29 19:07:40 +00:00
Nirav Dave 2aab7f4358 Add support for no-jump-tables
Add function soft attribute to the generation of Jump Tables in CodeGen
as initial step towards clang support of gcc's no-jump-table support

Reviewers: hans, echristo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18321

llvm-svn: 264756
2016-03-29 17:46:23 +00:00
Derek Schuff 42666eeea2 Add MachineVerifier check for AllVRegsAllocated MachineFunctionProperty
Summary:
Check that any function that has the property set is free of virtual
register operands.

Also, it is actually VirtRegMap (and not the register allocators) that
acutally remove the VReg operands (except for RegAllocFast).

Reviewers: qcolombet

Subscribers: MatzeB, llvm-commits, qcolombet

Differential Revision: http://reviews.llvm.org/D18535

llvm-svn: 264755
2016-03-29 17:40:22 +00:00
Manman Ren f46262e0b7 Swift Calling Convention: add swiftself attribute.
Differential Revision: http://reviews.llvm.org/D17866

llvm-svn: 264754
2016-03-29 17:37:21 +00:00
Matthias Braun f54530ef00 RegisterPressure: Simplify liveness tracking when lanemasks are not checked.
Split RegisterOperands code that collects defs/uses into a variant with
and without lanemask tracking. This is a bit of code duplication, but
there are enough subtle differences between the two variants that this
seems cleaner (and potentially faster).

This also fixes a problem where lanes where tracked even though
TrackLaneMasks was false. This is part of the fix for
http://llvm.org/PR27106. I will commit the testcase when it is
completely fixed.

llvm-svn: 264696
2016-03-29 03:54:22 +00:00
Matthias Braun 82cff88691 LiveVariables: Do not remove dead flags from vreg operands
Also add a FIXME comment on why Mips RDDSP causes bogus dead flags to be
added which LiveVariables cleans up by accident.

llvm-svn: 264695
2016-03-29 03:08:18 +00:00
Kyle Butt 5e241b11ed [Codegen] Decrease minimum jump table density.
Minimum density for both optsize and non optsize are now options
-sparse-jump-table-density (default 10) for non optsize functions
-dense-jump-table-density (default 40) for optsize functions, which
matches the current default. This improves several benchmarks at google
at the cost of a small codesize increase. For code compiled with -Os,
the old behavior continues

llvm-svn: 264689
2016-03-29 00:23:41 +00:00
Matthias Braun b74eb41d58 MIRParser: Add %subreg.xxx syntax for subregister index operands
Differential Revision: http://reviews.llvm.org/D18279

llvm-svn: 264608
2016-03-28 18:18:46 +00:00
Derek Schuff ad154c837e Introduce MachineFunctionProperties and the AllVRegsAllocated property
MachineFunctionProperties represents a set of properties that a MachineFunction
can have at particular points in time. Existing examples of this idea are
MachineRegisterInfo::isSSA() and MachineRegisterInfo::tracksLiveness() which
will eventually be switched to use this mechanism.
This change introduces the AllVRegsAllocated property; i.e. the property that
all virtual registers have been allocated and there are no VReg operands
left.

With this mechanism, passes can declare that they require a particular property
to be set, or that they set or clear properties by implementing e.g.
MachineFunctionPass::getRequiredProperties(). The MachineFunctionPass base class
verifies that the requirements are met, and handles the setting and clearing
based on the delcarations. Passes can also directly query and update the current
properties of the MF if they want to have conditional behavior.

This change annotates the target-independent post-regalloc passes; future
changes will also annotate target-specific ones.

Reviewers: qcolombet, hfinkel

Differential Revision: http://reviews.llvm.org/D18421

llvm-svn: 264593
2016-03-28 17:05:30 +00:00
James Y Knight 01f2ca5612 NFC: skip FenceInst up-front in AtomicExpandPass.
llvm-svn: 264583
2016-03-28 15:05:30 +00:00
JF Bastien a874d1a40d Revert "NFC: static_assert instead of comment"
This reverts commit fa36fcff16c7d4f78204d6296bf96c3558a4a672.

Causes the following Windows failure:

  C:\Buildbot\Slave\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.src\lib\CodeGen\MachineInstr.cpp(762):
  error C2338: must be trivially copyable to memmove

llvm-svn: 264516
2016-03-26 18:20:02 +00:00
JF Bastien d4ff3360ae NFC: static_assert instead of comment
Summary: isPodLike is as close as we have for is_trivially_copyable.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18483

llvm-svn: 264515
2016-03-26 18:14:27 +00:00
Junmo Park a26e93bcec Minor code cleanup. NFC.
llvm-svn: 264505
2016-03-26 06:04:55 +00:00
Jun Bum Lim 36c53fe147 [MachineCopyPropagation] Expose more dead copies across instructions with regmasks
When encountering instructions with regmasks, instead of cleaning up all the
elements in MaybeDeadCopies map, remove only the instructions erased. By keeping
more instruction in MaybeDeadCopies, this change will expose more dead copies
across instructions with regmasks.

llvm-svn: 264462
2016-03-25 21:15:35 +00:00
Nirav Dave fa250cad37 Prevent construction of cycle in DAG store merge
When merging stores in DAGCombiner, add check to ensure that no
dependenices exist that would cause the construction of a cycle in our
DAG.  This may happen if one store has a data dependence on another
instruction (e.g. a load) which itself has a (chain) dependence on
another store being merged. These stores cannot be merged safely and
doing so results in a cycle that is discovered in LegalizeDAG.

This test is only done in cases where Antialias analysis is used (UseAA)
as non-AA store merge candidates will be merged logically after all
loads which have been checked to not alias.

Reviewers: ahatanak, spatel, niravd, arsenm, hfinkel, tstellarAMD, jyknight

Subscribers: llvm-commits, tberghammer, danalbert, srhines

Differential Revision: http://reviews.llvm.org/D18336

llvm-svn: 264461
2016-03-25 21:06:30 +00:00
Justin Bogner ec0e7d2582 CodeGen: Don't iterate over operands after we've erased an MI
This fixes a use-after-free introduced 3 years ago, in r182872 ;)

The code more or less worked because the memory that CopyMI was
pointing to happened to still be valid, but lots of tests would crash
if you ran under ASAN with the recycling allocator changes from
llvm.org/PR26808

llvm-svn: 264455
2016-03-25 20:03:28 +00:00
Justin Bogner ec5ea36891 CodeGen: Fix a use-after-free in TII
Found by ASAN with the recycling allocator changes from PR26808.

llvm-svn: 264443
2016-03-25 18:38:48 +00:00
Reid Kleckner f6f04f8fc8 Consider regmasks when computing register-based DBG_VALUE live ranges
Now register parameters that aren't saved to the stack or CSRs are
considered dead after the first call. Previously the debugger would show
whatever was in the register.

Fixes PR26589

Reviewers: aprantl

Differential Revision: http://reviews.llvm.org/D17211

llvm-svn: 264429
2016-03-25 17:54:46 +00:00
Manman Ren 9dd8c14674 CXX TLS: collect return blocks after SelectAllBasicBlocks.
It is incorrect to get the corresponding MBB for a ReturnInst before
SelectAllBasicBlocks since SelectAllBasicBlocks can change the
correspondence between a ReturnInst and the MBB it is in.

PR27062

llvm-svn: 264358
2016-03-24 23:21:29 +00:00
Sanjoy Das fd3eaa8c5c Reduce code duplication by extracting out a helper function; NFC
llvm-svn: 264355
2016-03-24 22:51:49 +00:00
Sanjoy Das 731c67fed2 Lower varargs correctly in deopt bundle lowering
Earlier we were ignoring varargs in LowerCallSiteWithDeoptBundle because
populateCallLoweringInfo does not set CallLoweringInfo::IsVarArg.

llvm-svn: 264354
2016-03-24 22:37:52 +00:00
Matthias Braun ae81c29352 LiveInterval: Fix Distribute() failing on liveranges with unused VNInfos
This fixes http://llvm.org/PR26991

llvm-svn: 264345
2016-03-24 21:41:38 +00:00
Reid Kleckner 01bc66a8ce Revert "Recommitted r263424 "Supporting all entities declared in lexical scope in LLVM debug info." After fixing PR26942 (the fix is included in this commit)."
This reverts commit r264280.

This broke building Chromium for iOS. We'll upload a reproducer to the
PR soon.

llvm-svn: 264334
2016-03-24 20:38:49 +00:00
Sanjoy Das df9ae70f49 Add lowering support for llvm.experimental.deoptimize
Summary:
Only adds support for "naked" calls to llvm.experimental.deoptimize.
Support for round-tripping through RewriteStatepointsForGC will come
as a separate patch (should be simpler than this one).

Reviewers: reames

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18429

llvm-svn: 264329
2016-03-24 20:23:29 +00:00
Sanjoy Das c0c59fe14e [Statepoints] Fix yet another issue around gc pointer uniqueing
Given that StatepointLowering now uniques derived pointers before
putting them in the per-statepoint spill map, we may end up with missing
entries for derived pointers when we visit a gc.relocate on a pointer
that was de-duplicated away.

Fix this by keeping two maps, one mapping gc pointers to their
de-duplicated values, and one mapping a de-duplicated value to the slot
it is spilled in.

llvm-svn: 264320
2016-03-24 18:57:39 +00:00
Sanjoy Das 42f91a9959 Minor cosmestic changes (NFC)
- Reflow comments
 - Rename function

llvm-svn: 264319
2016-03-24 18:57:31 +00:00
David Blaikie 0b214e4a2a [debuginfo] Include dwo_name in the split unit to improve dwp diagnostics
When multiple DWP files are merged together and duplicate DWO IDs are
found it's currently difficult to give an actionable error message - the
DW_AT_name of the CU could be provided, but might be identical (if the
same source file is built into two different configurations), which
doesn't help the user identify the problem.

When no intermediate DWP files are generated, the path to the two DWO
files could be provided - but is lost once the DWOs are merged into a
DWP.

So, include the name of the DWO (dwo_name) in the split file so that
collissions involving a source CU from a DWP can be better diagnosed.

(improvements to llvm-dwp using this to come shortly)

llvm-svn: 264316
2016-03-24 18:37:08 +00:00
Tim Northover 4498eff9bb CodeGen: extend RHS when splitting ATOMIC_CMP_SWAP_WITH_SUCCESS.
If the operation's type has been promoted during type legalization, we
need to account for the fact that the high bits of the comparison
operand are likely unspecified.

The LHS is usually zero-extended, but MIPS sign extends it, so we have
to be slightly careful.

Patch by Simon Dardis.

llvm-svn: 264296
2016-03-24 15:38:38 +00:00
Pirama Arumuga Nainar dc45aef2d8 Remove unsafe AssertZext after promoting result of FP_TO_FP16
Summary:
Some target lowerings of FP_TO_FP16, for instance ARM's vcvtb.f16.f32
instruction, do not guarantee that the top 16 bits are zeroed out.
Remove the unsafe AssertZext and add tests to exercise this.

Reviewers: jmolloy, sbaranga, kristof.beyls, aadg

Subscribers: llvm-commits, srhines, aemerson

Differential Revision: http://reviews.llvm.org/D18426

llvm-svn: 264285
2016-03-24 14:06:03 +00:00
Amjad Aboud 6ff7e10052 Recommitted r263424 "Supporting all entities declared in lexical scope in LLVM debug info."
After fixing PR26942 (the fix is included in this commit).

Differential Revision: http://reviews.llvm.org/D18350

llvm-svn: 264280
2016-03-24 13:30:16 +00:00
Cong Hou 94710840fb Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed.
Currently, AnalyzeBranch() fails non-equality comparison between floating points
on X86 (see https://llvm.org/bugs/show_bug.cgi?id=23875). This is because this
function can modify the branch by reversing the conditional jump and removing
unconditional jump if there is a proper fall-through. However, in the case of
non-equality comparison between floating points, this can turn the branch
"unanalyzable". Consider the following case:

jne.BB1
jp.BB1
jmp.BB2
.BB1:
...
.BB2:
...

AnalyzeBranch() will reverse "jp .BB1" to "jnp .BB2" and then "jmp .BB2" will be
removed:

jne.BB1
jnp.BB2
.BB1:
...
.BB2:
...

However, AnalyzeBranch() cannot analyze this branch anymore as there are two
conditional jumps with different targets. This may disable some optimizations
like block-placement: in this case the fall-through behavior is enforced even if
the fall-through block is very cold, which is suboptimal.

Actually this optimization is also done in block-placement pass, which means we
can remove this optimization from AnalyzeBranch(). However, currently
X86::COND_NE_OR_P and X86::COND_NP_OR_E are not reversible: there is no defined
negation conditions for them.

In order to reverse them, this patch defines two new CondCode X86::COND_E_AND_NP
and X86::COND_P_AND_NE. It also defines how to synthesize instructions for them.
Here only the second conditional jump is reversed. This is valid as we only need
them to do this "unconditional jump removal" optimization.


Differential Revision: http://reviews.llvm.org/D11393

llvm-svn: 264199
2016-03-23 21:45:37 +00:00
Justin Bogner c35c10593b SelectionDAG: Remove a tautological dyn_cast. NFC
Index is already a StoreSDNode, so this dyn_cast doesn't do anything.

llvm-svn: 264177
2016-03-23 18:15:33 +00:00
Sanjoy Das a5b2972977 Remove stale comment
llvm-svn: 264131
2016-03-23 02:28:35 +00:00
Sanjoy Das ac53dc7520 [StatepointLowering] Don't do two DenseMap lookups; nfci
llvm-svn: 264130
2016-03-23 02:24:15 +00:00
Sanjoy Das 7edbef316b [StatepointLowering] Minor NFC cleanups
- Use auto
 - Name variables in LLVM style
 - Use llvm::find instead of std::find
 - Blank lines between declarations

llvm-svn: 264129
2016-03-23 02:24:13 +00:00
Sanjoy Das 4cd746ebe0 [StatepointLowering] Minor nfc refactoring
Now that StatepointLoweringInfo represents base pointers, derived
pointers and gc relocates as SmallVectors and not ArrayRefs, we no
longer need to allocate "backing storage" on stack in LowerStatepoint.
So elide the backing storage, and inline the trivial body of
getIncomingStatepointGCValues.

llvm-svn: 264128
2016-03-23 02:24:10 +00:00
Sanjoy Das e58ca59cf4 [StatepointLowering] Schedule gc relocates before uniqueing them
Otherwise we can see an "unexpected" gc.relocate that we uniqued away.

llvm-svn: 264127
2016-03-23 02:24:07 +00:00
George Burgess IV d4febd1612 Keep CodeGenPrepare from preserving the domtree.
CGP modifies the domtree in some cases, so saying that it preserves the
domtree is a lie. We'll be able to selectively preserve it with the new
pass manager.

Differential Revision: http://reviews.llvm.org/D16893

llvm-svn: 264099
2016-03-22 21:25:08 +00:00
Simon Pilgrim c6f5fe3d69 [SelectionDAG] Ensure constant folded legalized vector element types are compatible with the BUILD_VECTOR type
Found during fuzz testing - 32-bit x86 targets were legalizing a <2 x i1> compare result to <2 x i32> when <2 x i64> was expected.

llvm-svn: 264085
2016-03-22 19:59:53 +00:00
Tim Northover b49a8a9dbb CodeGen: check return types match when emitting tail call to builtin.
We were just completely ignoring the types when determining whether we could
safely emit a libcall as a tail call. This is clearly wrong.

Theoretically, we could dig deeper looking for incidental matches (much like
the generic code in Analysis.cpp does), but it's probably not worth it for the
few libcalls that exist.

llvm-svn: 264084
2016-03-22 19:14:38 +00:00
Sanjoy Das eb5037cadc Allow lowering call sites with both funclets and deopt state
Lowering funclets is a no-op, so we can just go ahead and lower the
deopt state.

llvm-svn: 264078
2016-03-22 18:10:39 +00:00
Sanjoy Das 6b535630a1 Add a hasOperandBundlesOtherThan helper, and use it; NFC
llvm-svn: 264072
2016-03-22 17:51:25 +00:00
Simon Pilgrim 25fb4177fb [X86][SSE] Reapplied: Simplify vector LOAD + EXTEND on pre-SSE41 hardware
Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions.

We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT.

Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718).

Reapplied with a fix for PR26953 (missing vector widening legalization).

Differential Revision: http://reviews.llvm.org/D17932

llvm-svn: 264062
2016-03-22 16:22:08 +00:00
Sanjoy Das 38bfc22161 Add "first class" lowering for deopt operand bundles
Summary:
After this change, deopt operand bundles can be lowered directly by
SelectionDAG into STATEPOINT instructions (which are then lowered to a
call or sequence of nop, with an associated __llvm_stackmaps entry0.
This obviates the need to round-trip deoptimization state through
gc.statepoint via RewriteStatepointsForGC.

Reviewers: reames, atrick, majnemer, JosephTremoulet, pgavlin

Subscribers: sanjoy, mcrosier, majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D18257

llvm-svn: 264015
2016-03-22 00:59:13 +00:00
Silviu Baranga 46030585b3 [DAGCombine] Catch the case where extract_vector_elt can cause an any_ext while processing AND SDNodes
Summary:
extract_vector_elt can cause an implicit any_ext if the types don't
match. When processing the following pattern:

  (and (extract_vector_elt (load ([non_ext|any_ext|zero_ext] V))), c)

DAGCombine was ignoring the possible extend, and sometimes removing
the AND even though it was required to maintain some of the bits
in the result to 0, resulting in a miscompile.

This change fixes the issue by limiting the transformation only to
cases where the extract_vector_elt doesn't perform the implicit
extend.

Reviewers: t.p.northover, jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18247

llvm-svn: 263935
2016-03-21 11:43:46 +00:00
Craig Topper ea87eae4ca Suppress a -Wunused-variable warning in release builds.
llvm-svn: 263892
2016-03-20 01:17:54 +00:00
Saleem Abdulrasool 2854666263 CodeGen: use range based for loop
Convert a loop to use a range based style loop.  NFC.

llvm-svn: 263884
2016-03-19 16:35:32 +00:00
Manman Ren 2828c57b6f [CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86.
Since at O0, explicit copies via SplitCSR may not be removed even if
they are unnecessary, we choose not to use SplitCSR at O0.

llvm-svn: 263855
2016-03-18 23:38:49 +00:00
Matthias Braun 0d208fc9f6 MILexer: Add ErrorCallbackType typedef; NFC
llvm-svn: 263829
2016-03-18 20:41:11 +00:00
Reid Kleckner fbd7787d7e [codeview] Only emit function ids for inlined functions
We aren't referencing any other kind of function currently.
Should save a bit on our debug info size.

llvm-svn: 263817
2016-03-18 18:54:32 +00:00
Peter Collingbourne a1f8625662 DebugInfo: Add ability to not emit DW_AT_vtable_elem_location for virtual functions.
A virtual index of -1u indicates that the subprogram's virtual index is
unrepresentable (for example, when using the relative vtable ABI), so do
not emit a DW_AT_vtable_elem_location attribute for it.

Differential Revision: http://reviews.llvm.org/D18236

llvm-svn: 263765
2016-03-17 23:58:03 +00:00
Sanjoy Das 3a02019fbc [SelectionDAG] Remove visitStatepoint; NFC
This way we have a single entry point into StatepointLowering.  The
method was a direct dispatch to LowerStatepoint anyway.

llvm-svn: 263682
2016-03-17 00:47:14 +00:00
Sanjoy Das 43e33d61c6 Fix indentation; NFC
llvm-svn: 263672
2016-03-16 23:11:21 +00:00
Sanjoy Das 70697ff74d Extract out a SelectionDAGBuilder::LowerAsStatepoint; NFC
Summary:
This is a step towards implementing "direct" lowering of calls and
invokes with deopt operand bundles into STATEPOINT nodes (as opposed to
having them mandatorily pass through RewriteStatepointsForGC, which is
the case today).

This change extracts out a `SelectionDAGBuilder::LowerAsStatepoint`
helper function that is able to lower a "statepoint like thing", and
uses it to lower `gc.statepoint` calls.  This is an NFC now, but in a
later change we will use `LowerAsStatepoint` to directly lower calls and
invokes with operand bundles without going through an intermediate
`gc.statepoint` IR representation.

FYI: I expect `SelectionDAGBuilder::StatepointInfo` will evolve as I add
support for lowering non gc.statepoints, right now it is fairly tightly
coupled with an IR level `gc.statepoint`.

Reviewers: reames, pgavlin, JosephTremoulet

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18106

llvm-svn: 263671
2016-03-16 23:08:00 +00:00
James Y Knight f44fc5219f Tweak some atomics functions in preparation for larger changes; NFC.
- Rename getATOMIC to getSYNC, as llvm will soon be able to emit both
  '__sync' libcalls and '__atomic' libcalls, and this function is for
  the '__sync' ones.

- getInsertFencesForAtomic() has been replaced with
  shouldInsertFencesForAtomic(Instruction), so that the decision can be
  made per-instruction. This functionality will be used soon.

- emitLeadingFence/emitTrailingFence are no longer called if
  shouldInsertFencesForAtomic returns false, and thus don't need to
  check the condition themselves.

llvm-svn: 263665
2016-03-16 22:12:04 +00:00
Sanjoy Das 19c6159833 [SelectionDAG] Extract out populateCallLoweringInfo; NFC
SelectionDAGBuilder::populateCallLoweringInfo is now used instead of
SelectionDAGBuilder::lowerCallOperands.  The populateCallLoweringInfo
interface is more composable in face of design changes like
http://reviews.llvm.org/D18106

llvm-svn: 263663
2016-03-16 20:49:31 +00:00
Simon Pilgrim b5a20f0fec Removed trailing whitespace
llvm-svn: 263650
2016-03-16 18:37:44 +00:00
Lang Hames 1b640e05ba [MachO] Add MachO alt-entry directive support.
This patch adds support for the MachO .alt_entry assembly directive, and uses
it for global aliases with non-zero GEP offsets. The alt_entry flag indicates
that a symbol should be layed out immediately after the preceding symbol.
Conceptually it introduces an alternate entry point for a function or data
structure. E.g.:

safe_foo:
  // check preconditions for foo
.alt_entry fast_foo
fast_foo:
  // body of foo, can assume preconditions.

The .alt_entry flag is also implicitly set on assembly aliases of the form:

a = b + C

where C is a non-zero constant, since these have the same effect as an
alt_entry symbol: they introduce a label that cannot be moved relative to the
preceding one. Setting the alt_entry flag on aliases of this form fixes
http://llvm.org/PR25381.

llvm-svn: 263521
2016-03-15 01:43:05 +00:00
Sanjoy Das c11460e051 [StatepointLowering] Move an assertion; NFCI
Instead of running an explicit loop over `gc.relocate` calls hanging off
of a `gc.statepoint`, assert the validity of the type of the value being
relocated in `visitRelocate`.

llvm-svn: 263516
2016-03-15 01:16:31 +00:00
Eric Christopher da8b3f1914 Temporarily Revert "[X86][SSE] Simplify vector LOAD + EXTEND on
pre-SSE41 hardware" as it seems to be causing crashes during code
generation in halide. PR forthcoming.

This reverts commit r263303.

llvm-svn: 263512
2016-03-14 23:59:57 +00:00
Amaury Sechet eae09c2c2a Factor out MachineBlockPlacement::fillWorkLists. NFC
Summary: There are places in MachineBlockPlacement where a worklist is filled in pretty much identical way. The code is duplicated. This refactor it so that the same code is used in both scenarii.

Reviewers: chandlerc, majnemer, rafael, MatzeB, escha, silvas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18077

llvm-svn: 263495
2016-03-14 21:24:11 +00:00
Quentin Colombet 40ce25b68b [SpillPlacement] Fix a quadratic behavior in spill placement.
The bad behavior happens when we have a function with a long linear chain of
basic blocks, and have a live range spanning most of this chain, but with very
few uses.
Let say we have only 2 uses.
The Hopfield network is only seeded with two active blocks where the uses are,
and each iteration of the outer loop in `RAGreedy::growRegion()` only adds two
new nodes to the network due to the completely linear shape of the CFG.
Meanwhile, `SpillPlacer->iterate()` visits the whole set of discovered nodes,
which adds up to a quadratic algorithm.

This is an historical accident effect from r129188.

When the Hopfield network is expanding, most of the action is happening on the
frontier where new nodes are being added. The internal nodes in the network are
not likely to be flip-flopping much, or they will at least settle down very
quickly. This means that while `SpillPlacer->iterate()` is recomputing all the
nodes in the network, it is probably only the two frontier nodes that are
changing their output.

Instead of recomputing the whole network on each iteration, we can maintain a
SparseSet of nodes that need to be updated:

- `SpillPlacement::activate()` adds the node to the todo list.
- When a node changes value (i.e., `update()` returns true), its neighbors are
  added to the todo list.
- `SpillPlacement::iterate()` only updates the nodes in the list.

The result of Hopfield iterations is not necessarily exact. It should converge
to a local minimum, but there is no guarantee that it will find a global
minimum. It is possible that updating nodes in a different order will cause us
to switch to a different local minimum. In other words, this is not NFC, but
although I saw a few runtime improvements and regressions when I benchmarked
this change, those were side effects and actually the performance change is in
the noise as expected.

Huge thanks to Jakob Stoklund Olesen <stoklund@2pi.dk> for his feedbacks,
guidance and time for the review.

llvm-svn: 263460
2016-03-14 18:21:25 +00:00
Sanjay Patel 7506852709 [DAG] use !isUndef() ; NFCI
llvm-svn: 263453
2016-03-14 18:09:43 +00:00
Sanjay Patel 5719584129 [DAG] use isUndef() ; NFCI
llvm-svn: 263448
2016-03-14 17:28:46 +00:00
Benjamin Kramer 1082fa66a5 Revert "Recommitted r261633 "Supporting all entities declared in lexical scope in LLVM debug info." After fixing PR26715 at r263379."
This reverts commit r263424. Breaks self-host.

llvm-svn: 263437
2016-03-14 14:58:36 +00:00
Amjad Aboud ab0378b16c Recommitted r261633 "Supporting all entities declared in lexical scope in LLVM debug info."
After fixing PR26715 at r263379.

llvm-svn: 263424
2016-03-14 12:03:20 +00:00
David Majnemer b9456a5eb3 [CodeView] Consistently handle overly large symbol names
Overly large symbol names weren't correctly handled for leaf function
records.

llvm-svn: 263408
2016-03-14 05:15:09 +00:00
David Majnemer 1256125fb7 [CodeView] Truncate display names
Fundamentally, the length of a variable or function name is bound by the
maximum size of a record: 0xffff.  However, the name doesn't live in a
vacuum; other data is associated with the name, lowering the bound
further.

We would naively attempt to emit the name, causing us to assert because
the record would no-longer fit in 16-bits.  Instead, truncate the name
but preserve as much as we can.

While I have tested this locally, I've decided to not commit it due to
the test's size.

N.B.  While this behavior is undesirable, it is better than MSVC's
behavior.  They seem to truncate to ~4000 characters.

llvm-svn: 263378
2016-03-13 10:53:30 +00:00
Sanjoy Das ecf96c9516 Make gc relocates more strongly typed; NFC
Don't use a `Value *` where we can use a stronger `GCRelocateInst *`
type.

llvm-svn: 263327
2016-03-12 02:54:27 +00:00
Simon Pilgrim 33d57c7547 [X86][SSE] Simplify vector LOAD + EXTEND on pre-SSE41 hardware
Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions.

We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT.

Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718).

Differential Revision: http://reviews.llvm.org/D17932

llvm-svn: 263303
2016-03-11 22:18:05 +00:00
Quentin Colombet dd4b137364 [IRTranslator] Translate unconditional branches.
llvm-svn: 263265
2016-03-11 17:28:03 +00:00
Quentin Colombet f9b4934d1d [MachineIRBuilder] Rework buildInstr API to maximize code reuse.
llvm-svn: 263264
2016-03-11 17:27:58 +00:00
Quentin Colombet e225e2541b [IRTranslator] Update getOrCreateVReg API to use references.
A value that we want to keep in a virtual register cannot be null.
Reflect that in the API.

llvm-svn: 263263
2016-03-11 17:27:54 +00:00
Quentin Colombet 000b580b13 [MachineIRBuilder] Rename the setter of MF for consistency with the getter.
llvm-svn: 263262
2016-03-11 17:27:51 +00:00
Quentin Colombet 91ebd71e26 [MachineIRBuilder] Rename the setter for MBB for consistency with the getter.
llvm-svn: 263261
2016-03-11 17:27:47 +00:00
Quentin Colombet 53237a9e64 [IRTranslator] Update getOrCreateBB API to use references.
A null basic block is invalid, so just pass a reference.

llvm-svn: 263260
2016-03-11 17:27:43 +00:00
Chad Rosier ac216fd9d5 [misched] Fix a truncation issue from r263021.
The truncation was causing the sorting algorithm to behave oddly when comparing
positive and negative offsets.  Fortunately, this doesn't currently happen in
practice and was exposed by a WIP.  Thus, I can't test this change now, but the
follow on patch will.

llvm-svn: 263255
2016-03-11 16:54:07 +00:00
Junmo Park 6098cbbd2c Minor code cleanups. NFC.
llvm-svn: 263200
2016-03-11 07:05:32 +00:00
Junmo Park 4ba6cf69e4 Minor code cleanup. NFC.
llvm-svn: 263196
2016-03-11 05:07:07 +00:00
Pete Cooper adebb9379a Remove llvm::getDISubprogram in favor of Function::getSubprogram
llvm::getDISubprogram walks the instructions in a function, looking for one in the scope of the current function, so that it can find the !dbg entry for the subprogram itself.

Now that !dbg is attached to functions, this should not be necessary. This patch changes all uses to just query the subprogram directly on the function.

Ideally this should be NFC, but in reality its possible that a function:

has no !dbg (in which case there's likely a bug somewhere in an opt pass), or
that none of the instructions had a scope referencing the function, so we used to not find the !dbg on the function but now we will

Reviewed by Duncan Exon Smith.

Differential Revision: http://reviews.llvm.org/D18074

llvm-svn: 263184
2016-03-11 02:14:16 +00:00
Marianne Mailhot-Sarrasin eddc5b130e Test commit access
llvm-svn: 263165
2016-03-10 21:54:25 +00:00
Simon Pilgrim 61eb49e437 [X86][SSE] Reapplied: Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG
Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns.

Reapplied with a fix for PR26870 (avoid premature use of TargetConstant in ZERO_EXTEND_VECTOR_INREG expansion).

Differential Revision: http://reviews.llvm.org/D17691

llvm-svn: 263159
2016-03-10 20:40:26 +00:00
Chandler Carruth 61440d225b [PM] Port memdep to the new pass manager.
This is a fairly straightforward port to the new pass manager with one
exception. It removes a very questionable use of releaseMemory() in
the old pass to invalidate its caches between runs on a function.
I don't think this is really guaranteed to be safe. I've just used the
more direct port to the new PM to address this by nuking the results
object each time the pass runs. While this could cause some minor malloc
traffic increase, I don't expect the compile time performance hit to be
noticable, and it makes the correctness and other aspects of the pass
much easier to reason about. In some cases, it may make things faster by
making the sets and maps smaller with better locality. Indeed, the
measurements collected by Bruno (thanks!!!) show mostly compile time
improvements.

There is sadly very limited testing at this point as there are only two
tests of memdep, and both rely on GVN. I'll be porting GVN next and that
will exercise this heavily though.

Differential Revision: http://reviews.llvm.org/D17962

llvm-svn: 263082
2016-03-10 00:55:30 +00:00
Philip Reames ac115ed72f [CGP] Duplicate addressing computation in cold paths if required to sink addressing mode
This patch teaches CGP to duplicate addressing mode computations into cold paths (detected via explicit cold attribute on calls) if required to let addressing mode be safely sunk into the basic block containing each load and store.

In general, duplicating code into cold blocks may result in code growth, but should not effect performance. In this case, it's better to duplicate some code than to put extra pressure on the register allocator by making it keep the address through the entirely of the fast path.

This patch only handles addressing computations, but in principal, we could implement a more general cold cold scheduling heuristic which tries to reduce register pressure in the fast path by duplicating code into the cold path. Getting the profitability of the general case right seemed likely to be challenging, so I stuck to the existing case (addressing computation) we already had.

Differential Revision: http://reviews.llvm.org/D17652

llvm-svn: 263074
2016-03-09 23:13:12 +00:00
Tom Stellard 9f2e00de7b SelectionDAG: Fix a crash on inline asm when output register supports multiple types
Summary:
The code in SelectionDAG did not handle the case where the
register type and output types were different, but had the same size.

Reviewers: arsenm, echristo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17940

llvm-svn: 263022
2016-03-09 16:02:52 +00:00
Chad Rosier c27a18f39f [TII] Allow getMemOpBaseRegImmOfs() to accept negative offsets. NFC.
http://reviews.llvm.org/D17967

llvm-svn: 263021
2016-03-09 16:00:35 +00:00
Krzysztof Parzyszek cd99e364e3 Invoke DAG postprocessing in the post-RA scheduler
This was inadvertently omitted from r262774, which added the mutation
interface.

llvm-svn: 262939
2016-03-08 16:54:20 +00:00
Hans Wennborg e00b6e7249 Revert r262599 "[X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG"
This caused PR26870.

llvm-svn: 262935
2016-03-08 16:21:41 +00:00
Krzysztof Parzyszek 1a1d78b86f Add DAG mutation interface to the DFA packetizer
llvm-svn: 262930
2016-03-08 15:33:51 +00:00
Justin Bogner 671febc0f7 Re-apply "SelectionDAG: Store SDNode operands in an ArrayRecycler"
This re-applies r262886 with a fix for 32 bit platforms that have 8 byte
pointer alignment, effectively reverting r262892.

Original Message:

  Currently some SDNode operands are malloc'd, some are stored inline in
  subclasses of SDNode, and some are thrown into a BumpPtrAllocator.
  This scheme is complex, inconsistent, and makes refactoring SDNodes
  fairly difficult.

  Instead, we can allocate all of the operands using an ArrayRecycler
  that wraps a BumpPtrAllocator. This keeps the cache locality when
  iterating operands, improves locality when iterating SDNodes without
  looking at operands, and vastly simplifies the ownership semantics.

  It also means we stop overallocating SDNodes by 2-3x and will make it
  simpler to fix the rampant undefined behaviour we have in how we
  mutate SDNodes from one kind to another (See llvm.org/pr26808).

  This is NFC other than the changes in memory behaviour, and I ran some
  LNT tests to make sure this didn't hurt compile time. Not many tests
  changed: there were a couple of 1-2% regressions reported, but there
  were more improvements (of up to 4%) than regressions.

llvm-svn: 262902
2016-03-08 03:14:29 +00:00
Quentin Colombet 5e63e78ca9 [MIR] Change the token name for '<' and '>' to be consitent with the LLVM IR parser.
Thanks to Ahmed Bougacha for noticing!

llvm-svn: 262899
2016-03-08 02:00:43 +00:00
Quentin Colombet 39293d3aaa [GlobalISel] Introduce initializer method to support start/stop-after features.
llvm-svn: 262896
2016-03-08 01:38:55 +00:00
Quentin Colombet 050b211820 [MIR] Teach the parser/printer that generic virtual registers do not need a register class.
llvm-svn: 262893
2016-03-08 01:17:03 +00:00
Justin Bogner 7e6f09c28f Revert "SelectionDAG: Store SDNode operands in an ArrayRecycler"
Looks like the largest SDNode is different between 32 and 64 bit now,
so this is breaking 32 bit bots. Reverting while I figure out a fix.

This reverts r262886.

llvm-svn: 262892
2016-03-08 01:07:03 +00:00
Quentin Colombet 287c6bb571 [MIR] Teach the parser how to parse complex types of generic machine instructions.
By complex types, I mean aggregate or vector types.

llvm-svn: 262890
2016-03-08 00:57:31 +00:00
Justin Bogner 6543a9385f SelectionDAG: Store SDNode operands in an ArrayRecycler
Currently some SDNode operands are malloc'd, some are stored inline in
subclasses of SDNode, and some are thrown into a BumpPtrAllocator.
This scheme is complex, inconsistent, and makes refactoring SDNodes
fairly difficult.

Instead, we can allocate all of the operands using an ArrayRecycler
that wraps a BumpPtrAllocator. This keeps the cache locality when
iterating operands, improves locality when iterating SDNodes without
looking at operands, and vastly simplifies the ownership semantics.

It also means we stop overallocating SDNodes by 2-3x and will make it
simpler to fix the rampant undefined behaviour we have in how we
mutate SDNodes from one kind to another (See llvm.org/pr26808).

This is NFC other than the changes in memory behaviour, and I ran some
LNT tests to make sure this didn't hurt compile time. Not many tests
changed: there were a couple of 1-2% regressions reported, but there
were more improvements (of up to 4%) than regressions.

llvm-svn: 262886
2016-03-08 00:39:51 +00:00
Quentin Colombet d655483944 [MIR] Teach the printer how to print complex types for generic machine instructions.
Before this change, we would get the type definition in the middle
of the instruction.
E.g., %0(48) = G_ADD %struct_alias = type { i32, i16 } %edi, %edi

Now, we have just the expected type name:
%0(48) = G_ADD %struct_alias %edi, %edi

llvm-svn: 262885
2016-03-08 00:38:01 +00:00
Quentin Colombet 12350a8e13 [MIR] Print the type of generic machine instructions.
llvm-svn: 262880
2016-03-08 00:29:15 +00:00
Quentin Colombet 851996778f [MIR] Teach the mir parser about types on generic machine instructions.
llvm-svn: 262879
2016-03-08 00:20:48 +00:00
Quentin Colombet 41bea872dd [MachineInstr] Get rid of some GlobalISel ifdefs.
Now the type API is always available, but when global-isel is not
built the implementation does nothing.

Note: The implementation free of ifdefs is WIP and tracked here in PR26576.
llvm-svn: 262873
2016-03-07 22:47:23 +00:00
Quentin Colombet 4e14a497a3 [MIR] Teach the MIPrinter about size for generic virtual registers.
llvm-svn: 262867
2016-03-07 21:57:52 +00:00
Quentin Colombet 2a831fb826 [MIR] Teach the parser how to handle the size of generic virtual registers.
llvm-svn: 262862
2016-03-07 21:48:43 +00:00
Quentin Colombet 1bd7504ef3 [MachineRegisterInfo] Add a method to set the size of a virtual register a posteriori.
This is required for mir testing.

llvm-svn: 262861
2016-03-07 21:41:39 +00:00
Quentin Colombet 70a9670d80 [MachineRegisterInfo] Get rid of the global-isel ifdefs.
One additional pointer is not a big deal size-wise and it makes
the code much nicer!

llvm-svn: 262856
2016-03-07 21:22:09 +00:00
Matt Arsenault ceb2c06cbd DAGCombiner: Check legality before creating extract_vector_elt
Problem not hit by any in tree target.

llvm-svn: 262852
2016-03-07 21:10:09 +00:00
Craig Topper 267bdb2094 [CodeGen] Add space-optimized EmitMergeInputChains1_2 to the DAG isel matching tables. Shaves about 5100 bytes from the X86 matcher table. NFC
llvm-svn: 262815
2016-03-07 07:29:12 +00:00
Krzysztof Parzyszek 5c61d11a6d Add DAG mutation interface to the post-RA scheduler
Differential Revision: http://reviews.llvm.org/D17868

llvm-svn: 262774
2016-03-05 15:45:23 +00:00
Matthias Braun 4797ec95e4 RegisterCoalescer: Remap subregister lanemasks before exchanging operands
Rematerializing and merging into a bigger register class at the same
time, requires the subregister range lanemasks getting remapped to the
new register class.

This fixes http://llvm.org/PR26805

llvm-svn: 262768
2016-03-05 04:36:13 +00:00
Matthias Braun 8de09aa0c5 RegisterCoalescer: Need to check DstReg+SrcReg for missing undef flags
copy coalescing with enabled subregister liveness can reveal undef uses,
previously this was only checked for the SrcReg in updateRegDefsUses()
but we need to check DstReg as well.

llvm-svn: 262767
2016-03-05 04:36:10 +00:00
Matthias Braun 2cbfd9fff5 RegisterPressure: Small cleanup
llvm-svn: 262766
2016-03-05 04:36:08 +00:00
Michael Kuperstein b89f0fa2a2 [DAGCombine] Fix divrem combine not to assume div/rem type is simple.
The divrem combine assumed the type of the div/rem is simple, which isn't
necessarily true. This probably worked fine until r250825, since it only
saw legal types, but now breaks when it runs as a pre-type-legalization 
combine.

This fixes PR26835.

Differential Revision: http://reviews.llvm.org/D17878

llvm-svn: 262746
2016-03-04 21:23:29 +00:00
Renato Golin 175c6d6d95 [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make
sure we only emit valid types or the ones that were explicitly marked as custom.
Now, passing check-all and test-suite on x86, ARM and AArch64.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262738
2016-03-04 19:19:36 +00:00
Teresa Johnson d84c7decb6 Change split code gen to use ThreadPool
Part of D15390.

llvm-svn: 262719
2016-03-04 15:39:13 +00:00
Benjamin Kramer 4dbf3371bb Make headers self-contained again.
llvm-svn: 262702
2016-03-04 10:49:30 +00:00
Simon Pilgrim 91dd0a796c [X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG
Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns.

Differential Revision: http://reviews.llvm.org/D17691

llvm-svn: 262599
2016-03-03 09:43:28 +00:00
Renato Golin 3d78271eac Revert "[ARM] Merging 64-bit divmod lib calls into one"
This reverts commit r262507, which broke some ARM buildbots.

llvm-svn: 262594
2016-03-03 08:57:44 +00:00
Junmo Park 6ba96fb431 [BranchFolding] Change function name related with merging MMOs. NFC
Summary:
Removing MMOs is not our prefer behavior any more.

Reviewers: mcrosier, reames
   
Differential Revision: http://reviews.llvm.org/D17668

llvm-svn: 262580
2016-03-03 03:57:20 +00:00
Philip Reames ae27b2380f [MBP] Renaming a confusing variable and add clarifying comments
Was discussed as part of http://reviews.llvm.org/D17830

llvm-svn: 262571
2016-03-03 00:58:43 +00:00
Philip Reames 23d933982a [MBP] Avoid placing random blocks between loop preheader and header
If we have a loop with a rarely taken path, we will prune that from the blocks which get added as part of the loop chain. The problem is that we weren't then recognizing the loop chain as schedulable when considering the preheader when forming the function chain. We'd then fall to various non-predecessors before finally scheduling the loop chain (as if the CFG was unnatural.) The net result was that there could be lots of garbage between a loop preheader and the loop, even though we could have directly fallen into the loop. It also meant we separated hot code with regions of colder code.

The particular reason for the rejection of the loop chain was that we were scanning predecessor of the header, seeing the backedge, believing that was a globally more important predecessor (true), but forgetting to account for the fact the backedge precessor was already part of the existing loop chain (oops!.

Differential Revision: http://reviews.llvm.org/D17830

llvm-svn: 262547
2016-03-03 00:01:42 +00:00
David Majnemer 1ef654024f [X86] Don't give catch objects a displacement of zero
Catch objects with a displacement of zero do not initialize a catch
object.  The displacement is relative to %rsp at the end of the
function's prologue for x86_64 targets.

If we place an object at the top-of-stack, we will end up wit a
displacement of zero resulting in our catch object remaining
uninitialized.

Address this by creating our catch objects as fixed objects.  We will
ensure that the UnwindHelp object is created after the catch objects so
that no catch object will have a displacement of zero.

Differential Revision: http://reviews.llvm.org/D17823

llvm-svn: 262546
2016-03-03 00:01:25 +00:00
Philip Reames 02e1132afb [MBP] Remove overly verbose debug output
llvm-svn: 262531
2016-03-02 22:40:51 +00:00
Philip Reames b9688f4382 [MBP] Adjust debug output to be more focused and approachable
llvm-svn: 262522
2016-03-02 21:45:13 +00:00
Renato Golin 93e42d9934 [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262507
2016-03-02 19:35:45 +00:00
Justin Bogner b2ecee9c31 SelectionDAG: Use correctly sized allocation functions for SDNodes
The placement new calls here were all calling the allocation function
in RecyclingAllocator/Recycler for SDNode, instead of the function for
the specific subclass we were constructing.

Since this particular allocator always overallocates it more or less
worked, but would hide what we're actually doing from any memory
tools. Also, if you tried to change this allocator so something like a
BumpPtrAllocator or MallocAllocator, the compiler would crash horribly
all the time.

Part of llvm.org/PR26808.

llvm-svn: 262500
2016-03-02 19:01:11 +00:00
Matt Arsenault 7d0a77b979 DAGCombiner: Make sure an integer is being truncated
llvm-svn: 262446
2016-03-02 01:36:51 +00:00
Matt Arsenault b36d462fac DAGCombiner: Turn truncate of a bitcasted vector to an extract
On AMDGPU where operations i64 operations are often bitcasted to v2i32
and back, this pattern shows up regularly where it breaks some
expected combines on i64, such as load width reducing.

This fixes some test failures in a future commit when i64 loads
are changed to promote.

llvm-svn: 262397
2016-03-01 21:31:53 +00:00
Vasileios Kalintiris 36901dd1c3 Revert "[mips] Promote the result of SETCC nodes to GPR width."
This reverts commit r262316.

It seems that my change breaks an out-of-tree chromium buildbot, so
I'm reverting this in order to investigate the situation further.

llvm-svn: 262387
2016-03-01 20:25:43 +00:00
Justin Lebar b5ca00a58d [NVPTX] Use different, convergent MIs for convergent calls.
Summary:
Calls sometimes need to be convergent.  This is already handled at the
LLVM IR level, but it also needs to be handled at the MI level.

Ideally we'd propagate convergence from instructions, down through the
selection DAG, and into MIs.  But this is Hard, and would affect
optimizations in the SDNs -- right now only SDNs with two operands have
any flags at all.

Instead, here's a much simpler hack: Add new opcodes for NVPTX for
convergent calls, and generate these when lowering convergent LLVM
calls.

Reviewers: jholewinski

Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits

Differential Revision: http://reviews.llvm.org/D17423

llvm-svn: 262373
2016-03-01 19:24:03 +00:00
Matt Arsenault 03dac8d8e4 DAGCombiner: Turn extract of bitcasted integer into truncate
This reduces the number of bitcast nodes and generally cleans up the
DAG when bitcasting between integers and vectors everywhere.

llvm-svn: 262358
2016-03-01 18:01:37 +00:00
Rafael Espindola 5cd721ae12 Refactor duplicated code for linking with pthread.
llvm-svn: 262344
2016-03-01 15:54:40 +00:00
Vasileios Kalintiris 3a8f7f9e31 [mips] Promote the result of SETCC nodes to GPR width.
Summary:
This patch modifies the existing comparison, branch, conditional-move
and select patterns, and adds new ones where needed. Also, the updated
SLT{u,i,iu} set of instructions generate a GPR width result.

The majority of the code changes in the Mips back-end fix the wrong
assumption that the result of SETCC nodes always produce an i32 value.
The changes in the common code path account for the fact that in 64-bit
MIPS targets, i1 is promoted to i32 instead of i64.

Reviewers: dsanders

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D10970

llvm-svn: 262316
2016-03-01 10:08:01 +00:00
Matt Arsenault a67c4916cf LegalizeDAG: Use correct ptr type when expanding unaligned load/store
This fixes regressions exposed in existing AMDGPU tests in a
future commit when all loads are custom lowered.

llvm-svn: 262299
2016-03-01 05:13:35 +00:00
David Majnemer cb305dea1c [WinEH] Allocate the registration node before the catch objects
The CatchObjOffset is relative to the end of the EH registration node
for 32-bit x86 WinEH targets.  A special sentinel value, 0, is used to
indicate that no catch object should be initialized.

This means that a catch object allocated immediately before the
registration node would be assigned a CatchObjOffset of 0, leading the
runtime to believe that a catch object should not be initialized.

To handle this, allocate the registration node prior to any other frame
object.  This will ensure that catch objects will not be allocated
before the registration node.

This fixes PR26757.

Differential Revision: http://reviews.llvm.org/D17689

llvm-svn: 262294
2016-03-01 04:30:16 +00:00
Adrian Prantl dba58fbdd9 Improve the debug output of DwarfDebug::buildLocationList().
llvm-svn: 262265
2016-02-29 22:28:22 +00:00
Adrian Prantl fb2add2be1 Fix PR26585 by improving the promotion of DBG_VALUEs to DW_AT_locations.
When a variable is described by a single DBG_VALUE instruction we can
often use a more efficient inline DW_AT_location instead of using a
location list.

This commit makes the heuristic that decides when to apply this
optimization stricter by also verifying that the DBG_VALUE is live at the
entry of the function (instead of just checking that it is valid until
the end of the function).

<rdar://problem/24611008>

llvm-svn: 262247
2016-02-29 19:49:46 +00:00
Adrian Prantl 693e8de0fa fix typo in comment
llvm-svn: 262236
2016-02-29 17:06:46 +00:00
Duncan P. N. Exon Smith ebcce78f65 CodeGen: Remove an iterator => pointer conversion, NFC
Part of PR26753.

llvm-svn: 262154
2016-02-27 20:27:44 +00:00
Duncan P. N. Exon Smith d6ebd07b8d CodeGen: Use MachineInstr& in InlineSpiller::rematerializeFor()
InlineSpiller::rematerializeFor() never uses its parameter as an
iterator, so take it by reference instead.  This removes an implicit
conversion from MachineBasicBlock::iterator to MachineInstr*.

llvm-svn: 262152
2016-02-27 20:23:14 +00:00
Duncan P. N. Exon Smith be8f8c4478 CodeGen: Update LiveIntervalAnalysis API to use MachineInstr&, NFC
These parameters aren't expected to be null, so take them by reference.

llvm-svn: 262151
2016-02-27 20:14:29 +00:00
Duncan P. N. Exon Smith fd8cc23220 CodeGen: Change MachineInstr to use MachineInstr&, NFC
Change MachineInstr API to prefer MachineInstr& over MachineInstr*
whenever the parameter is expected to be non-null.  Slowly inching
toward being able to fix PR26753.

llvm-svn: 262149
2016-02-27 20:01:33 +00:00
Matt Arsenault 982224cfb8 DAGCombiner: Don't unnecessarily swap operands in ReassociateOps
In the case where op = add, y = base_ptr, and x = offset, this
transform:

(op y, (op x, c1)) -> (op (op x, y), c1)

breaks the canonical form of add by putting the base pointer in the
second operand and the offset in the first.

This fix is important for the R600 target, because for some address
spaces the base pointer and the offset are stored in separate register
classes. The old pattern caused the ISel code for matching addressing
modes to put the base pointer and offset in the wrong register classes,
which required no-trivial code transformations to fix.

llvm-svn: 262148
2016-02-27 19:57:45 +00:00
Duncan P. N. Exon Smith d3a7467221 CodeGen: Use MachineInstr& in HashMachineInstr, NFC
Also update HashEndOfMBB to take MachineBasicBlock&.

llvm-svn: 262146
2016-02-27 19:48:01 +00:00
Duncan P. N. Exon Smith 5e6e8c7a0a CodeGen: Use MachineInstr& in AntiDepBreaker API, NFC
Take parameters as MachineInstr& instead of MachineInstr* in
AntiDepBreaker API, since these are required to be non-null.  No
functionality change intended.  Looking toward PR26753.

llvm-svn: 262145
2016-02-27 19:33:37 +00:00
Duncan P. N. Exon Smith bd529fbb4a CodeGen: Assert valid MI in AntiDepBreaker::UpdateDbgValue
This already assumes a valid MI, since it dereferences the MI in an
assertion before checking for null.  At an explicit assert.

llvm-svn: 262144
2016-02-27 19:23:34 +00:00
Duncan P. N. Exon Smith 5702287809 CodeGen: Update DFAPacketizer API to take MachineInstr&, NFC
In all but one case, change the DFAPacketizer API to take MachineInstr&
instead of MachineInstr*.  In DFAPacketizer::endPacket(), take
MachineBasicBlock::iterator.  Besides cleaning up the API, this is in
search of PR26753.

llvm-svn: 262142
2016-02-27 19:09:00 +00:00
Duncan P. N. Exon Smith f9ab416d70 WIP: CodeGen: Use MachineInstr& in MachineInstrBundle.h, NFC
Update APIs in MachineInstrBundle.h to take and return MachineInstr&
instead of MachineInstr* when the instruction cannot be null.  Besides
being a nice cleanup, this is tacking toward a fix for PR26753.

llvm-svn: 262141
2016-02-27 17:05:33 +00:00
Matt Arsenault 360d244d5b DAGCombiner: Relax sqrt NaN folding check
This is OK for +0 since compares to +/-0 give the same result.

llvm-svn: 262125
2016-02-27 09:38:05 +00:00
Duncan P. N. Exon Smith 3ac9cc6156 CodeGen: Take MachineInstr& in SlotIndexes and LiveIntervals, NFC
Take MachineInstr by reference instead of by pointer in SlotIndexes and
the SlotIndex wrappers in LiveIntervals.  The MachineInstrs here are
never null, so this cleans up the API a bit.  It also incidentally
removes a few implicit conversions from MachineInstrBundleIterator to
MachineInstr* (see PR26753).

At a couple of call sites it was convenient to convert to a range-based
for loop over MachineBasicBlock::instr_begin/instr_end, so I added
MachineBasicBlock::instrs.

llvm-svn: 262115
2016-02-27 06:40:41 +00:00
Junmo Park 272a2bc365 Minor code cleanup. NFC.
llvm-svn: 262096
2016-02-27 01:10:43 +00:00
Cong Hou e0eb8bfe37 Fix a bug in isVectorReductionOp() in SelectionDAGBuilder.cpp that may cause assertion failure on AArch64.
llvm-svn: 262091
2016-02-26 23:25:30 +00:00
Amaury Sechet b2055c53ba Fix warning in DwarfCFIException. NFC
llvm-svn: 262061
2016-02-26 20:49:07 +00:00
Amaury Sechet 7067ad3c27 Extract the method to begin and end a fragment in AsmPrinterHandler in their own method. NFC
Summary: This is extracted from D17555

Reviewers: davidxl, reames, sanjoy, MatzeB, pete

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17580

llvm-svn: 262058
2016-02-26 20:30:37 +00:00
Quentin Colombet 87e23e5733 [GlobalISel] Fix a ranlib warning about empty TOC.
Fixes PR26733

llvm-svn: 262057
2016-02-26 20:05:02 +00:00
Reid Kleckner 70c9bc71d4 [WinEH] Fix funclet return block clobber mask placement
MBB slot index intervals are half open, not closed. getMBBEndIndex()
returns the slot index of the start of the next block in layout order.
Placing a register mask there is incorrect if the successor of the
funclet return is not laid out after the return. Clang generates IR for
catch bodies before generating the following normal code, so we never
noticed this issue until the D frontend authors filed a bug about it.

Instead, we can put the clobber mask on the last instruction of the
funclet return block. We still aren't using a register mask operand on
the CATCHRET instruction because it would cause PEI to spill all CSRs,
including XMM regs, in the prologue.

Fixes PR26679.

llvm-svn: 262035
2016-02-26 16:53:19 +00:00
Matthias Braun 9dcd65f478 MachineCopyPropagation: Catch copies of the form A<-B;A<-B
Differential Revision: http://reviews.llvm.org/D17475

llvm-svn: 261966
2016-02-26 03:18:55 +00:00
Matthias Braun e39ff70685 MachineCopyPropagation: Keep scanning through instructions with regmasks
This also simplifies the code by removing the overly conservative
NoInterveningSideEffect() function. This function checked:
- That the two copies belong to the same block: We only process one
  block at a time and clear our maps in between it is impossible to find a
  copy from a different block.
- There is no terminator between the two copy instructions: This is not
  allowed anyway (the MachineVerifier would complain)
- Does not have instructions with hasUnmodeledSideEffects() or isCall()
  set: Even for those instructuction we must have all clobbers/defs of
  registers explicit as an operand. If the register is explicitely
  clobbered we would never come to the point of checking for
  NoInterveningSideEffect() anyway.

(I also checked this with a temporary build of the test-suite with all
 potentially failing conditions in NoInterveningSideEffect() turned into
 asserts)

Differential Revision: http://reviews.llvm.org/D17474

llvm-svn: 261965
2016-02-26 03:18:50 +00:00
Junmo Park 820e392601 Minor code cleanups. NFC.
llvm-svn: 261955
2016-02-26 02:07:36 +00:00
David Majnemer 08dd52dc75 [WinEH] Don't remove unannotated inline-asm calls
Inline-asm calls aren't annotated with funclet bundle operands because
they don't throw and cannot be inlined through.  We shouldn't require
them to bear an funclet bundle operand.

llvm-svn: 261942
2016-02-26 00:04:25 +00:00
Hongbin Zheng 751337faa7 Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC
Differential Revision: http://reviews.llvm.org/D17570

llvm-svn: 261903
2016-02-25 17:54:15 +00:00
Hongbin Zheng 3f97840721 Introduce analysis pass to compute PostDominators in the new pass manager. NFC
Differential Revision: http://reviews.llvm.org/D17537

llvm-svn: 261902
2016-02-25 17:54:07 +00:00
Hongbin Zheng 66b19fbc4e Revert "Introduce analysis pass to compute PostDominators in the new pass manager. NFC"
This reverts commit a3e5cc6a51ab5ad88d1760c63284294a4e34c018.

llvm-svn: 261891
2016-02-25 16:45:53 +00:00
Hongbin Zheng ad782ce3f7 Revert "Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC"
This reverts commit 109c38b2226a87b0be73fa7a0a8c1a81df20aeb2.

llvm-svn: 261890
2016-02-25 16:45:46 +00:00
Hongbin Zheng 237197ba63 Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC
Differential Revision: http://reviews.llvm.org/D17570

llvm-svn: 261883
2016-02-25 16:33:15 +00:00
Hongbin Zheng a0273a04f5 Introduce analysis pass to compute PostDominators in the new pass manager. NFC
Differential Revision: http://reviews.llvm.org/D17537

llvm-svn: 261882
2016-02-25 16:33:06 +00:00
Junmo Park 161dc1c605 [CodeGenPrepare] Remove load-based heuristic
Summary:
Both the hardware and LLVM have changed since 2012.
Now, load-based heuristic don't show big differences any more on OoO cores.

There is no notable regressons and improvements on spec2000/2006. (Cortex-A57, Core i5).

Reviewers: spatel, zansari
    
Differential Revision: http://reviews.llvm.org/D16836

llvm-svn: 261809
2016-02-25 00:23:27 +00:00
Cong Hou 4ce0280a41 Detecte vector reduction operations just before instruction selection.
(This is the second attemp to commit this patch, after fixing pr26652 & pr26653).

This patch detects vector reductions before instruction selection. Vector
reductions are vectorized reduction operations, and for such operations we have
freedom to reorganize the elements of the result as long as the reduction of them
stay unchanged. This will enable some reduction pattern recognition during
instruction combine such as SAD/dot-product on X86. A flag is added to
SDNodeFlags to mark those vector reduction nodes to be checked during instruction
combine.

To detect those vector reductions, we search def-use chains starting from the
given instruction, and check if all uses fall into two categories:

1. Reduction with another vector.
2. Reduction on all elements.

in which 2 is detected by recognizing the pattern that the loop vectorizer
generates to reduce all elements in the vector outside of the loop, which
includes several ShuffleVector and one ExtractElement instructions.


Differential revision: http://reviews.llvm.org/D15250

llvm-svn: 261804
2016-02-24 23:40:36 +00:00
Matthias Braun aca625a4fe MachineInstr: Respect register aliases in clearRegiserKills()
This fixes bugs in copy elimination code in llvm. It slightly changes the
semantics of clearRegisterKills(). This is appropriate because:
- Users in lib/CodeGen/MachineCopyPropagation.cpp and
  lib/Target/AArch64RedundantCopyElimination.cpp and
  lib/Target/SystemZ/SystemZElimCompare.cpp are incorrect without it
  (see included testcase).
- All other users in llvm are unaffected (they pass TRI==nullptr)
- (Kill flags are optional anyway so removing too many shouldn't hurt.)

Differential Revision: http://reviews.llvm.org/D17554

llvm-svn: 261763
2016-02-24 19:21:48 +00:00
Artur Pilipenko 31bcca47d3 NFC. Move isDereferenceable to Loads.h/cpp
This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated.   

Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D16180

llvm-svn: 261736
2016-02-24 12:49:04 +00:00
Hans Wennborg d3661cd140 Revert r261633 "Supporting all entities declared in lexical scope in LLVM debug info."
This and the corresponding Clang change caused PR26715.

llvm-svn: 261671
2016-02-23 19:17:03 +00:00
Amjad Aboud fc8f296782 Supporting all entities declared in lexical scope in LLVM debug info.
Differential Revision: http://reviews.llvm.org/D15976

llvm-svn: 261633
2016-02-23 13:36:51 +00:00
David Majnemer 17525aba8a [WinEH] Visit 'unwind to caller' catchswitches nested in catchswitches
We had the right logic for the nested cleanuppad case but omitted it for
catchswitches.

llvm-svn: 261615
2016-02-23 07:18:15 +00:00
Dehao Chen f84b630044 Add prefix based function layout when profile is available.
Summary: If a function is hot, put it in text.hot section.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17532

llvm-svn: 261607
2016-02-23 03:39:24 +00:00
Duncan P. N. Exon Smith 6307eb5518 CodeGen: TII: Take MachineInstr& in predicate API, NFC
Change TargetInstrInfo API to take `MachineInstr&` instead of
`MachineInstr*` in the functions related to predicated instructions
(I'll try to come back later and get some of the rest).  All of these
functions require non-null parameters already, so references are more
clear.  As a bonus, this happens to factor away a host of implicit
iterator => pointer conversions.

No functionality change intended.

llvm-svn: 261605
2016-02-23 02:46:52 +00:00
Duncan P. N. Exon Smith b3613fce19 Revert "Add prefix based function layout when profile is available."
This reverts commit r261582, since this bot has been broken for four
hours:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/19399/

llvm-svn: 261604
2016-02-23 02:28:40 +00:00
Dehao Chen 25527b6c3f Include ProfileData as CodeGen's required library.
Summary: Fixing buildbot failure introduced by http://reviews.llvm.org/D17460

Reviewers: davidxl, hans

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17524

llvm-svn: 261588
2016-02-22 22:54:14 +00:00
David Majnemer 964b70d559 [X86] Create mergeable constant pool entries for AVX
We supported creating mergeable constant pool entries for smaller
constants but not for 32-byte AVX constants.

llvm-svn: 261584
2016-02-22 22:23:11 +00:00
Dehao Chen c5f76f7347 Add prefix based function layout when profile is available.
Summary: If a function is hot, put it in text.hot section.

Reviewers: davidxl

Subscribers: eraman, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17460

llvm-svn: 261582
2016-02-22 22:14:14 +00:00
Matt Arsenault 0c6bd7b0d3 SelectionDAG: Use correct addrspace when lowering memcpy
This was causing assertions later from using the wrong pointer
size with LDS operations. getOptimalMemOpType should also have
address space arguments later.

This avoids assertions in existing tests exposed by
a future commit.

llvm-svn: 261580
2016-02-22 22:01:42 +00:00
Tim Northover d32f8e60bf ARM: sink atomic release barrier as far as possible into cmpxchg.
DMB instructions can be expensive, so it's best to avoid them if possible. In
atomicrmw operations there will always be an attempted store so a release
barrier is always needed, but in the cmpxchg case we can delay the DMB until we
know we'll definitely try to perform a store (and so need release semantics).

In the strong cmpxchg case this isn't quite free: we must duplicate the LDREX
instructions to skip the barrier on subsequent iterations. The basic outline
becomes:

        ldrex rOld, [rAddr]
        cmp rOld, rDesired
        bne Ldone
        dmb
    Lloop:
        strex rRes, rNew, [rAddr]
        cbz rRes Ldone
        ldrex rOld, [rAddr]
        cmp rOld, rDesired
        beq Lloop
    Ldone:

So we'll skip this version for strong operations in "minsize" functions.

llvm-svn: 261568
2016-02-22 20:55:50 +00:00
Duncan P. N. Exon Smith c5b668deb8 Revert "CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC"
This reverts commit r261504, since it's not obvious the new name is
better:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160222/334298.html

I'll recommit if we get consensus that it's the right direction.

llvm-svn: 261567
2016-02-22 20:49:58 +00:00
Justin Lebar 46123a8891 Revert "[ifcnv] Add comment explaining why it's OK to duplicate convergent MIs in ifcnv."
This reverts r261543.  Accidental commit (not LGTM'ed).

llvm-svn: 261547
2016-02-22 18:17:27 +00:00
Justin Lebar f62b165a04 [ifcnv] Add comment explaining why it's OK to duplicate convergent MIs in ifcnv.
Summary:
Also add a comment briefly explaining what ifcnv is.

No functional changes.

Reviewers: resistor

Subscribers: echristo, tra, llvm-commits

Differential Revision: http://reviews.llvm.org/D17430

llvm-svn: 261543
2016-02-22 17:51:30 +00:00
Justin Lebar 3a7bc57e63 [ifcnv] Use unique_ptr in IfConversion. NFC
Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17466

llvm-svn: 261541
2016-02-22 17:51:28 +00:00
Justin Lebar 5b82c9ba31 Don't tail-duplicate blocks that contain convergent instructions.
Summary:
Convergent instrs shouldn't be made control-dependent on other values,
but this is basically the whole point of tail duplication.  So just bail
if we see a convergent instruction.

Reviewers: iteratee

Subscribers: jholewinski, jhen, hfinkel, tra, jingyue, llvm-commits

Differential Revision: http://reviews.llvm.org/D17320

llvm-svn: 261540
2016-02-22 17:50:52 +00:00
Duncan P. N. Exon Smith e59c8af705 Reapply "CodeGen: Use references in MachineTraceMetrics::Trace, NFC"
This reverts commit r261510, effectively reapplying r261509.  The
original commit missed a caller in AArch64ConditionalCompares.

Original commit message:

Pass non-null arguments by reference in MachineTraceMetrics::Trace,
simplifying future work to remove implicit iterator => pointer
conversions.

llvm-svn: 261511
2016-02-22 03:33:28 +00:00
Duncan P. N. Exon Smith 0cc90a9147 Revert "CodeGen: Use references in MachineTraceMetrics::Trace, NFC"
This reverts commit r261509.  I'm not sure how this compiled locally,
but something was out of whack.

llvm-svn: 261510
2016-02-22 03:12:42 +00:00
Duncan P. N. Exon Smith 83d3476fd2 CodeGen: Use references in MachineTraceMetrics::Trace, NFC
Pass non-null arguments by reference in MachineTraceMetrics::Trace,
simplifying future work to remove implicit iterator => pointer
conversions.

llvm-svn: 261509
2016-02-22 03:07:49 +00:00
Duncan P. N. Exon Smith 395bd9cd63 CodeGen: Explicitly convert from iterator to pointer, NFC
llvm-svn: 261508
2016-02-22 02:53:42 +00:00
Duncan P. N. Exon Smith dc0848c029 CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC
Delete MachineInstr::getIterator(), since the term "iterator" is
overloaded when talking about MachineInstr.

- Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so
  that ilist_node::getIterator() is still available.
- Add it back as MachineInstr::getInstrIterator().  This matches the
  naming in MachineBasicBlock.
- Add MachineInstr::getBundleIterator().  This is explicitly called
  "bundle" (not matching MachineBasicBlock) to disintinguish it clearly
  from ilist_node::getIterator().
- Update all calls.  Some of these I switched to `auto` to remove
  boiler-plate, since the new name is clear about the type.

There was one call I updated that looked fishy, but it wasn't clear what
the right answer was.  This was in X86FrameLowering::inlineStackProbe(),
added in r252578 in lib/Target/X86/X86FrameLowering.cpp.  I opted to
leave the behaviour unchanged, but I'll reply to the original commit on
the list in a moment.

llvm-svn: 261504
2016-02-21 22:58:35 +00:00
Duncan P. N. Exon Smith e9bc579c37 ADT: Remove == and != comparisons between ilist iterators and pointers
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.

Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators.  (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)

There should be NFC here.

llvm-svn: 261498
2016-02-21 20:39:50 +00:00
Duncan P. N. Exon Smith a848c47130 ADT: Stop using getNodePtrUnchecked on end() iterators
Stop using `getNodePtrUnchecked()` when building IR.  Eventually a
dereference will be required to get at the downcast node, since the
iterator will only store an `ilist_node_base` of some sort.

This should have no functionality change for now, but is a path towards
removing some more UB from ilist.

llvm-svn: 261495
2016-02-21 19:52:15 +00:00
Duncan P. N. Exon Smith 7b269642d2 CodeGen: Avoid getNodePtrUnchecked() where we need a Value, NFC
`ilist_iterator<NodeTy>::getNodePtrUnchecked()` is documented as being
for internal use only, but CodeGenPrepare was using it anyway.  This
code relies on pulling out the `Value*` pointer even after the lifetime
of the iterator is over.  But having this pointer available in
ilist_iterator depends on UB in the first place.

Instead, safely pull out the `Value*` when the iterator is alive and
stop using the internal-only API.

There should be no functionality change here.

llvm-svn: 261493
2016-02-21 19:37:45 +00:00
David Majnemer a3ea407d48 [X86] Use the correct alignment for COMDAT constant pool entries
COFF doesn't have sections with mergeable contents.  Instead, each
constant pool entry ends up in a COMDAT section.  The linker, when
choosing between COMDAT sections, doesn't choose the max alignment of
the two sections.  You just get whatever alignment was on the section.

If one constant needed a higher alignment in one object file from
another one, then we will get into trouble if the linker chooses the
lower alignment one.

Instead, lets promote the alignment of the constant pool entry to make
sure we don't use an under aligned constant with an instruction which
assumed otherwise.

This fixes PR26680.

llvm-svn: 261462
2016-02-21 01:30:30 +00:00
Dan Gohman d1c5a3aa21 Don't scan for SSA register operands to update when not in SSA form.
TailDuplicate can run on either on SSA code or non-SSA code, as indicated to
it by MRI->isSSA() ("PreRegAlloc" here). TailDuplicate does extra work to
preserve SSA invariants when it duplicates code. This patch makes it skip
some of this extra work in the case where the code is not in SSA form.

llvm-svn: 261450
2016-02-20 21:28:18 +00:00
Simon Pilgrim c5199aae82 [DAGCombiner] Use getBitcast helper when possible. NFCI.
llvm-svn: 261437
2016-02-20 15:05:29 +00:00
Matthias Braun c65e904be8 MachineCopyPropagation: Introduce Reg2MIMap typedef; NFC
llvm-svn: 261408
2016-02-20 03:56:41 +00:00
Matthias Braun bd18d751de MachineCopyPropagation: Move variables from function to pass
This avoids unnecessarily passing them around when calling helper
functions. It may also be slightly faster to call clear() on the
datastructures instead of freshly initializing them for each block.

llvm-svn: 261407
2016-02-20 03:56:39 +00:00
Matthias Braun 273575dcbe MachineCopyPropagation: Use ranged for, cleanup; NFC
llvm-svn: 261406
2016-02-20 03:56:36 +00:00
Matthias Braun 57b5f11aa7 MachineCopyPropagation: Use assert() instead of if{report_error()} for 'impossible' condition
llvm-svn: 261405
2016-02-20 03:56:33 +00:00
Quentin Colombet e611698e84 [RegAllocFast] Properly track the physical register definitions on calls.
PR26485

llvm-svn: 261384
2016-02-20 00:32:29 +00:00
Sanjoy Das ffb7bd11f7 [StatepointLowering] Minor non-semantic cleanups
Use auto, bring file up to coding standards etc.

llvm-svn: 261358
2016-02-19 19:37:07 +00:00
Sanjoy Das f6fee29ceb [StatepointLowering] Update StatepointMaxSlotsRequired correctly
Now that we don't always add an element to AllocatedStackSlots if we
don't find a pre-existing unallocated stack slot, bumping
StatepointMaxSlotsRequired to `NumSlots + 1` is not correct.  Instead
bump the statistic near the push_back, to
Builder.FuncInfo.StatepointStackSlots.size().

llvm-svn: 261348
2016-02-19 18:15:56 +00:00
Sanjoy Das e8019df552 [StatepointLowering] Fix a mistake in rL261336
The check on MFI->getObjectSize() has to be on the FrameIndex, not on
the index of the FrameIndex in AllocatedStackSlots.  Weirdly, the tests
I added in rL261336 didn't catch this.

llvm-svn: 261347
2016-02-19 18:15:53 +00:00
Sanjoy Das 171313c69a [StatepointLowering] Change AllocatedStackSlots to use SmallBitVector
NFCI.  They key motivation here is that I'd like to use
SmallBitVector::all() in a later change.  Also, using a bit vector here
seemed better in general.

The only interesting change here is that in the failure case of
allocateStackSlot, we no longer (the equivalent of) push_back(true) to
AllocatedStackSlots.  As far as I can tell, this is fine, since we'd
never re-use those slots in the same StatepointLoweringState instance.

Technically there was no need to change the operator[] type accesses to
set() and test(), but I thought it'd be nice to make it obvious that
we're using something other than a std::vector like thing.

llvm-svn: 261337
2016-02-19 17:15:26 +00:00
Sanjoy Das d2db73ba59 [StatepointLowering] Fix bug in allocateStackSlot
allocateStackSlot did not consider the size of the value to be spilled
before deciding to re-use a spill slot.  This was originally okay (since
originally we'd only ever spill pointers), but it became not okay when
we changed our scheme to directly spill vectors of pointers.

While this change fixes the bug pointed out, it has two performance
caveats:

 - It matches spill slot and spillee size exactly, while in theory we
   can spill, e.g., an 8 byte pointer into a 16 byte slot.  This is
   slightly complicated to fix since in the stackmaps section, we report
   the size of the spill slot as the size of the "indirect value"; and
   if they're no longer equivalent, we'll have to keep track of the
   (indirect) value size separately from the stack slot size.

 - It will "spuriously run out" of reusable slots, since we now have an
   second check in the search loop in addition to the availablity
   check (e.g. you had two free scalar slots, and you first ask for a
   vector slot followed by a scalar slot).  I'll fix this in a later
   commit.

llvm-svn: 261336
2016-02-19 17:15:22 +00:00
Sanjoy Das 7b2e91fb59 [StatepointLowering] Clean up allocateStackSlot
This removes the unusual loop structure in allocateStackSlot in favor of
something more straightforward.  I've also removed the cautionary
comment in the function, which I suspect is historical cruft now, and
confuses more than it enlightens.

llvm-svn: 261335
2016-02-19 17:15:17 +00:00
David Majnemer 693f13156e Shuffle header file as per the Coding Standards
llvm-svn: 261308
2016-02-19 04:46:48 +00:00
David Majnemer b61fd7fc6d [SjLjEHPrepare] Simplify/cleanup code
No functional change is intended.

llvm-svn: 261307
2016-02-19 04:46:06 +00:00
Matthias Braun 848e79c578 LegalizeDAG: Fix ExpandFCOPYSIGN assuming the same type on both inputs
llvm-svn: 261306
2016-02-19 04:44:19 +00:00
David Majnemer bd1b8c0889 [SjLjEHPrepare] Don't grab pointers to functions in doInitialization
Certain optimization passes (like globaldce) can prune function
declaration that SjLjEHPrepare assumed would exit when it'd
runOnFunction.

This fixes PR26669.

llvm-svn: 261303
2016-02-19 03:13:40 +00:00
Justin Lebar c75d566f56 When printing MIR, output to errs() rather than outs().
Summary:
Without this, this command

  $ llvm-run llc -stop-after machine-cp -o - <( echo '' )

outputs an error, because we close stdout twice -- once when closing the
file opened for "-o", and again when closing outs().

Also clarify in the outs() definition that you can't ever call it if you
want to open your own raw_fd_ostream on stdout.

Reviewers: jroelofs, tstellarAMD

Subscribers: jholewinski, qcolombet, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D17422

llvm-svn: 261286
2016-02-19 00:18:46 +00:00
Philip Reames 1960cfd323 [IR] Extend cmpxchg to allow pointer type operands
Today, we do not allow cmpxchg operations with pointer arguments. We require the frontend to insert ptrtoint casts and do the cmpxchg in integers. While correct, this is problematic from a couple of perspectives:
1) It makes the IR harder to analyse (for instance, it make capture tracking overly conservative)
2) It pushes work onto the frontend authors for no real gain

This patch implements the simplest form of IR support. As we did with floating point loads and stores, we teach AtomicExpand to convert back to the old representation. This prevents us needing to change all backends in a single lock step change. Over time, we can migrate each backend to natively selecting the pointer type. In the meantime, we get the advantages of a cleaner IR representation without waiting for the backend changes.

Differential Revision: http://reviews.llvm.org/D17413

llvm-svn: 261281
2016-02-19 00:06:41 +00:00
Richard Trieu 7a08381403 Remove uses of builtin comma operator.
Cleanup for upcoming Clang warning -Wcomma.  No functionality change intended.

llvm-svn: 261270
2016-02-18 22:09:30 +00:00
Philip Reames 367fdd990c Restrict scope of variables [NFC]
llvm-svn: 261250
2016-02-18 19:45:31 +00:00
Benjamin Kramer 3a16e2a26a Make header self-contained. NFC.
llvm-svn: 261234
2016-02-18 18:02:48 +00:00
Xinliang David Li 1153f194bd Stop creating covmap as note section on ELF
covmap needs to created as non allocatable, but not with
SHT_NOTE. The latter was needed to workaround a problem
of BFD linker with gc, which is no longer needed. (A more
proper longer term fix requires changing FE driver to force
referencing the section using linker script).

Differential Revision: http://reviews.llvm.org/D17309

llvm-svn: 261228
2016-02-18 17:20:22 +00:00
Matthias Braun ac697c5d8e Revert "LiveIntervalAnalysis: Remove LiveVariables requirement" and LiveIntervalTest
The commit breaks stage2 compilation on PowerPC. Reverting for now while
this is analyzed. I also have to revert the LiveIntervalTest for now as
that depends on this commit.

Revert "LiveIntervalAnalysis: Remove LiveVariables requirement"
This reverts commit r260806.
Revert "Remove an unnecessary std::move to fix -Wpessimizing-move warning."
This reverts commit r260931.
Revert "Fix typo in LiveIntervalTest"
This reverts commit r260907.
Revert "Add unittest for LiveIntervalAnalysis::handleMove()"
This reverts commit r260905.

llvm-svn: 261189
2016-02-18 05:21:43 +00:00
Adrian Prantl 3b89e6634c DwarfDebug: Don't drop the DIExpression just because a variable is
described by an immediate.

Found via http://reviews.llvm.org/D16867
Thanks to Paul Robinson for pointing this out.

<rdar://problem/24456528>

llvm-svn: 261168
2016-02-17 22:20:08 +00:00
Adrian Prantl 6f4746b11a DbgVariable: Add an accessor for the common case of a single expression
belonging to a single DBG_VALUE instruction.

NFC

llvm-svn: 261167
2016-02-17 22:19:59 +00:00
Nico Weber e6154ffbe0 Revert r261070, it caused PR26652 / PR26653.
llvm-svn: 261127
2016-02-17 18:47:29 +00:00
Cong Hou bbd4e3b400 Detecte vector reduction operations just before instruction selection.
This patch detects vector reductions before instruction selection. Vector
reductions are vectorized reduction operations, and for such operations we have
freedom to reorganize the elements of the result as long as the reduction of them
stay unchanged. This will enable some reduction pattern recognition during
instruction combine such as SAD/dot-product on X86. A flag is added to
SDNodeFlags to mark those vector reduction nodes to be checked during instruction
combine.

To detect those vector reductions, we search def-use chains starting from the
given instruction, and check if all uses fall into two categories:

1. Reduction with another vector.
2. Reduction on all elements.

in which 2 is detected by recognizing the pattern that the loop vectorizer
generates to reduce all elements in the vector outside of the loop, which
includes several ShuffleVector and one ExtractElement instructions.


Differential revision: http://reviews.llvm.org/D15250

llvm-svn: 261070
2016-02-17 06:37:04 +00:00
Reid Kleckner 9a593ee7d2 [codeview] Bail on a DBG_VALUE register operand with no register
This apparently comes up when the register allocator decides that a
variable will become undef along a certain path.

Also improve the error message we emit when we can't map from LLVM
register number to CV register number.

llvm-svn: 261016
2016-02-16 21:49:26 +00:00
Reid Kleckner 6e0d5f573c [codeview] Fix assertion on non-memory, non-register DBG_VALUE instructions
Eventually we should find a way to describe constant variables, but it
is not obvious how to do this at the moment.

llvm-svn: 261010
2016-02-16 21:14:51 +00:00
Quentin Colombet ba2a01645b [GlobalISel] Re-apply r260922-260923 with MSVC-friendly code.
Original message:
Get rid of the ifdefs in TargetLowering.
Introduce a new API used only by GlobalISel: CallLowering.
This API will contain target hooks dedicated to call lowering.

llvm-svn: 260998
2016-02-16 19:26:02 +00:00
Aaron Ballman c6a2f2140b A signed bitfield's range is [-1,0], so assigning 1 is technically an overflow. However, the other bitfield requires a signed value (it supports negative offsets), so it is slightly better to retain a signed 1-bit bitfield and use -1 instead of 1. Silences an MSVC warning.
llvm-svn: 260973
2016-02-16 15:35:51 +00:00
Aaron Ballman fc64ef1a15 Reverting r260922-260923; they cause link failures with MSVC.
http://lab.llvm.org:8011/builders/lldb-x86-windows-msvc2015/builds/15436/steps/build/logs/stdio
http://bb.pgr.jp/builders/msbuild-llvmclang-x64-msc18-DA/builds/961/steps/build_llvm/logs/stdio

llvm-svn: 260972
2016-02-16 15:29:06 +00:00
Quentin Colombet 1ce38545fb [GlobalISel] Get rid of the ifdefs in TargetLowering.
Introduce a new API used only by GlobalISel: CallLowering.
This API will contain target hooks dedicated to call lowering.

llvm-svn: 260922
2016-02-16 00:57:44 +00:00
Zia Ansari 30a02384f7 Implemented stack symbol table ordering/packing optimization to improve data locality and code size from SP/FP offset encoding.
Differential Revision: http://reviews.llvm.org/D15393

llvm-svn: 260917
2016-02-15 23:44:13 +00:00
Matthias Braun 4a6c728cc0 LiveIntervalAnalysis: Support moving of subregister defs in handleMove
This is an updated version which fixes a bug that happened with
uses tied to an earlyclobber operand which end at an unusual slotindex.

If two definitions write to independent subregisters then they can be
put in any order. LiveIntervalAnalysis::handleMove() did not support
this previously because it looks like moving a definition of a vreg past
another one.

This is a modified version of a patch proposed (two years ago) by
Vincent Lejeune! This version does not touch the read-undef flags and is
extended for the case of moving a subregister def behind all uses - this
can happen for subregister defs that are completely unused.

Differential Revision: http://reviews.llvm.org/D9067

llvm-svn: 260906
2016-02-15 19:25:36 +00:00
Matthias Braun b3aefc3a69 MachineVerifier: Add parameter to choose if MachineFunction::verify() aborts
The abort on error behaviour is unpractical for debugger and unittest
usage.

llvm-svn: 260904
2016-02-15 19:25:31 +00:00
Ahmed Bougacha 93cff7fb82 [CodeGen] Document and use getConstant's splat-building feature. NFC.
Differential Revision: http://reviews.llvm.org/D17229

llvm-svn: 260901
2016-02-15 18:07:29 +00:00
Jonas Paulsson 98963fec41 [ScheduleDAGInstrs] isUnsafeMemoryObject() removed
This function was basically useless, since volatile memacesses or MIs with
unmodelled sideffects become global memory objects, and the other little
checks are also done elsewhere.

Reviewed by Andy Trick
http://reviews.llvm.org/D16881

llvm-svn: 260899
2016-02-15 16:43:15 +00:00
Benjamin Kramer 7f75e9403d [AggressiveAntiDepBreaker] Skip some unnecessary BitVector copies.
llvm-svn: 260825
2016-02-13 16:39:39 +00:00
Matthias Braun bbb528f189 LiveIntervalAnalysis: Remove LiveVariables requirement
This requirement was a huge hack to keep LiveVariables alive because it
was optionally used by TwoAddressInstructionPass and PHIElimination.
However we have AnalysisUsage::addUsedIfAvailable() which we can use in
those passes.

llvm-svn: 260806
2016-02-13 04:35:31 +00:00
Pirama Arumuga Nainar 7476bc89e9 Don't combine fp_round (fp_round x) if f80 to f16 is generated
Summary:
This patch skips DAG combine of fp_round (fp_round x) if it results in
an fp_round from f80 to f16.

fp_round from f80 to f16 always generates an expensive (and as yet,
unimplemented) libcall to __truncxfhf2.  This prevents selection of
native f16 conversion instructions from f32 or f64.  Moreover, the first
(value-preserving) fp_round from f80 to either f32 or f64 may become a
NOP in platforms like x86.

Reviewers: ab

Subscribers: srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D17221

llvm-svn: 260769
2016-02-13 00:08:05 +00:00
Reid Kleckner 876330d53a [codeview] Describe local variables in registers
llvm-svn: 260746
2016-02-12 21:48:30 +00:00
Andrew Kaylor d1188ddd33 [WinEH] Prevent EH state numbering from skipping nested cleanup pads that never return
Differential Revision: http://reviews.llvm.org/D17208

llvm-svn: 260733
2016-02-12 21:10:16 +00:00
Quentin Colombet 232f447782 Get rid of some GLOBAL_ISEL ifdefs that should be harmless for code size.
More to come, but those were easy.

llvm-svn: 260723
2016-02-12 20:41:24 +00:00
Mehdi Amini 40b369cf5a GlobalISel is always built since r260566, reflect it in LLVMBuild.txt
Other component could not depends on an optional library in llvm-config

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 260701
2016-02-12 18:43:14 +00:00
Quentin Colombet ccd7725808 [IRTranslator] Use a single virtual register to represent any Value.
PR26161.

llvm-svn: 260602
2016-02-11 21:48:32 +00:00
Quentin Colombet 8fd6718700 [Target] Add a helper function to check if an opcode is invalid after isel.
llvm-svn: 260590
2016-02-11 21:16:56 +00:00
Matthias Braun c67f5a6ab1 Revert "LiveIntervalAnalysis: Support moving of subregister defs in handleMove"
This is broke a bot:

http://lab.llvm.org:8011/builders/clang-cmake-aarch64-quick/builds/4703/steps/test-suite/logs/test.log

Reverting while I investigate.

This reverts commit r260565.

llvm-svn: 260586
2016-02-11 21:07:44 +00:00
Sanjay Patel e5df1dfb14 [SelectionDAG] change getConstant() to use the input SDLoc when building splat vectors
The code change is simple enough: instead of attaching an anonymous SDLoc to splatted
vector constants, use the scalar constant's existing SDLoc since that is what is passed 
into getConstant() as a param. But this changes instruction scheduling, so I'll explain
why that happens.

The motivation for this patch starts near:
http://reviews.llvm.org/rL258833
...x86's getZeroVector() could be similarly cleaned up and I thought it would be 'NFC'.
But when I made that change locally, several x86 codegen tests wiggled.

It turns out that the lack of SDLoc consistency in getConstant() changes the way 
ScheduleDAGRRList behaves. This is because the SDLoc contains 'IROrder' and some DAG
scheduler algorithms use IROrder for tie-breaking.

Differential Revision: http://reviews.llvm.org/D16972

llvm-svn: 260582
2016-02-11 20:21:24 +00:00
Quentin Colombet fd9d0a07d8 [GlobalISel] Add the necessary plumbing to lower formal arguments.
llvm-svn: 260579
2016-02-11 19:59:41 +00:00
Peter Collingbourne 7c384ccea2 DwarfDebug: emit type units immediately.
Rather than storing type units in a vector and emitting them at the end
of code generation, emit them immediately and destroy them, reclaiming the
memory we were using for their DIEs.

In one benchmark carried out against Chromium's 50 largest (by bitcode
file size) translation units, total peak memory consumption with type units
decreased by median 17%, or by 7% when compared against disabling type units.

Tested using check-{llvm,clang}, the GDB 7.5 test suite (with
'-fdebug-types-section') and by eyeballing llvm-dwarfdump output on those
Chromium translation units with split DWARF both disabled and enabled, and
verifying that the only changes were to addresses and abbreviation ordering.

Differential Revision: http://reviews.llvm.org/D17118

llvm-svn: 260578
2016-02-11 19:57:46 +00:00
Reid Kleckner 829365aeef [codeview] Fix bug around multi-level wrapper inlining
If there were wrapper functions with no instructions of their own in the
inlining tree, we would fail to emit InlineSite records for them.

llvm-svn: 260571
2016-02-11 19:41:47 +00:00
Quentin Colombet 2e00253750 Play nice with Visual Studio and attributes
llvm-svn: 260568
2016-02-11 19:33:21 +00:00
Quentin Colombet bde158cbc7 [CMake] Produce an empty library for GlobalISel when not building it.
The rational for this change is that LLVMBuild cannot express conditional 
dependencies. Therefore, when we start optionally using GlobalISel library for 
say AArch64, without that change, all the tools that use the AArch64 library 
would need to explicitly link with GlobalISel when we ask for it.

This does not scale.

Instead, we will set the dependencies between the target and GlobalISel and if 
we did not ask to build GlobalISel, the library will just be empty.

Thanks to Chris Bieneman and Mehdi Animi for the idea.

llvm-svn: 260566
2016-02-11 19:18:27 +00:00
Matthias Braun 33c641bddf LiveIntervalAnalysis: Support moving of subregister defs in handleMove
If two definitions write to independent subregisters then they can be
put in any order. LiveIntervalAnalysis::handleMove() did not support
this previously because it looks like moving a definition of a vreg past
another one.

This is a modified version of a patch proposed (two years ago) by
Vincent Lejeune! This version does not touch the read-undef flags and is
extended for the case of moving a subregister def behind all uses - this
can happen for subregister defs that are completely unused.

Differential Revision: http://reviews.llvm.org/D9067

llvm-svn: 260565
2016-02-11 19:03:53 +00:00
Quentin Colombet 74d7d2f00b [GlobalISel] Teach the IRTranslator how to lower returns.
llvm-svn: 260562
2016-02-11 18:53:28 +00:00
Quentin Colombet 9855111b77 [GlobalISel] Add a type to MachineInstr.
We actually need that information only for generic instructions, therefore it
would be nice not to have to pay the extra memory consumption for all
instructions. Especially because a typed non-generic instruction does not make
sense.

The question is then, is it possible to have that information in a union or
something?
My initial thought was that we could have a derived class GenericMachineInstr
with additional information, but in practice it makes little to no sense since
generic MachineInstrs are likely turned into non-generic ones by just switching
the opcode. In other words, we don't want to go through the process of creating
a new, non-generic MachineInstr, object each time we do this switch. The memory
benefit probably is not worth the extra compile time.

Another option would be to keep the type of the MachineInstr in a side table.
This would induce an extra indirection though.

Anyway, I will file a PR to discuss about it and remember we need to come back
to it at some point.

llvm-svn: 260558
2016-02-11 18:22:37 +00:00
Quentin Colombet 37a09a8428 [GlobalISel] Add a hook in TargetConfigPass to run GlobalISel.
llvm-svn: 260553
2016-02-11 17:57:22 +00:00
Quentin Colombet a7fae162e6 [GlobalISel][IRTranslator] Change the ownership of the MIRBuilder field.
llvm-svn: 260551
2016-02-11 17:53:23 +00:00
Quentin Colombet 4f0ec8d2b0 [GlobalISel][IRTranslator] Fix a typo in assert.
llvm-svn: 260550
2016-02-11 17:52:28 +00:00
Quentin Colombet 17c494b91c [GlobalISel][IRTranslator] Teach the pass how to translate Add instructions.
llvm-svn: 260549
2016-02-11 17:51:31 +00:00
Quentin Colombet 2ad1f851a1 [GlobalISel] Add a MachineIRBuilder class.
Helper class to build machine instrs. This is a higher abstraction
than MachineInstrBuilder.

llvm-svn: 260547
2016-02-11 17:44:59 +00:00
Benjamin Kramer e3b963d5ee Drop the hidden visibility from DebugHandlerBase for now.
If a class has hidden visibility all derived classes and all classes
that have it as a member must have hidden visibility too. That may
be fixable here but requires changes to quite a lot of debug info
classes.

This is also one of the things that GCC enforces aggressively while
clang ignores it, making testing more annoying than necessary.

llvm-svn: 260529
2016-02-11 15:41:56 +00:00
Quentin Colombet e1494c353a [GlobalISel][MachineRegisterInfo] Add a method to create generic vregs.
For now, generic virtual registers will not have a register class. We may want
to change that. For instance, if we want to use all the methods from
TargetRegisterInfo with generic virtual registers, we need to either have some
sort of generic register classes that do what we want, or teach those methods
how to deal with nullptr register class.

Although the latter seems easy enough to do, we may still want to differenciate
generic register classes from nullptr to catch cases where nullptr gets
introduced by a bug of some sort.

Anyway, I will file a PR to keep track of that.

llvm-svn: 260474
2016-02-11 00:19:17 +00:00
Quentin Colombet 36ce1b0157 [GlobalISel] Remember the size of generic virtual registers
llvm-svn: 260468
2016-02-10 23:43:48 +00:00
Quentin Colombet 2ecff3bff2 [GlobalISel] More detailed skeleton for the IRTranslator.
llvm-svn: 260456
2016-02-10 22:59:27 +00:00
Reid Kleckner f9c275fe0a [codeview] Describe int local variables using .cv_def_range
Summary:
Refactor common value, scope, and label tracking logic out of DwarfDebug
into a common base class called DebugHandlerBase.

Update an old LLVM IR test case to avoid an assertion in LexicalScopes.

Reviewers: dblaikie, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16931

llvm-svn: 260432
2016-02-10 20:55:49 +00:00
Ahmed Bougacha f8dfb47c02 [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.
llvm-svn: 260316
2016-02-09 22:54:12 +00:00
Sanjay Patel 73200f72de [SelectionDAG] make getMemBasePlusOffset() accessible; NFCI
I reinvented this functionality in http://reviews.llvm.org/D16828 because it was
hidden away as a static function. The changes in x86 are not based on a complete
audit. I suspect there are other possible uses there, and there are almost certainly
more potential users in other targets.

llvm-svn: 260295
2016-02-09 21:42:04 +00:00
Andrew Kaylor 1224488e0c [regalloc][WinEH] Do not mark intervals as not spillable if they contain a regmask
Differential Revision: http://reviews.llvm.org/D16831

llvm-svn: 260164
2016-02-08 22:52:51 +00:00
Hans Wennborg 850ec6ca18 [X86] Don't zero/sign-extend i1, i8, or i16 return values to 32 bits (PR22532)
This matches GCC and MSVC's behaviour, and saves on code size.

We were already not extending i1 return values on x86_64 after r127766. This
takes that patch further by applying it to x86 target as well, and also for i8
and i16.

The ABI docs have been unclear about the required behaviour here. The new i386
psABI [1] clearly states (Table 2.4, page 14) that i1, i8, and i16 return
vales do not need to be extended beyond 8 bits. The x86_64 ABI doc is being
updated to say the same [2].

Differential Revision: http://reviews.llvm.org/D16907

 [1]. https://01.org/sites/default/files/file_attach/intel386-psabi-1.0.pdf
 [2]. https://groups.google.com/d/msg/x86-64-abi/E8O33onbnGQ/_RFWw_ixDQAJ

llvm-svn: 260133
2016-02-08 19:34:30 +00:00
Matt Arsenault 2bba779272 SelectionDAG: Lower some range metadata to AssertZext
If a range has a lower bound of 0, add an AssertZext from the
nearest floor power of two.

This allows operations with some workitem intrinsics with known
maximum ranges to use fast 24-bit multiplies.

llvm-svn: 260109
2016-02-08 16:28:19 +00:00
Sanjoy Das 86d7d83f2a [StatepointLower] Use None instead of Optional<int>()
llvm-svn: 259956
2016-02-05 23:40:04 +00:00
Wei Mi a62f058989 Some stackslots are allocated to vregs which have no real reference.
LiveRangeEdit::eliminateDeadDef is used to remove dead define instructions
after rematerialization. To remove a VNI for a vreg from its LiveInterval,
LiveIntervals::removeVRegDefAt is used. However, after non-PHI VNIs are all
removed, PHI VNI are still left in the LiveInterval. Such unused vregs will
be kept in RegsToSpill[] at the end of InlineSpiller::reMaterializeAll and
spiller will allocate stackslot for them.

The fix is to get rid of unused reg by checking whether it has non-dbg
reference instead of whether it has non-empty interval.

llvm-svn: 259895
2016-02-05 18:14:24 +00:00
Matt Arsenault 5923973fe2 Fix printing of f16 machine operands
Only single and double FP immediates are correctly printed by
MachineInstr::print() during debug output. Half float type goes to
APFloat::convertToDouble() and hits assertion it is not a double
semantics. This diff prints half machine operands correctly.

This cannot currently be hit by any in-tree target.

Patch by Stanislav Mekhanoshin

llvm-svn: 259857
2016-02-05 00:50:18 +00:00
Nemanja Ivanovic e8cbae32e9 Enable the %s modifier in inline asm template string
This patch corresponds to review:
http://reviews.llvm.org/D16847

There are some files in glibc that use the output operand modifier even though
it was deprecated in GCC. This patch just adds support for it to prevent issues
with such files.

llvm-svn: 259798
2016-02-04 16:18:08 +00:00
Petar Jovanovic 23e44f5e39 [Power PC] softening long double type
This patch implements softening of long double type (ppcf128) on ppc32
architecture and enables operations for this type for soft float.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D15811

llvm-svn: 259791
2016-02-04 14:43:50 +00:00
Jonas Paulsson 2293685731 [ScheduleDagInstrs] Improved comments
llvm-svn: 259783
2016-02-04 13:08:48 +00:00
Sanjay Patel e9fa3363b4 rangify; NFCI
llvm-svn: 259722
2016-02-03 22:44:14 +00:00
Reid Kleckner eb3bcdd28b [codeview] Remove EmitLabelDiff in favor emitAbsoluteSymbolDiff
llvm-svn: 259700
2016-02-03 21:24:42 +00:00
Reid Kleckner dac21b43d5 [codeview] Use the MCStreamer interface directly instead of AsmPrinter
This is mostly about having shorter lines and standardizing on one
interface, but it also avoids some needless indirection.

No functional change.

llvm-svn: 259697
2016-02-03 21:15:48 +00:00
Keno Fischer 6c1e47a66b [DWARFDebug] Fix another case of overlapping ranges
Summary:
In r257979, I added code to ensure that we wouldn't merge DebugLocEntries if
the pieces they describe overlap. Unfortunately, I failed to cover the case,
where there may have multiple active Expressions in the entry, in which case we
need to make sure that no two values overlap before we can perform the merge.

This fixed PR26148.

Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D16742

llvm-svn: 259696
2016-02-03 21:13:33 +00:00
Tim Shen f99f0d5a7e [SelectionDAG] Fix CombineToPreIndexedLoadStore O(n^2) behavior
This patch consists of two parts: a performance fix in DAGCombiner.cpp
and a correctness fix in SelectionDAG.cpp.

The test case tests the bug that's uncovered by the performance fix, and
fixed by the correctness fix.

The performance fix keeps the containers required by the
hasPredecessorHelper (which is a lazy DFS) and reuse them. Since
hasPredecessorHelper is called in a loop, the overall efficiency reduced
from O(n^2) to O(n), where n is the number of SDNodes.

The correctness fix keeps iterating the neighbor list even if it's time
to early return. It will return after finishing adding all neighbors to
Worklist, so that no neighbors are discarded due to the original early
return.

llvm-svn: 259691
2016-02-03 20:58:55 +00:00
Jonas Paulsson ac29f01788 [ScheduleDAGInstrs::buildSchedGraph()] Handling of memory dependecies rewritten.
Recommited, after some fixing with test cases.

Updated test cases:
test/CodeGen/AArch64/arm64-misched-memdep-bug.ll
test/CodeGen/AArch64/tailcall_misched_graph.ll

Temporarily disabled test cases:
test/CodeGen/AMDGPU/split-vector-memoperand-offsets.ll
test/CodeGen/PowerPC/ppc64-fastcc.ll (partially updated)
test/CodeGen/PowerPC/vsx-fma-m.ll
test/CodeGen/PowerPC/vsx-fma-sp.ll

http://reviews.llvm.org/D8705
Reviewers: Hal Finkel, Andy Trick.

llvm-svn: 259673
2016-02-03 17:52:29 +00:00
Jun Bum Lim 59df5e89c2 [MachineCopyPropagation] Fix comment. NFC
Reviewers: MatzeB, qcolombet, jmolloy, mcrosier

Subscribers: llvm-commits, mcrosier

Differential Revision: http://reviews.llvm.org/D16806

llvm-svn: 259656
2016-02-03 15:56:27 +00:00
Marcello Maggioni bfe87568aa RegCoalescer: Making sure re-materialization defines all subranges
The register coalescer can rematerialize constants that define
more of a register than the copy it is going to replace was going
to do.
This is valid in the case the register was undef before the
copy happened.
This patch makes sure that all the subranges defined by the new
rematerialization instructions have at least a dead def.

Review: http://reviews.llvm.org/D16693
llvm-svn: 259614
2016-02-03 00:22:32 +00:00
David Majnemer 30579ec851 [codeview] Improve readability of codeview assembly output
Strictly speaking, this is not an improvement in functionality per se
but a usability improvement to those debugging codeview.

llvm-svn: 259601
2016-02-02 23:18:23 +00:00
Matthias Braun 1377fd6781 MachineVerifier: Check that defs/uses are live in subregisters as well.
llvm-svn: 259552
2016-02-02 20:04:51 +00:00
David Majnemer c9911f28e5 [codeview] Correctly handle inlining functions post-dominated by unreachable
CodeView requires us to accurately describe the extent of the inlined
code.  We did this by grabbing the next debug location in source order
and using *that* to denote where we stopped inlining.  However, this is
not sufficient or correct in instances where there is no next debug
location or the next debug location belongs to the start of another
function.

To get this correct, use the end symbol of the function to denote the
last possible place the inlining could have stopped at.

llvm-svn: 259548
2016-02-02 19:22:34 +00:00
Eugene Zelenko ecefe5a81f Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes.
Differential revision: http://reviews.llvm.org/D16793

llvm-svn: 259539
2016-02-02 18:20:45 +00:00
Reid Kleckner 1fcd610c94 [codeview] Wire up the .cv_inline_linetable directive
This directive emits the binary annotations that describe line and code
deltas in inlined call sites. Single-stepping through inlined frames in
windbg now works.

llvm-svn: 259535
2016-02-02 17:41:18 +00:00
David Majnemer ccc809e2e6 [RegisterCoalescer] Better DebugLoc for reMaterializeTrivialDef
When rematerializing a computation by replacing the copy, use the copy's
location.  The location of the copy is more representative of the
original program.

This partially fixes PR10003.

llvm-svn: 259469
2016-02-02 06:41:55 +00:00
Matthias Braun 579c9cda13 MachineVerifier: Use report_context() instead of ad-hoc messages.
llvm-svn: 259457
2016-02-02 02:44:25 +00:00
Anna Zaks cad7994c3b [safestack] Make sure the unsafe stack pointer is popped in all cases
The unsafe stack pointer is only popped in moveStaticAllocasToUnsafeStack so it won't happen if there are no static allocas.

Fixes https://llvm.org/bugs/show_bug.cgi?id=26122

Differential Revision: http://reviews.llvm.org/D16339

llvm-svn: 259447
2016-02-02 01:03:11 +00:00
Balaram Makam 92431703d7 AArch64: Implement missed conditional compare sequences.
Summary:
This is an extension to the existing implementation of r242436 which
restricts to only select inputs. This version fixes missed opportunities
in pr26084 by attempting to lower conditional compare sequences of
and/or trees with setcc leafs. This will additionaly handle the case
when a tree with select input is not a conjunction-disjunction tree
but some of the sub trees are conjunction-disjunction trees.

Reviewers: jmolloy, t.p.northover, mcrosier, MatzeB

Subscribers: mcrosier, llvm-commits, junbuml, haicheng, mssimpso, gberry

Differential Revision: http://reviews.llvm.org/D16291

llvm-svn: 259387
2016-02-01 19:13:07 +00:00
Geoff Berry b13b2eed11 [PrologEpilogInserter] Add some debug output for callee-save frame object allocation
Reviewers: mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16733

llvm-svn: 259367
2016-02-01 16:47:51 +00:00
Amjad Aboud 8bbce8ad8e Improved macro emission in dwarf.
Changed emitting offset of macinfo entry into compiler unit DIE to use "addSectionLabel" method rather than explicitly calculating size/offset of macro entry.

Differential Revision: http://reviews.llvm.org/D16292

llvm-svn: 259358
2016-02-01 14:09:41 +00:00
David Majnemer 784d4a455b Revert r258580 and r258581.
Those commits created an artificial edge from a cleanup to a synthesized
catchswitch in order to get the MSVC personality routine to execute
cleanups which don't cleanupret and are not wrapped by a catchswitch.

This worked well enough but is not a complete solution in situations
where there the cleanup infinite loops.

However, the real deal breaker behind this approach comes about from a
degenerate case where the cleanup is post-dominated by unreachable *and*
throws an exception.  This ends poorly because the catchswitch will
inadvertently catch the exception.

Because of this we should go back to our previous behavior of not
executing certain cleanups (identical behavior with the Itanium ABI
implementation in clang, GCC and ICC).

N.B. I think this could be salvaged by making the catchpad rethrow the
exception and properly transforming throwing calls in the cleanup into
invokes.

llvm-svn: 259338
2016-02-01 03:29:38 +00:00
Tim Shen 3b428cb764 [SelectionDAG] Eliminate exponential behavior in WalkChainUsers
llvm-svn: 259315
2016-01-31 03:59:34 +00:00
Matthias Braun b30f2f5141 Avoid overly large SmallPtrSet/SmallSet
These sets perform linear searching in small mode so it is never a good
idea to use SmallSize/N bigger than 32.

llvm-svn: 259283
2016-01-30 01:24:31 +00:00
Manman Ren c77e0ff785 [Objective-C] Support a new special module flag.
"Objective-C Class Properties" will be put into the objc_imageinfo struct.

rdar://23891898

llvm-svn: 259270
2016-01-29 23:51:00 +00:00
Yaron Keren eb2a25467e Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment.
clang part in r259232, this is the LLVM part of the patch.

llvm-svn: 259240
2016-01-29 20:50:44 +00:00
Reid Kleckner f3b9ba4941 [codeview] Begin to add support for inlined call sites
Summary:
There are three parts to inlined call frames:
1. The inlinee line subsection
2. The inline site symbol record
3. The function ids referenced by both

This change starts by emitting function ids (3) for all subprograms and
emitting the base inline site symbol record (2). The actual line numbers
in (2) use an encoded format that will come next, along with the inlinee
line subsection.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16333

llvm-svn: 259217
2016-01-29 18:16:43 +00:00
Jonas Paulsson 8c738635b1 Temporarily revert "[ScheduleDAGInstrs::buildSchedGraph()] Handling of memory dependecies rewritten."
Some buildbot failures needs to be debugged.

llvm-svn: 259213
2016-01-29 17:22:43 +00:00
Jonas Paulsson 23f12e5c02 [ScheduleDAGInstrs::buildSchedGraph()] Handling of memory dependecies rewritten.
The buildSchedGraph() was in need of reworking as the AA features had been
added on top of earlier code. It was very difficult to understand, and buggy.
There had been found cases where scheduling dependencies had actually been
missed (see r228686).

AliasChain, RejectMemNodes, adjustChainDeps() and iterateChainSucc() have
been removed. There are instead now just the four maps from Value to SUs, which
have been renamed to Stores, Loads, NonAliasStores and NonAliasLoads.

An unknown store used to become the AliasChain, but now becomes a store mapped
to 'unknownValue' (in Stores). What used to be PendingLoads is instead the
list of SUs mapped to 'unknownValue' in Loads.

RejectMemNodes and adjustChainDeps() used to be a safety-net for everything.
The SU maps were sometimes cleared and SUs were put in RejectMemNodes, where
adjustChainDeps() would look. Instead of this, a more straight forward approach
is used in maintaining the SU maps without clearing them and simply letting
them grow over time. Instead of the cutt-off in adjustChainDeps() search, a
reduction of maps will be done if needed (see below).

Each SUnit either becomes the BarrierChain, or is put into one of the maps. For
each SUnit encountered, all the information about previous ones are still
available until a new BarrierChain is set, at which point the maps are cleared.

For huge regions, the algorithm becomes slow, therefore the maps will get
reduced at a threshold (current default is 1000 nodes), by a fraction (default 1/2).
These values can be tuned by use of CL options in case some test case shows that
they need to be changed (-dag-maps-huge-region and -dag-maps-reduction-size).

There has not been any considerable change observed in output quality or compile
time. There may now be more DAG edges inserted than before (i.e. if A->B->C,
then A->C is not needed). However, in a comparison run there were fewer total
calls to AA, and a somewhat improved compile time, which means this seems to
be not a problem.

http://reviews.llvm.org/D8705
Reviewers: Hal Finkel, Andy Trick.

llvm-svn: 259201
2016-01-29 16:11:18 +00:00
Junmo Park 67bb3f1d27 Minor code cleanup. NFC.
llvm-svn: 259139
2016-01-29 01:39:39 +00:00
Reid Kleckner 2214ed8937 Reland "[CodeView] Use assembler directives for line tables"
This reverts commit r259126 and relands r259117.

This time with updated library dependencies.

llvm-svn: 259130
2016-01-29 00:49:42 +00:00
Reid Kleckner 00d9639c24 Revert "[CodeView] Use assembler directives for line tables"
This reverts commit r259117.

The LineInfo constructor is defined in the codeview library and we have
to link against it now. Doing that isn't trivial, so reverting for now.

llvm-svn: 259126
2016-01-29 00:13:28 +00:00
Reid Kleckner c62e379d22 [CodeView] Use assembler directives for line tables
Adds a new family of .cv_* directives to LLVM's variant of GAS syntax:

- .cv_file: Similar to DWARF .file directives

- .cv_loc: Similar to the DWARF .loc directive, but starts with a
  function id. CodeView line tables are emitted by function instead of
  by compilation unit, so we needed an extra field to communicate this.
  Rather than overloading the .loc direction further, we decided it was
  better to have our own directive.

- .cv_stringtable: Emits the codeview string table at the current
  position. Currently this just contains the filenames as
  null-terminated strings.

- .cv_filechecksums: Emits the file checksum table for all files used
  with .cv_file so far. There is currently no support for emitting
  actual checksums, just filenames.

This moves the line table emission code down into the assembler.  This
is in preparation for implementing the inlined call site line table
format. The inline line table format encoding algorithm requires knowing
the absolute code offsets, so it must run after the assembler has laid
out the code.

David Majnemer collaborated on this patch.

llvm-svn: 259117
2016-01-28 23:31:52 +00:00
David Majnemer 4543ff09a2 [X86] Don't transform X << 1 to X + X during type legalization
While legalizing a 64-bit shift left by 1, the following occurs:

We split the shift operand in half: a high half and a low half.
We then create an ADDC with the low half and a ADDE with the high half +
the carry bit from the ADDC.

This is problematic if X is any_ext'd because the high half computation
is now undef + undef + carry bit and there is no way to ensure that the
two undef values had the same bitwise representation.  This results in
the lowest bit in the high half turning into garbage.

Instead, do not try to turn shifts into arithmetic during type
legalization.

This fixes PR26350.

llvm-svn: 259065
2016-01-28 18:20:05 +00:00
Oliver Stannard 02fa1c80c4 Revert r259035, it introduces a cyclic library dependency
llvm-svn: 259045
2016-01-28 13:19:47 +00:00
Oliver Stannard b4b092ea1b Add backend dignostic printer for unsupported features
Re-commit of r258951 after fixing layering violation.

The related LLVM patch adds a backend diagnostic type for reporting
unsupported features, this adds a printer for them to clang.

In the case where debug location information is not available, I've
changed the printer to report the location as the first line of the
function, rather than the closing brace, as the latter does not give the
user any information. This also affects optimisation remarks.

Differential Revision: http://reviews.llvm.org/D16590

llvm-svn: 259035
2016-01-28 10:07:27 +00:00
Junmo Park 7d6c5f19f1 Minor code cleanups. NFC.
llvm-svn: 259033
2016-01-28 09:42:39 +00:00
Junmo Park b3327b7007 [DAGCombiner] Don't add volatile or indexed stores to ChainedStores
Summary:
findBetterNeighborChains does not handle volatile or indexed stores.
However, it did not check when adding stores to ChainedStores.

Reviewers: arsenm

Differential Revision: http://reviews.llvm.org/D16463

llvm-svn: 259024
2016-01-28 06:23:33 +00:00
NAKAMURA Takumi 628a7a0aef Revert r258951 (and r258950), "Refactor backend diagnostics for unsupported features"
It broke layering violation in LLVMIR.

clang r258950 "Add backend dignostic printer for unsupported features"
llvm  r258951 "Refactor backend diagnostics for unsupported features"

llvm-svn: 259016
2016-01-28 04:41:32 +00:00
Benjamin Kramer 391be792f2 One more batch of self-containing headers.
llvm-svn: 258974
2016-01-27 19:29:56 +00:00
Oliver Stannard 1e67a9f196 Refactor backend diagnostics for unsupported features
The BPF and WebAssembly backends had identical code for emitting errors
for unsupported features, and AMDGPU had very similar code. This merges
them all into one DiagnosticInfo subclass, that can be used by any
backend.

There should be minimal functional changes here, but some AMDGPU tests
have been updated for the new format of errors (it used a slightly
different format to BPF and WebAssembly). The AMDGPU error messages will
now benefit from having precise source locations when debug info is
available.

The implementation of DiagnosticInfoUnsupported::print must be in
lib/Codegen rather than in the existing file in lib/IR/ to avoid
introducing a dependency from IR to CodeGen.

Differential Revision: http://reviews.llvm.org/D16590

llvm-svn: 258951
2016-01-27 17:30:33 +00:00
Benjamin Kramer 390c33cd18 Move SafeStack to CodeGen.
It depends on the target machinery, that's not available for
instrumentation passes.

llvm-svn: 258942
2016-01-27 16:53:42 +00:00
Benjamin Kramer f9172fd4ac Rename TargetSelectionDAGInfo into SelectionDAGTargetInfo and move it to CodeGen/
It's a SelectionDAG thing, not a Target thing.

llvm-svn: 258939
2016-01-27 16:32:26 +00:00
Benjamin Kramer c67cf31f5c Move passes that live in lib/CodeGen out of Scalar.h
llvm-svn: 258938
2016-01-27 16:05:42 +00:00
Benjamin Kramer 820f7548a1 Make some headers self-contained, remove unused includes that violate layering.
llvm-svn: 258937
2016-01-27 16:05:37 +00:00
Benjamin Kramer b3e8a6d2b8 Move MCTargetAsmParser.h to llvm/MC/MCParser where it belongs.
llvm-svn: 258917
2016-01-27 10:01:28 +00:00
Chris Bieneman e49730d4ba Remove autoconf support
Summary:
This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html

"I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened."
- Obi Wan Kenobi

Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark

Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D16471

llvm-svn: 258861
2016-01-26 21:29:08 +00:00
Chad Rosier b46d0f9a71 [ScheduleDAGInstrs] Simplify logic to improve readability. NFC.
The call to isInvariantLoad() already returns false for non-load instructions.

llvm-svn: 258841
2016-01-26 19:33:57 +00:00
Sanjay Patel 59066a0803 tidy up; NFC
llvm-svn: 258838
2016-01-26 19:30:14 +00:00
Sanjay Patel 9eed9a956f fix formatting; NFC
llvm-svn: 258825
2016-01-26 18:14:37 +00:00
Matthias Braun db320773e1 LiveIntervalAnalysis: Improve some comments
As recommended by Justin.

llvm-svn: 258771
2016-01-26 01:40:48 +00:00
Matthias Braun 242b8bb58d LiveIntervalAnalysis: Cleanup handleMove{Down|Up}() functions, NFC
These two functions are hard to reason about. This commit makes the code
more comprehensible:

- Use four distinct variables (OldIdxIn, OldIdxOut, NewIdxIn, NewIdxOut)
  with a fixed value instead of a changing iterator I that points to
  different things during the function.
- Remove the early explanation before the function in favor of more
  detailed comments inside the function. Should have more/clearer comments now
  stating which conditions are tested and which invariants hold at
  different points in the functions.

The behaviour of the code was not changed.

I hope that this will make it easier to review the changes in
http://reviews.llvm.org/D9067 which I will adapt next.

Differential Revision: http://reviews.llvm.org/D16379

llvm-svn: 258756
2016-01-26 00:43:50 +00:00
Dan Gohman 5016c0f99d [SelectionDAG] Use the correct return type for memcpy, memmove, and memset.
When generating calls to memcpy, memmove, and memset, use void* as the return
type rather than void, to match the standard signatures for these functions.

This has no practical effect for most targets, since the return values of
these calls aren't being used anyway, and most calling conventions tolerate
this kind of mismatch. However, this change will help support future
optimizations to utilize the return value to avoid holding the argument
value live across a call.

llvm-svn: 258691
2016-01-25 15:05:56 +00:00
Amjad Aboud c077841492 Fixed few comments.
llvm-svn: 258658
2016-01-24 08:18:55 +00:00
David Majnemer b7d49268c2 [WinEH] Don't miscompile cleanups which conditionally unwind to caller
A cleanup can have paths which unwind or end up in unreachable.
If there is an unreachable path *and* a path which unwinds to caller,
we would mistakenly inject an unwind path to a catchswitch on the
unreachable path.  This results in a verifier assertion firing because
the cleanup unwinds to two different places: to the caller and to the
catchswitch.

This occured because we used getCleanupRetUnwindDest to determine if the
cleanuppad had no cleanuprets.
This is incorrect, getCleanupRetUnwindDest returns null for cleanuprets
which unwind to caller.

llvm-svn: 258651
2016-01-23 23:54:33 +00:00
Simon Pilgrim 02c1b54a4a [SelectionDAG] Generalised the CONCAT_VECTORS creation to support BUILD_VECTOR and UNDEF folding.
llvm-svn: 258646
2016-01-23 22:27:54 +00:00
Simon Pilgrim b9b8fcd831 Tidied up TRUNC combine code. NFC.
Make use of DAG.getBitcast and use clang-format to reduce number of lines (and make it more readable).

llvm-svn: 258644
2016-01-23 21:50:40 +00:00
Benjamin Kramer 58e1998520 Don't check if a list is empty with ilist::size.
ilist::size() is O(n) while ilist::empty() is O(1)

llvm-svn: 258636
2016-01-23 20:58:09 +00:00
Junmo Park 75e9d64aa2 Remove extra whitespace. NFC.
llvm-svn: 258617
2016-01-23 06:34:36 +00:00
David Majnemer f1ff538456 [WinEH] Let cleanups post-dominated by unreachable get executed
Cleanups in C++ are a little weird.  They are only guaranteed to be
reliably executed if, and only if, there is a viable catch handler which
can handle the exception.

This means that reachability of a cleanup is lexically determined by it
being nested with a try-block which unwinds to a catch.  It is *cannot*
be reasoned about by examining the control flow edges leaving a cleanup.

Usually this is not a problem.  It becomes a problem when there are *no*
edges out of a cleanup because we believed that code post-dominated by
the cleanup is dead.  In LLVM's case, this code is what informs the
personality routine about the presence of a suitable catch handler.
However, the lack of edges to that catch handler makes the handler
become unreachable which causes us to remove it.  By removing the
handler, the cleanup becomes unreachable.

Instead, inject a catch-all handler with every cleanup that has no
unwind edges.  This will allow us to properly unwind the stack.

This fixes PR25997.

llvm-svn: 258580
2016-01-22 23:20:43 +00:00
Weiming Zhao 13e7cb294c Fix LivePhysRegs::addLiveOuts
Summary:
The testing for returnBB was flipped which may cause ARM ld/st opt pass uses callee saved regs in returnBB when shrink-wrap is used.


Reviewers: t.p.northover, apazos, MatzeB

Subscribers: mcrosier, zzheng, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D16434

llvm-svn: 258569
2016-01-22 22:21:34 +00:00
Sanjay Patel 3388d1fc6d function names start with a lowercase letter; NFC
llvm-svn: 258552
2016-01-22 21:11:47 +00:00
David Majnemer 734d7c3272 [WinEH] Make collectFuncletMembers non-recursive
Use a worklist for the pre-order DFS instead of using recursion.
No functionality change is intended.

llvm-svn: 258521
2016-01-22 18:49:50 +00:00
Dan Gohman 0bf3ae84ca [SelectionDAG] Fold more offsets into GlobalAddresses
This reapplies r258296 and r258366, and also fixes an existing bug in
SelectionDAG.cpp's isMemSrcFromString, neglecting to account for the
offset in a GlobalAddressSDNode, which is uncovered by those patches.

llvm-svn: 258482
2016-01-22 03:57:34 +00:00
Eduard Burtescu 68e7f49f8e [opaque pointer types] [NFC] DataLayout::getIndexedOffset: take source element type instead of pointer type and rename to getIndexedOffsetInType.
Summary:

Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16282

llvm-svn: 258478
2016-01-22 03:08:27 +00:00
Eduard Burtescu 1423921a24 [opaque pointer types] [NFC] Add an explicit type argument to ConstantFoldLoadFromConstPtr.
Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16418

llvm-svn: 258472
2016-01-22 01:17:26 +00:00
Reid Kleckner b7ecfa5b09 Revert "[SelectionDAG] Fold more offsets into GlobalAddresses"
This reverts r258296 and the follow up r258366. With this change, we
miscompiled the following program on Windows:
  #include <string>
  #include <iostream>
  static const char kData[] = "asdf jkl;";
  int main() {
    std::string s(kData + 3, sizeof(kData) - 3);
    std::cout << s << '\n';
  }

llvm-svn: 258465
2016-01-22 01:09:29 +00:00
Reid Kleckner 18ec96f0fc Avoid unnecessary stack realignment in musttail thunks with SSE2 enabled
The X86 musttail implementation finds register parameters to forward by
running the calling convention algorithm until a non-register location
is returned. However, assigning a vector memory location has the side
effect of increasing the function's stack alignment. We shouldn't
increase the stack alignment when we are only looking for register
parameters, so this change conditionalizes it.

llvm-svn: 258442
2016-01-21 22:23:22 +00:00
Chad Rosier 406808e344 Partially revert "Add command line options to force function/loop alignments."
This partially reverts r256571 in favor of the solution in r258409.

llvm-svn: 258421
2016-01-21 18:49:15 +00:00
Geoff Berry 10494aca05 [BlockPlacement] Add option to align all non-fall-through blocks.
Summary: This option is being added for testing purposes.

Reviewers: mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16410

llvm-svn: 258409
2016-01-21 17:25:52 +00:00
Manuel Jacob 925d029461 Introduce ConstantFoldCastOperand function and migrate some callers of ConstantFoldInstOperands to use it. NFC.
Summary:
Although this is a slight cleanup on its own, the main motivation is to
refactor the constant folding API to ease migration to opaque pointers.
This will be follow-up work.

Reviewers: eddyb

Subscribers: zzheng, dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D16380

llvm-svn: 258390
2016-01-21 06:31:08 +00:00
Andrew Wilkins a7a8ab71aa [GlobalISel] make library an optional component
Summary:
Mark the LLVMGlobalISel library as optional in
LLVMBuild.txt, since the library is only built
if LLVM_BUILD_GLOBAL_ISEL is set. Without doing
this, llvm-config includes the library in the
list of components regardless of whether it's
built, and then will error out when asked for
the library names/paths.

Reviewers: qcolombet

Subscribers: joker.eph, llvm-commits, vkalintiris

Differential Revision: http://reviews.llvm.org/D16386

llvm-svn: 258379
2016-01-21 01:41:03 +00:00
Dan Gohman 760bef5e50 [SelectionDAG] Fix constant offset folding to avoid commuting non-commutative operators.
This fixes a miscompile in MultiSource/Benchmarks/MiBench/consumer-lame
introduced in r258296.

llvm-svn: 258366
2016-01-20 23:16:59 +00:00
Chad Rosier 816a1ab9d9 MachineScheduler: Add a command line option to disable post scheduler.
llvm-svn: 258364
2016-01-20 23:08:32 +00:00
Chad Rosier 6338d7c390 MachineScheduler: Honor optnone functions in the pre-ra scheduler.
llvm-svn: 258363
2016-01-20 22:38:25 +00:00
Quentin Colombet 105cf2b179 [GlobalISel] Add the proper cmake plumbing.
This patch adds the necessary plumbing to cmake to build the sources related to
GlobalISel.

To build the sources related to GlobalISel, we need to add -DBUILD_GLOBAL_ISEL=ON.
By default, this is OFF, thus GlobalISel sources will not impact people that do
not explicitly opt-in.

Differential Revision: http://reviews.llvm.org/D15983

llvm-svn: 258344
2016-01-20 20:58:56 +00:00
Sanjay Patel 545a456235 fix formatting; NFC
llvm-svn: 258330
2016-01-20 18:59:16 +00:00
Krzysztof Parzyszek 2451c4835a Proper handling of diamond-like cases in if-conversion
If converter was somewhat careless about "diamond" cases, where there
was no join block, or in other words, where the true/false blocks did
not have analyzable branches. In such cases, it was possible for it to
remove (needed) branches, resulting in a loss of entire basic blocks.

Differential Revision: http://reviews.llvm.org/D16156

llvm-svn: 258310
2016-01-20 13:14:52 +00:00
Dan Gohman edf98c5682 [SelectionDAG] Fold more offsets into GlobalAddresses
SelectionDAG previously missed opportunities to fold constants into
GlobalAddresses in several areas. For example, given `(add (add GA, c1), y)`, it
would often reassociate to `(add (add GA, y), c1)`, missing the opportunity to
create `(add GA+c, y)`. This isn't often visible on targets such as X86 which
effectively reassociate adds in their complex address-mode folding logic,
however it is currently visible on WebAssembly since it currently has very
simple address mode folding code that doesn't reassociate anything.

This patch fixes this by making SelectionDAG fold offsets into GlobalAddresses
at the same times that it folds constants together, so that it doesn't miss any
opportunities to perform such folding.

Differential Revision: http://reviews.llvm.org/D16090

llvm-svn: 258296
2016-01-20 07:03:08 +00:00
Eduard Burtescu 23c4d83aa3 [NFC] Replace several manual GEP loops with gep_type_iterator.
Reviewers: dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16335

llvm-svn: 258262
2016-01-20 00:26:52 +00:00
Matthias Braun d4f6409dff MachineScheduler: Allow independent scheduling of sub register defs
Note that this is disabled by default and still requires a patch to
handleMove() which is not upstreamed yet.

If the TrackLaneMasks policy/strategy is enabled the MachineScheduler
will build a schedule graph where definitions of independent
subregisters are no longer serialised.

Implementation comments:
- Without lane mask tracking a sub register def also counts as a use
  (except for the first one with the read-undef flag set), with lane
  mask tracking enabled this is no longer the case.
- Pressure Diffs where previously maintained per definition of a
  vreg with the help of the SSA information contained in the
  LiveIntervals.  With lanemask tracking enabled we cannot do this
  anymore and instead change the pressure diffs for all uses of the vreg
  as it becomes live/dead.  For this changed style to work correctly we
  ignore uses of instructions that define the same register again: They
  won't affect register pressure.
- With lanemask tracking we remove all read-undef flags from
  sub register defs when building the graph and re-add them later when
  all vreg lanes have become dead.

Differential Revision: http://reviews.llvm.org/D14969

llvm-svn: 258259
2016-01-20 00:23:32 +00:00
Matthias Braun 5d458617aa RegisterPressure: Make liveness tracking subregister aware
Differential Revision: http://reviews.llvm.org/D14968

llvm-svn: 258258
2016-01-20 00:23:26 +00:00
Matthias Braun 3907fded1b LiveInterval: Add utility class to rename independent subregister usage
This renaming is necessary to avoid a subregister aware scheduler
accidentally creating liveness "holes" which are rejected by the
MachineVerifier.

Explanation as found in this patch:

Helper class that can divide MachineOperands of a virtual register into
equivalence classes of connected components.
MachineOperands belong to the same equivalence class when they are part of
the same SubRange segment or adjacent segments (adjacent in control
flow); Different subranges affected by the same MachineOperand belong to
the same equivalence class.

Example:
  vreg0:sub0 = ...
  vreg0:sub1 = ...
  vreg0:sub2 = ...
  ...
  xxx        = op vreg0:sub1
  vreg0:sub1 = ...
  store vreg0:sub0_sub1

The example contains 3 different equivalence classes:
  - One for the (dead) vreg0:sub2 definition
  - One containing the first vreg0:sub1 definition and its use,
    but not the second definition!
  - The remaining class contains all other operands involving vreg0.

We provide a utility function here to rename disjunct classes to different
virtual registers.

Differential Revision: http://reviews.llvm.org/D16126

llvm-svn: 258257
2016-01-20 00:23:21 +00:00
Sanjoy Das 16901a3e20 [MachineSink] Don't break ImplicitNulls
Summary:
This teaches MachineSink to not sink instructions that might break the
implicit null check optimization that runs later.  This should not
affect frontends that do not use implicit null checks.

Reviewers: aadg, reames, hfinkel, atrick

Subscribers: majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D14632

llvm-svn: 258254
2016-01-20 00:06:14 +00:00
Quentin Colombet 2c49e2e664 [MachineFunction] Constify getter. NFC.
llvm-svn: 258207
2016-01-19 22:31:12 +00:00
Philip Reames 1a196f7daf Revert 258157
According the build bots, clang is using the Registry class somewhere as well. Will reapply with appropriate clang changes at a later point.

llvm-svn: 258159
2016-01-19 18:41:10 +00:00
Philip Reames 0f6650e8e8 [GC] Registry initialization and linkage interactions
The Registry class constructs a linked list of nodes whose storage is inside static variables and nodes are added via static initializers. The trick is that those static initializers are in both the LLVM code base, and some random plugin that might get loaded in at runtime. The existing code tries to use C++ templates and their ODR rules to get a single definition of the registry for each type, but, experimentally, this doesn't quite work as designed. (Well, the entire structure doesn't. It might not actually be an ODR problem.)

Previously, when I tried moving the GCStrategy class (along with it's registry) from CodeGen to IR, I ran into a problem where asking the GCStrategyRegistry a question would return inconsistent results depending on whether you asked from CodeGen (where the static initializers still were) or Transforms. My best guess is that this is a result of either a) an order of initialization error, or b) we ended up with two copies of the registry being created. I remember at the time having convinced myself it was probably (b), but I don't have any of my notes around from that investigation any more.

See http://reviews.llvm.org/rL226311 for the original patch in question.

This patch tries to remove the possibility of (b) above. (a) was already fixed in change 258109.

Differential Revision: http://reviews.llvm.org/D16170

llvm-svn: 258157
2016-01-19 18:34:27 +00:00
Eduard Burtescu 19eb03106d [opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.
Summary:
GEPOperator: provide getResultElementType alongside getSourceElementType.
This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has.

GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.

Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16275

llvm-svn: 258145
2016-01-19 17:28:00 +00:00
Philip Reames 3195500297 [GC] Consolidate all built in GCs into a single file [NFC]
Combine a bunch of small files into a single, still rather small, file.  The primary purpose of this is to get all of the static initializers into a single file so as to have a well defined order of initialization.  

llvm-svn: 258109
2016-01-19 03:57:18 +00:00
Simon Pilgrim c4d519d340 Fixed MSVC warning that not all control paths return a value.
llvm-svn: 258099
2016-01-18 22:54:46 +00:00
Sergei Larin d19d4d30d8 Add to the split module utility an SCC based method which allows not to globalize any local variables.
Summary:
    Currently llvm::SplitModule as the first step globalizes all local objects, which might not be desirable in some scenarios.
    This change adds a new flag to llvm::SplitModule that uses SCC approach to search for a balanced partition without the need to externalize symbols.
    Such partition might not be possible or fully balanced for a given number of partitions, and is a function of the module properties (global/local dependencies within the module).
    
    Joint development Tobias Edler von Koch (tobias@codeaurora.org) and Sergei Larin (slarin@codeaurora.org)
    
    Subscribers: llvm-commits, joker.eph
    
    Differential Revision: http://reviews.llvm.org/D16124

llvm-svn: 258083
2016-01-18 21:07:13 +00:00
Tom Stellard ccdc5391ea TargetLowering: Improve handling of (setcc ([sz]ext x) 0, cc) in SimplifySetCC
Summary:
When SimplifySetCC sees a setcc node that compares the result of a
value extension operation with a constant, it tries to simplify the
setcc node by eliminating the extension and shrinking the constant.

If shrinking the inputs to setcc is deemed not desirable by the target
(e.g. the target does not want a setcc comparing i1 values), then it
is still possible to optimize this sequence in some cases.

This patch adds the following combines to SimplifySetCC when shrinking setcc
inputs is not desirable:

(setcc ([sz]ext (setcc x, y, cc)), 0, setne) -> (setcc (x, y, cc))
(setcc ([sz]ext (setcc x, y, cc)), 0, seteq) -> (setcc (x, Y, !cc))

There are no tests for this yet, but once AMDGPU correctly implements
TargetLowering::isTypeDesirableForOp(), this new combine will be
exercised by the existing CodeGen/AMDGPU/setcc-opt.ll test.

Reviewers: resistor, arsenm

Subscribers: jroelofs, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15034

llvm-svn: 258067
2016-01-18 19:55:21 +00:00
Junmo Park 3347e7823a Remove extra whitespace. NFC.
llvm-svn: 258035
2016-01-18 06:42:51 +00:00
Eduard Burtescu 90c4449128 [opaque pointer types] Alloca: use getAllocatedType() instead of getType()->getPointerElementType().
Reviewers: mjacob

Subscribers: llvm-commits, dblaikie

Differential Revision: http://reviews.llvm.org/D16272

llvm-svn: 258028
2016-01-18 00:10:01 +00:00
Manuel Jacob 190577ac81 [opaque pointer types] [NFC] CallSite: use getFunctionType() instead of going through PointerType::getElementType.
Patch by Eduard Burtescu.

Reviewers: dblaikie, mjacob

Subscribers: dsanders, llvm-commits, dblaikie

Differential Revision: http://reviews.llvm.org/D16273

llvm-svn: 258023
2016-01-17 22:37:39 +00:00
Manuel Jacob 5f6eaac611 GlobalValue: use getValueType() instead of getType()->getPointerElementType().
Reviewers: mjacob

Subscribers: jholewinski, arsenm, dsanders, dblaikie

Patch by Eduard Burtescu.

Differential Revision: http://reviews.llvm.org/D16260

llvm-svn: 257999
2016-01-16 20:30:46 +00:00
Keno Fischer bc0cb11eb2 [DwarfDebug] Don't merge DebugLocEntries if their pieces overlap
Summary:
Later in DWARF emission we check that DebugLocEntries have
non-overlapping pieces, so we should create any such entries
by merging here.

Fixes PR26163.

Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D16249

llvm-svn: 257979
2016-01-16 01:15:32 +00:00
Keno Fischer f8eb6a1414 [DwarfDebug] Move MergeValues to .cpp, NFC
llvm-svn: 257977
2016-01-16 01:11:33 +00:00
Reid Kleckner 9533af4f8a [codeview] Remove custom line info struct in favor of DebugLoc
The only functional change would be that we might emit multiple filename
segments on code like this:

  void f() {
  #include "p1/../t.h"
  #include "p2/../t.h"
  }

I believe these get separate DIFile metadata nodes, but will have the
same canonicalized absolute path. Previously by computing the path up
front and comparing it we would merge the line info segments.

llvm-svn: 257966
2016-01-16 00:09:09 +00:00
Dan Gohman 4e9b2a60ab [SelectionDAG] CSE nodes with differing SDNodeFlags
In the optimizer (GVN etc.) when eliminating redundant nodes with different
flags, the flags are ignored for the purposes of testing for congruence, and
then intersected for the purposes of producing a result that supports the union
of all the uses. This commit makes SelectionDAG's CSE do the same thing,
allowing it to CSE nodes in more cases. This fixes PR26063.

Differential Revision: http://reviews.llvm.org/D15957

llvm-svn: 257940
2016-01-15 21:56:40 +00:00
Joseph Tremoulet 44b3f961e1 [WinEH] Rename CatchReturnInst::getParentPad, NFC
Summary:
Rename to getCatchSwitchParentPad, to make it more clear which ancestor
the "parent" in question is.  Add a comment pointing out the key feature
that the returned pad indicates which funclet contains the successor
block.

Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16222

llvm-svn: 257933
2016-01-15 21:16:19 +00:00
Rafael Espindola 79db917139 Don't try to check all uses if lazy loading.
This means that LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN will not be set
in a few cases.

This should have no impact in ld64 since it doesn't use lazy loading
when merging modules and that is when it checks
LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN.

llvm-svn: 257915
2016-01-15 18:23:46 +00:00
James Y Knight ac03dca412 Stop increasing alignment of externally-visible globals on ELF
platforms.

With ELF, the alignment of a global variable in a shared library will
get copied into an executables linked against it, if the executable even
accesss the variable. So, it's not possible to implicitly increase
alignment based on access patterns, or you'll break existing binaries.

This happened to affect libc++'s std::cout symbol, for example. See
thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311

(This is a re-commit of r257719, without the bug reported in
PR26144. I've tweaked the code to not assert-fail in
enforceKnownAlignment when computeKnownBits doesn't recurse far enough
to find the underlying Alloca/GlobalObject value.)

Differential Revision: http://reviews.llvm.org/D16145

llvm-svn: 257902
2016-01-15 16:33:06 +00:00
James Molloy 3ef84c4cbb [CodeGenPrepare] Try and appease sanitizers
dupRetToEnableTailCallOpts(BB) can invalidate BB. It must run *after* we iterate across BB!

llvm-svn: 257886
2016-01-15 10:36:01 +00:00
James Molloy f01488e2bc [InstCombine] Rewrite bswap/bitreverse handling completely.
There are several requirements that ended up with this design;
  1. Matching bitreversals is too heavyweight for InstCombine and doesn't really need to be done so early.
  2. Bitreversals and byteswaps are very related in their matching logic.
  3. We want to implement support for matching more advanced bswap/bitreverse patterns like partial bswaps/bitreverses.
  4. Bswaps are best matched early in InstCombine.

The result of these is that a new utility function is created in Transforms/Utils/Local.h that can be configured to search for bswaps, bitreverses or both. InstCombine uses it to find only bswaps, CGP uses it to find only bitreversals.

We can then extend the matching logic in one place only.

llvm-svn: 257875
2016-01-15 09:20:19 +00:00
Keno Fischer 81e2e9ef86 Reapply r257105 "[Verifier] Check that debug values have proper size"
I originally reapplied this in 257550, but had to revert again due to bot
breakage. The only change in this version is to allow either the TypeSize
or the TypeAllocSize of the variable to be the one represented in debug info
(hopefully in the future we can figure out how to encode the difference).
Additionally, several bot failures following r257550, were due to
optimizer bugs now fixed in r257787 and r257795.

r257550 commit message was:

```
The follow extra changes were made to test cases:

Manually making the variable be the actual type instead of a pointer
to avoid pointer-size differences in generic code:

    LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll
    LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll
    LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll
    LLVM :: DebugInfo/Generic/varargs.ll

Delete sizing information from debug info for the same reason
(but the presence of the pointer was important to the test case):

    LLVM :: DebugInfo/Generic/restrict.ll
    LLVM :: DebugInfo/Generic/tu-composite.ll
    LLVM :: Linker/type-unique-type-array-a.ll
    LLVM :: Linker/type-unique-simple2.ll

Fixing an incorrect DW_OP_deref

    LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll

Fixing a missing DW_OP_deref

    LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll

Additionally, clang should no longer complain during bootstrap should no
longer happen after r257534.

The original commit message was:
``
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref
``

```

llvm-svn: 257850
2016-01-15 00:46:17 +00:00
Krzysztof Parzyszek c005e20d3b [Packetizer] Code cleanup, NFC
llvm-svn: 257805
2016-01-14 21:17:04 +00:00
Rui Ueyama da00f2fdf4 Update to use new name alignTo().
llvm-svn: 257804
2016-01-14 21:06:47 +00:00
Ahmed Bougacha 60b201b662 [CodeGen] Don't assume fp_to_fp16 produces i16 when legalizing it.
Since r230276, we support an improved legalization for f64->f16,
which goes through a temporary f32, improving codegen when
f32->f16 is legal but not f64->f16. This requires unsafe-fp-math.

However, that legalization assumed that the second step, producing
a pseudo-softened f16, had type i16. That's not true on targets
with illegal i16, such as ARM.

Use the initial f64->f16 result type instead.

llvm-svn: 257794
2016-01-14 19:45:36 +00:00
Reid Kleckner 70f5bc99b6 Rename WinCodeViewLineTables to CodeViewDebug, similar to DwarfDebug
Soon it will be responsible for more than line tables.

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D16199

llvm-svn: 257792
2016-01-14 19:25:04 +00:00
Xinliang David Li 5f04f926e9 [PGO] [Coverage] put covmap into note section with no 'alloc flag' (Linux)
Coverage mapping data is not referenced by runtime, and they won't be dumped
into profile data. There is no need to allocate memory for covmap sections.
A good side effect of this change is that the coverage map data won't be mistakenly
 garbage collected by the linker (for Gold linker only, BFD linker has an issue where the a bug is filed).
Tested with clang build with instrumentation and -fcoverage-mapping and linker GC. The size of
 covmap section is ~17.6M so the text segment size will be reduced by this amount with this change.

llvm-svn: 257781
2016-01-14 18:09:45 +00:00
James Y Knight 582f556251 Revert "Stop increasing alignment of externally-visible globals on ELF platforms."
This reverts commit r257719, due to PR26144.

llvm-svn: 257775
2016-01-14 16:33:21 +00:00
David Majnemer 3463e696fb [X86] Don't alter HasOpaqueSPAdjustment after we've relied on it
We rely on HasOpaqueSPAdjustment not changing after we've calculated
things based on it.  Things like whether or not we can use 'rep;movs' to
copy bytes around, that sort of thing.  If it changes, invariants in the
backend will quietly break.  This situation arose when we had a call to
memcpy *and* a COPY of the FLAGS register where we would attempt to
reference local variables using %esi, a register that was clobbered by
the 'rep;movs'.

This fixes PR26124.

llvm-svn: 257730
2016-01-14 01:20:03 +00:00
Philip Reames 054123550f Fix Release build warning.
A value used only in an assert.  Again.

llvm-svn: 257728
2016-01-14 00:55:51 +00:00
Philip Reames 8f8e3f245c [GCRoot] Assert preconditions to clarify behavior
This code isn't reachable if the GFI (GCFunctionInfo*) is null.  Clarify this by adding an assert and removing an always taken if.  

llvm-svn: 257724
2016-01-14 00:21:56 +00:00
Reid Kleckner 3c0ff98708 [codeview] Regenerate C++ display name test case and update comments
Clang generates good display names for codeview since r255744, and the
change to make LLVM use them was accidentally included in r257658.

This change just updates the comments and test case to reflect reality
better.

llvm-svn: 257723
2016-01-14 00:12:54 +00:00
James Y Knight 9de6d7becc Stop increasing alignment of externally-visible globals on ELF
platforms.

With ELF, the alignment of a global variable in a shared library will
get copied into an executables linked against it, if the executable even
accesss the variable. So, it's not possible to implicitly increase
alignment based on access patterns, or you'll break existing binaries.

This happened to affect libc++'s std::cout symbol, for example. See
thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311

llvm-svn: 257719
2016-01-13 23:59:19 +00:00
Chih-Hung Hsieh 578864007b [TLS] New lower emutls pass, fix linkage bugs.
Previous implementation in http://reviews.llvm.org/D10522
created external references to __emutls_v.* variables.
Such references are inaccurate and cannot be handled by
all linkers, e.g. Android dynamic and gold linkers for aarch64.

Now a new LowerEmuTLS pass to go through all global variables,
and add emutls_v.* and emutls_t.* variables.
These __emutls* variables have the same linkage and
visibility as the associated user defined TLS variable.

Also removed old code that dump __emutls* variables in AsmPrinter.cpp,
and updated TLS unit tests.

Differential Revision: http://reviews.llvm.org/D15300

llvm-svn: 257718
2016-01-13 23:56:37 +00:00
Reid Kleckner 6b3faefff9 [codeview] Share more enums across the writer and the dumper
Moves some .def files into include/DebugInfo/CodeView.

Aslo remove a 'using namespace' directive from a header in readobj and
update the uses of the endian helper types to compensate.

llvm-svn: 257712
2016-01-13 23:44:57 +00:00
Reid Kleckner 72e2ba7abb [readobj] Expand CodeView dumping functionality
This rewrites and expands the existing codeview dumping functionality in
llvm-readobj using techniques similar to those in lib/Object. This defines a
number of new records and enums useful for reading memory mapped codeview
sections in COFF objects.

The dumper is intended as a testing tool for LLVM as it grows more codeview
output capabilities.

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D16104

llvm-svn: 257658
2016-01-13 19:32:35 +00:00
Keno Fischer 78e5c9e6e2 Re-Revert r257105 (Verifier debug info changes)
While I investigate some new buildbot failures. This was originally reapplied
as r257550 and r257558.

llvm-svn: 257563
2016-01-13 02:31:14 +00:00
Matthias Braun 4cc3421a24 AsmPrinter: Fix wrong OS X versions being emitted for darwin triples
The version numbers of the darwin kernel are different from the version
numbers of OS X, so we need adjustments if we had "*-*-darwin" triples.
Use the existing utility functions in TargetTriple for this.

Fixes rdar://22056966

Differential Revision: http://reviews.llvm.org/D14601

llvm-svn: 257555
2016-01-13 01:18:13 +00:00
David Majnemer c3340db77d [CodeView] Mark our lines as statements, not expressions
The line tables for CodeView make a distinction between expressions and
statements.  As it turns out, MSVC always emits them as statements and
we always emit them as expressions.  Let's switch to statements to match
the CodeView that they emit.

llvm-svn: 257553
2016-01-13 01:05:23 +00:00
Keno Fischer 25916079ff Reapply r257105 "[Verifier] Check that debug values have proper size"
The follow extra changes were made to test cases:

Manually making the variable be the actual type instead of a pointer
to avoid pointer-size differences in generic code:

    LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll
    LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll
    LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll
    LLVM :: DebugInfo/Generic/varargs.ll

Delete sizing information from debug info for the same reason
(but the presence of the pointer was important to the test case):

    LLVM :: DebugInfo/Generic/restrict.ll
    LLVM :: DebugInfo/Generic/tu-composite.ll
    LLVM :: Linker/type-unique-type-array-a.ll
    LLVM :: Linker/type-unique-simple2.ll

Fixing an incorrect DW_OP_deref

    LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll

Fixing a missing DW_OP_deref

    LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll

Additionally, clang should no longer complain during bootstrap should no
longer happen after r257534.

The original commit message was:
```
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref
```

llvm-svn: 257550
2016-01-13 00:31:44 +00:00
Matthias Braun b505c76c9a RegisterPressure: Expose RegisterOperands API
Previously the RegisterOperands have only been used internally in
RegisterPressure.cpp. However this datastructure can be useful for other
tasks as well and allows refactoring of PDiff initialisation out of
RPTracker::recede().

This patch:
- Exposes RegisterOperands as public API
- Splits RPTracker::recede() into a part that skips DebugValues and
  maintains the region borders, and the core that changes register
  pressure when given a set of RegisterOperands.
- This allows to move the PDiff initialisation out recede() into a
  method of the PressureDiffs class.
- The upcoming subregister scheduling code will also use
  RegisterOperands to avoid pushing more unrelated functionality into
  recede()/advance().

Differential Revision: http://reviews.llvm.org/D15473

llvm-svn: 257535
2016-01-12 22:57:35 +00:00
David Majnemer c81c8c66d5 [CodeView] Initialize column-end to zero
CodeView, unlike DWARF, can associate code with a range of columns.
However, LLVM can only represent a single column position internally.

We used to claim that the end column and start column were the same
which yielded less than satisfactory results: we would stop printing at
the _beginning_ of the source expression!  Instead, mark the column-end
as 'zero' to indicate that we don't have one (as per the documentation
for IDiaLineNumber::get_lineNumberEnd).

llvm-svn: 257528
2016-01-12 21:58:20 +00:00
Chad Rosier f35395eac1 [NFC] Fix whitespace.
llvm-svn: 257365
2016-01-11 19:17:36 +00:00
Matt Arsenault 5ca3c72c5a LegalizeDAG: Expand ctlz with ctlz_zero_undef if legal
llvm-svn: 257345
2016-01-11 16:37:46 +00:00
Junmo Park 7ceec0b82f [BranchFolding] Set correct mem refs (2nd try)
This is a recommit of r257253 which was reverted in r257270.
Previous testcase can make failure on some targets due to using opt with O3 option.

Original Summary:
Merge MBBICommon and MBBI's MMOs.

Differential Revision: http://reviews.llvm.org/D15990

llvm-svn: 257317
2016-01-11 07:15:38 +00:00
Daniel Berlin 7256059ef0 Speed up LiveDebugValues
Summary:
Use proper dataflow ordering to speed convergence.
This will converge the testcase on bug 26055 in 2 iterations.

(data structures speedups to come to make even that faster)

Reviewers: kcc, samsonov, echristo, dblaikie, tvvikram

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16039

llvm-svn: 257292
2016-01-10 18:08:32 +00:00
Daniel Berlin ca4d93a82f Don't use random class variables across functions
llvm-svn: 257271
2016-01-10 03:25:42 +00:00
Michael Zolotukhin 0fc89c67cc Revert "[BranchFolding] Set correct mem refs"
This reverts commit 1ff11017d2669b933b29fcbb6451cfcda34ad693.

llvm-svn: 257270
2016-01-09 23:53:16 +00:00
Junmo Park e1582cec34 [BranchFolding] Set correct mem refs
Merge MBBICommon and MBBI's MMOs.

Differential Revision: http://reviews.llvm.org/D15990

llvm-svn: 257253
2016-01-09 07:30:13 +00:00
Sanjay Patel 1dc7dfb9d9 [DAGCombiner] don't dereference an operand that doesn't exist (PR26070)
The bug was introduced with changes for x86-64 fp128:
http://reviews.llvm.org/rL254653

I don't know why an x86 change is here, so I'll follow up in:
http://reviews.llvm.org/D15134

Should fix:
https://llvm.org/bugs/show_bug.cgi?id=26070

llvm-svn: 257200
2016-01-08 19:53:24 +00:00
Tim Shen 9b68bd48ca Test commit access - add a blank line in comment.
llvm-svn: 257192
2016-01-08 19:20:23 +00:00
Pirama Arumuga Nainar bf5ccdccb2 Do not ASSERTZEXT for i16 result of bitcast from f16 operand
Summary:
During legalization if i16, do not ASSERTZEXT the result of FP_TO_FP16.
Directly return an FP_TO_FP16 node with return type as the
promote-to-type of i16.

This patch also removes extraneous length check.  This legalization
should be valid even if integer and float types are of different
lengths.

This patch breaks a hard-float test for fp16 args.  The test is changed
to allow a vmov to zero-out the top bits, and also ensure that the
return value is in an FP register.

Reviewers: ab, jmolloy

Subscribers: srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D15438

llvm-svn: 257184
2016-01-08 17:46:05 +00:00
David Majnemer 2a6368f609 [WinEH] CatchHandler which don't have catch objects in StackColoring
StackColoring rewrites the frame indicies of operations involving
allocas if it can find that the life time of two objects do not overlap.
MSVC EH needs to be kept aware of this if happens in the event that a
catch object has moved around.  However, we represent the non-existance
of a catch object with a sentinel frame index (INT_MAX).  This sentinel
also happens to be the EmptyKey of the SlotRemap DenseMap.  Testing for
whether or not we need to translate the frame index fails in this case
because we call the count method on the DenseMap with the EmptyKey,
leading to assertions.  Instead, check if it is our sentinel value
before trying to look into the DenseMap.

This fixes PR26073.

llvm-svn: 257182
2016-01-08 17:24:47 +00:00
David Majnemer 086fec23ec [WinEH] Update WinEHFuncInfo if StackColoring merges allocas
Windows EH keeping track of which frame index corresponds to a catchpad
in order to inform the runtime where the catch parameter should be
initialized.  LLVM's optimizations are able to prove that the memory
used by the catch parameter can be reused with another memory
optimization, changing it's frame index.

We need to keep WinEHFuncInfo up to date with respect to this or we will
miscompile/assert.

This fixes PR26069.

llvm-svn: 257158
2016-01-08 08:03:55 +00:00
Junmo Park aa9243a25d Remove extra whitespace. NFC.
llvm-svn: 257144
2016-01-08 04:20:32 +00:00
Matthias Braun bf47f63b74 LiveInterval: A LiveRange is enough for ConnectedVNInfoEqClasses::Classify()
llvm-svn: 257129
2016-01-08 01:16:35 +00:00
Alexey Samsonov 117b104166 [LiveDebugValues] Replace several lines of code with operator[].
llvm-svn: 257114
2016-01-07 23:38:45 +00:00
Keno Fischer ea33a25816 Temporarily revert r257105 "[Verifier] Check that debug values have proper size"
Looks like there's a case where clang generates debug info that triggers
the new verifier check. Reverting while investigating.

llvm-svn: 257107
2016-01-07 22:39:11 +00:00
Keno Fischer b3326be6ad [Verifier] Check that debug values have proper size
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref

Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D14276

llvm-svn: 257105
2016-01-07 22:18:37 +00:00
Dimitry Andric 2c36421337 Turn off lldb debug tuning by default for FreeBSD
Summary:
In rL242338, debugger tuning was introduced, and the tuning for FreeBSD
was set to lldb by default.  However, for the foreseeable future we
still need to default to gdb tuning, since lldb is not ready for all of
FreeBSD's architectures, and some system tools (like objcopy, etc) have
not yet been adapted to cope with the lldb tuned format, which has
.apple sections.

Therefore, let FreeBSD use gdb by default for now.

Reviewers: emaste, probinson

Subscribers: llvm-commits, emaste

Differential Revision: http://reviews.llvm.org/D15966

llvm-svn: 257103
2016-01-07 22:09:12 +00:00
Amjad Aboud d7cfb48485 Added support for macro emission in dwarf (supporting DWARF version 4).
Differential Revision: http://reviews.llvm.org/D15495

llvm-svn: 257060
2016-01-07 14:28:20 +00:00
Junmo Park 1238610aa1 Remove extra whitespace. NFC.
llvm-svn: 257047
2016-01-07 10:26:32 +00:00
David Majnemer 0e90f46e10 Undo spurious change made in r256965
llvm-svn: 257028
2016-01-07 04:31:35 +00:00
Philip Reames afdbcc6a84 [Statepoints] Add test cases around vectors and stablize test
Unlike my comment in 257022 said, it turns out we do handle constant vectors in the statepoint lowering, but only because SelectionDAG doesn't actually produce constants for them.  Add a couple of tests which show this working.

Also, add a triple to the same test file to hopefully fix a failing bot.

It turns out we do han

llvm-svn: 257025
2016-01-07 04:15:31 +00:00
Philip Reames 3e2cf5320c [Statepoints] Initial support for relocating vectors of pointers
Currently, we try to split vectors of pointers back into their component pointer elements during rewrite-statepoints-for-gc. This is less than ideal since presumably the vectorizer chose to vectorize for a reason. :) It's also been a source of bugs - in particular, the relocation logic as currently implemented was recently discovered to be wrong.

The alternate approach is to allow gc.relocates of vector-of-pointer type and update the backend to handle them. That's what this patch tries to do. This won't actually enable vector-of-pointers in practice - there are some RS4GC changes needed - but the lowering is standalone and testable so it makes sense to separate.

Note that there are some known cases around vector constants which this patch does not handle. Once this is in, I'll send another patch with individual fixes and test cases. 

Differential Revision: http://reviews.llvm.org/D15632

llvm-svn: 257022
2016-01-07 03:32:11 +00:00
Quentin Colombet 9ed52e9a9e [ShrinkWrapping] Give up on irreducible CFGs.
We need to know whether or not a given basic block is in a loop for the analysis
to be correct.
Loop information may be incomplete on irreducible CFGs, therefore we may
generate incorrect code if we use it in those situations.

This fixes PR25988.

llvm-svn: 257012
2016-01-07 01:23:49 +00:00
Sanjay Patel 882a8eed3e rangify; NFCI
llvm-svn: 256998
2016-01-06 23:45:05 +00:00
Weiming Zhao 0f1762caf9 Recommit r256952 "Filtering IR printing for print-after-all/print-before-all"
Fix lit test fail due to outputting an extra line.

Differential Revision: http://reviews.llvm.org/D15776

llvm-svn: 256987
2016-01-06 22:55:03 +00:00
Philip Reames 5eb90a7835 Consolidate MemRefs handling from BranchFolding and correct latent bug
Move the logic from BranchFolding to use the shared infrastructure for merging MMOs introduced in 256909. This has the effect of making BranchFolding more capable.

In the process, fix a latent bug. The existing handling for merging didn't handle the case where one of the instructions being merged had overflowed and dropped MemRefs. This was a latent bug in the places the code was commoned from, but potentially reachable in BranchFolding.

Once this is in, we're left with a single place to consider implementing MMO unique-ing as proposed in http://reviews.llvm.org/D15230.

Differential Revision: http://reviews.llvm.org/D15913

llvm-svn: 256966
2016-01-06 19:33:12 +00:00
David Majnemer eea7582bfa [WinEH] Remove calculateCatchReturnSuccessorColors
The functionality that calculateCatchReturnSuccessorColors provides was
once non-trivial: it was a computation layered on top of funclet
coloring.

These days, LLVM IR directly encodes what
calculateCatchReturnSuccessorColors computed, obsoleting the need for
it.

No functionality change is intended.

llvm-svn: 256965
2016-01-06 19:26:30 +00:00
Michael Kuperstein 037c9984db [ShrinkWrap] Fix FindIDom to only have one kind of failure.
FindIDom() can fail in two different ways - it can either return nullptr or the
block itself, depending on the circumstances. Some users of FindIDom() check
one error condition, while others check the other.

Change it to always return nullptr on failure.
This fixes PR26004.

Differential Revision: http://reviews.llvm.org/D15847

llvm-svn: 256955
2016-01-06 18:40:11 +00:00
Weiming Zhao b243c95c6a Revert r256952 due to lit test fails.
llvm-svn: 256954
2016-01-06 18:31:44 +00:00
Weiming Zhao eac0636805 Filtering IR printing for print-after-all/print-before-all
Summary:
This patch implements "-print-funcs" option to support function filtering for IR printing like -print-after-all, -print-before etc.
Examples:
  -print-after-all -print-funcs=foo,bar

Reviewers: mcrosier, joker.eph

Subscribers: tejohnson, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D15776

llvm-svn: 256952
2016-01-06 18:20:25 +00:00
Geoff Berry 12fe2279f3 ScheduleDAGInstrs: Bug fix for missed memory dependency.
Summary:
In buildSchedGraph(), when adding memory dependencies for loads, move
the call to adjustChainDeps() after the call to
addChainDependency(AliasChain) to handle the case where
addChainDependency(AliasChain) ends up not adding a dependency and
instead putting the SU on the RejectMemNodes list.  The call to
adjustChainDeps() must be done after the call to addChainDependency() in
order to process the SU added to the RejectMemNodes list to create
memory dependencies for it.

Reviewers: hfinkel, atrick, jonpa, resistor

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D15927

llvm-svn: 256950
2016-01-06 18:14:26 +00:00
Philip Reames 2d2fc4adf1 Fix a warning [NFC]
llvm-svn: 256916
2016-01-06 05:53:09 +00:00
Philip Reames c86ed0055d Extract helper function to merge MemoryOperand lists [NFC]
In the discussion on http://reviews.llvm.org/D15730, Andy pointed out we had a utility function for merging MMO lists. Since it turned we actually had two copies and there's another review in progress (http://reviews.llvm.org/D15230) which needs the same, extract it into a utility function and clean up the interfaces to make it easier to use with a MachineInstBuilder.

I introduced a pair here to track size and allocation together. I think we should probably move in the direction of the MachineOperandsRef helper class, but I'm leaving that for further work. I want to get the poison state introduced before I make major changes to the interface.

Differential Revision: http://reviews.llvm.org/D15757

llvm-svn: 256909
2016-01-06 04:39:03 +00:00
Sanjay Patel 3d07ec973f rangify; NFCI
llvm-svn: 256891
2016-01-06 00:45:42 +00:00
Dan Gohman 797f639e79 [SelectionDAGBuilder] Set NoUnsignedWrap for inbounds gep and load/store offsets.
In an inbounds getelementptr, when an index produces a constant non-negative
offset to add to the base, the add can be assumed to not have unsigned overflow.

This relies on the assumption that addresses can't occupy more than half the
address space, which isn't possible in C because it wouldn't be possible to
represent the difference between the start of the object and one-past-the-end
in a ptrdiff_t.

Setting the NoUnsignedWrap flag is theoretically useful in general, and is
specifically useful to the WebAssembly backend, since it permits stronger
constant offset folding.

Differential Revision: http://reviews.llvm.org/D15544

llvm-svn: 256890
2016-01-06 00:43:06 +00:00
Sanjay Patel 8260d0a9fa use std::max ; NFCI
llvm-svn: 256889
2016-01-06 00:36:59 +00:00
MinSeong Kim 4a9a4e198f [MISched] Explanatory error message when machine model is not complete. NFC
When not all instructions have a scheduling class,
the error message now provides a possible solution.

Differential Revision: http://reviews.llvm.org/D15854

llvm-svn: 256839
2016-01-05 14:50:15 +00:00
Manuel Jacob 83eefa6d20 [Statepoints] Refactor GCRelocateOperands into an intrinsic wrapper. NFC.
Summary:
This commit renames GCRelocateOperands to GCRelocateInst and makes it an
intrinsic wrapper, similar to e.g. MemCpyInst.  Also, all users of
GCRelocateOperands were changed to use the new intrinsic wrapper instead.

Reviewers: sanjoy, reames

Subscribers: reames, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D15762

llvm-svn: 256811
2016-01-05 04:03:00 +00:00
Matthias Braun 7e762e4f9c MachineInstrBundle: Fix reversed isSuperRegisterEq() call
Unfortunately this fix had the effect of exposing the
-verify-machineinstrs FIXME of X86InstrInfo.cpp in two testcases for
which I disabled it for now.
Two testcases also have additional pushq/popq where the corrected code
cannot prove that %rax is dead any longer. Looking at the examples, this
could potentially be fixed by improving computeRegisterLiveness() to check
the live-in lists of the successors blocks when reaching the end of a
block.

This fixes http://llvm.org/PR25951.

llvm-svn: 256799
2016-01-05 00:45:35 +00:00
Eric Christopher 49a7d6c473 Clarify that the bypassSlowDivision optimization operates on a single BB [v2]
Update some comments to be more explicit.

Change bypassSlowDivision and the functions it calls so that they take
BasicBlock*s and Instruction*s, rather than Function::iterator&s and
BasicBlock::iterator&s.

Change the APIs so that the caller is responsible for updating the
iterator, rather than the callee. This makes control flow much easier
to follow.

Patch by Justin Lebar!

llvm-svn: 256789
2016-01-04 23:18:58 +00:00
Joseph Tremoulet 52f729a613 [WinEH] Update CoreCLR EH state numbering
Summary:
Fix the CLR state numbering to generate correct tables, and update the lit
test to verify them.

The CLR numbering assigns one state number to each catchpad and
cleanuppad.

It also computes two tree-like relations over states:
 1) Each state has a "HandlerParentState", which is the state of the next
    outer handler enclosing this state's handler (same as nearest ancestor
    per the ParentPad linkage on EH pads, but skipping over catchswitches).
 2) Each state has a "TryParentState", which:
    a) for a catchpad that's not the last handler on its catchswitch, is
       the state of the next catchpad on that catchswitch.
    b) for all other pads, is the state of the pad whose try region is the
       next outer try region enclosing this state's try region.  The "try
       regions are not present as such in the IR, but will be inferred
       based on the placement of invokes and pads which reach each other
       by exceptional exits.

Catchswitches do not get their own states, but each gets mapped to the
state of its first catchpad.

Table generation requires each state's "unwind dest" state to have a lower
state number than the given state.

Since HandlerParentState can be computed as a function of a pad's
ParentPad, and TryParentState can be computed as a function of its unwind
dest and the TryParentStates of its children, the CLR state numbering
algorithm first computes HandlerParentState in a top-down pass, then
computes TryParentState in a bottom-up pass.

Also reword some comments/names in the CLR EH table generation to make the
distinction between the different kinds of "parent" clear.


Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: AndyAyers, llvm-commits

Differential Revision: http://reviews.llvm.org/D15325

llvm-svn: 256760
2016-01-04 16:16:01 +00:00
David Majnemer ca1c9f074f [X86] Make hasFP constant time
We need a frame pointer if there is a push/pop sequence after the
prologue in order to unwind the stack.  Scanning the instructions to
figure out if this happened made hasFP not constant-time which is a
violation of expectations.  Let's compute this up-front and reuse that
computation when we need it.

llvm-svn: 256730
2016-01-04 04:49:41 +00:00
Simon Pilgrim 5bf96e41c5 [SelectionDAG] Pulled out common code for CONCAT_VECTORS node creation
Pulled out the similar CONCAT_VECTORS creation code from the 2/3 operand getNode() calls (to handle all UNDEF and all BUILD_VECTOR cases). Added a similar handler to the general getNode() call as well.

llvm-svn: 256709
2016-01-03 18:24:19 +00:00
NAKAMURA Takumi ded575e4eb WinEHPrepare.cpp: Suppress a warning for -Asserts. [-Wunused-variable]
llvm-svn: 256694
2016-01-03 01:41:00 +00:00
Joseph Tremoulet 71e5676de4 [WinEH] Update catchrets with cloned successors
Summary:
Add a pass to update catchrets when their successors get cloned; the
existing pass doesn't catch these because it walks the funclet whose
blocks are being cloned but the catchret is in a child funclet.

Also update the test for removing incoming PHI values; when the
predecessor is a catchret, the relevant color is the catchret's parentPad,
not its block's color.


Reviewers: andrew.w.kaylor, rnk, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15840

llvm-svn: 256689
2016-01-02 15:22:36 +00:00
Yaron Keren c47c6ac0a5 Correct misleading formatting of several ifs followed by two statements without braces.
While the original code would work with or without braces, it makes sense to
set HaveSemi to true only if (!HaveSemi), otherwise it's already true, so I
put the assignment inside the if block. This addresses PR25998.

llvm-svn: 256688
2016-01-02 13:40:36 +00:00
David Majnemer dbdc9c274d [WinEH] Add additional verification
Recolor the IR to make sure our computed colors are not hiding any bugs.
Also, verifyFunction if we are running some post-preparation operations;
some of these operations can hide latent bugs.

llvm-svn: 256687
2016-01-02 09:26:36 +00:00
Sanjay Patel ac6e910c42 don't repeat function names in comments; NFC
llvm-svn: 256584
2015-12-29 22:11:50 +00:00
Sanjay Patel d1d9db5889 use auto with dyn_casted values; NFC
llvm-svn: 256581
2015-12-29 22:00:37 +00:00
Sanjay Patel 7a7abc9a3b use auto with dyn_casted values; NFC
llvm-svn: 256579
2015-12-29 21:49:08 +00:00
Sanjay Patel b120ae96d3 fix formatting; NFC
llvm-svn: 256574
2015-12-29 19:34:53 +00:00
Sanjay Patel faeee6f44d use range-based for-loop; NFCI
llvm-svn: 256572
2015-12-29 18:30:09 +00:00
Chad Rosier 6b4326367a Add command line options to force function/loop alignments.
These are being added for testing purposes.
http://reviews.llvm.org/D15648

llvm-svn: 256571
2015-12-29 18:18:07 +00:00
Sanjay Patel 59309cc090 don't repeat function names in comments; NFC
llvm-svn: 256569
2015-12-29 18:14:06 +00:00
Sanjay Patel 7dd45697ca use range-based for-loops; NFCI
llvm-svn: 256566
2015-12-29 17:15:22 +00:00
Chandler Carruth e0115344e6 [ptr-traits] Sink a constructor definition to the .cpp file and add
missing includes so that the pointee types for DenseMap pointer keys and
such are complete prior to us querying the pointer traits for them.

This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.

llvm-svn: 256550
2015-12-29 09:24:39 +00:00
Michael Kuperstein 2ea81baf3a [X86] Better support for the MCU psABI (LLVM part)
This adds support for the MCU psABI in a way different from r251223 and r251224,
basically reverting most of these two patches. The problem with the approach
taken in r251223/4 is that it only handled libcalls that originated from the backend.
However, the mid-end also inserts quite a few libcalls and assumes these use the
platform's default calling convention.

The previous patch tried to insert inregs when necessary both in the FE and,
somewhat hackily, in the CG. Instead, we now define a new default calling convention
for the MCU, which doesn't use inreg marking at all, similarly to what x86-64 does.

Differential Revision: http://reviews.llvm.org/D15054

llvm-svn: 256494
2015-12-28 14:39:21 +00:00
Craig Topper 4b1808d8e7 [SelectionDAG] Teach LegalizeVectorOps to not unroll CTLZ_ZERO_UNDEF and CTTZ_ZERO_UNDEF if the non-ZERO_UNDEF form is legal or custom. Will be used to simplify X86 code in a follow on commit.
llvm-svn: 256476
2015-12-27 21:33:47 +00:00
David Majnemer 081e8fe4c0 [WinEH] Add comments explaining the EH tables
This is aids in debugging WinEH, similar functionality is present for
DWARF EH.

llvm-svn: 256455
2015-12-27 06:07:12 +00:00
David Majnemer 11234ed7d3 [CodeGen] Use generic printAsOperand machinery instead of hand rolling it
We already know how to properly print out basic blocks in
printAsOperand, we should not roll it ourselves in
AsmPrinter::EmitBasicBlockStart.  No functionality change is intended.

llvm-svn: 256413
2015-12-25 09:37:26 +00:00
Craig Topper 73275a2951 Use range-based for loops. NFC
llvm-svn: 256363
2015-12-24 05:20:40 +00:00
Philip Reames cb0f947a2a [Statepoints] Use Indirect operands for spill slots
Teach the statepoint lowering code to emit Indirect stackmap entries for spill inserted by StatepointLowering (i.e. SelectionDAG), but Direct stackmap entries for in-IR allocas which represent manual stack slots. This is what the docs call for (http://llvm.org/docs/StackMaps.html#stack-map-format), but we've been emitting both as Direct. This was pointed out recently on the mailing list as a bug. It also blocks http://reviews.llvm.org/D15632 which extends the lowering to handle vector-of-pointers since only Indirect references can encode a variable sized slot.

To implement this, I introduced a new flag on the StackObject class used to maintian information about stack slots. I original considered (and prototyped in http://reviews.llvm.org/D15632), the idea of using the existing isSpillSlot flag, but end up deciding that was a bit too risky and that the cost of adding a new flag was low. Having the new flag will also allow us - in the future - to emit better comments in verbose assembly which indicate where a particular stack spill around a call comes from. (deopt, gc, regalloc).

Differential Revision: http://reviews.llvm.org/D15759

llvm-svn: 256352
2015-12-23 23:44:28 +00:00
Philip Reames 4e66c84722 [MemOperands] Clarify code around dropping memory operands [NFC]
Clarify a comment about what it means to drop memory operands from an instruction.  While I'm adding change the name of the method slightly to make it a bit more clear what's going on when reading calling code.

llvm-svn: 256346
2015-12-23 19:16:04 +00:00
Philip Reames 42bd26f29d [MachineLICM] Fix handling of memoperands
As far as I can tell, the correct interpretation of an empty memoperands list is that we didn't have sufficient room to store information about the MachineInstr, NOT that the MachineInstr doesn't access any particular bit of memory. This appears to be fairly consistent in a number of places, but I'm not 100% sure of this interpretation. I'd really appreciate someone more knowledgeable confirming my reading of the code.

This patch fixes two latent bugs in MachineLICM - given the above assumption - and adds comments to document the meaning and required handling. I don't have test cases; these were noticed by inspection.

Differential Revision: http://reviews.llvm.org/D15730

llvm-svn: 256335
2015-12-23 17:05:57 +00:00
David Majnemer c640f863e0 [WinEH] Don't visit the same catchswitch twice
We visited the same catchswitch twice because it was both the child of
another funclet and the predecessor of a cleanuppad.

Instead, change the numbering algorithm to only recurse if the unwind
destination of the inner funclet agrees with the unwind destination of
the catchswitch.

This fixes PR25926.

llvm-svn: 256317
2015-12-23 03:59:04 +00:00
Philip Reames ee8f055327 [GC] Make GCStrategy::isGCManagedPointer a type predicate not a value predicate [NFC]
Reasons:
1) The existing form was a form of false generality.  None of the implemented GCStrategies use anything other than a type.  Its becoming more and more clear we're going to need some type of strong GC pointer in the type system and we shouldn't pretend otherwise at this point.
2) The API was awkward when applied to vectors-of-pointers.  The old one could have been made to work, but calling isGCManagedPointer(Ty->getScalarType()) is much cleaner than the Value alternatives.  
3) The rewriting implementation effectively assumes the type based predicate as well.  We should be consistent.

llvm-svn: 256312
2015-12-23 01:42:15 +00:00
Cong Hou e93b8e1539 [BPI] Replace weights by probabilities in BPI.
This patch removes all weight-related interfaces from BPI and replace
them by probability versions. With this patch, we won't use edge weight
anymore in either IR or MC passes. Edge probabilitiy is a better
representation in terms of CFG update and validation.


Differential revision: http://reviews.llvm.org/D15519 

llvm-svn: 256263
2015-12-22 18:56:14 +00:00
Manuel Jacob 4e4f60ded0 Remove deprecated llvm.experimental.gc.result.{int,float,ptr} intrinsics.
Summary:
These were deprecated 11 months ago when a generic
llvm.experimental.gc.result intrinsic, which works for all types, was added.

Reviewers: sanjoy, reames

Subscribers: sanjoy, chenli, llvm-commits

Differential Revision: http://reviews.llvm.org/D15719

llvm-svn: 256262
2015-12-22 18:44:45 +00:00
Chad Rosier a108010385 Typo. NFC.
llvm-svn: 256242
2015-12-22 15:06:47 +00:00
Keno Fischer 4eccf11373 [ASMPrinter] Fix missing handling of DW_OP_bit_piece
In r256077, I added printing for DIExpressions in DEBUG_VALUE comments,
but neglected to handle DW_OP_bit_piece operands. Thanks to
Mikael Holmen and Joerg Sonnenberger for spotting this.

llvm-svn: 256236
2015-12-22 07:14:50 +00:00
David Majnemer 03e2cc3007 [MC, COFF] Support link /incremental conditionally
Today, we always take into account the possibility that object files
produced by MC may be consumed by an incremental linker.  This results
in us initialing fields which vary with time (TimeDateStamp) which harms
hermetic builds (e.g. verifying a self-host went well) and produces
sub-optimal code because we cannot assume anything about the relative
position of functions within a section (call sites can get redirected
through incremental linker thunks).

Let's provide an MCTargetOption which controls this behavior so that we
can disable this functionality if we know a-priori that the build will
not rely on /incremental.

llvm-svn: 256203
2015-12-21 22:09:27 +00:00
Adrian Prantl ce8581389b Fix PR24563 (LiveDebugVariables unconditionally propagates all DBG_VALUEs)
LiveDebugVariables unconditionally propagates all DBG_VALUE down the
dominator tree, which happens to work fine if there already is another
DBG_VALUE or the DBG_VALUE happends to describe a single-assignment vreg
but is otherwise wrong if the DBG_VALUE is coming from only one of the
predecessors.

In r255759 we introduced a proper data flow analysis scheduled after
LiveDebugVariables that correctly propagates DBG_VALUEs across basic block
boundaries. With the new pass in place, the incorrect propagation in
LiveDebugVariables can be retired witout loosing any of the benefits
where LiveDebugVariables happened to do the right thing.

llvm-svn: 256188
2015-12-21 20:03:00 +00:00
Amjad Aboud 60b5e1b6c0 Implemented Support of IA interrupt and exception handlers:
http://lists.llvm.org/pipermail/cfe-dev/2015-September/045171.html

Differential Revision: http://reviews.llvm.org/D15567

llvm-svn: 256155
2015-12-21 14:07:14 +00:00
Manuel Jacob 5b90b147d4 Remove unnecessary casts. NFC.
llvm-svn: 256101
2015-12-19 18:38:42 +00:00
Matt Arsenault d206d6cc54 SelectionDAG: Cleanup integer bin op promotion functions.
SDIV and UDIV had special handling, but this is the same handling
that min/max need.

llvm-svn: 256098
2015-12-19 17:18:43 +00:00
Keno Fischer 00cbf9a69a Clean up the processing of dbg.value in various places
Summary:
First up is instcombine, where in the dbg.declare -> dbg.value conversion,
the llvm.dbg.value needs to be called on the actual loaded value, rather
than the address (since the whole point of this transformation is to be
able to get rid of the alloca). Further, now that that's cleaned up, we
can remove a hack in the backend, that would add an implicit OP_deref if
the argument to dbg.value was an alloca. This stems from before the
existence of DIExpression and is no longer necessary since the deref can
be expressed explicitly.

Now, in order to make sure that the tests pass with this change, we need to
correct the printing of DEBUG_VALUE comments to take into account the
expression, which wasn't taken into account before.

Unfortunately, for both these changes, there were a number of incorrect
test cases (mostly the wrong number of DW_OP_derefs, but also a couple
where the test itself was broken more badly). aprantl and I have gone
through and adjusted these test case in order to make them pass with
these fixes and in some cases to make sure they're actually testing
what they are meant to test.

Reviewers: aprantl

Subscribers: dsanders

Differential Revision: http://reviews.llvm.org/D14186

llvm-svn: 256077
2015-12-19 02:02:44 +00:00
Matt Arsenault 10a509292c Fix broken type legalization of min/max
This was using an anyext when promoting the type
when zext/sext is required.

llvm-svn: 256074
2015-12-19 01:39:48 +00:00
Cong Hou fd0d62b87e Use getEdgeProbability() instead of getEdgeWeight() in BFI and remove getEdgeWeight() interfaces from MBPI.
This patch removes all getEdgeWeight() interfaces from CodeGen directory. As
getEdgeProbability() is a little more expensive than getEdgeWeight(), I will
compose a patch soon in which BPI only stores probabilities instead of edge
weights so that getEdgeProbability() will have O(1) time.


Differential revision: http://reviews.llvm.org/D15489

llvm-svn: 256039
2015-12-18 21:53:24 +00:00
Cong Hou b9e8d483b5 Fix PR25838.
This is a quick fix to PR25838. The issue comes from the restriction that we
cannot normalize probabilities containing both known and unknown ones. A patch
that removes this restriction is under the review now:

http://reviews.llvm.org/D15548

llvm-svn: 255867
2015-12-17 01:29:08 +00:00
Eric Christopher bfba572425 Fix funciton->function typo.
llvm-svn: 255841
2015-12-16 23:10:53 +00:00
Manman Ren 3e3edc91f9 CXX_FAST_TLS calling convention: target independent portion.
Update supportSplitCSR's interface to take machine function instead of the
calling convention.

Review comments for http://reviews.llvm.org/D15341

llvm-svn: 255818
2015-12-16 20:45:48 +00:00
Paul Robinson 6c27a2c40e Set debugger tuning from TargetOptions (NFC)
Differential Revision: http://reviews.llvm.org/D15427

llvm-svn: 255810
2015-12-16 19:58:30 +00:00
Tom Stellard 5ce530608f MachineScheduler: Add a target hook for deciding which RegPressure sets to
increase

Summary:
This patch adds a function called getRegPressureSetScore() to
TargetRegisterInfo.  The MachineScheduler uses this when comparing
instruction that increase the register pressure of different sets
to determine which set is safer to increase.

This hook is useful for GPU targets where the number of registers in the
class is not the best metric for determing which presser set is safer to
increase.

Future work may include adding more parameters to this function, like
for example, the current pressure level of the set or the amount that
the pressure will be increased/decreased.

Reviewers: qcolombet, escha, arsenm, atrick, MatzeB

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14806

llvm-svn: 255795
2015-12-16 18:31:01 +00:00
Krzysztof Parzyszek 2005d7dc01 [Packetizer] Add a check whether an instruction should be packetized now
Add a function VLIWPacketizerList::shouldAddToPacket, which will allow
specific implementations to decide if it is profitable to add given
instruction to the current packet.

llvm-svn: 255780
2015-12-16 16:38:16 +00:00
Vikram TV 859ad29b52 Recommit LiveDebugValues pass after fixing a couple of minor issues.
llvm-svn: 255759
2015-12-16 11:09:48 +00:00
Cong Hou 08ec3d91bb Minor change to TailDuplication.cpp to turn on normalization when removing successor
llvm-svn: 255752
2015-12-16 06:03:30 +00:00
Chen Li 3e8330a1fe [SelectionDAGBuilder] Adds support for landingpads of token type
Summary: This patch adds a check in visitLandingPad to see if landingpad's result type is token type. If so, do not create DAG nodes for its exception pointer and selector value. This patch enables the back end to handle landingpads of token type.

Reviewers: JosephTremoulet, majnemer, rnk

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D15405

llvm-svn: 255749
2015-12-16 04:48:42 +00:00
Philip Reames 23319014a9 Speculative fix for windows build
llvm-svn: 255743
2015-12-16 01:24:05 +00:00
Philip Reames 61a24ab6cc [IR] Add support for floating pointer atomic loads and stores
This patch allows atomic loads and stores of floating point to be specified in the IR and adds an adapter to allow them to be lowered via existing backend support for bitcast-to-equivalent-integer idiom.

Previously, the only way to specify a atomic float operation was to bitcast the pointer to a i32, load the value as an i32, then bitcast to a float. At it's most basic, this patch simply moves this expansion step to the point we start lowering to the backend.

This patch does not add canonicalization rules to convert the bitcast idioms to the appropriate atomic loads. I plan to do that in the future, but for now, let's simply add the support. I'd like to get instruction selection working through at least one backend (x86-64) without the bitcast conversion before canonicalizing into this form.

Similarly, I haven't yet added the target hooks to opt out of the lowering step I added to AtomicExpand. I figured it would more sense to add those once at least one backend (x86) was ready to actually opt out.

As you can see from the included tests, the generated code quality is not great. I plan on submitting some patches to fix this, but help from others along that line would be very welcome. I'm not super familiar with the backend and my ramp up time may be material.

Differential Revision: http://reviews.llvm.org/D15471

llvm-svn: 255737
2015-12-16 00:49:36 +00:00
Wolfgang Pieb 60b7ca6713 Test commit: fixed spelling error in comment.
llvm-svn: 255721
2015-12-16 00:08:18 +00:00
Reid Kleckner 7850c9f5ca [WinEH] Make llvm.x86.seh.recoverfp work on x64
It adjusts from RSP-after-prologue to RBP, which is what SEH filters
need to do before they can use llvm.localrecover.

Fixes SEH filter captures, which were broken in r250088.

Issue reported by Alex Crichton.

llvm-svn: 255707
2015-12-15 23:40:58 +00:00
David Majnemer 3bb88c0210 [WinEH] Use operand bundles to describe call sites
SimplifyCFG allows tail merging with code which terminates in
unreachable which, in turn, makes it possible for an invoke to end up in
a funclet which it was not originally part of.

Using operand bundles on invokes allows us to determine whether or not
an invoke was part of a funclet in the source program.

Furthermore, it allows us to unambiguously answer questions about the
legality of inlining into call sites which the personality may have
trouble with.

Differential Revision: http://reviews.llvm.org/D15517

llvm-svn: 255674
2015-12-15 21:27:27 +00:00
Michael Kuperstein 801ee74167 Do not try to use i8 and i16 versions of FP_TO_U/SINT soft float library calls
It appears that neither compiler-rt nor the gnu soft-float libraries actually
implement these conversions. Instead of emitting calls to library functions
that don't exist, handle it similarly to the way we handle i8 -> float and
i16 -> float conversions: call the i32 library function, and adjust the type.

Differential Revision: http://reviews.llvm.org/D15151

llvm-svn: 255643
2015-12-15 12:55:50 +00:00
Cong Hou 3ba9cf6020 Improve the successor list update in TailDuplication.cpp.
This patch improves a temporary fix in r255530 so that we can normalize
successor list without trigger assertion failures in tail duplication pass.

llvm-svn: 255638
2015-12-15 10:10:40 +00:00
Elena Demikhovsky 6015f5c823 Type legalizer for masked gather and scatter intrinsics.
Full type legalizer that works with all vectors length - from 2 to 16, (i32, i64, float, double).

This intrinsic, for example
void @llvm.masked.scatter.v2f32(<2 x float>%data , <2 x float*>%ptrs , i32 align , <2 x i1>%mask )
requires type widening for data and type promotion for mask.

Differential Revision: http://reviews.llvm.org/D13633

llvm-svn: 255629
2015-12-15 08:40:41 +00:00
Quentin Colombet b82786e0ff [ShrinkWrapping] Do not choose restore point inside loops.
The post-dominance property is not sufficient to guarantee that a restore point
inside a loop is safe.
E.g.,
 while(1) {
   Save
   Restore
   if (...)
     break;
   use/def CSRs
 }
All the uses/defs of CSRs are dominated by Save and post-dominated
by Restore. However, the CSRs uses are still reachable after
Restore and before Save are executed.

This fixes PR25824

llvm-svn: 255613
2015-12-15 03:28:11 +00:00
Chih-Hung Hsieh 7993e18e80 [X86] Part 2 to fix x86-64 fp128 calling convention.
Part 1 was submitted in http://reviews.llvm.org/D15134.
Changes in this part:
* X86RegisterInfo.td, X86RecognizableInstr.cpp: Add FR128 register class.
* X86CallingConv.td: Pass f128 values in XMM registers or on stack.
* X86InstrCompiler.td, X86InstrInfo.td, X86InstrSSE.td:
  Add instruction selection patterns for f128.
* X86ISelLowering.cpp:
  When target has MMX registers, configure MVT::f128 in FR128RegClass,
  with TypeSoftenFloat action, and custom actions for some opcodes.
  Add missed cases of MVT::f128 in places that handle f32, f64, or vector types.
  Add TODO comment to support f128 type in inline assembly code.
* SelectionDAGBuilder.cpp:
  Fix infinite loop when f128 type can have
  VT == TLI.getTypeToTransformTo(Ctx, VT).
* Add unit tests for x86-64 fp128 type.

Differential Revision: http://reviews.llvm.org/D11438

llvm-svn: 255558
2015-12-14 22:08:36 +00:00
Krzysztof Parzyszek dac7102874 [Packetizer] Add AliasAnalysis as a parameter to the packetizer
This will make the depedence graph more accurate if an alias analysis
is provided. If nullptr is specified in its place, the behavior will
remain as it is currently.

llvm-svn: 255540
2015-12-14 20:35:13 +00:00
Cong Hou c5f510bc23 Remove the successor probabilities normalization in tail duplication pass.
The normalization may cause assertion failures on SystemZ and some out-of-tree
tests. The root cause is that unknown probabilities are materialized into known
ones by calling getSuccProbability(), which is then used to add another
successor to the same MBB which results in mixed known and unknown
probabilities. But currently those mixed probabilities cannot be normalized.

I will compose another patch to fix the root issue.

llvm-svn: 255530
2015-12-14 19:11:54 +00:00
David Majnemer bbfc7219ef [IR] Remove terminatepad
It turns out that terminatepad gives little benefit over a cleanuppad
which calls the termination function.  This is not sufficient to
implement fully generic filters but MSVC doesn't support them which
makes terminatepad a little over-designed.

Depends on D15478.

Differential Revision: http://reviews.llvm.org/D15479

llvm-svn: 255522
2015-12-14 18:34:23 +00:00
Paul Robinson accc3e0376 FastISel needs to remove dead code when it bails out.
When FastISel fails to translate an instruction it hands off code
generation to SelectionDAG. Before it does so, it may have generated
local value instructions to feed phi nodes in successor blocks. These
instructions will then be generated again by SelectionDAG, causing
duplication and less efficient code, including extra spill
instructions.

Patch by Wolfgang Pieb!

Differential Revision: http://reviews.llvm.org/D11768

llvm-svn: 255520
2015-12-14 18:33:18 +00:00
Matt Arsenault d079285e05 AMDGPU: Use generic bitreverse intrinsic
Also fix bug in vector legalization for bitreverse.

llvm-svn: 255512
2015-12-14 17:25:38 +00:00
Sanjay Patel af674fbfd9 getParent() ^ 3 == getModule() ; NFCI
llvm-svn: 255511
2015-12-14 17:24:23 +00:00
Cong Hou c00e65aa89 Fix a type issue in r255455. Should not use unsigned type as std::abs()'s template type.
llvm-svn: 255461
2015-12-13 17:00:25 +00:00
Cong Hou 663dd018c1 Replace <cstdint> by llvm/Support/DataTypes.h for the typedef of uint64_t. NFC.
llvm-svn: 255458
2015-12-13 09:52:14 +00:00
Cong Hou c0a33e0f62 Add the missing header file <cstdint> needed by uint64_t
llvm-svn: 255457
2015-12-13 09:32:21 +00:00
Cong Hou c106989fd5 Normalize MBB's successors' probabilities in several locations.
This patch adds some missing calls to MBB::normalizeSuccProbs() in several
locations where it should be called. Those places are found by checking if the
sum of successors' probabilities is approximate one in MachineBlockPlacement
pass with some instrumented code (not in this patch).


Differential revision: http://reviews.llvm.org/D15259

llvm-svn: 255455
2015-12-13 09:26:17 +00:00
Manuel Jacob 1578ec8860 Partially fix memcpy / memset / memmove lowering in SelectionDAG construction if address space != 0.
Summary:
Previously SelectionDAGBuilder asserted that the pointer operands of
memcpy / memset / memmove intrinsics are in address space < 256.  This assert
implicitly assumed the X86 backend, where all address spaces < 256 are
equivalent to address space 0 from the code generator's point of view.  On some
targets (R600 and NVPTX) several address spaces < 256 have a target-defined
meaning, so this assert made little sense for these targets.

This patch removes this wrong assertion and adds extra checks before lowering
these intrinsics to library calls.  If a pointer operand can't be casted to
address space 0 without changing semantics, a fatal error is reported to the
user.

The new behavior should be valid for all targets that give address spaces != 0
a target-specified meaning (NVPTX, R600, X86).  NVPTX lowers big or
variable-sized memory intrinsics before SelectionDAG construction.  All other
memory intrinsics are inlined (the threshold is set very high for this target).
R600 doesn't support memcpy / memset / memmove library calls (previously the
illegal emission of a call to such library function triggered an error
somewhere in the code generator).  X86 now emits inline loads and stores for
address spaces 256 and 257 up to the same threshold that is used for address
space 0 and reports a fatal error otherwise.

I call this a "partial fix" because there are still cases that can't be
lowered.  A fatal error is reported in these cases.

Reviewers: arsenm, theraven, compnerd, hfinkel

Subscribers: hfinkel, llvm-commits, alex

Differential Revision: http://reviews.llvm.org/D7241

llvm-svn: 255441
2015-12-12 21:33:31 +00:00
David Majnemer 8a1c45d6e8 [IR] Reformulate LLVM's EH funclet IR
While we have successfully implemented a funclet-oriented EH scheme on
top of LLVM IR, our scheme has some notable deficiencies:
- catchendpad and cleanupendpad are necessary in the current design
  but they are difficult to explain to others, even to seasoned LLVM
  experts.
- catchendpad and cleanupendpad are optimization barriers.  They cannot
  be split and force all potentially throwing call-sites to be invokes.
  This has a noticable effect on the quality of our code generation.
- catchpad, while similar in some aspects to invoke, is fairly awkward.
  It is unsplittable, starts a funclet, and has control flow to other
  funclets.
- The nesting relationship between funclets is currently a property of
  control flow edges.  Because of this, we are forced to carefully
  analyze the flow graph to see if there might potentially exist illegal
  nesting among funclets.  While we have logic to clone funclets when
  they are illegally nested, it would be nicer if we had a
  representation which forbade them upfront.

Let's clean this up a bit by doing the following:
- Instead, make catchpad more like cleanuppad and landingpad: no control
  flow, just a bunch of simple operands;  catchpad would be splittable.
- Introduce catchswitch, a control flow instruction designed to model
  the constraints of funclet oriented EH.
- Make funclet scoping explicit by having funclet instructions consume
  the token produced by the funclet which contains them.
- Remove catchendpad and cleanupendpad.  Their presence can be inferred
  implicitly using coloring information.

N.B.  The state numbering code for the CLR has been updated but the
veracity of it's output cannot be spoken for.  An expert should take a
look to make sure the results are reasonable.

Reviewers: rnk, JosephTremoulet, andrew.w.kaylor

Differential Revision: http://reviews.llvm.org/D15139

llvm-svn: 255422
2015-12-12 05:38:55 +00:00
Matt Arsenault fabab4b7dd SelectionDAG: Match min/max if the scalar operation is legal
llvm-svn: 255388
2015-12-11 23:16:47 +00:00
Hal Finkel cd8664c3c2 Revert r248483, r242546, r242545, and r242409 - absdiff intrinsics
After much discussion, ending here:

  http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151123/315620.html

it has been decided that, instead of having the vectorizer directly generate
special absdiff and horizontal-add intrinsics, we'll recognize the relevant
reduction patterns during CodeGen. Accordingly, these intrinsics are not needed
(the operations they represent can be pattern matched, as is already done in
some backends). Thus, we're backing these out in favor of the current
development work.

r248483 - Codegen: Fix llvm.*absdiff semantic.
r242546 - [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA
r242545 - [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA
r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation

llvm-svn: 255387
2015-12-11 23:11:52 +00:00
Matthias Braun 60d69e2865 CodeGen: Redo analyzePhysRegs() and computeRegisterLiveness()
computeRegisterLiveness() was broken in that it reported dead for a
register even if a subregister was alive. I assume this was because the
results of analayzePhysRegs() are hard to understand with respect to
subregisters.

This commit: Changes the results of analyzePhysRegs (=struct
PhysRegInfo) to be clearly understandable, also renames the fields to
avoid silent breakage of third-party code (and improve the grammar).

Fix all (two) users of computeRegisterLiveness() in llvm: By reenabling
it and removing workarounds for the bug.

This fixes http://llvm.org/PR24535 and http://llvm.org/PR25033

Differential Revision: http://reviews.llvm.org/D15320

llvm-svn: 255362
2015-12-11 19:42:09 +00:00
Manman Ren abc7c1d1d2 CXX_FAST_TLS calling convention: target independent portion.
The access function has a short entry and a short exit, the initialization
block is only run the first time. To improve the performance, we want to
have a short frame at the entry and exit.

We explicitly handle most of the CSRs via copies. Only the CSRs that are not
handled via copies will be in CSR_SaveList.

Frame lowering and prologue/epilogue insertion will generate a short frame
in the entry and exit according to CSR_SaveList. The majority of the CSRs will
be handled by register allcoator. Register allocator will try to spill and
reload them in the initialization block.

We add CSRsViaCopy, it will be explicitly handled during lowering.

1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target
   supports it for the given calling convention and the function has only return
   exits). We also call TLI->initializeSplitCSR to perform initialization.
2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to
   virtual registers at beginning of the entry block and copies from virtual
   registers to CSRsViaCopy at beginning of the exit blocks.
3> we also need to make sure the explicit copies will not be eliminated.

rdar://problem/23557469

Differential Revision: http://reviews.llvm.org/D15340

llvm-svn: 255353
2015-12-11 18:24:30 +00:00
Eric Christopher 325e8d06dc Fix (bitcast (fabs x)), (bitcast (fneg x)) and (bitcast (fcopysign cst,
x)) combines for ppc_fp128, since signbit computation is more
complicated.

Discussion thread:
http://lists.llvm.org/pipermail/llvm-dev/2015-November/092863.html

Patch by Tim Shen!

llvm-svn: 255305
2015-12-10 22:09:06 +00:00
Cong Hou 5146b2d1da Delete a duplicate branch in IfConversion.cpp. NFC.
llvm-svn: 255291
2015-12-10 19:57:22 +00:00
Simon Pilgrim 06ea4be281 [DAGCombiner] Fix PR25763 - vector comparison constant folding + sign-extension
PR25763 demonstrated an issue with D14683 - vector comparison constant folding only works for i1 results, so we need to split off the sign-extension of the result to the required type. Luckily this can be done with the existing type legalization code.

llvm-svn: 255289
2015-12-10 19:47:06 +00:00
Sanjay Patel 87c6c0797e remove duplicated comments and don't repeat function names in comments; NFC
llvm-svn: 255257
2015-12-10 16:34:21 +00:00
Jonas Paulsson e451eeff5c [PostRA scheduling] Allow a target to do scheduling when it wants post RA.
SystemZ needs to do its scheduling after branch relaxation, which can
only happen after block placement, and therefore the standard
PostRAScheduler point in the pass sequence is too early.

TargetMachine::targetSchedulesPostRAScheduling() is a new method that
signals on returning true that target will insert the final scheduling
pass on its own.

Reviewed by Hal Finkel

llvm-svn: 255234
2015-12-10 09:10:07 +00:00
Matthias Braun 7d8e41e82c RegisterPressure: Factor out liveness dead-def detection logic; NFCI
Detecting additional dead-defs without a dead flag that are only visible
through liveness information should be part of the register operand
collection not intertwined with the register pressure update logic.

llvm-svn: 255192
2015-12-10 01:04:15 +00:00
Dan Gohman dab313e0ed PeepholeOptimizer: Ignore dead implicit defs
Target-specific instructions may have uninteresting physreg clobbers,
for target-specific reasons. The peephole pass doesn't need to concern
itself with such defs, as long as they're implicit and marked as dead.

llvm-svn: 255182
2015-12-10 00:37:51 +00:00
Sanjay Patel 87d2ae23ac use range-based for loops; NFCI
llvm-svn: 255171
2015-12-09 22:45:45 +00:00
Robert Lougher f0033b29d4 Fix cycle in selection DAG introduced by extractelement legalization
During selection DAG legalization, extractelement is replaced with a load
instruction.  To do this, a temporary store to the stack is used unless an
existing store is found that can be re-used.
    
If re-using a store, the chain going out of the store must be replaced by
the one going out of the new load (this ensures that any stores that must
take place after the store happens after the load, else the value might
be overwritten before it is loaded).
    
The problem is, if the extractelement index is dependent on the store
replacing the chain will introduce a cycle in the selection DAG (the load
uses the index, and by replacing the chain we will make the index dependent
on the load).
    
To fix this, if the index is dependent on the store, the store is skipped.
This is conservative as we may end up creating an unnecessary extra store
to the stack.  However, the situation is not expected to occur very often.

Differential Revision: http://reviews.llvm.org/D15330

llvm-svn: 255114
2015-12-09 14:34:10 +00:00
Mehdi Amini ceca971576 Revert "Implement a new pass - LiveDebugValues - to compute the set of live DEBUG_VALUEs at each basic block and insert them. Reviewed and accepted at: http://reviews.llvm.org/D11933"
This reverts commit r255096.

Break the bots: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/16378/

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 255101
2015-12-09 08:17:42 +00:00
Vikram TV 0876d2d5cf Implement a new pass - LiveDebugValues - to compute the set of live DEBUG_VALUEs at each basic block and insert them. Reviewed and accepted at: http://reviews.llvm.org/D11933
llvm-svn: 255096
2015-12-09 05:49:14 +00:00
Reid Kleckner 8de1fe23ed [CGP] Reimplement r255055 a different way
llvm-svn: 255070
2015-12-08 23:00:03 +00:00
Reid Kleckner e18f92bfe9 Revert "[CGP] Check that we have an insert point before moving llvm.dbg.value around"
This reverts commit r255055.

Breakage has been reported.

llvm-svn: 255063
2015-12-08 22:33:23 +00:00
Reid Kleckner 7c005324d5 [CGP] Check that we have an insert point before moving llvm.dbg.value around
llvm-svn: 255055
2015-12-08 21:50:52 +00:00
Justin Bogner 3135ba9b38 AsmPrinter: Use emitGlobalConstantFP to emit elements of constant data
It's strange to duplicate the logic for emitting FP values into
emitGlobalConstantDataSequential, and it's even stranger that we end
up printing the verbose assembly comments differently between the two
paths. Just call into emitGlobalConstantFP rather than crudely
duplicating its logic.

llvm-svn: 254988
2015-12-08 02:37:48 +00:00
Sanjay Patel dc627ad63f fix return values to match bool return type; NFC
llvm-svn: 254968
2015-12-07 23:34:30 +00:00
Elena Demikhovsky 33e61eceb4 AVX-512: Fixed masked load / store instruction selection for KNL.
Patterns were missing for KNL target for <8 x i32>, <8 x float> masked load/store.

This intrinsic comes with all legal types:
<8 x float> @llvm.masked.load.v8f32(<8 x float>* %addr, i32 align, <8 x i1> %mask, <8 x float> %passThru),
but still requires lowering, because VMASKMOVPS, VMASKMOVDQU32 work with 512-bit vectors only.

All data operands should be widened to 512-bit vector.
The mask operand should be widened to v16i1 with zeroes.

Differential Revision: http://reviews.llvm.org/D15265

llvm-svn: 254909
2015-12-07 13:39:24 +00:00
Craig Topper e5e035a3a8 Replace uint16_t with the MCPhysReg typedef in many places. A lot of physical register arrays already use this typedef.
llvm-svn: 254843
2015-12-05 07:13:35 +00:00
Cong Hou 833fe143f5 Normalize successors' probabilities when building MBBs for jump table.
llvm-svn: 254837
2015-12-05 05:00:55 +00:00
Matthias Braun b17e8b1c1d ScheduleDAGInstrs: Move LiveIntervals field to ScheduleDAGMI
Now that ScheduleDAGInstrs doesn't need it anymore we can move the field
down the class hierarcy to ScheduleDAGMI.

llvm-svn: 254759
2015-12-04 19:54:24 +00:00
Rafael Espindola 71bd70cc30 Revert "[BranchFolding] Merge MMOs during tail merge"
This reverts commit r254694.

It broke bootstrap.

llvm-svn: 254700
2015-12-04 04:15:05 +00:00
Junmo Park c0731ca183 [BranchFolding] Merge MMOs during tail merge
Summary:
If we remove the MMOs from Load/Store instructions,
they are treated as volatile. This makes other optimization passes unhappy.
eg. Load/Store Optimization

So, it looks better to merge, not remove.

Reviewers: gberry, mcrosier

Subscribers: gberry, llvm-commits

Differential Revision: http://reviews.llvm.org/D14797

llvm-svn: 254694
2015-12-04 02:29:25 +00:00
Junmo Park 7cc13f2e58 (no commit message)
llvm-svn: 254686
2015-12-04 02:06:59 +00:00
Matthias Braun 97d0ffbe06 ScheduleDAGInstrs: Rework schedule graph builder.
Re-comitting with a change that avoids undefined uses getting put into
the VRegUses list.

The new algorithm remembers the uses encountered while walking backwards
until a matching def is found. Contrary to the previous version this:
- Works without LiveIntervals being available
- Allows to increase the precision to subregisters/lanemasks
  (not used for now)

The changes in the AMDGPU tests are necessary because the R600 scheduler
is not stable with respect to the order of nodes in the ready queues.

Differential Revision: http://reviews.llvm.org/D9068

llvm-svn: 254683
2015-12-04 01:51:19 +00:00
Matthias Braun c07cbc8d3c raw_ostream: << operator for callables with raw_ostream argument
This is a revised version of r254655 which uses a Printable wrapper
class to avoid ambiguous overload problems.

Differential Revision: http://reviews.llvm.org/D14348

llvm-svn: 254681
2015-12-04 01:31:59 +00:00
Evgeniy Stepanov 2bb9c5ca22 Emit function alias to data as a function symbol.
CFI emits jump slots for indirect functions as a byte array
constant, and declares function-typed aliases to these constants.

This change fixes AsmPrinter to emit these aliases as function
symbols and not data symbols.

llvm-svn: 254674
2015-12-04 00:45:43 +00:00
JF Bastien 1ac69947b6 CodeGen peephole: fold redundant phys reg copies
Code generation often exposes redundant physical register copies through
virtual registers such as:

  %vreg = COPY %PHYSREG
  ...
  %PHYSREG = COPY %vreg

There are cases where no intervening clobber of %PHYSREG occurs, and the
later copy could therefore be removed. In some cases this further allows
us to remove the initial copy.

This patch contains a motivating example which comes from the x86 build
of Chrome, specifically cc::ResourceProvider::UnlockForRead uses
libstdc++'s implementation of hash_map. That example has two tests live
at the same time, and after machine sinking LLVM has confused itself
enough and things spilling EFLAGS is a great idea even though it's
never restored and the comparison results are both live.

Before this patch we have:
  DEC32m %RIP, 1, %noreg, <ga:@L>, %noreg, %EFLAGS<imp-def>
  %vreg1<def> = COPY %EFLAGS; GR64:%vreg1
  %EFLAGS<def> = COPY %vreg1; GR64:%vreg1
  JNE_1 <BB#1>, %EFLAGS<imp-use>

Both copies are useless. This patch tries to eliminate the later copy in
a generic manner.

dec is especially confusing to LLVM when compared with sub.

I wrote this patch to treat all physical registers generically, but only
remove redundant copies of non-allocatable physical registers because
the allocatable ones caused issues (e.g. when calling conventions weren't
properly modeled) and should be handled later by the register allocator
anyways.

The following tests used to failed when the patch also replaced allocatable
registers:
  CodeGen/X86/StackColoring.ll
  CodeGen/X86/avx512-calling-conv.ll
  CodeGen/X86/copy-propagation.ll
  CodeGen/X86/inline-asm-fpstack.ll
  CodeGen/X86/musttail-varargs.ll
  CodeGen/X86/pop-stack-cleanup.ll
  CodeGen/X86/preserve_mostcc64.ll
  CodeGen/X86/tailcallstack64.ll
  CodeGen/X86/this-return-64.ll
This happens because COPY has other special meaning for e.g. dependency
breakage and x87 FP stack.

Note that all other backends' tests pass.

Reviewers: qcolombet
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15157

llvm-svn: 254665
2015-12-03 23:43:56 +00:00
Justin Bogner 8d6f4e36e8 AsmPrinter: Simplify emitting FP elements in sequential data. NFC
Use APFloat APIs here Rather than manually type-punning through
unions.

llvm-svn: 254664
2015-12-03 23:28:35 +00:00
Matthias Braun 149b859c55 Revert "raw_ostream: << operator for callables with raw_stream argument"
This commit provoked "error C2593: 'operator <<' is ambiguous" on MSVC.

This reverts commit r254655.

llvm-svn: 254661
2015-12-03 23:00:28 +00:00
Matthias Braun e957a9bb1b raw_ostream: << operator for callables with raw_stream argument
This allows easier construction of print helpers. Example:

Printable PrintLaneMask(unsigned LaneMask) {
  return Printable([LaneMask](raw_ostream &OS) {
    OS << format("%08X", LaneMask);
  });
}

// Usage:
OS << PrintLaneMask(Mask);

Differential Revision: http://reviews.llvm.org/D14348

llvm-svn: 254655
2015-12-03 22:17:26 +00:00
Chih-Hung Hsieh ed7d81e5d4 [X86] Part 1 to fix x86-64 fp128 calling convention.
Almost all these changes are conditioned and only apply to the new
x86-64 f128 type configuration, which will be enabled in a follow up
patch. They are required together to make new f128 work. If there is
any error, we should fix or revert them as a whole.
These changes should have no impact to current configurations.

* Relax type legalization checks to accept new f128 type configuration,
  whose TypeAction is TypeSoftenFloat, not TypeLegal, but also has
  TLI.isTypeLegal true.
* Relax GetSoftenedFloat to return in some cases f128 type SDValue,
  which is TLI.isTypeLegal but not "softened" to i128 node.
* Allow customized FABS, FNEG, FCOPYSIGN on new f128 type configuration,
  to generate optimized bitwise operators for libm functions.
* Enhance related Lower* functions to handle f128 type.
* Enhance DAGTypeLegalizer::run, SoftenFloatResult, and related functions
  to keep new f128 type in register, and convert f128 operators to library calls.
* Fix Combiner, Emitter, Legalizer routines that did not handle f128 type.
* Add ExpandConstant to handle i128 constants, ExpandNode
  to handle ISD::Constant node.
* Add one more parameter to getCommonSubClass and firstCommonClass,
  to guarantee that returned common sub class will contain the specified
  simple value type.
  This extra parameter is used by EmitCopyFromReg in InstrEmitter.cpp.
* Fix infinite loop in getTypeLegalizationCost when f128 is the value type.
* Fix printOperand to handle null operand.
* Enhance ISD::BITCAST node to handle f128 constant.
* Expand new f128 type for BR_CC, SELECT_CC, SELECT, SETCC nodes.
* Enhance X86AsmPrinter to emit f128 values in comments.

Differential Revision: http://reviews.llvm.org/D15134

llvm-svn: 254653
2015-12-03 22:02:40 +00:00
Andrew Kaylor 9efb2332e2 [WinEH] Avoid infinite loop in BranchFolding for multiple single block funclets
Differential Revision: http://reviews.llvm.org/D14996

llvm-svn: 254629
2015-12-03 18:55:28 +00:00
Matthias Braun 2fd672a221 Revert "ScheduleDAGInstrs: Rework schedule graph builder."
This works mostly fine but breaks some stage 1 builders when compiling
compiler-rt on i386. Revert for further investigation as I can't see an
obvious cause/fix.

This reverts commit r254577.

llvm-svn: 254586
2015-12-03 03:01:10 +00:00
Matthias Braun d35fe3d984 ScheduleDAGInstrs: Rework schedule graph builder.
The new algorithm remembers the uses encountered while walking backwards
until a matching def is found. Contrary to the previous version this:
- Works without LiveIntervals being available
- Allows to increase the precision to subregisters/lanemasks
  (not used for now)

The changes in the AMDGPU tests are necessary because the R600 scheduler
is not stable with respect to the order of nodes in the ready queues.

Differential Revision: http://reviews.llvm.org/D9068

llvm-svn: 254577
2015-12-03 02:05:27 +00:00
Matthias Braun b0083608b4 RegisterPressure: Use range based for, fix else style; NFC
llvm-svn: 254575
2015-12-03 01:44:45 +00:00
David Majnemer 70497c696a Move EH-specific helper functions to a more appropriate place
No functionality change is intended.

llvm-svn: 254562
2015-12-02 23:06:39 +00:00
Reid Kleckner 1f11b4e3a7 Use std::string instead of strdup() and free() in WinCodeViewLineTables
llvm-svn: 254557
2015-12-02 22:34:30 +00:00
Kyle Butt cf6a8bfe51 [CodeGen]: Fix bad interaction with AntiDep breaking and inline asm.
AggressiveAntiDepBreaker was renaming registers specified by the user
for inline assembly. While this will work for compiler-specified
registers, it won't work for user-specified registers, and at the time
this runs, I don't currently see a way to distinguish them.

llvm-svn: 254532
2015-12-02 18:58:51 +00:00
Fiona Glaser 1075f6323f Fix accidental off by one change
Didn't break any tests, but did unnecessary extra work.

llvm-svn: 254529
2015-12-02 18:46:23 +00:00
Fiona Glaser e25b06fa23 Scheduler / Regalloc: use unique_ptr[] instead of std::vector
vector.resize() is significantly slower than memset in many STLs
and the cost of initializing these vectors is significant on targets
with many registers. Since we don't need the overhead of a vector,
use a simple unique_ptr instead.

llvm-svn: 254526
2015-12-02 18:32:59 +00:00
Tim Northover f520eff782 AArch64: use ldxp/stxp pair to implement 128-bit atomic loads.
The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic
if there has been a corresponding successful stxp. It's less clear for AArch32, so
I'm leaving that alone for now.

llvm-svn: 254524
2015-12-02 18:12:57 +00:00
Cong Hou cb07d7016a Fix a bug in IfConversion.cpp.
The bug is introduced in r254377 which failed some tests on ARM, where a new
probability is assigned to a successor but the provided BB may not be a
successor.

llvm-svn: 254463
2015-12-01 21:50:20 +00:00
Sanjay Patel 0b2a94916d use range-based for loops; NFCI
llvm-svn: 254453
2015-12-01 19:57:43 +00:00
Sanjay Patel b53791e5a7 don't repeat function/variable names in comments; NFC
llvm-svn: 254445
2015-12-01 19:32:35 +00:00
Sanjay Patel 96824deebc fix typo; NFC
llvm-svn: 254442
2015-12-01 19:19:18 +00:00
Elena Demikhovsky 0781d7b2b4 Fixed a failure in cost calculation for vector GEP
Cost calculation for vector GEP failed with due to invalid cast to GEP index operand.
The bug is fixed, added a test.

http://reviews.llvm.org/D14976

llvm-svn: 254408
2015-12-01 12:08:36 +00:00
Yury Gribov d7dbb66eb8 Introduce new @llvm.get.dynamic.area.offset.i{32, 64} intrinsics.
The @llvm.get.dynamic.area.offset.* intrinsic family is used to get the offset
from native stack pointer to the address of the most recent dynamic alloca on
the caller's stack. These intrinsics are intendend for use in combination with
@llvm.stacksave and @llvm.restore to get a pointer to the most recent dynamic
alloca. This is useful, for example, for AddressSanitizer's stack unpoisoning
routines.

Patch by Max Ostapenko.

Differential Revision: http://reviews.llvm.org/D14983

llvm-svn: 254404
2015-12-01 11:40:55 +00:00
Cong Hou 4aef7ef881 Allow known and unknown probabilities coexist in MBB's successor list.
Previously it is not allowed for each MBB to have successors with both known and
unknown probabilities. However, this may be too strict as at this stage we could
not always guarantee that. It is better to remove this restriction now, and I
will work on validating MBB's successors' probabilities first (for example,
check if the sum is approximate one).

llvm-svn: 254402
2015-12-01 11:05:39 +00:00
Cong Hou d97c100dc4 Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces.
(This is the second attempt to submit this patch. The first caused two assertion
 failures and was reverted. See https://llvm.org/bugs/show_bug.cgi?id=25687)

The patch in http://reviews.llvm.org/D13745 is broken into four parts:

1. New interfaces without functional changes (http://reviews.llvm.org/D13908).
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights (http://reviews.llvm.org/D14361).
3. Use new interfaces in all other passes.
4. Remove old interfaces.

This patch is 3+4 above. In this patch, MBB won't provide weight-based
interfaces any more, which are totally replaced by probability-based ones.
The interface addSuccessor() is redesigned so that the default probability is
unknown. We allow unknown probabilities but don't allow using it together
with known probabilities in successor list. That is to say, we either have a
list of successors with all known probabilities, or all unknown
probabilities. In the latter case, we assume each successor has 1/N
probability where N is the number of successors. An assertion checks if the
user is attempting to add a successor with the disallowed mixed use as stated
above. This can help us catch many misuses.

All uses of weight-based interfaces are now updated to use probability-based
ones.


Differential revision: http://reviews.llvm.org/D14973

llvm-svn: 254377
2015-12-01 05:29:22 +00:00
Matthias Braun 50f7f585ed RegisterPressure: If we do not collect dead defs the list must be empty
llvm-svn: 254372
2015-12-01 04:20:06 +00:00
Matthias Braun ba6b225bf9 RegisterPressure: Remove support for recede()/advance() at MBB boundaries
Nobody was checking the returnvalue of recede()/advance() so we can
simply replace this code with asserts.

llvm-svn: 254371
2015-12-01 04:20:04 +00:00
Matthias Braun f9f8b92d93 RegisterPressure: Split RegisterOperands analysis code from result object; NFC
This is in preparation to expose the RegisterOperands class as
RegisterPressure API.

llvm-svn: 254368
2015-12-01 04:19:56 +00:00
Hans Wennborg 1dbaf67537 Revert r254348: "Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces."
and the follow-up r254356: "Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction."

Asserts were firing in Chromium builds. See PR25687.

llvm-svn: 254366
2015-12-01 03:49:42 +00:00
Cong Hou 1ccca9e673 Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction.
The root cause is the rounding behavior in BranchProbability construction. We may consider to use truncation instead in the future.

llvm-svn: 254356
2015-12-01 00:55:42 +00:00
Evgeniy Stepanov fd07995363 Extend debug info for function parameters in SDAG.
SDAG currently can emit debug location for function parameters when
an llvm.dbg.declare points to either a function argument SSA temp,
or to an AllocaInst. This change extends this logic by adding a
fallback case when neither of the above is true.

This is required for SafeStack, which may copy the contents of a
byval function argument into something that is not an alloca, and
then describe the target as the new location of the said argument.

llvm-svn: 254352
2015-12-01 00:34:30 +00:00
Cong Hou fa1917c673 Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces.
The patch in http://reviews.llvm.org/D13745 is broken into four parts:

1. New interfaces without functional changes (http://reviews.llvm.org/D13908).
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights (http://reviews.llvm.org/D14361).
3. Use new interfaces in all other passes.
4. Remove old interfaces.

This patch is 3+4 above. In this patch, MBB won't provide weight-based
interfaces any more, which are totally replaced by probability-based ones.
The interface addSuccessor() is redesigned so that the default probability is
unknown. We allow unknown probabilities but don't allow using it together
with known probabilities in successor list. That is to say, we either have a
list of successors with all known probabilities, or all unknown
probabilities. In the latter case, we assume each successor has 1/N
probability where N is the number of successors. An assertion checks if the
user is attempting to add a successor with the disallowed mixed use as stated
above. This can help us catch many misuses.

All uses of weight-based interfaces are now updated to use probability-based
ones.


Differential revision: http://reviews.llvm.org/D14973

llvm-svn: 254348
2015-12-01 00:02:51 +00:00
Paul Robinson a2550a6da3 Have 'optnone' respect the -fast-isel=false option.
This is primarily useful for debugging optnone v. ISel issues.

Differential Revision: http://reviews.llvm.org/D14792

llvm-svn: 254335
2015-11-30 21:56:16 +00:00
Craig Topper 6066164454 Use a lambda instead of std::bind and std::mem_fn I introduced in r254242. NFC
llvm-svn: 254260
2015-11-29 18:05:22 +00:00
Craig Topper d0573179dc [SelectionDAG] Use std::any_of instead of a manually coded loop. NFC
llvm-svn: 254242
2015-11-29 04:37:11 +00:00
Jonas Paulsson f12b925bb1 [Stack realignment] Handling of aligned allocas.
This patch implements dynamic realignment of stack objects for targets
with a non-realigned stack pointer. Behaviour in FunctionLoweringInfo
is changed so that for a target that has StackRealignable set to
false, over-aligned static allocas are considered to be variable-sized
objects and are handled with DYNAMIC_STACKALLOC nodes.

It would be good to group aligned allocas into a single big alloca as
an optimization, but this is yet todo.

SystemZ benefits from this, due to its stack frame layout.

New tests SystemZ/alloca-03.ll for aligned allocas, and
SystemZ/alloca-04.ll for "no-realign-stack" attribute on functions.

Review and help from Ulrich Weigand and Hal Finkel.

llvm-svn: 254227
2015-11-28 11:02:32 +00:00
Artyom Skrobov 314ee04268 Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC)
Summary:
Many target lowerings copy-paste the code to test SDValues for known constants.
This code can instead be shared in SelectionDAG.cpp, and reused in the targets.

Reviewers: MatzeB, andreadb, tstellarAMD

Subscribers: arsenm, jyknight, llvm-commits

Differential Revision: http://reviews.llvm.org/D14945

llvm-svn: 254085
2015-11-25 19:41:11 +00:00
Eric Christopher 4675c439aa Fix some places where we were assuming that memory type had been legalized
to a simple type when lowering a truncating store of a vector type. In this
case for an EVT we'll return Expand as we should in all of the cases anyhow.

The testcase triggered at the one in VectorLegalizer::LegalizeOp, inspection
found the rest.

llvm-svn: 254061
2015-11-25 09:11:53 +00:00
Matthias Braun 147110da84 LiveVariables should not clobber MachineOperand::IsDead, ::IsKill on reserved physical registers
Patch by Nick Johnson <Nicholas.Paul.Johnson@deshawresearch.com>

Differential Revision: http://reviews.llvm.org/D14875

llvm-svn: 254012
2015-11-24 20:06:56 +00:00
Cong Hou 1938f2eb98 Let SelectionDAG start to use probability-based interface to add successors.
The patch in http://reviews.llvm.org/D13745 is broken into four parts:

1. New interfaces without functional changes.
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights.
3. Use new interfaces in all other passes.
4. Remove old interfaces.

This the second patch above. In this patch SelectionDAG starts to use
probability-based interfaces in MBB to add successors but other MC passes are
still using weight-based interfaces. Therefore, we need to maintain correct
weight list in MBB even when probability-based interfaces are used. This is
done by updating weight list in probability-based interfaces by treating the
numerator of probabilities as weights. This change affects many test cases
that check successor weight values. I will update those test cases once this
patch looks good to you.


Differential revision: http://reviews.llvm.org/D14361

llvm-svn: 253965
2015-11-24 08:51:23 +00:00
Davide Italiano c304a0ddc1 [DIE] Make DIE.h NDEBUG conditional-free.
Switch dump()/print() method definitions to LLVM_DUMP_METHOD instead.

llvm-svn: 253945
2015-11-24 02:21:43 +00:00
Andrew Kaylor d0430e8580 [WinEH] Fix problem where CodeGenPrepare incorrectly sinks a bitcast into an EH pad.
Differential Revision: http://reviews.llvm.org/D14842

llvm-svn: 253902
2015-11-23 19:16:15 +00:00
Simon Pilgrim 1dfe53e180 Remove duplicate getValueType() calls. NFCI.
llvm-svn: 253823
2015-11-22 16:49:38 +00:00
Krzysztof Parzyszek 6753f33388 Avoid dependency between TableGen and CodeGen
Duplicate a few common definitions between DFAPacketizer.cpp and
DFAPacketizerEmitter.cpp to avoid including files from CodeGen
in TableGen.

llvm-svn: 253820
2015-11-22 15:20:19 +00:00
Krzysztof Parzyszek b46557292c Hexagon V60/HVX DFA scheduler support
Extended DFA tablegen to:
  - added "-debug-only dfa-emitter" support to llvm-tblgen

  - defined CVI_PIPE* resources for the V60 vector coprocessor

  - allow specification of multiple required resources
    - supports ANDs of ORs
    - e.g. [SLOT2, SLOT3], [CVI_MPY0, CVI_MPY1] means:
           (SLOT2 OR SLOT3) AND (CVI_MPY0 OR CVI_MPY1)

  - added support for combo resources
    - allows specifying ORs of ANDs
    - e.g. [CVI_XLSHF, CVI_MPY01] means:
           (CVI_XLANE AND CVI_SHIFT) OR (CVI_MPY0 AND CVI_MPY1)

  - increased DFA input size from 32-bit to 64-bit
    - allows for a maximum of 4 AND'ed terms of 16 resources

  - supported expressions now include:

    expression     => term [AND term] [AND term] [AND term]
    term           => resource [OR resource]*
    resource       => one_resource | combo_resource
    combo_resource => (one_resource [AND one_resource]*)

Author: Dan Palermo <dpalermo@codeaurora.org>

kparzysz: Verified AMDGPU codegen to be unchanged on all llc
tests, except those dealing with instruction encodings.

Reapply the previous patch, this time without circular dependencies.

llvm-svn: 253793
2015-11-21 20:00:45 +00:00
Krzysztof Parzyszek 4ca21fc1aa Revert r253790: it breaks all builds for some reason.
llvm-svn: 253791
2015-11-21 17:38:33 +00:00
Krzysztof Parzyszek 220a9bc018 Hexagon V60/HVX DFA scheduler support
Extended DFA tablegen to:
  - added "-debug-only dfa-emitter" support to llvm-tblgen

  - defined CVI_PIPE* resources for the V60 vector coprocessor

  - allow specification of multiple required resources
    - supports ANDs of ORs
    - e.g. [SLOT2, SLOT3], [CVI_MPY0, CVI_MPY1] means:
           (SLOT2 OR SLOT3) AND (CVI_MPY0 OR CVI_MPY1)

  - added support for combo resources
    - allows specifying ORs of ANDs
    - e.g. [CVI_XLSHF, CVI_MPY01] means:
           (CVI_XLANE AND CVI_SHIFT) OR (CVI_MPY0 AND CVI_MPY1)

  - increased DFA input size from 32-bit to 64-bit
    - allows for a maximum of 4 AND'ed terms of 16 resources

  - supported expressions now include:

    expression     => term [AND term] [AND term] [AND term]
    term           => resource [OR resource]*
    resource       => one_resource | combo_resource
    combo_resource => (one_resource [AND one_resource]*)

Author: Dan Palermo <dpalermo@codeaurora.org>

kparzysz: Verified AMDGPU codegen to be unchanged on all llc
tests, except those dealing with instruction encodings.

llvm-svn: 253790
2015-11-21 17:23:52 +00:00
Jonas Paulsson 8f0d2b7f1f [DAGCombiner] Bugfix for lost chain depenedency.
When MergeConsecutiveStores() combines two loads and two stores into
wider loads and stores, the chain users of both of the original loads
must be transfered to the new load, because it may be that a chain
user only depends on one of the loads.

New test case: test/CodeGen/SystemZ/dag-combine-01.ll

Reviewed by James Y Knight.

Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=25310#c6
llvm-svn: 253779
2015-11-21 13:25:07 +00:00
Geoff Berry 5256fcada0 [CodeGenPrepare] Create more extloads and fewer ands
Summary:
Add and instructions immediately after loads that only have their low
bits used, assuming that the (and (load x) c) will be matched as a
extload and the ands/truncs fed by the extload will be removed by isel.

Reviewers: mcrosier, qcolombet, ab

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14584

llvm-svn: 253722
2015-11-20 22:34:39 +00:00
Arnaud A. de Grandmaison 4e89e9f846 [ShrinkWrap] Teach ShrinkWrap to handle targets requiring a register scavenger.
The included test only checks for a compiler crash for now. Several people are
facing this issue, so we first resolve the crash, and will increase shrinkwrap's
coverage later in a follow-up patch.

llvm-svn: 253718
2015-11-20 21:54:27 +00:00
Daniel Sanders b700203c8b Partially revert r253662: some unrelated work was accidentally committed with it.
Sorry.

llvm-svn: 253663
2015-11-20 13:16:35 +00:00
Daniel Sanders be9db3c00a Revert the revert 253497 and 253539 - These commits aren't the cause of the clang-cmake-mips failures.
Sorry for the noise.

llvm-svn: 253662
2015-11-20 13:13:53 +00:00
Tobias Edler von Koch 4d45090659 [LTO] Add option to emit assembly from LTOCodeGenerator
This adds a new API, LTOCodeGenerator::setFileType, to choose the output file
format for LTO CodeGen. A corresponding change to use this new API from
llvm-lto and a test case is coming in a separate commit.

Differential Revision: http://reviews.llvm.org/D14554

llvm-svn: 253622
2015-11-19 23:59:24 +00:00
Reid Kleckner cc2f6c35a3 [WinEH] Disable most forms of demotion
Now that the register allocator knows about the barriers on funclet
entry and exit, testing has shown that this is unnecessary.

We still demote PHIs on unsplittable blocks due to the differences
between the IR CFG and the Machine CFG.

llvm-svn: 253619
2015-11-19 23:23:33 +00:00
Krzysztof Parzyszek df537b97b1 Expand subregisters in MachineFrameInfo::getPristineRegs
http://reviews.llvm.org/D14719

llvm-svn: 253600
2015-11-19 21:18:52 +00:00
Sanjay Patel 4699b8ab6a [CGP] despeculate expensive cttz/ctlz intrinsics
This is another step towards allowing SimplifyCFG to speculate harder, but then have 
CGP clean things up if the target doesn't like it.

Previous patches in this series:
http://reviews.llvm.org/D12882
http://reviews.llvm.org/D13297

D13297 should catch most expensive ops, but speculation of cttz/ctlz requires special
handling because of weirdness in the intrinsic definition for handling a zero input 
(that definition can probably be blamed on x86).

For example, if we have the usual speculated-by-select expensive op pattern like this:

  %tobool = icmp eq i64 %A, 0
  %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true)   ; is_zero_undef == true
  %cond = select i1 %tobool, i64 64, i64 %0
  ret i64 %cond

There's an instcombine that will turn it into:

  %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 false)   ; is_zero_undef == false

This CGP patch is looking for that case and despeculating it back into:

  entry:
    %tobool = icmp eq i64 %A, 0
    br i1 %tobool, label %cond.end, label %cond.true

  cond.true:
    %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true)    ; is_zero_undef == true
    br label %cond.end

  cond.end:
    %cond = phi i64 [ %0, %cond.true ], [ 64, %entry ]
    ret i64 %cond

This unfortunately may lead to poorer codegen (see the changes in the existing x86 test), 
but if we increase speculation in SimplifyCFG (the next step in this patch series), then
we should avoid those kinds of cases in the first place.

The need for this patch was originally mentioned here:
http://reviews.llvm.org/D7506
with follow-up here:
http://reviews.llvm.org/D7554

Differential Revision: http://reviews.llvm.org/D14630

llvm-svn: 253573
2015-11-19 16:37:10 +00:00
Hans Wennborg dcc2500452 X86: More efficient legalization of wide integer compares
In particular, this makes the code for 64-bit compares on 32-bit targets
much more efficient.

Example:

  define i32 @test_slt(i64 %a, i64 %b) {
  entry:
    %cmp = icmp slt i64 %a, %b
    br i1 %cmp, label %bb1, label %bb2
  bb1:
    ret i32 1
  bb2:
    ret i32 2
  }

Before this patch:

  test_slt:
          movl    4(%esp), %eax
          movl    8(%esp), %ecx
          cmpl    12(%esp), %eax
          setae   %al
          cmpl    16(%esp), %ecx
          setge   %cl
          je      .LBB2_2
          movb    %cl, %al
  .LBB2_2:
          testb   %al, %al
          jne     .LBB2_4
          movl    $1, %eax
          retl
  .LBB2_4:
          movl    $2, %eax
          retl

After this patch:

  test_slt:
          movl    4(%esp), %eax
          movl    8(%esp), %ecx
          cmpl    12(%esp), %eax
          sbbl    16(%esp), %ecx
          jge     .LBB1_2
          movl    $1, %eax
          retl
  .LBB1_2:
          movl    $2, %eax
          retl

Differential Revision: http://reviews.llvm.org/D14496

llvm-svn: 253572
2015-11-19 16:35:08 +00:00
Pete Cooper 67cf9a723b Revert "Change memcpy/memset/memmove to have dest and source alignments."
This reverts commit r253511.

This likely broke the bots in
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787

llvm-svn: 253543
2015-11-19 05:56:52 +00:00
Pete Cooper 72bc23ef02 Change memcpy/memset/memmove to have dest and source alignments.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

These intrinsics currently have an explicit alignment argument which is
required to be a constant integer.  It represents the alignment of the
source and dest, and so must be the minimum of those.

This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments.  The alignment
argument itself is removed.

There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe.  For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.

For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)

For out of tree owners, I was able to strip alignment from calls using sed by replacing:
  (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
  $1i1 false)

and similarly for memmove and memcpy.

I then added back in alignment to test cases which needed it.

A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.

In IRBuilder itself, a new argument was added.  Instead of calling:
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)

There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool.  This is to prevent isVolatile here from passing its default
parameter to the source alignment.

Note, changes in future can now be made to codegen.  I didn't change anything here, but this
change should enable better memcpy code sequences.

Reviewed by Hal Finkel.

llvm-svn: 253511
2015-11-18 22:17:24 +00:00
Simon Pilgrim c1a46b729b [DAGCombiner] Vector constant folding for comparisons
This patch adds support for vector constant folding of integer/float comparisons.

This requires FoldConstantVectorArithmetic to support scalar constant operands (in this case ISD::CONDCASE). In future we should be able to support other scalar constant types as necessary (and possibly start calling FoldConstantVectorArithmetic for all node creations)

Differential Revision: http://reviews.llvm.org/D14683

llvm-svn: 253504
2015-11-18 21:17:19 +00:00
Betul Buyukkurt 6fac1741c9 [PGO] Value profiling support
This change introduces an instrumentation intrinsic instruction for
value profiling purposes, the lowering of the instrumentation intrinsic
and raw reader updates. The raw profile data files for llvm-profdata
testing are updated.

llvm-svn: 253484
2015-11-18 18:14:55 +00:00
Jonas Paulsson af722f8287 [SelectionDAGBuilder] Make sure DemoteReg ends up in right reg-class.
The virtual register containing the address for returned value on
stack should in the DAG be represented with a CopyFromReg node and not
a Register node. Otherwise, InstrEmitter will not make sure that it
ends up in the right register class for the target instruction.

SystemZ needs this, becuause the reg class for address registers is a
subset of the general 64 bit register class.

test/SystemZ/CodeGen/args-07.ll and args-04.ll updated to run with
-verify-machineinstrs.

Reviewed by Hal Finkel.

llvm-svn: 253461
2015-11-18 14:59:00 +00:00
Rafael Espindola 449711cb36 Stop producing .data.rel sections.
If a section is rw, it is irrelevant if the dynamic linker will write to
it or not.

It looks like llvm implemented this because gcc was doing it. It looks
like gcc implemented this in the hope that it would put all the
relocated items close together and speed up the dynamic linker.

There are two problem with this:
* It doesn't work. Both bfd and gold will map .data.rel to .data and
  concatenate the input sections in the order they are seen.
* If we want a feature like that, it can be implemented directly in the
  linker since it knowns where the dynamic relocations are.

llvm-svn: 253436
2015-11-18 06:02:15 +00:00
Cong Hou 136bc65ec8 Remove a redundant assertion in MachineBasicBlock.cpp. NFC.
llvm-svn: 253426
2015-11-18 01:55:56 +00:00
Cong Hou 11c1420173 Remove redundant code in MachineBasicBlock.cpp. NFC.
llvm-svn: 253425
2015-11-18 01:45:10 +00:00
Cong Hou 41cf1a5dfb Improving edge probabilities computation when choosing the best successor in machine block placement.
When looking for the best successor from the outer loop for a block
belonging to an inner loop, the edge probability computation can be
improved so that edges in the inner loop are ignored. For example,
suppose we are building chains for the non-loop part of the following
code, and looking for B1's best successor. Assume the true body is very
hot, then B3 should be the best candidate. However, because of the
existence of the back edge from B1 to B0, the probability from B1 to B3
can be very small, preventing B3 to be its successor. In this patch, when
computing the probability of the edge from B1 to B3, the weight on the
back edge B1->B0 is ignored, so that B1->B3 will have 100% probability.

if (...)
  do {
    B0;
    ... // some branches
    B1;
  } while(...);
else
  B2;
B3;


Differential revision: http://reviews.llvm.org/D10825

llvm-svn: 253414
2015-11-18 00:52:52 +00:00
David Blaikie 6196aa06c9 Generalize ownership/passing semantics to allow dsymutil to own abbreviations via unique_ptr
While still allowing CodeGen/AsmPrinter in llvm to own them using a bump
ptr allocator. (might be nice to replace the pointers there with
something that at least automatically calls their dtors, if that's
necessary/useful, rather than having it done explicitly (I think a typed
BumpPtrAllocator already does this, or maybe a unique_ptr with a custom
deleter, etc))

llvm-svn: 253409
2015-11-18 00:34:10 +00:00
Reid Kleckner c20276d0b2 [WinEH] Move WinEHFuncInfo from MachineModuleInfo to MachineFunction
Summary:
Now that there is a one-to-one mapping from MachineFunction to
WinEHFuncInfo, we don't need to use a DenseMap to select the right
WinEHFuncInfo for the current funclet.

The main challenge here is that X86WinEHStatePass is an IR pass that
doesn't have access to the MachineFunction. I gave it its own
WinEHFuncInfo object that it uses to calculate state numbers, which it
then throws away. As long as nobody creates or removes EH pads between
this pass and SDAG construction, we will get the same state numbers.

The other thing X86WinEHStatePass does is to mark the EH registration
node. Instead of communicating which alloca was the registration through
WinEHFuncInfo, I added the llvm.x86.seh.ehregnode intrinsic.  This
intrinsic generates no code and simply marks the alloca in use.

Reviewers: JCTremoulet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14668

llvm-svn: 253378
2015-11-17 21:10:25 +00:00
Pat Gavlin c8ea157811 Lower statepoints with multi-def targets.
Statepoint lowering currently expects that the target method of a
statepoint only defines a single value. This precludes using
statepoints with ABIs that return values in multiple registers
(e.g. the SysV AMD64 ABI). This change adds support for lowering
statepoints with mutli-def targets.

llvm-svn: 253339
2015-11-17 16:04:21 +00:00
Dan Gohman 7aa4abac24 Use TargetRegisterInfo for printing MachineOperand register comments
Several places in AsmPrinter.cpp print comments describing MachineOperand
registers using MCRegisterInfo, which uses MCOperand-oriented names. This
doesn't work for targets that use virtual registers exclusively, as
WebAssembly does, since virtual registers are represented and printed
differently.

This patch preserves what seems to be the spirit of r229978, avoiding the
use of TM.getSubtargetImpl(), while still using MachineOperand-oriented
printing for MachineOperands.

Differential Revision: http://reviews.llvm.org/D14709

llvm-svn: 253338
2015-11-17 16:01:28 +00:00
Rafael Espindola 65e4902156 Drop prelink support.
The way prelink used to work was

* The compiler decides if a given section only has relocations that
are know to point to the same DSO. If so, it names it
.data.rel.ro.local<something>.
* The static linker puts all of these together.
* The prelinker program assigns addresses to each library and resolves
the local relocations.

There are many problems with this:
* It is incompatible with address space randomization.
* The information passed by the compiler is redundant. The linker
knows if a given relocation is in the same DSO or not. If could sort
by that if so desired.
* There are newer ways of speeding up DSO (gnu hash for example).
* Even if we want to implement this again in the compiler, the previous
  implementation is pretty broken. It talks about relocations that are
  "resolved by the static linker". If they are resolved, there are none
  left for the prelinker. What one needs to track is if an expression
  will require only dynamic relocations that point to the same DSO.

At this point it looks like the prelinker is an historical curiosity.
For example, fedora has retired it because it failed to build for two
releases
(http://pkgs.fedoraproject.org/cgit/prelink.git/commit/?id=eb43100a8331d91c801ee3dcdb0a0bb9babfdc1f)

This patch removes support for it. That is, it stops printing the
".local" sections.

llvm-svn: 253280
2015-11-17 00:51:23 +00:00
Matthias Braun fe9d6f211f Assume lane masks are always precise
Allowing imprecise lane masks in case of more than 32 sub register lanes
lead to some tricky corner cases, and I need another bugfix for another
one. Instead I rather declare lane masks as precise and let tablegen
abort if we do not have enough bits.

This does not affect any in-tree target, even AMDGPU only needs 16 lanes
at the moment. If the 32 lanes turn out to be a problem in the future,
then we can easily change the LaneBitmask typedef to uint64_t.

Differential Revision: http://reviews.llvm.org/D14557

llvm-svn: 253279
2015-11-17 00:50:55 +00:00
Keno Fischer b011c63d19 [DIBuilder] Make createReferenceType take size and align
Summary: Since we're passing references to dbg.value as pointers,
we need to have the frontend properly declare their sizes and
alignments (as it already does for regular pointers) in preparation
for my upcoming patch to have the verifer check that the sizes agree.

Also augment the backend logic that skips actually emitting this
information into DWARF such that it also handles reference types.

Reviewers: aprantl, dexonsmith, dblaikie

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D14275

llvm-svn: 253186
2015-11-16 07:57:32 +00:00
Akira Hatanaka b11ef0897c Reduce the size of MCRelaxableFragment.
MCRelaxableFragment previously kept a copy of MCSubtargetInfo and
MCInst to enable re-encoding the MCInst later during relaxation. A copy
of MCSubtargetInfo (instead of a reference or pointer) was needed
because the feature bits could be modified by the parser.

This commit replaces the MCSubtargetInfo copy in MCRelaxableFragment
with a constant reference to MCSubtargetInfo. The copies of
MCSubtargetInfo are kept in MCContext, and the target parsers are now
responsible for asking MCContext to provide a copy whenever the feature
bits of MCSubtargetInfo have to be toggled.
 
With this patch, I saw a 4% reduction in peak memory usage when I
compiled verify-uselistorder.lto.bc using llc.

rdar://problem/21736951

Differential Revision: http://reviews.llvm.org/D14346

llvm-svn: 253127
2015-11-14 06:35:56 +00:00
Quentin Colombet 2cdcfd23cd [ShrinkWrapping] Disable the optimization for functions with sanitize like
attribute.

Even if the target supports shrink-wrapping, the prologue and epilogue
must not move because a crash can happen anywhere and sanitizers need
to be able to unwind from the PC of the crash.

llvm-svn: 253116
2015-11-14 01:55:17 +00:00
Matthias Braun e6edd48d69 MachineScheduler: Print initial pressure in debug dump
llvm-svn: 253097
2015-11-13 22:30:31 +00:00
Matthias Braun 3b099db61d MachineScheduler: Improve debug output for "only one node in readyset"
When there is only 1 node left in the ready queue and it is picked call
the reason "ONLY1" instead of "NOCAND".

llvm-svn: 253096
2015-11-13 22:30:29 +00:00
James Molloy bb1dbf530a [SDAG] Fix expansion of BITREVERSE
Richard Trieu noted that UBSan detected an overflowing shift, and the obvious fix caused a crash.

What was happening was that the shiftee (1U) was indeed too small for the possible range of shifts it had to handle, but also we were using "VT.getSizeInBits()" to get the maximum type bitwidth, but we wanted "VT.getScalarSizeInBits()" to get the vector lane size instead of the entire vector size.

Use an APInt for the shift and VT.getScalarSizeInBits().

llvm-svn: 253023
2015-11-13 10:02:36 +00:00
Sanjoy Das ac9c5b1901 [ImplicitNulls] Add some clarifying comments; NFC
llvm-svn: 253020
2015-11-13 08:14:00 +00:00
Tom Stellard 0967c91e0c Revert "Remove unnecessary call to getAllocatableRegClass"
This reverts commit r252565.

This also includes the revert of the commit mentioned below in order to
avoid breaking tests in AMDGPU:

Revert "AMDGPU: Set isAllocatable = 0 on VS_32/VS_64"

This reverts commit r252674.

llvm-svn: 252956
2015-11-12 21:43:25 +00:00
Sanjoy Das e8b81649cf [ImplicitNulls] Fix wrapping by breaking up a condition, NFC
llvm-svn: 252947
2015-11-12 20:51:49 +00:00
Sanjoy Das edc394f1ed [ImplicitNull] Extract out a HazardDetector class, NFC
This will make later functional changes easier to follow.

llvm-svn: 252946
2015-11-12 20:51:44 +00:00
Quentin Colombet aeb85934b6 [ShrinkWrap] Fix a typo in a comment.
llvm-svn: 252918
2015-11-12 18:16:27 +00:00
Quentin Colombet 94dc1e0d34 [ShrinkWrap] Make sure we do not mess up with EH funclet lowering.
ShrinkWrapping does not understand exception handling constraints for now, so
make sure we do not mess with them by aborting on functions that use EH
funclets.

llvm-svn: 252917
2015-11-12 18:13:42 +00:00
Andrew Kaylor fb16a3ac9a [WinEH] Fix problem with removing an element from a SetVector while iterating.
Patch provided by Yaron Keren. (Thanks!)

llvm-svn: 252913
2015-11-12 17:36:03 +00:00
James Molloy 90111f79f9 [SDAG] Introduce a new BITREVERSE node along with a corresponding LLVM intrinsic
Several backends have instructions to reverse the order of bits in an integer. Conceptually matching such patterns is similar to @llvm.bswap, and it was mentioned in http://reviews.llvm.org/D14234 that it would be best if these patterns were matched in InstCombine instead of reimplemented in every different target.

This patch introduces an intrinsic @llvm.bitreverse.i* that operates similarly to @llvm.bswap. For plumbing purposes there is also a new ISD node ISD::BITREVERSE, with simple expansion and promotion support.

The intention is that InstCombine's BSWAP detection logic will be extended to support BITREVERSE too, and @llvm.bitreverse intrinsics emitted (if the backend supports lowering it efficiently).

llvm-svn: 252878
2015-11-12 12:29:09 +00:00
Matthias Braun b9610a6bc2 LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization
- Factor out code to query and modify the sign bit of a floatingpoint
  value as an integer. This also works if none of the targets integer
  types is big enough to hold all bits of the floatingpoint value.

- Legalize FABS(x) as FCOPYSIGN(x, 0.0) if FCOPYSIGN is available,
  otherwise perform bit manipulation on the sign bit. The previous code
  used "x >u 0 ? x : -x" which is incorrect for x being -0.0! It also
  takes 34 instructions on ARM Cortex-M4. With this patch we only
  require 5:
    vldr d0, LCPI0_0
    vmov r2, r3, d0
    lsrs r2, r3, #31
    bfi r1, r2, #31, #1
    bx lr
  (This could be further improved if the compiler would recognize that
   r2, r3 is zero).

- Only lower FCOPYSIGN(x, y) = sign(x) ? -FABS(x) : FABS(x) if FABS is
  available otherwise perform bit manipulation on the sign bit.

- Perform the sign(x) test by masking out the sign bit and comparing
  with 0 rather than shifting the sign bit to the highest position and
  testing for "<s 0". For x86 copysignl (on 80bit values) this gets us:
    testl $32768, %eax
  rather than:
    shlq $48, %rax
    sets %al
    testb %al, %al

Differential Revision: http://reviews.llvm.org/D11172

llvm-svn: 252839
2015-11-12 01:02:47 +00:00
Reid Kleckner b9204a584c [WinEH] Don't forward branches across empty EH pad BBs
For really simple SEH catchpads, we tried to forward the invoke unwind
edge across the empty block.

llvm-svn: 252822
2015-11-11 23:09:31 +00:00
Geoff Berry 2ddfc5e60f [DAGCombiner] Improve zextload optimization.
Summary:
Don't fold
  (zext (and (load x), cst)) -> (and (zextload x), (zext cst))
if
  (and (load x) cst)
will match as a zextload already and has additional users.

For example, the following IR:

  %load = load i32, i32* %ptr, align 8
  %load16 = and i32 %load, 65535
  %load64 = zext i32 %load16 to i64
  store i32 %load16, i32* %dst1, align 4
  store i64 %load64, i64* %dst2, align 8

used to produce the following aarch64 code:

	ldr		w8, [x0]
	and	w9, w8, #0xffff
	and	x8, x8, #0xffff
	str		w9, [x1]
	str		x8, [x2]

but with this change produces the following aarch64 code:

	ldrh		w8, [x0]
	str		w8, [x1]
	str		x8, [x2]

Reviewers: resistor, mcrosier

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14340

llvm-svn: 252789
2015-11-11 19:42:52 +00:00
Matt Arsenault d8fed1b793 Add target preference for GatherAllAliases max depth
llvm-svn: 252775
2015-11-11 18:44:33 +00:00
Dehao Chen 54511353e3 clang-format lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
llvm-svn: 252769
2015-11-11 18:09:47 +00:00
Dehao Chen 72fdf444b7 Emit discriminator for inlined callsites.
Summary: Inlined callsites need to be emitted in debug info so that sample profile can be annotated to the correct inlined instance.

Reviewers: dnovillo, dblaikie

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D14511

llvm-svn: 252768
2015-11-11 18:08:18 +00:00
Matthias Braun 2c98d0f477 MachineInstr: addRegisterDefReadUndef() => setRegisterDefReadUndef()
This way we can not only add but also remove read undef flags.

llvm-svn: 252678
2015-11-11 00:41:58 +00:00
Matthias Braun 4353b30542 TableGen: Emit LaneMask for register classes without subregisters as ~0u
This makes it slightly easier to handle classes with and without
subregister uniformly.

llvm-svn: 252671
2015-11-10 23:23:05 +00:00
Sanjay Patel 33ec5dbe35 less indent; NFCI
llvm-svn: 252643
2015-11-10 20:09:02 +00:00
Matt Arsenault aa118e299c LegalizeDAG: Implement promote for scalar_to_vector
This allows avoiding the default Expand behavior which
introduces stack usage. Bitcast the scalar and replace
the missing elements with undef.

This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.

llvm-svn: 252632
2015-11-10 18:48:11 +00:00
Matt Arsenault a46aa641f2 LegalizeDAG: Implement promote for insert_vector_elt
This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.

llvm-svn: 252631
2015-11-10 18:48:08 +00:00
Matt Arsenault 0b7958a59b LegalizeDAG: Implement promote for extract_vector_elt
This is for AMDGPU to implement v2i64 extract as extract of
half of a v4i32.

This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.

llvm-svn: 252630
2015-11-10 18:48:04 +00:00
Sanjay Patel 766589efdc add 'MustReduceDepth' as an objective/cost-metric for the MachineCombiner
This is one of the problems noted in PR25016:
https://llvm.org/bugs/show_bug.cgi?id=25016
and:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/090998.html

The spilling problem is independent and not addressed by this patch.

The MachineCombiner was doing reassociations that don't improve or even worsen the critical path. 
This is caused by inclusion of the "slack" factor when calculating the critical path of the original
code sequence. If we don't add that, then we have a more conservative cost comparison of the old code
sequence vs. a new sequence. The more liberal calculation must be preserved, however, for the AArch64
MULADD patterns because benchmark regressions were observed without that.

The two failing test cases now have identical asm that does what we want:
a + b + c + d ---> (a + b) + (c + d)

Differential Revision: http://reviews.llvm.org/D13417

llvm-svn: 252616
2015-11-10 16:48:53 +00:00
Andy Ayers 809cbe9ea0 Support for emitting inline stack probes
For CoreCLR on Windows, stack probes must be emitted as inline sequences that probe successive stack pages
between the current stack limit and the desired new stack pointer location. This implements support for
the inline expansion on x64.

For in-body alloca probes, expansion is done during instruction lowering. For prolog probes, a stub call
is initially emitted during prolog creation, and expanded after epilog generation, to avoid complications
that arise when introducing new machine basic blocks during prolog and epilog creation.

Added a new test case, modified an existing one to exclude non-x64 coreclr (for now).

Add test case

Fix tests

llvm-svn: 252578
2015-11-10 01:50:49 +00:00
Matt Arsenault 6d87f28afd Remove unnecessary call to getAllocatableRegClass
I'm not sure what the point of this was. I'm not sure why
you would ever define an instruction that produces an unallocatable
register class. No tests fail with this removed, and it seems like
it should be a verifier error to define such an instruction.

This was problematic for AMDGPU because it would make bad decisions
by arbitrarily changing the register class when unsetting isAllocatable
for VS_32/VS_64, which is currently set as a workaround to this problem.

AMDGPU uses the VS_32/VS_64 register classes to represent operands which
can use either VGPRs or SGPRs. When  isAllocatable is unset for these,
this would need to pick  either the SGPR or VGPR class and insert either
a copy we don't want, or an illegal copy we would need to deal with
later. A semi-arbitrary register class ordering decision is made in tablegen,
which resulted in always picking a VGPR class because it happens to have
more registers than the SGPR register class. We really just want to
use whatever register class the original register had.

llvm-svn: 252565
2015-11-10 00:30:14 +00:00
Matthias Braun 7e624d5f11 MachineVerifier: Streamline live interval related error reporting
Simply perform additional report_context() calls after a report()
instead of adding more and more overloaded variations of report().  Also
improve several instances where information was output in an ad-hoc way
probably because no matching report() overload was available.

llvm-svn: 252552
2015-11-09 23:59:33 +00:00
Matthias Braun 716b43306b MachineVerifier: Add missing linebreak
MachineInstr::print() with SkipOppers==true does not produce a
linebreak, so we have to do that in MachineVerifier::report().

llvm-svn: 252551
2015-11-09 23:59:29 +00:00
Matthias Braun 45718db0a1 MachineVerifier: MI::print has no TargetMachine overload
The code was passing a target machine pointer which degraded to a true
operand to SkipOppers.

llvm-svn: 252550
2015-11-09 23:59:25 +00:00
Matthias Braun 42b4b63056 MachineVerifier: print list of live intervals if available
llvm-svn: 252549
2015-11-09 23:59:23 +00:00
Sanjay Patel 533c10c651 add a SelectionDAG method to check if no common bits are set in two nodes; NFCI
This was suggested in:
http://reviews.llvm.org/D13956

and is a follow-on to:
http://reviews.llvm.org/rL252515
http://reviews.llvm.org/rL252519

This lets us remove logically equivalent/duplicated code from DAGCombiner and X86ISelDAGToDAG.

A corresponding function for IR instructions already exists in ValueTracking.

llvm-svn: 252539
2015-11-09 23:31:38 +00:00
David Majnemer 2652b75700 [WinEH] Don't emit CATCHRET from visitCatchPad
Instead, emit a CATCHPAD node which will get selected to a target
specific sequence.

llvm-svn: 252528
2015-11-09 23:07:48 +00:00
Reid Kleckner 64b003f05d [WinEH] Tweak funclet prologue/epilogue insertion to pass verifier
For some reason we'd never run MachineVerifier on WinEH code, and you
explicitly have to ask for it with llc. I added it to a few test cases
to get some coverage.

Fixes PR25461.

llvm-svn: 252512
2015-11-09 21:04:00 +00:00
Andrew Kaylor fdd48fa1e1 [WinEH] Re-committing r252249 (Clone funclets with multiple parents) with additional fixes for determinism problems
Differential Revision: http://reviews.llvm.org/D14454

llvm-svn: 252508
2015-11-09 19:59:02 +00:00
Oliver Stannard 563585789c [CodeGen] Always promote f16 if not legal
We don't currently have any runtime library functions for operations on
f16 values (other than conversions to and from f32 and f64), so we
should always promote it to f32, even if that is not a legal type. In
that case, the f32 values would be softened to f32 library calls.

SoftenFloatRes_FP_EXTEND now needs to check the promoted operand's type,
as it may ne a no-op or require a different library call.

getCopyFromParts and getCopyToParts now need to cope with a
floating-point value stored in a larger integer part, as is the case for
any target that needs to store an f16 value in a 32-bit integer
register.

Differential Revision: http://reviews.llvm.org/D12856

llvm-svn: 252459
2015-11-09 11:03:18 +00:00
Yaron Keren 9ffee46d45 Erase unused FunctionDIs variables after r252219.
llvm-svn: 252401
2015-11-07 10:21:25 +00:00
Joseph Tremoulet f748c8937e [WinEH] Update exception pointer registers
Summary:
The CLR's personality routine passes these in rdx/edx, not rax/eax.

Make getExceptionPointerRegister a virtual method parameterized by
personality function to allow making this distinction.

Similarly make getExceptionSelectorRegister a virtual method parameterized
by personality function, for symmetry.


Reviewers: pgavlin, majnemer, rnk

Subscribers: jyknight, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D14344

llvm-svn: 252383
2015-11-07 01:11:31 +00:00
Tom Stellard 05691a678e DAGCombiner: Check shouldReduceLoadWidth before combining (and (load), x) -> extload
Reviewers: resistor, arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13805

llvm-svn: 252349
2015-11-06 21:58:37 +00:00
Quentin Colombet 9a8efc08d3 [ShrinkWrapping] Teach shrink-wrapping how to analyze RegMask.
Previously we were conservatively assuming that RegMask operands clobber
callee saved registers.

llvm-svn: 252341
2015-11-06 21:00:13 +00:00
Matthias Braun 9198c671e8 MachineScheduler: Add regpressure information to debug dump
llvm-svn: 252340
2015-11-06 20:59:02 +00:00
Reid Kleckner b8fd162fc5 [WinEH] Mark funclet entries and exits as clobbering all registers
Summary:
In this implementation, LiveIntervalAnalysis invents a few register
masks on basic block boundaries that preserve no registers. The nice
thing about this is that it prevents the prologue inserter from thinking
it needs to spill all XMM CSRs, because it doesn't see any explicit
physreg defs in the MI.

Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer

Subscribers: MatzeB, llvm-commits

Differential Revision: http://reviews.llvm.org/D14407

llvm-svn: 252318
2015-11-06 17:06:38 +00:00
NAKAMURA Takumi 9947cacebf Revert r252249 (and r252255, r252258), "[WinEH] Clone funclets with multiple parents"
It behaved flaky due to iterating pointer key values on std::set and std::map.

llvm-svn: 252279
2015-11-06 10:07:33 +00:00
Reid Kleckner e535c1f856 Range-for some LiveIntervals code under review
llvm-svn: 252267
2015-11-06 02:01:02 +00:00
Andrew Kaylor f477585a2b Fix build warnings
llvm-svn: 252255
2015-11-06 01:08:35 +00:00
Andrew Kaylor 29cd576554 [WinEH] Clone funclets with multiple parents
Windows EH funclets need to always return to a single parent funclet.  However, it is possible for earlier optimizations to combine funclets (probably based on one funclet having an unreachable terminator) in such a way that this condition is violated.

These changes add code to the WinEHPrepare pass to detect situations where a funclet has multiple parents and clone such funclets, fixing up the unwind and catch return edges so that each copy of the funclet returns to the correct parent funclet.

Differential Revision: http://reviews.llvm.org/D13274?id=39098

llvm-svn: 252249
2015-11-06 00:20:50 +00:00
Peter Collingbourne d4bff30370 DI: Reverse direction of subprogram -> function edge.
Previously, subprograms contained a metadata reference to the function they
described. Because most clients need to get or set a subprogram for a given
function rather than the other way around, this created unneeded inefficiency.

For example, many passes needed to call the function llvm::makeSubprogramMap()
to build a mapping from functions to subprograms, and the IR linker needed to
fix up function references in a way that caused quadratic complexity in the IR
linking phase of LTO.

This change reverses the direction of the edge by storing the subprogram as
function-level metadata and removing DISubprogram's function field.

Since this is an IR change, a bitcode upgrade has been provided.

Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is
attached to the PR.

Differential Revision: http://reviews.llvm.org/D14265

llvm-svn: 252219
2015-11-05 22:03:56 +00:00
Reid Kleckner 6ddae31045 [WinEH] Fix funclet prologues with stack realignment
We already had a test for this for 32-bit SEH catchpads, but those don't
actually create funclets. We had a bug that only appeared in funclet
prologues, where we would establish EBP and ESI as our FP and BP, and
then downstream prologue code would overwrite them.

While I was at it, I fixed Win64+funclets+stackrealign. This issue
doesn't come up as often there due to the ABI requring 16 byte stack
alignment, but now we can rest easy that AVX and WinEH will work well
together =P.

llvm-svn: 252210
2015-11-05 21:09:49 +00:00
Sanjay Patel 387e66e79f replace MachineCombinerPattern namespace and enum with enum class; NFCI
Also, remove an enum hack where enum values were used as indexes into an array.

We may want to make this a real class to allow pattern-based queries/customization (D13417).

llvm-svn: 252196
2015-11-05 19:34:57 +00:00
Eugene Zelenko ffec81ca00 Fix some Clang-tidy modernize warnings, other minor fixes.
Fixed warnings are: modernize-use-override, modernize-use-nullptr and modernize-redundant-void-arg.

Differential revision: http://reviews.llvm.org/D14312

llvm-svn: 252087
2015-11-04 22:32:32 +00:00
Cong Hou 23a3bf0147 Add new interfaces to MBB for manipulating successors with probabilities instead of weights. NFC.
This is part-1 of the patch that replaces all edge weights in MBB by
probabilities, which only adds new interfaces. No functional changes.

Differential revision: http://reviews.llvm.org/D13908

llvm-svn: 252083
2015-11-04 21:37:58 +00:00
Igor Laevsky 35fe692025 [StatepointLowering] Remove distinction between call and invoke safepoints
There is no point in having invoke safepoints handled differently than the
call safepoints. All relevant decisions could be made by looking at whether
or not gc.result and gc.relocate lay in a same basic block. This change will
 allow to lower call safepoints with relocates and results in a different 
basic blocks. See test case for example.

Differential Revision: http://reviews.llvm.org/D14158

llvm-svn: 252028
2015-11-04 01:16:10 +00:00
Peter Collingbourne 94d778697a CodeGen, Target: Move Mach-O-specific symbol name logic to Mach-O lowering.
A profile of an LTO link of Chrome revealed that we were spending some
~30-50% of execution time in the function Constant::getRelocationInfo(),
which is called from TargetLoweringObjectFile::getKindForGlobal() and in turn
from TargetMachine::getNameWithPrefix().

It turns out that we only need the result of getKindForGlobal() when
targeting Mach-O, so this change moves the relevant part of the logic to
TargetLoweringObjectFileMachO.

NFCI.

Differential Revision: http://reviews.llvm.org/D14168

llvm-svn: 252014
2015-11-03 23:40:03 +00:00
Simon Pilgrim 191ac7c679 [SelectionDAG] Use existing constant nodes instead of recreating them. NFC.
llvm-svn: 251990
2015-11-03 22:21:38 +00:00
Rafael Espindola 43e2e251ea Delete dead code.
llvm-svn: 251960
2015-11-03 18:55:58 +00:00
Igor Laevsky f637b4a52e [CodegenPrepare] Do not rematerialize gc.relocates across different basic blocks
Differential Revision: http://reviews.llvm.org/D14258

llvm-svn: 251957
2015-11-03 18:37:40 +00:00
Michael Kuperstein 73dc85293f [X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments
When push instructions are being used to pass function arguments on
the stack, and either EH or debugging are enabled, we need to generate
.cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is
enough for the CFA offset to be correct at every call site, while
for debugging we want to be correct after every push.

Darwin does not support this well, so don't use pushes whenever it
would be required.

Differential Revision: http://reviews.llvm.org/D13767

llvm-svn: 251904
2015-11-03 08:17:25 +00:00
Matthias Braun 6f4ed269b9 RegisterPressure: Improve assert message
llvm-svn: 251885
2015-11-03 01:53:36 +00:00
Matthias Braun 11859b5c8f RegisterPressure: Slightly nicer pressure diff dumping
llvm-svn: 251884
2015-11-03 01:53:33 +00:00
Matthias Braun 93563e7032 ScheduleDAGInstrs: Remove IsPostRA flag; NFC
ScheduleDAGInstrs doesn't behave differently before or after register
allocation. It was only used in a method of MachineSchedulerBase which
behaved differently in MachineScheduler/PostMachineScheduler. Change
this to let MachineScheduler/PostMachineScheduler just pass in a
parameter to that function.

The order of the LiveIntervals* and bool RemoveKillFlags paramters have
been switched to make out-of-tree code fail instead of unintentionally
passing a value intended for the IsPostRA flag to the (previously
following and default initialized) RemoveKillFlags.

Differential Revision: http://reviews.llvm.org/D14245

llvm-svn: 251883
2015-11-03 01:53:29 +00:00
Sanjay Patel 0ed9aeaa5f [CGP] widen switch condition and case constants to target's register width (2nd try)
This is a redo of r251849 except the tests have been split into arch-specific folders
to hopefully make the bots happy.

This is a follow-up from the discussion in D12965. The block-at-a-time limitation of
SelectionDAG also came up in D13297.

Without the InstCombine change from D12965, I don't expect this patch to make any
difference in the real world because InstCombine does not shrink cases like this in
visitSwitchInst(). But we need to have this CGP safety harness in place before
proceeding with any shrinkage in D12965, so we won't generate extra extends for compares.

I've opted for IR regression tests in the patch because that seems like a clearer way to
test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86
will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473

Before:
BB#0:
  mr 4, 3
  extsh. 3, 4
  ble 0, .LBB0_5
 BB#1:
  cmpwi  3, 99
  bgt    0, .LBB0_9
 BB#2:
  rlwinm 4, 4, 0, 16, 31      <--- 32-bit mask/extend
  li 3, 0
  cmplwi         4, 1
  beqlr 0
 BB#3:
  cmplwi         4, 10
  bne    0, .LBB0_12
 BB#4:
  li 3, 1
  blr
.LBB0_5:
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi         3, 65436
  beq    0, .LBB0_13
 BB#6:
  cmplwi         3, 65526
  beq    0, .LBB0_15
 BB#7:
  cmplwi         3, 65535
  bne    0, .LBB0_12
 BB#8:
  li 3, 4
  blr
.LBB0_9:
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi         3, 100
  beq    0, .LBB0_14
...

After:
BB#0:
  rlwinm 4, 3, 0, 16, 31      <--- mask/extend to 32-bit and then use that for comparisons
  cmpwi  4, 999
  ble 0, .LBB0_5
 BB#1:
  lis 3, 0
  ori 3, 3, 65525
  cmpw   4, 3
  bgt    0, .LBB0_9
 BB#2:
  cmplwi         4, 1000
  beq    0, .LBB0_14
 BB#3:
  cmplwi         4, 65436
  bne    0, .LBB0_13
 BB#4:
  li 3, 6
  blr
.LBB0_5:
  li 3, 0
  cmplwi         4, 1
  beqlr 0
 BB#6:
  cmplwi         4, 10
  beq    0, .LBB0_12
 BB#7:
  cmplwi         4, 100
  bne    0, .LBB0_13
 BB#8:
  li 3, 2
  blr
.LBB0_9:
  cmplwi         4, 65526
  beq    0, .LBB0_15
 BB#10:
  cmplwi         4, 65535
  bne    0, .LBB0_13
...


Differential Revision: http://reviews.llvm.org/D13532

llvm-svn: 251857
2015-11-02 23:22:49 +00:00
Sanjay Patel dfc825eb36 revert r251849; need to move tests to arch-specific folders
llvm-svn: 251851
2015-11-02 23:05:20 +00:00
Sanjay Patel b90a078de9 [CGP] widen switch condition and case constants to target's register width
This is a follow-up from the discussion in D12965. The block-at-a-time limitation of 
SelectionDAG also came up in D13297.

Without the InstCombine change from D12965, I don't expect this patch to make any 
difference in the real world because InstCombine does not shrink cases like this in
visitSwitchInst(). But we need to have this CGP safety harness in place before
proceeding with any shrinkage in D12965, so we won't generate extra extends for compares.

I've opted for IR regression tests in the patch because that seems like a clearer way to
test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86
will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473

Before:
BB#0:
  mr 4, 3
  extsh. 3, 4
  ble 0, .LBB0_5
 BB#1: 
  cmpwi	 3, 99
  bgt	 0, .LBB0_9
 BB#2:            
  rlwinm 4, 4, 0, 16, 31      <--- 32-bit mask/extend
  li 3, 0
  cmplwi	 4, 1
  beqlr 0
 BB#3:            
  cmplwi	 4, 10
  bne	 0, .LBB0_12
 BB#4:                      
  li 3, 1
  blr
.LBB0_5:                             
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi	 3, 65436
  beq	 0, .LBB0_13
 BB#6:                            
  cmplwi	 3, 65526
  beq	 0, .LBB0_15
 BB#7:                       
  cmplwi	 3, 65535
  bne	 0, .LBB0_12
 BB#8:                       
  li 3, 4
  blr
.LBB0_9:                       
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi	 3, 100
  beq	 0, .LBB0_14
...

After:
BB#0:        
  rlwinm 4, 3, 0, 16, 31      <--- mask/extend to 32-bit and then use that for comparisons
  cmpwi	 4, 999
  ble 0, .LBB0_5
 BB#1:          
  lis 3, 0
  ori 3, 3, 65525
  cmpw	 4, 3
  bgt	 0, .LBB0_9
 BB#2:         
  cmplwi	 4, 1000
  beq	 0, .LBB0_14
 BB#3:    
  cmplwi	 4, 65436
  bne	 0, .LBB0_13
 BB#4:       
  li 3, 6
  blr
.LBB0_5:   
  li 3, 0
  cmplwi	 4, 1
  beqlr 0
 BB#6: 
  cmplwi	 4, 10
  beq	 0, .LBB0_12
 BB#7:             
  cmplwi	 4, 100
  bne	 0, .LBB0_13
 BB#8:             
  li 3, 2
  blr
.LBB0_9:       
  cmplwi	 4, 65526
  beq	 0, .LBB0_15
 BB#10:      
  cmplwi	 4, 65535
  bne	 0, .LBB0_13
...


Differential Revision: http://reviews.llvm.org/D13532

llvm-svn: 251849
2015-11-02 22:46:24 +00:00
Cong Hou b90b9e0531 In MachineBlockPlacement, filter cold blocks off the loop chain when profile data is available.
In the current BB placement algorithm, a loop chain always contains all loop blocks. This has a drawback that cold blocks in the loop may be inserted on a hot function path, hence increasing branch cost and also reducing icache locality.

Consider a simple example shown below:

A
|
B⇆C
|
D

When B->C is quite cold, the best BB-layout should be A,B,D,C. But the current implementation produces A,C,B,D.

This patch filters those cold blocks off from the loop chain by comparing the ratio:

LoopBBFreq / LoopFreq

to 20%: if it is less than 20%, we don't include this BB to the loop chain. Here LoopFreq is the frequency of the loop when we reduce the loop into a single node. In general we have more cold blocks when the loop has few iterations. And vice versa.


Differential revision: http://reviews.llvm.org/D11662

llvm-svn: 251833
2015-11-02 21:24:00 +00:00
James Y Knight 646c4032e7 Fix two issues in MergeConsecutiveStores:
1) PR25154. This is basically a repeat of PR18102, which was fixed in
r200201, and broken again by r234430. The latter changed which of the
store nodes was merged into from the first to the last. Thus, we now
also need to prefer merging a later store at a given address into the
target node, instead of an earlier one.

2) While investigating that, I also realized I'd introduced a bug in
r236850. There, I removed a check for alignment -- not realizing that
nothing except the alignment check was ensuring that none of the stores
were overlapping! This is a really bogus way to ensure there's no
aliased stores.

A better solution to both of these issues is likely to always use the
code added in the 'if (UseAA)' branches which rearrange the chain based
on a more principled analysis. I'll look into whether that can be used
always, but in the interest of getting things back to working, I think a
minimal change makes sense.

llvm-svn: 251816
2015-11-02 18:48:08 +00:00
Jonas Paulsson 72640f1c9f [MachineVerifier] Analyze MachineMemOperands for mem-to-mem moves.
Since the verifier will give false reports if it incorrectly thinks MI is
loading or storing using an FI, it is necessary to scan memoperands and
find out how the FI is used in the instruction. This should be relatively
rare.

Needed to make CodeGen/SystemZ/spill-01.ll pass, which now runs with this flag.

Reviewed by Quentin Colombet.

llvm-svn: 251620
2015-10-29 08:28:35 +00:00
Matthias Braun f2f194455f Revert "ScheduleDAGInstrs: Remove IsPostRA flag"
It broke 3 arm testcases.

This reverts commit r251608.

llvm-svn: 251615
2015-10-29 05:06:41 +00:00
Matthias Braun dc7580aa88 MachineScheduler: Fix typo in debug message
Maybe I just missed the humor there ;-)

llvm-svn: 251609
2015-10-29 03:57:28 +00:00
Matthias Braun 7ffadd0087 ScheduleDAGInstrs: Remove IsPostRA flag
This was a layering violation in ScheduleDAGInstrs (and
MachineSchedulerBase) they both shouldn't know directly whether they are
used by the PostMachineScheduler or the MachineScheduler.

llvm-svn: 251608
2015-10-29 03:57:24 +00:00
Matthias Braun b0c437bc76 MachineScheduler: Use ranged for and slightly simplify the code
llvm-svn: 251607
2015-10-29 03:57:17 +00:00
Tim Northover 2d4d161519 ARM: support .watchos_version_min and .tvos_version_min.
These MachO file directives are used by linkers and other tools to provide
compatibility information, much like the existing .ios_version_min and
.macosx_version_min.

llvm-svn: 251569
2015-10-28 22:36:05 +00:00
Sanjoy Das 1d1929aace [ValueTracking] Use !range metadata more aggressively in KnownBits
Summary:
Teach `computeKnownBitsFromRangeMetadata` to use `!range` metadata more
aggressively.

Reviewers: majnemer, nlewycky, jingyue

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14100

llvm-svn: 251487
2015-10-28 03:20:15 +00:00
Sanjoy Das 4ff3cf6d92 [SelectionDAG] Don't inspect !range metadata for extended loads
Summary:
Don't call `computeKnownBitsFromRangeMetadata` for extended loads --
this can cause a mismatch between the width of the !range metadata and
the width of the APInt's accumulating `KnownZero` (and `KnownOne` in the
future).  This isn't a problem now, but will be after a future change.

Note: this can be made more aggressive in the future.

Reviewers: nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14107

llvm-svn: 251486
2015-10-28 03:20:10 +00:00
James Y Knight 14eedd189b Make the SelectionDAG graph printer use SDNode::PersistentId labels.
r248010 changed the -debug output to use short ids, but did not
similarly modify the graph printer. Change to be consistent, for ease of
cross-reference.

llvm-svn: 251465
2015-10-27 23:09:03 +00:00
Sanjay Patel bbd4c79c8f Use the 'arcp' fast-math-flag when combining repeated FP divisors
This is a usage of the IR-level fast-math-flags now that they are propagated to SDNodes. 
This was originally part of D8900.

Removing the global 'enable-unsafe-fp-math' checks will require auto-upgrade and 
possibly other changes.

Differential Revision: http://reviews.llvm.org/D9708

llvm-svn: 251450
2015-10-27 20:27:25 +00:00
Cong Hou 07eeb8001e Create a new interface addSuccessorWithoutWeight(MBB*) in MBB to add successors when optimization is disabled.
When optimization is disabled, edge weights that are stored in MBB won't be used so that we don't have to store them. Currently, this is done by adding successors with default weight 0, and if all successors have default weights, the weight list will be empty. But that the weight list is empty doesn't mean disabled optimization (as is stated several times in MachineBasicBlock.cpp): it may also mean all successors just have default weights.

We should discourage using default weights when adding successors, because it is very easy for users to forget update the correct edge weights instead of using default ones (one exception is that the MBB only has one successor). In order to detect such usages, it is better to differentiate using default weights from the case when optimizations is disabled.

In this patch, a new interface addSuccessorWithoutWeight(MBB*) is created for when optimization is disabled. In this case, MBB will try to maintain an empty weight list, but it cannot guarantee this as for many uses of addSuccessor() whether optimization is disabled or not is not checked. But it can guarantee that if optimization is enabled, then the weight list always has the same size of the successor list.

Differential revision: http://reviews.llvm.org/D13963

llvm-svn: 251429
2015-10-27 17:59:36 +00:00
Mehdi Amini 891c0973df Do not use "else" when both branches return (NFC)
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 251398
2015-10-27 08:12:08 +00:00
Steve King fee370be72 Fix llc crash processing S/UREM for -Oz builds caused by rL250825.
When taking the remainder of a value divided by a constant, visitREM()
attempts to convert the REM to a longer but faster sequence of instructions.
This conversion calls combine() on a speculative DIV instruction. Commit
rL250825 may cause this combine() to return a DIVREM, corrupting nearby nodes.
Flow eventually hits unreachable().

This patch adds a test case and a check to prevent visitREM() from trying
to convert the REM instruction in cases where a DIVREM is possible.
See http://reviews.llvm.org/D14035

llvm-svn: 251373
2015-10-27 00:14:06 +00:00
Ivan Krasin 465fbe25c4 Fix indents. It's a follow up to r251353.
llvm-svn: 251364
2015-10-26 22:35:40 +00:00
Ivan Krasin 298639a5fd Move imported entities into DwarfCompilationUnit to speed up LTO linking.
Summary:
In particular, this CL speeds up the official Chrome linking with LTO by
1.8x.

See more details in https://crbug.com/542426

Reviewers: dblaikie

Subscribers: jevinskie

Differential Revision: http://reviews.llvm.org/D13918

llvm-svn: 251353
2015-10-26 21:36:35 +00:00
David Blaikie 7b54b525cd Remove assert(false) in favor of asserting the if conditional it is contained within.
Also adjust the code to avoid 3 redundant map lookups.

llvm-svn: 251327
2015-10-26 18:41:13 +00:00
Evgeniy Stepanov d1aad26589 [safestack] Fast access to the unsafe stack pointer on AArch64/Android.
Android libc provides a fixed TLS slot for the unsafe stack pointer,
and this change implements direct access to that slot on AArch64 via
__builtin_thread_pointer() + offset.

This change also moves more code into TargetLowering and its
target-specific subclasses to get rid of target-specific codegen
in SafeStackPass.

This change does not touch the ARM backend because ARM lowers
builting_thread_pointer as aeabi_read_tp, which is not available
on Android.

The previous iteration of this change was reverted in r250461. This
version leaves the generic, compiler-rt based implementation in
SafeStack.cpp instead of moving it to TargetLoweringBase in order to
allow testing without a TargetMachine.

llvm-svn: 251324
2015-10-26 18:28:25 +00:00
Elena Demikhovsky 092858588a Scalarizer for masked.gather and masked.scatter intrinsics.
When the target does not support these intrinsics they should be converted to a chain of scalar load or store operations.
If the mask is not constant, the scalarizer will build a chain of conditional basic blocks.
I added isLegalMaskedGather() isLegalMaskedScatter() APIs.

Differential Revision: http://reviews.llvm.org/D13722

llvm-svn: 251237
2015-10-25 15:37:55 +00:00
Michael Kuperstein eaa16005af [X86] Use correct calling convention for MCU psABI libcalls
When using the MCU psABI, compiler-generated library calls should pass
some parameters in-register. However, since inreg marking for x86 is currently
done by the front end, it will not be applied to backend-generated calls.

This is a workaround for PR3997, which describes a similar issue for -mregparm.

Differential Revision: http://reviews.llvm.org/D13977

llvm-svn: 251223
2015-10-25 08:14:05 +00:00
Rafael Espindola 84921b9860 Refactor: Simplify boolean conditional return statements in lib/CodeGen.
Patch by Richard.

llvm-svn: 251213
2015-10-24 23:11:13 +00:00
Simon Pilgrim 3448cbcc51 [DAGCombiner] Tidy up ConstantFP commutation. NFCI
Move ConstantFP canonicalization of commutative instructions to start of 2-op node creation (matches integer) - simplifies constant folding code.

llvm-svn: 251203
2015-10-24 20:06:18 +00:00
Simon Pilgrim 7430804fe1 [DAGCombiner] Generalize masking of constant rotates.
We don't need a mask of a rotation result to be a constant splat - any constant scalar/vector can be usefully folded.

Followup to D13851.

llvm-svn: 251197
2015-10-24 18:44:52 +00:00
Simon Pilgrim d5ef318b5b [X86][XOP] Add support for lowering vector rotations
This patch adds support for lowering to the XOP VPROT / VPROTI vector bit rotation instructions.

This has required changes to the DAGCombiner rotation pattern matching to support vector types - so far I've only changed it to support splat vectors, but generalising this further is feasible in the future.

Differential Revision: http://reviews.llvm.org/D13851

llvm-svn: 251188
2015-10-24 13:17:26 +00:00
Joseph Tremoulet 3d0fbf1d74 [CodeGen] Mark setjmp/catchret MBBs address-taken
Summary:
This ensures that BranchFolding (and similar) won't remove these blocks.

Also allow AsmPrinter::EmitBasicBlockStart to process MBBs which are
address-taken but do not have BBs that are address-taken, since otherwise
its call to getAddrLabelSymbolTableToEmit would fail an assertion on such
blocks.  I audited the other callers of getAddrLabelSymbolTableToEmit
(and getAddrLabelSymbol); they all have BBs known to be address-taken
except for the call through getAddrLabelSymbol from
WinException::create32bitRef; that call is actually now unreachable, so
I've removed it and updated the signature of create32bitRef.

This fixes PR25168.

Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: pgavlin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13774

llvm-svn: 251113
2015-10-23 15:06:05 +00:00
Davide Italiano fbb958c24b [CodeGen] Remove usage of NDEBUG in header.
Moreover, this seems unused.

llvm-svn: 251081
2015-10-23 00:17:40 +00:00
Matthias Braun 61f4d6439c MachineScheduler: Add a way to disable the 'ReduceLatency' heuristic
llvm-svn: 251037
2015-10-22 18:07:31 +00:00
Craig Topper 8fe40e0ed5 Change makeLibCall to take an ArrayRef<SDValue> instead of pointer and size. This removes the need to pass a hardcoded size in many places. NFC
llvm-svn: 251032
2015-10-22 17:05:00 +00:00
Zia Ansari 8f509a7044 [X86] - Catch extra combine opportunities for redundant imuls.
When we fold "mul ((add x, c1), c1)" -> "add ((mul x, c2), c1*c2)", we bail if (add x, c1) has multiple
users which would result in an extra add instruction.
In such cases, this patch adds a check to see if we can eliminate a multiply instruction in exchange for the extra add.

I also added the capability of doing the existing optimization with non-splatted vectors (splatted also works).

Differential Revision: http://reviews.llvm.org/D13740

llvm-svn: 251028
2015-10-22 16:14:45 +00:00
David Majnemer a8f17871e4 [WinEH] Remove extraneous call to emitEHRegistrationOffsetLabel
It's a relic from the earlier implementation, let's remove it.

llvm-svn: 250964
2015-10-21 23:20:39 +00:00
Matt Arsenault 29f9663f97 LegalizeDAG: Implement promote for build_vector
This will be used in future commits for AMDGPU to promote
operations on i64 vectors into operations on 32-bit vector
components.

This will be used / tested in future AMDGPU commits.

llvm-svn: 250945
2015-10-21 21:10:10 +00:00
Elena Demikhovsky 3ad76a1acd Masked Load/Store optimization for scalar code
When we have to convert the masked.load, masked.store to scalar code, we generate a chain of conditional basic blocks.
I added optimization for constant mask vector.

Differential Revision: http://reviews.llvm.org/D13855

llvm-svn: 250893
2015-10-21 11:50:54 +00:00
Jonas Paulsson 17ad04535f Let MachineVerifier be aware of mem-to-mem instructions.
A mem-to-mem instruction (that both loads and stores), which store to an
FI, cannot pass the verifier since it thinks it is loading from the FI.

For the mem-to-mem instruction, do a looser check in visitMachineOperand()
and only check liveness at the reg-slot while analyzing a frame index operand.

Needed to make CodeGen/SystemZ/xor-01.ll pass with -verify-machineinstrs,
which now runs with this flag.

Reviewed by Evan Cheng and Quentin Colombet.

llvm-svn: 250885
2015-10-21 07:39:47 +00:00
Krzysztof Parzyszek fdb7b693a7 Tail duplication can mix incompatible registers in phi nodes
Do not tail duplicate blocks where the successor has a phi node,
and the corresponding value in that phi node uses a subregister.

http://reviews.llvm.org/D13922

llvm-svn: 250877
2015-10-21 02:40:06 +00:00
Artyom Skrobov c736863a85 Two switch blocks in VectorLegalizer::LegalizeOp already have a
default: llvm_unreachable("This action is not supported yet!");

-- so I'm adding one to the third switch block, too.

This is a follow-up fix for http://reviews.llvm.org/D13862

llvm-svn: 250830
2015-10-20 15:06:37 +00:00
Artyom Skrobov 7fd67e25aa Adding support for TargetLoweringBase::LibCall
Summary:
TargetLoweringBase::Expand is defined as "Try to expand this to other ops,
otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between
the two possibilities was defined in a rather convoluted way:

- if DIVREM is legal, expand to DIVREM
- if DIVREM has a custom lowering, expand to DIVREM
- if DIVREM libcall is defined and a remainder from the same division is
  computed elsewhere, expand to a DIVREM libcall
- else, expand to a DIV libcall

This had the undesirable effect that if both DIV and DIVREM are implemented
as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM
libcall, even when the remainder isn't used.

The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that
backends can directly control whether they prefer an expansion or a conversion
to a libcall. This makes the generic lowering code even more generic,
allowing its reuse in a wider range of target-specific configurations.

The useful effect is that ARM backend will now generate a call
to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where
it doesn't need the remainder. There's no functional change outside
the ARM backend.

Reviewers: t.p.northover, rengolin

Subscribers: t.p.northover, llvm-commits, aemerson

Differential Revision: http://reviews.llvm.org/D13862

llvm-svn: 250826
2015-10-20 13:14:52 +00:00
Artyom Skrobov b844fa7fc0 Combining DIV+REM->DIVREM doesn't belong in LegalizeDAG; move it over into DAGCombiner.
Summary:
In addition to moving the code over, this patch amends the DIV,REM -> DIVREM
combining to run on all affected nodes at once: if the nodes are converted
to DIVREM one at a time, then the resulting DIVREM may get legalized by the
backend into something target-specific that we won't be able to recognize
and correlate with the remaining nodes.

The motivation is to "prepare terrain" for D13862: when we set DIV and REM
to be legalized to libcalls, instead of the DIVREM, we otherwise lose the
ability to combine them together. To prevent this, we need to take the
DIV,REM -> DIVREM combining out of the lowering stage.

Reviewers: RKSimon, eli.friedman, rengolin

Subscribers: john.brawn, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13733

llvm-svn: 250825
2015-10-20 13:06:02 +00:00
Duncan P. N. Exon Smith a25ad0685a AsmPrinter: Remove implicit ilist iterator conversion, NFC
llvm-svn: 250776
2015-10-20 00:36:08 +00:00
Cong Hou 7745dbc5c4 Enhance loop rotation with existence of profile data in MachineBlockPlacement pass.
Currently, in MachineBlockPlacement pass the loop is rotated to let the best exit to be the last BB in the loop chain, to maximize the fall-through from the loop to outside. With profile data, we can determine the cost in terms of missed fall through opportunities when rotating a loop chain and select the best rotation. Basically, there are three kinds of cost to consider for each rotation:

1. The possibly missed fall through edge (if it exists) from BB out of the loop to the loop header.
2. The possibly missed fall through edges (if they exist) from the loop exits to BB out of the loop.
3. The missed fall through edge (if it exists) from the last BB to the first BB in the loop chain.

Therefore, the cost for a given rotation is the sum of costs listed above. We select the best rotation with the smallest cost. This is only for PGO mode when we have more precise edge frequencies.

Differential revision: http://reviews.llvm.org/D10717

llvm-svn: 250754
2015-10-19 23:16:40 +00:00
Sanjay Patel 69a50a1e17 [CGP] transform select instructions into branches and sink expensive operands
This was originally checked in at r250527, but reverted at r250570 because of PR25222.
There were at least 2 problems: 
1. The cost check was checking for an instruction with an exact cost of TCC_Expensive;
that should have been >=.
2. The cause of the clang stage 1 failures was illegally sinking 'call' instructions;
we can't sink instructions that may have side effects / are not safe to execute speculatively.

Fixed those conditions in sinkSelectOperand() and added test cases.

Original commit message:
This is a follow-up to the discussion in D12882.

Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands
are expensive (as defined by the TTI cost model) because that may expose further optimizations.
However, we would then like a later pass like CodeGenPrepare to undo that transformation if the
target would likely benefit from not speculatively executing an expensive op (this patch).

Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its
select-formation behavior that changed with r248439.

Differential Revision: http://reviews.llvm.org/D13297

llvm-svn: 250743
2015-10-19 21:59:12 +00:00
Owen Anderson faf5187ee0 Restore the original behavior of SelectionDAG::getTargetIndex().
It looks like an extra negation snuck in as apart of restoring it.

llvm-svn: 250726
2015-10-19 19:27:40 +00:00
Benjamin Kramer 2002aadaad Put back SelectionDAG::getTargetIndex.
While technically this is untested dead code, it has out-of-tree users.
This reverts a part of r250434.

llvm-svn: 250717
2015-10-19 18:26:16 +00:00
Matthias Braun e734195ce3 Revert "RegisterPressure: allocatable physreg uses are always kills"
This reverts commit r250596.

Reverted for now as the commit triggers assert in the AMDGPU target
pending investigation.

llvm-svn: 250713
2015-10-19 17:44:22 +00:00
Elena Demikhovsky 20662e39f1 Removed parameter "Consecutive" from isLegalMaskedLoad() / isLegalMaskedStore().
Originally I planned to use the same interface for masked gather/scatter and set isConsecutive to "false" in this case.

Now I'm implementing masked gather/scatter and see that the interface is inconvenient. I want to add interfaces isLegalMaskedGather() / isLegalMaskedScatter() instead of using the "Consecutive" parameter in the existing interfaces.

Differential Revision: http://reviews.llvm.org/D13850

llvm-svn: 250686
2015-10-19 07:43:38 +00:00
Simon Pilgrim 04d52d26f6 Use SDValue bool check. NFCI.
llvm-svn: 250653
2015-10-18 12:33:54 +00:00
Simon Pilgrim c2c154e078 Move one-use variable inside test. NFC.
llvm-svn: 250651
2015-10-18 11:47:23 +00:00
Simon Pilgrim 24057b9566 [DAG] Ensure vector constant folding uses correct scalar undef types
Minor fix to D13665 found during post-commit review.

llvm-svn: 250616
2015-10-17 16:49:43 +00:00
Matthias Braun 65e6d4a3f8 RegisterPressure: Unify the sparse sets in LiveRegsSet; NFC
Also do some cleanups comment improvements.

llvm-svn: 250598
2015-10-17 01:03:44 +00:00
Matthias Braun cdd2792aa6 RegisterPressure: allocatable physreg uses are always kills
This property was already used in the code path when no liveness
intervals are present. Unfortunately the code path that uses liveness
intervals tried to query a cached live interval for an allocatable
physreg, those are usually not computed so a conservative default was
used.

This doesn't affect any of the lit testcases. This is a foreclosure to
upcoming changes which should be NFC but without this patch this tidbit
wouldn't be NFC.

llvm-svn: 250596
2015-10-17 00:46:57 +00:00
Matthias Braun 5105e05e8f RegisterPressure: Remove 0 entries from PressureChange
This should not change behaviour because as far as I can see all code
reading the pressure changes has no effect if the PressureInc is 0.
Removing these entries however does avoid unnecessary computation, and
results in a more stable debug output. I want the stable debug output to
check that some upcoming changes are indeed NFC and identical even at
the debug output level.

llvm-svn: 250595
2015-10-17 00:35:59 +00:00
Matthias Braun 96e411b90c RegisterPressure: Hide non-const iterators of PressureDiff
It is too easy to accidentally violate the ordering requirements when
modifying the PressureDiff entries through iterators.

llvm-svn: 250590
2015-10-17 00:08:48 +00:00
Joseph Tremoulet 55b51e9dcc [WinEH] Fix eh.exceptionpointer intrinsic lowering
Summary:
Some shared code for handling eh.exceptionpointer and eh.exceptioncode
needs to not share the part that truncates to 32 bits, which is intended
just for exception codes.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13747

llvm-svn: 250588
2015-10-17 00:08:08 +00:00
Reid Kleckner 28e490342b [WinEH] Fix stack alignment in funclets and ParentFrameOffset calculation
Our previous value of "16 + 8 + MaxCallFrameSize" for ParentFrameOffset
is incorrect when CSRs are involved. We were supposed to have a test
case to catch this, but it wasn't very rigorous.

The main effect here is that calling _CxxThrowException inside a
catchpad doesn't immediately crash on MOVAPS when you have an odd number
of CSRs.

llvm-svn: 250583
2015-10-16 23:43:27 +00:00
Matthias Braun fdee8ec2bd RegisterPressure: Use range based for, cleanup
llvm-svn: 250579
2015-10-16 23:25:09 +00:00
Benjamin Kramer b43d33bf0f Revert "This is a follow-up to the discussion in D12882."
Breaks clang selfhost, see PR25222. This reverts commits r250527 and r250528.

llvm-svn: 250570
2015-10-16 23:00:29 +00:00
Joseph Tremoulet d11a998e81 [WinEH] Fix CatchRetSuccessorColorMap accounting
Summary:
We now use the block for the catchpad itself, rather than its normal
successor, as the funclet entry.
Putting the normal successor in the map leads downstream funclet
membership computations to erroneous results.

Reviewers: majnemer, rnk

Subscribers: rnk, llvm-commits

Differential Revision: http://reviews.llvm.org/D13798

llvm-svn: 250552
2015-10-16 21:22:54 +00:00
David Majnemer e696583dba [WinEH] Remove dead code/includes from WinEHPrepare
No functionality change is intended.

llvm-svn: 250545
2015-10-16 19:59:52 +00:00
Joseph Tremoulet 53e9cbd95a [WinEH] Fix endpad coloring/numbering
Summary:
When a cleanup's cleanupendpad or cleanupret targets a catchendpad, stop
trying to propagate the cleanup's parent's color to the catchendpad, since
what's needed is the cleanup's grandparent's color and the catchendpad
will get that color from the catchpad linkage already.  We already had
this exclusion for invokes, but were missing it for
cleanupendpad/cleanupret.

Also add a missing line that tags cleanupendpads' states in the
EHPadStateMap, without with lowering invokes that target cleanupendpads
which unwind to other handlers (and so don't have the -1 state) will fail.

This fixes the reduced IR repro in PR25163.


Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13797

llvm-svn: 250534
2015-10-16 18:08:16 +00:00
Sanjay Patel 374dd8d88e This is a follow-up to the discussion in D12882.
Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands
are expensive (as defined by the TTI cost model) because that may expose further optimizations. 
However, we would then like a later pass like CodeGenPrepare to undo that transformation if the
target would likely benefit from not speculatively executing an expensive op (this patch).

Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its 
select-formation behavior that changed with r248439.

Differential Revision: http://reviews.llvm.org/D13297

llvm-svn: 250527
2015-10-16 16:54:30 +00:00
Evgeniy Stepanov 9addbc9fc1 Revert "[safestack] Fast access to the unsafe stack pointer on AArch64/Android."
Breaks the hexagon buildbot.

llvm-svn: 250461
2015-10-15 21:26:49 +00:00
Adrian Prantl 96b1551d53 Replace a forward declaration with an #include.
When building with modules the forward-declared inner class
DebugLocStream::ListBuilder causes clang to fall over.

llvm-svn: 250459
2015-10-15 20:58:55 +00:00
Evgeniy Stepanov 142947e9f0 [safestack] Fast access to the unsafe stack pointer on AArch64/Android.
Android libc provides a fixed TLS slot for the unsafe stack pointer,
and this change implements direct access to that slot on AArch64 via
__builtin_thread_pointer() + offset.

This change also moves more code into TargetLowering and its
target-specific subclasses to get rid of target-specific codegen
in SafeStackPass.

This change does not touch the ARM backend because ARM lowers
builting_thread_pointer as aeabi_read_tp, which is not available
on Android.

llvm-svn: 250456
2015-10-15 20:50:16 +00:00
Benjamin Kramer bacc7ba7aa [SelectionDAG] Remove dead code. NFC.
Carefully selected parts without deleting graph stuff and dumping methods.

llvm-svn: 250434
2015-10-15 17:54:06 +00:00
Benjamin Kramer 7fa42c8a8c [AsmPrinter] Prune dead code. NFC.
I left all (dead) print and dump methods in place.

llvm-svn: 250433
2015-10-15 17:16:32 +00:00
Artyom Skrobov 4bca0bb010 A doccomment for CombineTo, and some NFC refactorings
Summary:
Caching SDLoc(N), instead of recreating it in every single
function call, keeps the code denser, and allows to unwrap long lines.

Reviewers: sunfish, atrick, sdmitrouk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13726

llvm-svn: 250305
2015-10-14 17:18:35 +00:00
Artyom Skrobov a5b9ad22b3 Merge DAGCombiner::visitSREM and DAGCombiner::visitUREM (NFC)
Summary: The two implementations had more code in common than not.

Reviewers: sunfish, MatzeB, sdmitrouk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13724

llvm-svn: 250302
2015-10-14 16:54:14 +00:00
Joseph Tremoulet 28c89bbb36 [WinEH] Add CoreCLR EH table emission
Summary:
Emit the handler and clause locations immediately after the standard
xdata.
Clauses are emitted in the same order and format used to communiate them
to the CLR Execution Engine.
Add a lit test to verify correct table generation on a small but
interesting example function.

Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: pgavlin, AndyAyers, llvm-commits

Differential Revision: http://reviews.llvm.org/D13451

llvm-svn: 250219
2015-10-13 20:18:27 +00:00
Duncan P. N. Exon Smith e400a7d412 SelectionDAG: Remove implicit ilist iterator conversions, NFC
llvm-svn: 250214
2015-10-13 19:47:46 +00:00
Joseph Tremoulet 1e2f062ec5 [WinEH] Iterate state changes instead of invokes
Summary:
Add an iterator that can walk across blocks and which visits the state
transitions rather than state ranges, with explicit transitions to -1
indicating the presence of top-level calls that may throw and cause the
current function to unwind to caller.  This will simplify code that needs
to identify nested try regions.

Refactor SEH and C++EH table generation to use the new
InvokeStateChangeIterator, and remove the InvokeLabelIterator they were
using.


Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13623

llvm-svn: 250179
2015-10-13 16:44:30 +00:00
Matt Arsenault e5d9515fb7 DAGCombiner: Don't stop finding better chain on 2 aliases
The comment says this was stopped because it was unlikely to be
profitable. This is not true if you want to combine vector loads
with multiple components.

For a simple case that looks like

t0 = load t0 ...
t1 = load t0 ...
t2 = load t0 ...
t3 = load t0 ...

t4 = store t0:1, t0:1
  t5 = store t4, t1:0
    t6 = store t5, t2:0
	  t7 = store t6, t3:0

We want to get all of these stores onto a chain
that is a TokenFactor of these N loads. This mostly
solves the AMDGPU merge-stores.ll regressions
with -combiner-alias-analysis for merging vector
stores of vector loads.

llvm-svn: 250138
2015-10-13 00:49:00 +00:00
Matt Arsenault 61dc235f20 DAGCombiner: Combine extract_vector_elt from build_vector
This basic combine was surprisingly missing.
AMDGPU legalizes many operations in terms of 32-bit vector components,
so not doing this results in many extra copies and subregister extracts
that need to be cleaned up later.

InstCombine already does this for the hasOneUse case. The target hook
is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn
from a vector materialize repeated immediate instruction to a constant
vector load with more scalar copies from it.

llvm-svn: 250129
2015-10-12 23:59:50 +00:00
Cong Hou bf22f5063a Assign correct edge weights to unwind destinations when lowering invoke statement.
When lowering invoke statement, all unwind destinations are directly added as successors of call site block, and the weight of those new edges are not assigned properly. Actually, default weight 16 are used for those edges. This patch calculates the proper edge weights for those edges when collecting all unwind destinations.

Differential revision: http://reviews.llvm.org/D13354

llvm-svn: 250119
2015-10-12 23:02:58 +00:00
Simon Pilgrim c8832fc233 [SelectionDAG] Add common vector constant folding helper function
We have a number of functions that implement constant folding of vectors (unary and binary ops) in near identical manners (and the differences don't appear to be critical).

This patch introduces a common implementation (SelectionDAG::FoldConstantVectorArithmetic) and calls this in both the unary and binary op cases.

After this initial patch I intend to begin enabling vector constant folding for a wider number of opcodes in SelectionDAG::getNode().

Differential Revision: http://reviews.llvm.org/D13665

llvm-svn: 250118
2015-10-12 23:00:11 +00:00
Matt Arsenault 07a72bad0b Enable verifier after PeepholeOptimizer
No tests fail with this enabled so I assume it was an accident
that it isn't enabled now.

llvm-svn: 250070
2015-10-12 17:43:56 +00:00
Reid Kleckner 9abb3c06a6 Don't call PrepareEHLandingPad on non EH pads
This was a minor bug in r249492. Calling PrepareEHLandingPad on a
non-landingpad was a no-op, but it attempted to get the generic pointer
register class, which apparently doesn't exist for some targets.

llvm-svn: 250068
2015-10-12 17:42:32 +00:00
David Majnemer 99c1d13e52 [WinEH] Remove CatchObjRecoverIdx
CatchObjRecoverIdx was used for the old scheme, it is no longer
relevant.

llvm-svn: 250065
2015-10-12 16:44:22 +00:00
Oliver Stannard cca893ffac [Debug] Look through bitcasts to find argument registers
On targets where f32 is not legal, we have to look through a BITCAST SDNode to
find the register that an argument is stored in when emitting debug info, or we
will not be able to emit a DW_AT_location for it.

Differential Revision: http://reviews.llvm.org/D13005

llvm-svn: 250056
2015-10-12 15:52:36 +00:00
Simon Pilgrim d45c88bbb5 [DAGCombiner] Improved FMA combine support for vectors
Enabled constant canonicalization for all constants.

Improved combining of constant vectors.

llvm-svn: 249993
2015-10-11 19:48:12 +00:00
Simon Pilgrim 5eac2607b9 [DAGCombiner] Tidyup FMINNUM/FMAXNUM constant folding
Enable constant folding for vector splats as well as scalars.

Enable constant canonicalization for all scalar and vector constants.

llvm-svn: 249978
2015-10-11 16:02:28 +00:00
David Majnemer bfa5b98201 [WinEH] Remove more dead code
wineh-parent is dead, so is ValueOrMBB.

llvm-svn: 249920
2015-10-10 00:04:29 +00:00
Reid Kleckner 14e773500e [WinEH] Delete the old landingpad implementation of Windows EH
The new implementation works at least as well as the old implementation
did.

Also delete the associated preparation tests. They don't exercise
interesting corner cases of the new implementation. All the codegen
tests of the EH tables have already been ported.

llvm-svn: 249918
2015-10-09 23:34:53 +00:00
Reid Kleckner eb7cd6c889 [SEH] Update SEH codegen tests to use the new IR
Also Fix a buglet where SEH tables had ranges that spanned funclets.

The remaining tests using the old landingpad IR are preparation tests,
and will be deleted along with the old preparation.

llvm-svn: 249917
2015-10-09 23:05:54 +00:00
Duncan P. N. Exon Smith f1ff53ecc2 CodeGen: Remove implicit ilist iterator conversions, NFC
Finish removing implicit ilist iterator conversions from LLVMCodeGen.
I'm sure there are lots more of these in lib/CodeGen/*/.

llvm-svn: 249915
2015-10-09 22:56:24 +00:00