Commit Graph

104393 Commits

Author SHA1 Message Date
Simon Pilgrim 6e85e92b6c Remove trailing whitespace. NFCI.
llvm-svn: 306121
2017-06-23 16:35:32 +00:00
Tim Northover 4b4eec7009 GlobalISel: remove G_SEQUENCE instruction.
It was trying to do too many things. The basic lumping together of values for
legalization purposes is now handled by G_MERGE_VALUES. More complex things
involving gaps and odd sizes are handled by G_INSERT sequences.

llvm-svn: 306120
2017-06-23 16:15:55 +00:00
Tim Northover b57bf2ac79 GlobalISel: convert buildSequence to use non-deprecated instructions.
G_SEQUENCE is going away soon so as a first step the MachineIRBuilder needs to
be taught how to emulate it with alternatives. We use G_MERGE_VALUES where
possible, and a sequence of G_INSERTs if not.

llvm-svn: 306119
2017-06-23 16:15:37 +00:00
Jun Bum Lim 506cfb7ab7 [InlineCost] Do not take INT_MAX when Cost is negative
Summary: visitSwitchInst should not take INT_MAX when Cost is negative. Instead of INT_MAX , we also use a valid upperbound cost when overflow occurs in Cost.

Reviewers: hans, echristo, dmgreen

Reviewed By: dmgreen

Subscribers: mcrosier, javed.absar, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D34436

llvm-svn: 306118
2017-06-23 16:12:37 +00:00
Ulrich Weigand eaf0051ba3 [SystemZ] Remove unnecessary serialization before volatile loads
This reverts the use of TargetLowering::prepareVolatileOrAtomicLoad
introduced by r196905.  Nothing in the semantics of the "volatile"
keyword or the definition of the z/Architecture actually requires
that volatile loads are preceded by a serialization operation, and
no other compiler on the platform actually implements this.

Since we've now seen a use case where this additional serialization
causes noticable performance degradation, this patch removes it.

The patch still leaves in the serialization before atomic loads,
which is now implemented directly in lowerATOMIC_LOAD.  (This also
seems overkill, but that can be addressed separately.)

llvm-svn: 306117
2017-06-23 15:56:14 +00:00
Tom Stellard af552dc352 AMDGPU/GlobalISel: Mark 32-bit G_AND as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D34349

llvm-svn: 306112
2017-06-23 15:17:17 +00:00
Jonas Paulsson 82f15a7168 [SystemZ] Fix trap issue and enable expensive checks.
The isBarrier/isTerminator flags have been removed from the SystemZ trap
instructions, so that tests do not fail with EXPENSIVE_CHECKS. This was just
an issue at -O0 and did not affect code output on benchmarks.

(Like Eli pointed out: "targets are split over whether they consider their
"trap" a terminator; x86, AArch64, and NVPTX don't, but ARM, MIPS, PPC, and
SystemZ do. We should probably try to be consistent here.". This is still the
case, although SystemZ has switched sides).

SystemZ now returns true in isMachineVerifierClean() :-)

These Generic tests have been modified so that they can be run with or without
EXPENSIVE_CHECKS: CodeGen/Generic/llc-start-stop.ll and
CodeGen/Generic/print-machineinstrs.ll

Review: Ulrich Weigand, Simon Pilgrim, Eli Friedman
https://bugs.llvm.org/show_bug.cgi?id=33047
https://reviews.llvm.org/D34143

llvm-svn: 306106
2017-06-23 14:30:46 +00:00
Anna Thomas 91eed9ac1a [RuntimeLoopUnrolling] Rename exit block and move assert earlier. NFC
The single exit block allowed in runtime unrolling is guaranteed to be
the Latch's successor, so rename it as LatchExitBlock.

llvm-svn: 306105
2017-06-23 14:28:01 +00:00
Anna Thomas d67165c93c [InstCombine] Recognize and simplify three way comparison idioms
Summary:
Many languages have a three way comparison idiom where comparing two values
produces not a boolean, but a tri-state value. Typical values (e.g. as used in
the lcmp/fcmp bytecodes from Java) are -1 for less than, 0 for equality, and +1
for greater than.

We actually do a great job already of converting three way comparisons into
binary comparisons when the result produced has one a single use. Unfortunately,
such values can have more than one use, and in that case, our existing
optimizations break down.

The patch adds a peephole which converts a three-way compare + test idiom into a
binary comparison on the original inputs. It focused on replacing the test on
the result of the three way compare and does nothing about removing the three
way compare itself. That's left to other optimizations (which do actually kick
in commonly.)
We currently recognize one idiom on signed integer compare. In the future, we
plan to recognize and simplify other comparison idioms on
other signed/unsigned datatypes such as floats, vectors etc.

This is a resurrection of Philip Reames' original patch:
https://reviews.llvm.org/D19452

Reviewers: majnemer, apilipenko, reames, sanjoy, mkazantsev

Reviewed by: mkazantsev

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34278

llvm-svn: 306100
2017-06-23 13:41:45 +00:00
Petar Jovanovic 78811c2c07 Revert r306095: [mips] Fix reg positions in the aui/daui instructions
ELF/mips-plt-r6.s in lld-test is failing. Reverting the change.

Original commit message:

  [mips] Fix register positions in the aui/daui instructions

  Swapped the position of the rt and rs register in the aut/daui
  instructions for mips32r6 and mips64r6. With this change, the format of
  the generated instructions complies with specifications and GCC.
  Patch by Milos Stojanovic.

llvm-svn: 306099
2017-06-23 13:33:46 +00:00
Pavel Labath ec000f42fa [ADT] Add llvm::to_float
Summary:
The function matches the interface of llvm::to_integer, but as we are
calling out to a C library function, I let it take a Twine argument, so
we can avoid a string copy at least in some cases.

I add a test and replace a couple of existing uses of strtod with this
function.

Reviewers: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34518

llvm-svn: 306096
2017-06-23 12:55:02 +00:00
Petar Jovanovic d5f7711ebb [mips] Fix register positions in the aui/daui instructions
Swapped the position of the rt and rs register in the aut/daui instructions
for mips32r6 and mips64r6. With this change, the format of the generated
instructions complies with specifications and GCC.

Patch by Milos Stojanovic.

Differential Revision: https://reviews.llvm.org/D33988

llvm-svn: 306095
2017-06-23 12:47:18 +00:00
Stefan Maksimovic b794c0a5ca [mips][msa] Splat.d endianness check
Before this change, it was always the first element of a vector that got splatted since the lower 6 bits of vshf.d $wd were always zero for little endian.
Additionally, masking has been performed for vshf via which splat.d is created.

Vshf has a property where if its first operand's elements have either bit 6 or 7 set, destination element is set to zero.
Initially masked with 63 to avoid this property, which would result in generation of and.v + vshf.d in all cases.
Masking with one results in generating a single splati.d instruction when possible.

Differential Revision: https://reviews.llvm.org/D32216

llvm-svn: 306090
2017-06-23 09:09:31 +00:00
Craig Topper 2c20c42cb6 [JumpThreading] Teach jump threading how to analyze (and (cmp A, C1), (cmp A, C2)) after InstCombine has turned it into (cmp (add A, C3), C4)
Currently JumpThreading can use LazyValueInfo to analyze an 'and' or 'or' of compare if the compare is fed by a livein of a basic block. This can be used to to prove the condition can't be met for some predecessor and the jump from that predecessor can be moved to the false path of the condition.

But if the compare is something that InstCombine turns into an add and a single compare, it can't be analyzed because the livein is now an input to the add and not the compare.

This patch adds a new method to LVI to get a ConstantRange on an edge. Then we teach jump threading to detect the add livein feeding a compare and to get the ConstantRange and propagate it.

Differential Revision: https://reviews.llvm.org/D33262

llvm-svn: 306085
2017-06-23 05:41:35 +00:00
Craig Topper 7927996140 [JumpThreading] Use some temporary variables to reduce the number of times we call the same methods. NFC
A future patch will add even more uses of these variables.

llvm-svn: 306084
2017-06-23 05:41:32 +00:00
Rafael Espindola 58173b9720 COFF: Produce an error on invalid pcrel relocs.
X86_64 COFF only has support for 32 bit pcrel relocations. Produce an
error on all others.

Note that gnu as has extended the relocation values to support
this. It is not clear if we should support the gnu extension.

llvm-svn: 306082
2017-06-23 04:07:44 +00:00
Chandler Carruth 4ab0f4910a [LoopSimplify] Factor the logic to form dedicated exits into a utility.
I want to use the same logic as LoopSimplify to form dedicated exits in
another pass (SimpleLoopUnswitch) so I wanted to factor it out here.

I also noticed that there is a pretty significantly more efficient way
to implement this than the way the code in LoopSimplify worked. We don't
need to actually retain the set of unique exit blocks, we can just
rewrite them as we find them and use only a set to deduplicate.

This did require changing one part of LoopSimplify to not re-use the
unique set of exits, but it only used it to check that there was
a single unique exit. That part of the code is about to walk the exiting
blocks anyways, so it seemed better to rewrite it to use those exiting
blocks to compute this property on-demand.

I also had to ditch a statistic, but it doesn't seem terribly valuable.

Differential Revision: https://reviews.llvm.org/D34049

llvm-svn: 306081
2017-06-23 04:03:04 +00:00
Rafael Espindola 34e94a8783 COFF: handle "undef - ." expressions.
This is another thing that the ELF implementation can do but is
missing from COFF.

llvm-svn: 306078
2017-06-23 02:15:56 +00:00
Craig Topper b60f866a8b [LVI] Teach LVI to reason about ORs of icmps similar to how it reasons about ANDs of icmps
Summary: LVI can reason about an AND of icmps on the true dest of a branch. I believe we can do similar for the false dest of ORs. This allows us to get the same answer for the demorganed versions of some of the AND test cases as you can see.

Reviewers: anna, reames

Reviewed By: reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34431

llvm-svn: 306076
2017-06-23 01:08:16 +00:00
Farhana Aleen 9bd593e0d7 Fixed a (product) build error that was due to an unused variable
Details: There was a use but it was in the assert which was not
         exercised during product build.

Reviewers: Andrew Kaylor

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32658

llvm-svn: 306073
2017-06-22 23:56:31 +00:00
Sanjay Patel 359ae44fb4 [x86] add/sub (X==0) --> sbb(cmp X, 1)
This is very similar to the transform in:
https://reviews.llvm.org/rL306040
...but in this case, we use cmp X, 1 to set the carry bit as needed.

Again, we can show that all of these are logically equivalent (although
InstCombine currently canonicalizes to a form not seen here), and if
we believe IACA, then this is the smallest/fastest code. Eg, with SNB:

| Num Of |              Ports pressure in cycles               |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |    |
---------------------------------------------------------------------
|   1    | 1.0       |     |           |           |     |     |    | cmp edi, 0x1
|   2    |           | 1.0 |           |           |     | 1.0 | CP | sbb eax, eax


The larger motivation is to clean up all select-of-constants combining/lowering 
because we're missing some common cases.

llvm-svn: 306072
2017-06-22 23:47:15 +00:00
Andrew Kaylor d49711996f Restrict the definition of loop preheader to avoid EH blocks
Differential Revision: https://reviews.llvm.org/D34487

llvm-svn: 306070
2017-06-22 23:27:16 +00:00
whitequark 08b20356c3 Define behavior of "stack-probe-size" attribute when inlining.
Also document the attribute, since "probe-stack" already is.

Reviewed By: majnemer

Differential Revision: https://reviews.llvm.org/D34528

llvm-svn: 306069
2017-06-22 23:22:36 +00:00
Farhana Aleen 4b652a5335 Supported lowerInterleavedStore() in X86InterleavedAccess.
Reviewers: RKSimon, DavidKreitzer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32658

llvm-svn: 306068
2017-06-22 22:59:04 +00:00
Eric Christopher 5a7c2f1700 Remove the LoadCombine pass. It was never enabled and is unsupported.
Based on discussions with the author on mailing lists.

llvm-svn: 306067
2017-06-22 22:58:12 +00:00
Rafael Espindola d2edd137df Change creation of relative relocations on COFF.
For whatever reason, when processing

  .globl foo
foo:
  .data
bar:
  .long foo-bar

llvm-mc creates a relocation with the section:

0x0 IMAGE_REL_I386_REL32 .text

This is different than when the relocation is relative from the
beginning. For example, a file with

call foo

produces

0x0 IMAGE_REL_I386_REL32 foo

I would like to refactor the logic for converting "foo - ." into a
relative relocation so that it is shared with ELF. This is the first
step and just changes the coff implementation to match what ELF (and
COFF in the case of calls) does.

llvm-svn: 306063
2017-06-22 21:57:04 +00:00
Jacob Gravelle a31ec61c46 [WebAssembly] WebAssemblyFastISel getelementptr variable index support
Summary:
Previously -fast-isel getelementptr would constant-fold non-constant i8
load/stores.

Reviewers: sunfish

Subscribers: jfb, dschuff, sbc100, llvm-commits

Differential Revision: https://reviews.llvm.org/D34044

llvm-svn: 306060
2017-06-22 21:26:08 +00:00
Krzysztof Parzyszek 9b7c1d2dcf [Hexagon] Properly update kill flags in HexagonNewValueJump
The feeder instruction will be moved to right before the compare, so
the updating code should not be looking for kills past the compare.

llvm-svn: 306059
2017-06-22 21:11:44 +00:00
Lang Hames 266202236f [ORC] Switch the object layer API from addObjectSet to addObject (singular), and
move the ObjectCache from the IRCompileLayer to SimpleCompiler.

This is the first in a series of patches aimed at cleaning up and improving the
robustness and performance of the ORC APIs.

llvm-svn: 306058
2017-06-22 21:06:54 +00:00
Reid Kleckner 40a47a8702 [MC] Allow assembling .secidx and .secrel32 for undefined symbols
There's nothing incorrect about emitting such relocations against
symbols defined in other objects. The code in EmitCOFFSec* was missing
the visitUsedExpr part of MCStreamer::EmitValueImpl, so these symbols
were not being registered with the object file assembler.

This will be used to make reduced test cases for LLD.

llvm-svn: 306057
2017-06-22 21:02:14 +00:00
Krzysztof Parzyszek 1a0da8d5a3 [Hexagon] Use LivePhysRegs to fix up kills in HexagonGenMux
Remove the previous, manual shuffling of the kill flags. 

llvm-svn: 306054
2017-06-22 20:43:02 +00:00
Rafael Espindola 656669bae4 Simplify WinCOFFObjectWriter::recordRelocation.
It looks like that when this code was written recordRelocation could
be called with A-B where A and B are in the same section. The
expression evaluation logic these days makes sure those are folded, so
some of this code was dead.

llvm-svn: 306053
2017-06-22 20:27:33 +00:00
Anna Thomas 72c90c87f8 [LoopDeletion] Update exits correctly when multiple duplicate edges from an exiting block
Summary:
Currently, we incorrectly update exit blocks of loops when there are multiple
edges from a single exiting block to the exit block. This can happen when we
have switches as the terminator of the exiting blocks.
The fix here is to correctly update the phi nodes in the exit block, and remove
all incoming values *except* for one which is from the preheader.

Note: Currently, this error can manifest only while deleting non-executed loops. However, it
is possible to trigger this error in invariant loops, once we enhance the logic
around the exit conditions for the loop check.

Reviewers: chandlerc, dberlin, sanjoy, efriedma

Reviewed by: efriedma

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D34516

llvm-svn: 306048
2017-06-22 20:20:56 +00:00
Craig Topper 792fc92be2 [AVX-512] Remove and autoupgrade the masked integer compare intrinsics
Summary:
These intrinsics aren't used by clang and haven't been for a while.

There's some really terrible codegen in the 32-bit target for avx512bw due to i64 not being legal. But as I said these intrinsics aren't used by clang even before this patch so this codegen reflects our clang behavior today.

Reviewers: spatel, RKSimon, zvi, igorb

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34389

llvm-svn: 306047
2017-06-22 20:11:01 +00:00
Ekaterina Vaartis c4a6322153 [MC] Fix const qualifier warning
llvm-svn: 306045
2017-06-22 19:08:30 +00:00
Craig Topper d3711ee93e [BasicAA] Add type check and Value equality check around code added in r305481.
This matches the checks done at the beginning of isKnownNonEqual that this code is partially emulating.

Without this we can get assertion failures due to the bit widths of the KnownBits not matching.

llvm-svn: 306044
2017-06-22 19:04:14 +00:00
Adrian McCarthy 4aedc81b8c Fix build break by using llvm::make_unique instead of std::make_unique.
llvm-svn: 306043
2017-06-22 18:57:51 +00:00
Adrian McCarthy 31bcb6f680 Add IDs and clone methods to NativeRawSymbol
All NativeRawSymbols will have a unique symbol ID (retrievable via
getSymIndexId).  For now, these are initialized to 0, but soon the
NativeSession will be responsible for creating the raw symbols, and it will
assign unique IDs.

The symbol cache in the NativeSession will also require the ability to clone
raw symbols, so I've provided implementations for that as well.

llvm-svn: 306042
2017-06-22 18:43:18 +00:00
Adrian McCarthy 6a4b080a5f Make IPDBSession::getGlobalScope a non-const method
There doesn't seem to be a compelling reason why this method should be const
other than it was possible with the DIA implementation.  The native session
is going to act as a symbol factory and cache.  This could be acheived with
mutable (and the existing const_cast), but it seems cleaner to accept that
this method affects the state of the session.

This change eliminates an existing const_cast.

llvm-svn: 306041
2017-06-22 18:42:23 +00:00
Sanjay Patel 41a34e4111 [x86] add/sub (X==0) --> sbb(neg X)
Our handling of select-of-constants is lumpy in IR (https://reviews.llvm.org/D24480),
lumpy in DAGCombiner, and lumpy in X86ISelLowering. That's why we only had the 'sbb'
codegen in 1 out of the 4 tests. This is a step towards smoothing that out.

First, show that all of these IR forms are equivalent:
http://rise4fun.com/Alive/mx

Second, show that the 'sbb' version is faster/smaller. IACA output for SandyBridge
(later Intel and AMD chips are similar based on Agner's tables):

This is the "obvious" x86 codegen (what gcc appears to produce currently):

| Num Of |              Ports pressure in cycles               |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |    |
---------------------------------------------------------------------
|   1*   |           |     |           |           |     |     |    | xor eax, eax
|   1    | 1.0       |     |           |           |     |     | CP | test edi, edi
|   1    |           |     |           |           |     | 1.0 | CP | setnz al
|   1    |           | 1.0 |           |           |     |     | CP | neg eax


This is the adc version:
|   1*   |           |     |           |           |     |     |    | xor eax, eax
|   1    | 1.0       |     |           |           |     |     | CP | cmp edi, 0x1
|   2    |           | 1.0 |           |           |     | 1.0 | CP | adc eax, 0xffffffff


And this is sbb:
|   1    | 1.0       |     |           |           |     |     |    | neg edi
|   2    |           | 1.0 |           |           |     | 1.0 | CP | sbb eax, eax

If IACA is trustworthy, then sbb became a single uop in Broadwell, so this will be
clearly better than the alternatives going forward.

llvm-svn: 306040
2017-06-22 18:11:19 +00:00
Sam Clegg 58ad080ef0 MC: Fix dumping of MCFragment values
Without this cast the "char" overload of operator<< is
chosen and the values is output as an ascii rather than
an integer.

Differential Revision: https://reviews.llvm.org/D34486

llvm-svn: 306039
2017-06-22 17:57:01 +00:00
Kevin Enderby abf10f2d2e Updated llvm-objdump symbolic disassembly with x86_64 Mach-O MH_KEXT_BUNDLE
file types so it symbolically disassembles operands using the external
relocation entries.

rdar://31521343

llvm-svn: 306037
2017-06-22 17:41:22 +00:00
Rafael Espindola 8a261c2565 Add a common error checking for some invalid expressions.
This refactors a bit of duplicated code and fixes an assertion failure
on ELF.

llvm-svn: 306035
2017-06-22 17:25:35 +00:00
David Stuttard f677966e2e [AMDGPU] Add intrinsics for tbuffer load and store - build error fix
Variable was unused in non-debug build (used in assert) causing compile time
warning and eventual build failure

llvm-svn: 306034
2017-06-22 17:15:49 +00:00
David Stuttard 70e8bc1bf3 [AMDGPU] Add intrinsics for tbuffer load and store
Intrinsic already existed for llvm.SI.tbuffer.store

Needed tbuffer.load and also re-implementing the intrinsic as llvm.amdgcn.tbuffer.*

Added CodeGen tests for the 2 new variants added.
Left the original llvm.SI.tbuffer.store implementation to avoid issues with existing code

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, tpr

Differential Revision: https://reviews.llvm.org/D30687

llvm-svn: 306031
2017-06-22 16:29:22 +00:00
Craig Topper dffbbcb3fd [InstCombine] Teach foldSelectICmpAndOr to recognize (select (icmp slt (trunc (X)), 0), Y, (or Y, C2))
Summary:
InstCombine likes to turn (icmp eq (and X, C1), 0) into (icmp slt (trunc (X)), 0) sometimes. This breaks foldSelectICmpAndOr's ability to recognize (select (icmp eq (and X, C1), 0), Y, (or Y, C2))->(or (shl (and X, C1), C3), y).

This patch tries to recover this. I had to flip around some of the early out checks so that I could create a new And instruction during the compare processing without it possibly never getting used.

Reviewers: spatel, majnemer, davide

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34184

llvm-svn: 306029
2017-06-22 16:23:30 +00:00
Teresa Johnson a690e3cea2 [ThinLTO] Remove unnecessary include of Linker.h (NFC)
The ModuleLinker is no longer used by ThinLTO, so this is not needed.

Patch by Benoit Belley <Benoit.Belley@autodesk.com>

llvm-svn: 306028
2017-06-22 16:18:48 +00:00
Craig Topper 0de5e6a729 [InstCombine] Add one use checks to or/and->xnor folding
If the components of the and/or had multiple uses, this transform created an additional instruction.

This patch makes sure we remove one of the components.

Differential Revision: https://reviews.llvm.org/D34498

llvm-svn: 306027
2017-06-22 16:12:02 +00:00
Krzysztof Parzyszek f63ad39e7d [Hexagon] Handle a global operand to A2_addi when creating duplexes
llvm-svn: 306012
2017-06-22 15:53:31 +00:00
Sanjay Patel d1e811979c [InstCombine] reverse bitcast + bitwise-logic canonicalization (PR33138)
There are 2 parts to this patch made simultaneously to avoid a regression.

We're reversing the canonicalization that moves bitwise vector ops before bitcasts. 
We're moving bitwise vector ops *after* bitcasts instead. That's the 1st and 3rd hunks 
of the patch. The motivation is that there's only one fold that currently depends on 
the existing canonicalization (see next), but there are many folds that would 
automatically benefit from the new canonicalization. 
PR33138 ( https://bugs.llvm.org/show_bug.cgi?id=33138 ) shows why/how we have these 
patterns in IR.

There's an or(and,andn) pattern that requires an adjustment in order to continue matching
to 'select' because the bitcast changes position. This match is unfortunately complicated 
because it requires 4 logic ops with optional bitcast and sext ops.

Test diffs:

  1. The bitcast.ll and bitcast-bigendian.ll changes show the most basic difference - 
     bitcast comes before logic.
  2. There are also tests with no diffs in bitcast.ll that verify that we're still doing 
     folds that were enabled by the previous canonicalization.
  3. icmp-xor-signbit.ll shows the payoff. We don't need to adjust existing icmp patterns 
     to look through bitcasts.
  4. logical-select.ll contains several tests for the or(and,andn) --> select fold to 
     verify that we are still handling those cases. The lone diff shows the movement of 
     the bitcast from the new canonicalization rule.

Differential Revision: https://reviews.llvm.org/D33517

llvm-svn: 306011
2017-06-22 15:46:54 +00:00
whitequark cebe8241ca [X86] Add support for "probe-stack" attribute
This commit adds prologue code emission for stack probe function
calls.

Reviewed By: majnemer

Differential Revision: https://reviews.llvm.org/D34387

llvm-svn: 306010
2017-06-22 15:42:53 +00:00
Florian Hahn 5991b5be74 [ARM] Create relocations for beq.w branches to ARM function syms.
Summary:
The ARM ELF ABI requires the linker to do interworking for wide
conditional branches from Thumb code to ARM code. 

That was pointed out by @peter.smith in the comments for D33436.

Reviewers: rafael, peter.smith, echristo

Reviewed By: peter.smith

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits, peter.smith

Differential Revision: https://reviews.llvm.org/D34447

llvm-svn: 306009
2017-06-22 15:32:41 +00:00
Sanjay Patel e800df8eac [InstCombine] add peekThroughBitcast() helper; NFC
This is an NFC portion of D33517. We have similar helpers in the backend.

llvm-svn: 306008
2017-06-22 15:28:01 +00:00
Petar Jovanovic 636851b845 [mips] Allow $AT to be used as a register name
This patch allows $AT to be used as a register name in assembly files.
Currently only $at is recognized as a valid register name.

Patch by Stanislav Ocovaj.

Differential Revision: https://reviews.llvm.org/D34348

llvm-svn: 306007
2017-06-22 15:24:16 +00:00
Nirav Dave f2c349ccec [DAG] Add Target Store Merge pass ordering function
Allow targets to specify if they should merge stores before or after
legalization.

llvm-svn: 306006
2017-06-22 15:07:49 +00:00
Pavel Labath efd57a8aec Revert "[Support] Add RetryAfterSignal helper function" and subsequent fix
The fix in r306003 uncovered a pretty fundamental problem that libc++
implementation of std::result_of does not handle the prototype of
open(2) correctly (presumably because it contains ...). This makes the
whole function unusable in its current form, so I am also reverting the
original commit (r305892), which introduced the function, at least until
I figure out a way to solve the libc++ issue.

llvm-svn: 306005
2017-06-22 14:18:55 +00:00
Krzysztof Parzyszek 69ffba4595 [Hexagon] Recognize potential offset overflow for store-imm to stack
Reserve an extra scavenging stack slot if the offset field in store-
-immediate instructions may overflow.

llvm-svn: 306004
2017-06-22 14:11:23 +00:00
Sagar Thakur 15126308c8 Revert [mips] Adds support for R_MIPS_26, HIGHER, HIGHEST relocations in RuntimeDyld
Reverting due to build bot failures

llvm-svn: 306000
2017-06-22 12:48:04 +00:00
Sam Kolton ca5a30ed74 [AMDGPU] SDWA: remove support for VOP2 instructions that have only 64-bit encoding
Summary:
Despite that this instructions are listed in VOP2, they are treated as VOP3 in specs. They should not support SDWA.
There are no real instructions for them, but there are pseudo instructions.

Reviewers: arsenm, vpykhtin, cfang

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D34403

llvm-svn: 305999
2017-06-22 12:42:14 +00:00
Kristof Beyls 9665249fd8 Don't conditionalize Neon instructions, even in IT blocks.
This has been deprecated since ARMARM v7-AR, release C.b, published back
in 2012.

This also removes test/CodeGen/Thumb2/ifcvt-neon.ll that originally was
introduced to check that conditionalization of Neon instructions did
happen when generating Thumb2. However, the test had evolved and was no
longer testing that. Rather than trying to adapt that test, this commit
introduces test/CodeGen/Thumb2/ifcvt-neon-deprecated.mir, since we can
now use the MIR framework to write nicer/more maintainable tests.

llvm-svn: 305998
2017-06-22 12:11:38 +00:00
Sagar Thakur f8858d0979 [mips] Adds support for R_MIPS_26, HIGHER, HIGHEST relocations in RuntimeDyld
After the N64 static relocation model support was added to llvm it is required to add its support in RuntimeDyld also because lldb uses ExecutionEngine for evaluating expressions.

Reviewed by sdardis
Differential: D31649

llvm-svn: 305997
2017-06-22 11:49:19 +00:00
Simon Dardis 1c73fcc131 [mips] Implement the ".rdata" MIPS assembly directive.
Rather than creating a separate ".rdata" section distinct from the
customary ".rodata" in ELF, ".rdata" switches to the ".rodata" section.

This patch relands r305949 and r305950 with the correct commit message
and addresses nit raised during review.

Patch By: John Baldwin!

Differential Revision: https://reviews.llvm.org/D34452

llvm-svn: 305995
2017-06-22 10:41:51 +00:00
John Brawn ed78aaf093 [ARM] Add .w aliases of MOV with shifted operand
These appear to have been simply missing.

Differential Revision: https://reviews.llvm.org/D34461

llvm-svn: 305993
2017-06-22 10:30:53 +00:00
John Brawn 192f74a84d [ARM] Clean up choice of narrow instructions in ARMAsmParser, NFC
This patch makes a couple of changes to how we decide whether to use the narrow
or wide encoding of thumb2 instructions:
 * Common out the detection of the .w qualifier
 * Check for the CPSR operand in a consistent way

Differential Revision: https://reviews.llvm.org/D34460

llvm-svn: 305992
2017-06-22 10:29:31 +00:00
Diana Picus b512e91515 Revert "Enable vectorizer-maximize-bandwidth by default."
This reverts commit r305960 because it broke self-hosting on AArch64.

llvm-svn: 305990
2017-06-22 10:00:28 +00:00
Igor Breger 1c29be7e4f [GlobalISel][X86] Support vector type G_INSERT legalization/selection.
Summary:
Support vector type G_INSERT legalization/selection.
Split from https://reviews.llvm.org/D33665

Reviewers: qcolombet, t.p.northover, zvi, guyblank

Reviewed By: guyblank

Subscribers: guyblank, rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D33956

llvm-svn: 305989
2017-06-22 09:43:35 +00:00
Florian Hahn b489e56ae2 [ARM] Add macro fusion for AES instructions.
Summary:
This patch adds a macro fusion using CodeGen/MacroFusion.cpp to pair AES
instructions back to back and adds FeatureFuseAES to enable the feature.

Reviewers: evandro, javed.absar, rengolin, t.p.northover

Reviewed By: javed.absar

Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34142

llvm-svn: 305988
2017-06-22 09:39:36 +00:00
Elena Demikhovsky 2dac0b4d58 AVX-512: Lowering Masked Gather intrinsic - fixed a bug
Masked gather for vector length 2 is lowered incorrectly for element type i32.
The type <2 x i32> was automatically extended to <2 x i64> and we generated VPGATHERQQ instead of VPGATHERQD.
The type <2 x float> is extended to <4 x float>, so there is no bug for this type, but the sequence may be more optimal.

In this patch I'm fixing <2 x i32>bug and optimizing <2 x float> sequence for GATHERs only. The same fix should be done for Scatters as well.

Differential revision: https://reviews.llvm.org/D34343

llvm-svn: 305987
2017-06-22 06:47:41 +00:00
Sam Kolton 3c4933fcc6 [AMDGPU] SDWA: add support for GFX9 in peephole pass
Summary:
Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers.
Added several subtarget features for GFX9 SDWA.
This diff also contains changes from D34026.
Depends D34026

Reviewers: vpykhtin, rampitec, arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D34241

llvm-svn: 305986
2017-06-22 06:26:41 +00:00
Hiroshi Inoue 1d5693c915 [PowerPC] fix potential verification errors
This patch fixes trivial mishandling of 32-bit/64-bit instructions that may cause verification errors with -verify-machineinstrs.

llvm-svn: 305984
2017-06-22 04:33:44 +00:00
Reid Kleckner b7d716c06f [llvm-readobj] Dump the COFF image load config
This includes the safe SEH tables and the control flow guard function
table. LLD will emit the guard table soon, and I need a tool that dumps
them for testing.

llvm-svn: 305979
2017-06-22 01:10:29 +00:00
Reid Kleckner ef5817579b [wasm] Fix WebAssembly asm backend after r305968
llvm-svn: 305978
2017-06-22 01:07:05 +00:00
Davide Italiano 7a6c5c12ad Revert "[Target] Implement the ".rdata" MIPS assembly directive."
This reverts commit r305949 and r305950 as they didn't have the
correct commit message.

llvm-svn: 305973
2017-06-22 00:11:41 +00:00
Sam Clegg fe6414b043 [WebAssembly] Cleanup WasmObjectWriter.cpp. NFC
- Use auto where appropriate
- Use early return to reduce nesting
- Remove stray comment line
- Use C++ foreach over explicit iterator

Differential Revision: https://reviews.llvm.org/D34477

llvm-svn: 305971
2017-06-21 23:46:41 +00:00
Stanislav Mekhanoshin 3ed38c601a [AMDGPU] Add FP_CLASS to the add/setcc combine
This is one of the nodes which also compile as v_cmp_*.

Differential Revision: https://reviews.llvm.org/D34485

llvm-svn: 305970
2017-06-21 23:46:22 +00:00
Eugene Zelenko 72208a8226 [ProfileData, Support] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 305969
2017-06-21 23:19:47 +00:00
Rafael Espindola 88d9e37ec8 Use a MutableArrayRef. NFC.
llvm-svn: 305968
2017-06-21 23:06:53 +00:00
Rafael Espindola 6da25f4fc4 Fix build.
llvm-svn: 305967
2017-06-21 23:02:57 +00:00
Bob Haarman 4d2711fbb5 [codeview] respect signedness of APSInts when printing to YAML
Summary:
This fixes a bug where we always treat APSInts in Codeview as
signed when writing them to YAML. One symptom of this problem is that
llvm-pdbdump raw would show Enumerator Values that differ between the
original PDB and a PDB that has been round-tripped through YAML.

Reviewers: zturner

Reviewed By: zturner

Subscribers: llvm-commits, fhahn

Differential Revision: https://reviews.llvm.org/D34013

llvm-svn: 305965
2017-06-21 22:31:52 +00:00
Stanislav Mekhanoshin a8b26936d0 [AMDGPU] Combine add and adde, sub and sube
If one of the arguments of adde/sube is zero we can fold another
add/sub into it.

Differential Revision: https://reviews.llvm.org/D34374

llvm-svn: 305964
2017-06-21 22:30:01 +00:00
Sam Clegg 705f798bff Mark dump() methods as const. NFC
Add const qualifier to any dump() method where adding one
was trivial.

Differential Revision: https://reviews.llvm.org/D34481

llvm-svn: 305963
2017-06-21 22:19:17 +00:00
Stanislav Mekhanoshin e3eb42cef6 [AMDGPU] simplify add x, *ext (setcc) => addc|subb x, 0, setcc
This simplification allows to avoid generating v_cndmask_b32
to serialize condition code between compare and use.

Differential Revision: https://reviews.llvm.org/D34300

llvm-svn: 305962
2017-06-21 22:05:06 +00:00
Dehao Chen 014db29b89 Enable vectorizer-maximize-bandwidth by default.
Summary:
vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact:

spec/2006/fp/C++/444.namd                 26.84  -0.31%
spec/2006/fp/C++/447.dealII               46.19  +0.89%
spec/2006/fp/C++/450.soplex               42.92  -0.44%
spec/2006/fp/C++/453.povray               38.57  -2.25%
spec/2006/fp/C/433.milc                   24.54  -0.76%
spec/2006/fp/C/470.lbm                    41.08  +0.26%
spec/2006/fp/C/482.sphinx3                47.58  -0.99%
spec/2006/int/C++/471.omnetpp             22.06  +1.87%
spec/2006/int/C++/473.astar               22.65  -0.12%
spec/2006/int/C++/483.xalancbmk           33.69  +4.97%
spec/2006/int/C/400.perlbench             33.43  +1.70%
spec/2006/int/C/401.bzip2                 23.02  -0.19%
spec/2006/int/C/403.gcc                   32.57  -0.43%
spec/2006/int/C/429.mcf                   40.35  +0.27%
spec/2006/int/C/445.gobmk                 26.96  +0.06%
spec/2006/int/C/456.hmmer                  24.4  +0.19%
spec/2006/int/C/458.sjeng                 27.91  -0.08%
spec/2006/int/C/462.libquantum            57.47  -0.20%
spec/2006/int/C/464.h264ref               46.52  +1.35%

geometric mean                                   +0.29%

The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag.

I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent.

Reviewers: hfinkel, mkuper, davidxl, chandlerc

Reviewed By: chandlerc

Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33341

llvm-svn: 305960
2017-06-21 22:01:32 +00:00
Krzysztof Parzyszek 5b933fee3c [Hexagon] Use MachineInstrBuilder instead of changing instruction in place
llvm-svn: 305953
2017-06-21 21:03:34 +00:00
Sam Clegg 9fa8af6f82 Rename WinCOFFStreamer.cpp -> MCWinCOFFStreamer.cpp
For consistency with other MC*Streamer.cpp files and
the header file.

Differential Revision: https://reviews.llvm.org/D34466

llvm-svn: 305952
2017-06-21 20:58:17 +00:00
Davide Italiano 75ed943def [Target] Implement the ".rdata" MIPS assembly directive.
Patch by John Baldwin < jhb at freebsd dot org >!

Differential Revision:  https://reviews.llvm.org/D34452

llvm-svn: 305949
2017-06-21 20:40:27 +00:00
Davide Italiano 9b8e3d308f [Solaris] emit .init_array instead of .ctors on Solaris (Sparc/x86)
Patch by Fedor Sergeev.

Differential Revision:  https://reviews.llvm.org/D33868

llvm-svn: 305948
2017-06-21 20:36:32 +00:00
Craig Topper 34caf5396f [Reassociate] Use early returns in a couple places to reduce indentation and improve readability. NFC
llvm-svn: 305946
2017-06-21 19:39:35 +00:00
Craig Topper 99a2e89920 [Reassociate] Const correct a helper function. NFC
llvm-svn: 305945
2017-06-21 19:39:33 +00:00
Wolfgang Pieb 258927e3da [DWARF] Support for DW_FORM_strx3 and complete support for DW_FORM_strx{1,2,4}
(consumer).

Reviewer: aprantl

Differential Revision:  https://reviews.llvm.org/D34418

llvm-svn: 305944
2017-06-21 19:37:44 +00:00
Krzysztof Parzyszek fd048cc0ec [Hexagon] Handle more types of immediate operands in expand-condsets
llvm-svn: 305943
2017-06-21 19:21:30 +00:00
Craig Topper a074c101e5 [InstCombine] Cleanup using commutable matchers. Make a couple helper methods standalone static functions. Put 'if' around variable declaration instead of after. NFC
llvm-svn: 305941
2017-06-21 18:57:00 +00:00
whitequark ed54b4a798 Add a "probe-stack" attribute
This attribute is used to ensure the guard page is triggered on stack
overflow. Stack frames larger than the guard page size will generate
a call to __probestack to touch each page so the guard page won't
be skipped.

Reviewed By: majnemer

Differential Revision: https://reviews.llvm.org/D34386

llvm-svn: 305939
2017-06-21 18:46:50 +00:00
Michael Kruse 47f856095a [BasicAA] Use MayAlias instead of PartialAlias for fallback.
Using various methods, BasicAA tries to determine whether two
GetElementPtr memory locations alias when its base pointers are known
to be equal. When none of its heuristics are applicable, it falls back
to PartialAlias to, according to a comment, protect TBAA making a wrong
decision in case of unions and malloc. PartialAlias is not correct,
because a PartialAlias result implies that some, but not all, bytes
overlap which is not necessarily the case here.

AAResults returns the first analysis result that is not MayAlias.
BasicAA is always the first alias analysis. When it returns
PartialAlias, no other analysis is queried to give a more exact result
(which was the intention of returning PartialAlias instead of MayAlias).
For instance, ScopedAA could return a more accurate result.

The PartialAlias hack was introduced in r131781 (and re-applied in
r132632 after some reverts) to fix llvm.org/PR9971 where TBAA returns a
wrong NoAlias result due to a union. A test case for the malloc case
mentioned in the comment was not provided and I don't think it is
affected since it returns an omnipotent char anyway.

Since r303851 (https://reviews.llvm.org/D33328) clang does emit specific
TBAA for unions anymore (but "omnipotent char" instead). Hence, the
PartialAlias workaround is not required anymore.

This patch passes the test-suite and check-llvm/check-clang of a
self-hoisted build on x64.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D34318

llvm-svn: 305938
2017-06-21 18:25:37 +00:00
Peter Collingbourne afaeed5322 Object: Have the irsymtab builder take a string table builder. NFCI.
This will be needed in order to share the irsymtab string table with
the bitcode string table.

Differential Revision: https://reviews.llvm.org/D33971

llvm-svn: 305937
2017-06-21 18:23:19 +00:00
Sanjay Patel 2a6f9f8adf [CGP, memcmp] replace CreateZextOrTrunc with CreateZext because it can never trunc
llvm-svn: 305936
2017-06-21 18:20:52 +00:00
Sanjay Patel a10f5b626d [CGP] fix variables to be unsigned in memcmp expansion
llvm-svn: 305935
2017-06-21 18:06:13 +00:00
Dehao Chen 50f2aa19e8 Do not inline recursive direct calls in sample loader pass.
Summary: r305009 disables recursive inlining for indirect calls in sample loader pass. The same logic applies to direct recursive calls.

Reviewers: iteratee, davidxl

Reviewed By: iteratee

Subscribers: sanjoy, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D34456

llvm-svn: 305934
2017-06-21 17:57:43 +00:00
Reid Kleckner d0e6e24a53 [PDB] Add symbols to the PDB
Summary:
The main complexity in adding symbol records is that we need to
"relocate" all the type indices. Type indices do not have anything like
relocations, an opaque data structure describing where to find existing
type indices for fixups. The linker just has to "know" where the type
references are in the symbol records. I added an overload of
`discoverTypeIndices` that works on symbol records, and it seems to be
able to link the standard library.

Reviewers: zturner, ruiu

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D34432

llvm-svn: 305933
2017-06-21 17:25:56 +00:00
Lei Huang 84dbbfdeb9 [PowerPC] define target hook isReallyTriviallyReMaterializable()
Define target hook isReallyTriviallyReMaterializable() to explicitly specify
PowerPC instructions that are trivially rematerializable.  This will allow
the MachineLICM pass to accurately identify PPC instructions that should always
be hoisted.

Differential Revision: https://reviews.llvm.org/D34255

llvm-svn: 305932
2017-06-21 17:17:56 +00:00
Craig Topper 5b173f2bb3 [InstCombine] Add range metadata to cttz/ctlz/ctpop intrinsic calls based on known bits
Summary:
I noticed that passing known bits across these intrinsics isn't great at capturing the information we really know. Turning known bits of the input into known bits of a count output isn't able to convey a lot of what we really know.

This patch adds range metadata to these intrinsics based on the known bits.

Currently the patch punts if we already have range metadata present.

Reviewers: spatel, RKSimon, davide, majnemer

Reviewed By: RKSimon

Subscribers: sanjoy, hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D32582

llvm-svn: 305927
2017-06-21 16:32:35 +00:00
Craig Topper ae86cc725d [InstCombine] Don't let folding (select (icmp eq (and X, C1), 0), Y, (or Y, C2)) create more instructions than it removes
Summary:
Previously this folding had no checks to see if it was going to result in less instructions. This was pointed out during the review of D34184

This patch adds code to count how many instructions its going to create vs how many its going to remove so we can make a proper decision.

Reviewers: spatel, majnemer

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34437

llvm-svn: 305926
2017-06-21 16:07:13 +00:00
Craig Topper cbac691c4b [Reassociate] Support xor reassociating for splat vectors
Summary: This patch adds support for xors of splat vectors.

Reviewers: mcrosier

Reviewed By: mcrosier

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34354

llvm-svn: 305925
2017-06-21 16:07:09 +00:00
Dmitry Preobrazhensky 851a3d9f05 [AMDGPU][MC][GFX9] Corrected VOP3P relevant code to fix disassembler failures
See Bug 33509: https://bugs.llvm.org//show_bug.cgi?id=33509

Reviewers: Sam Kolton, Artem Tamazov, Valery Pykhtin

Differential Revision: https://reviews.llvm.org/D34360

llvm-svn: 305923
2017-06-21 16:00:54 +00:00
Nirav Dave c1b6aa77bb [DAG] Move BaseIndexOffset into separate Libarary. NFC.
Move BaseIndexOffset analysis out of DAGCombiner for use in other
files.

llvm-svn: 305921
2017-06-21 15:40:43 +00:00
Nirav Dave 9a69d444a3 [DAG] Remove Node csonstruction from BaseIndexOffset match. NFCI.
Move GlobalAddress Offset decomposition from initial match into
comparision check and removing the possibility of constructing a new
offseted global address when examining addresses.

llvm-svn: 305917
2017-06-21 15:07:30 +00:00
Dmitry Preobrazhensky dc4ac823ec [AMDGPU][MC] Corrected V_*QSAD* instructions to check that dest register is different than any of the src
See Bug 33279: https://bugs.llvm.org//show_bug.cgi?id=33279

Reviewers: artem.tamazov, vpykhtin

Differential Revision: https://reviews.llvm.org/D34003

llvm-svn: 305915
2017-06-21 14:41:34 +00:00
Sanjay Patel cec6a500a8 [x86] fix formatting; NFC
llvm-svn: 305914
2017-06-21 14:27:11 +00:00
Christof Douma c1c28051d2 [AARCH64][LSE] Preliminary support for ARMv8.1 LSE Atomics.
Implemented support to AArch64 codegen for ARMv8.1 Large System
Extensions atomic instructions. Where supported, these instructions can
provide atomic operations with higher performance.

Currently supported operations include: fetch_add, fetch_or, fetch_xor,
fetch_smin, fetch_min/max (signed and unsigned), swap, and
compare_exchange.

This implementation implies sequential-consistency ordering, more
relaxed ordering is under development.

Subtarget->hasLSE is currently supported for Cavium ThunderX2T99.

Patch by Ananth Jasty.

Differential Revision: https://reviews.llvm.org/D33586

Change-Id: I82f6d3d64255622791ceb0715b7ab9f4dc4d4b2c
llvm-svn: 305893
2017-06-21 10:58:31 +00:00
Pavel Labath 1f6aea2eb3 [Support] Add RetryAfterSignal helper function
Summary:
This function retries an operation if it was interrupted by a signal
(failed with EINTR). It's inspired by the TEMP_FAILURE_RETRY macro in
glibc, but I've turned that into a template function. I've also added a
fail-value argument, to enable the function to be used with e.g.
fopen(3), which is documented to fail for any reason that open(2) can
fail (which includes EINTR).

The main user of this function will be lldb, but there were also a
couple of uses within llvm that I could simplify using this function.

Reviewers: zturner, silvas, joerg

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D33895

llvm-svn: 305892
2017-06-21 10:55:34 +00:00
Florian Hahn 8552e591a1 [AArch64] Add early exit to promoteLoadFromStore.
There should be at most a single kill flag for the
promoted operand between the store/load pair.
Discussed in https://reviews.llvm.org/D34402.

llvm-svn: 305889
2017-06-21 09:51:52 +00:00
Strahinja Petrovic d280ea4f76 [MIPS] Fix for selecting of DINS/INS instruction
This patch adds one more condition in selection DINS/INS
instruction, which fixes MultiSource/Applications/JM/ldecod/
for mips32r2 (and mips64r2 n32 abi).

Differential Revision: https://reviews.llvm.org/D33725

llvm-svn: 305888
2017-06-21 09:25:51 +00:00
Javed Absar e3a0cc2ca0 Use range-loop in machine-scheduler. NFCI.
Converts to range-loop usage in machine scheduler.
This makes the code neater and easier to read,
and also keeps pace of the machine scheduler
implementation with C++11 features.

Reviewed by: Matthias Braun
Differential Revision: https://reviews.llvm.org/D34320

llvm-svn: 305887
2017-06-21 09:10:10 +00:00
Sam Kolton 549c89d2c9 [AMDGPU] SDWA: merge VI and GFX9 pseudo instructions
Summary: Previously there were two separate pseudo instruction for SDWA on VI and on GFX9. Created one pseudo instruction that is union of both of them. Added verifier to check that operands conform either VI or GFX9.

Reviewers: dp, arsenm, vpykhtin

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, artem.tamazov

Differential Revision: https://reviews.llvm.org/D34026

llvm-svn: 305886
2017-06-21 08:53:38 +00:00
Florian Hahn 80e485179e [AArch64] Preserve register flags when promoting a load from store.
Summary:
This patch updates promoteLoadFromStore to use the store MachineOperand as the
source operand of the of the new instruction instead of creating a new
register MachineOperand. This way, the existing register flags are
preserved. 

This fixes PR33468 (https://bugs.llvm.org/show_bug.cgi?id=33468). 


Reviewers: MatzeB, t.p.northover, junbuml

Reviewed By: MatzeB

Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34402

llvm-svn: 305885
2017-06-21 08:47:23 +00:00
Guy Blank 52d73fce85 [DAGCombiner] Add another combine from build vector to shuffle
Add support for combining a build vector to a shuffle.
When the build vector is of extracted elements from 2 vectors (vec1, vec2) where vec2 is 2 times smaller than vec1.

llvm-svn: 305883
2017-06-21 07:38:41 +00:00
Max Kazantsev eac01d4c62 [SCEV] Make MulOpsInlineThreshold lower to avoid excessive compilation time
MulOpsInlineThreshold option of SCEV is defaulted to 1000, which is inadequately high.
When constructing SCEVs of expressions like:

  x1 = a * a
  x2 = x1 * x1
  x3 = x2 * x2
    ...

We actually have huge SCEVs with max allowed amount of operands inlined.
Such expressions are easy to get from unrolling of loops looking like

  x = a
  for (i = 0; i < n; i++)
    x = x * x

Or more tricky cases where big powers are involved. If some non-linear analysis
tries to work with a SCEV that has 1000 operands, it may lead to excessively long
compilation. The attached test does not pass within 1 minute with default threshold.

This patch decreases its default value to 32, which looks much more reasonable if we
use analyzes with complexity O(N^2) or O(N^3) working with SCEV.

Differential Revision: https://reviews.llvm.org/D34397

llvm-svn: 305882
2017-06-21 07:28:13 +00:00
Dean Michael Berris 28ecff5cf1 [XRay] Reduce synthetic references emitted by XRay
Summary:
When we're building with XRay instrumentation, we use a trick that
preserves references from the function to a function sled index. This
index table lives in a separate section, and without this trick the
linker is free to garbage-collect this section and all the segments it
refers to. Until we're able to tell the linkers to preserve these
sections, we use this reference trick to keep around both the index and
the entries in the instrumentation map.

Before this change we emitted both a synthetic reference to the label in
the instrumentation map, and to the entry in the function map index.
This change removes the first synthetic reference and only emits one
synthetic reference to the index -- the index entry has the references
to the labels in the instrumentation map, so the linker will still
preserve those if the function itself is preserved.

This reduces the amount of synthetic references we emit from 16 bytes to
just 8 bytes in x86_64, and similarly to other platforms.

Reviewers: dblaikie

Subscribers: javed.absar, kpw, pelikan, llvm-commits

Differential Revision: https://reviews.llvm.org/D34340

llvm-svn: 305880
2017-06-21 06:39:42 +00:00
Serguei Katkov 0b0dc57dd8 [ImplicitNullChecks] Uphold an invariant in areMemoryOpsAliased
Right now areMemoryOpsAliased has an assertion justified as:

MMO1 should have a value due it comes from operation we'd like to use
as implicit null check.
assert(MMO1->getValue() && "MMO1 should have a Value!");
However, it is possible for that invariant to not be upheld in the
following situation (conceptually):

Null check %RAX
NotNullSucc:

%RAX = LEA %RSP, 16            // I0
%RDX = MOV64rm %RAX            // I1
With the current code, we will have an early exit from
ImplicitNullChecks::isSuitableMemoryOp on I0 with SR_Unsuitable.
However, I1 will look plausible (since it loads from %RAX) and
will go ahead and call areMemoryOpsAliased(I1, I0). This will cause
us to fail the assert mentioned above since I1 does not load from an
IR level value and thus is allowed to have a non-Value base address.

The fix is to bail out earlier whenever we see an unsuitable
instruction overwrite PointerReg. This would guarantee that when we
call areMemoryOpsAliased, we're guaranteed to be looking at an
instruction that loads from or stores to an IR level value.

Original Patch Author: sanjoy
Reviewers: sanjoy, mkazantsev, reames
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34385

llvm-svn: 305879
2017-06-21 06:38:23 +00:00
Davide Italiano 0ec715be1f [NewGVN] Fix a bug that made the store verifier less effective.
We weren't actually checking for duplicated stores, as the condition
was always actually false. This was found by Coverity, and I have
no clue how to trigger this in real-world code (although I
 tried for a bit).

llvm-svn: 305867
2017-06-20 22:57:40 +00:00
Rafael Espindola 3ac4c09daf clang-format a region.
It will make a followup patch easier to read.

llvm-svn: 305865
2017-06-20 22:53:29 +00:00
Reid Kleckner 91ef9de643 [codeview] YAMLize all section offsets and indices in symbol records
We forgot to serialize these because llvm-readobj didn't dump them. They
are typically all zeros in an object file. The linker fills them in with
relocations before adding them to the PDB. Now we can properly round
trip these symbols through pdb2yaml -> yaml2pdb.

I made these fields optional with a zero default so that we can elide
them from our test cases.

llvm-svn: 305857
2017-06-20 21:19:22 +00:00
Adrian Prantl 25422dcccb Fix a crash in DwarfDebug::validThroughout.
The instruction it falls over on is an IMPLICT_DEF that also happens
to be the only instruction in its lexical scope. That LexicalScope has
never been created because its range is empty. This patch skips over
all meta-instructions instead of just DBG_VALUEs.

Thanks to David Blaikie for providing a testcase!

llvm-svn: 305853
2017-06-20 21:08:52 +00:00
Anna Thomas f765cad13e [Statepoint] Add helper functions for GCRelocate and GCResult
These functions isGCRelocate and isGCResult are
similar to isStatepoint(const Value*).

llvm-svn: 305847
2017-06-20 20:54:57 +00:00
Saleem Abdulrasool 8199dadab8 Support: chunk writing on Linux
This is a workaround for large file writes.  It has been witnessed that
write(2) failing with EINVAL (22) due to a large value (>2G).  Thanks to
James Knight for the help with coming up with a sane test case.

llvm-svn: 305846
2017-06-20 20:51:51 +00:00
Matt Arsenault 67cd347e93 AMDGPU: Allow vectorization of packed types
llvm-svn: 305844
2017-06-20 20:38:06 +00:00
Reid Kleckner 665e1c9240 [codeview] Fully initialize DataSym when mapping from YAML
In the object file, the section index and relative offset are typically
zero, so make these YAML fields optional with a default.

It looks like there may be more partially initialized symbol records,
but this should fix the msan bot.

llvm-svn: 305842
2017-06-20 20:34:37 +00:00
Stanislav Mekhanoshin a9d846c6ef [AMDGPU] Fix illegal shrink of V_SUBB_U32 and V_ADDC_U32
If there is an immediate operand we shall not shrink V_SUBB_U32
and V_ADDC_U32, it does not fit e32 encoding.

Differential Revison: https://reviews.llvm.org/D34291

llvm-svn: 305840
2017-06-20 20:33:44 +00:00
Matt Arsenault 9698f1c862 AMDGPU: Start adding global_* instructions
llvm-svn: 305838
2017-06-20 19:54:14 +00:00
Aditya Nandakumar c6a419123a [GISel]: Add G_FMA opcode for fused multiply adds
https://reviews.llvm.org/D34372

Reviewed by dsanders

llvm-svn: 305824
2017-06-20 19:25:23 +00:00
Matt Arsenault ff3f912e74 AMDGPU: Do operand folding in program order
Before it was possible to partially fold use instructions
before the defs. After the xor is folded into a copy, the same
mov can end up in the fold list twice, so on the second attempt
it will fail expecting to see a register to fold.

llvm-svn: 305821
2017-06-20 18:56:32 +00:00
Zachary Turner 297b6eb20d [PDB] Don't write uninitialized bytes to a PDB file.
There were certain fields that we didn't know how to write, as
well as various padding bytes that we would ignore.  This leads
to garbage data in the PDB.  While not strictly necessary, we
should initialize these bytes to something meaningful, as it
makes for easier binary comparison between PDBs.

llvm-svn: 305819
2017-06-20 18:50:55 +00:00
Matthias Braun 7a482e2302 RegisterScavenging: Followup to r305625
This does some improvements/cleanup to the recently introduced
scavengeRegisterBackwards() functionality:

- Rewrite findSurvivorBackwards algorithm to use the existing
  LiveRegUnit::accumulateBackward() code. This also avoids the Available
  and Candidates bitset and just need 1 LiveRegUnit instance
  (= 1 bitset).
- Pick registers in allocation order instead of register number order.

llvm-svn: 305817
2017-06-20 18:43:14 +00:00
Matt Arsenault 76858f5a1d AMDGPU: Preserve undef when folding register operands
If the source was a copy of an undef register, this would
produce a read of an undefined register which is a verifier
error.

llvm-svn: 305816
2017-06-20 18:41:31 +00:00
Stanislav Mekhanoshin 465a1ff193 [AMDGPU] Eliminate SGPR to VGPR copy when possible
SGPRs are generally cheaper, so try to use them over VGPRs.

Differential Revision: https://reviews.llvm.org/D34130

llvm-svn: 305815
2017-06-20 18:32:42 +00:00
Matt Arsenault 7f67b35901 AMDGPU: Fix crash with undef vreg input operand
llvm-svn: 305814
2017-06-20 18:28:02 +00:00
Hiroshi Inoue e7a35539c5 [PowerPC] fix trivial typos in comment, NFC
llvm-svn: 305813
2017-06-20 17:53:33 +00:00
Yuka Takahashi ba5d4af490 [GSoC] Flag value completion for clang
This is patch for GSoC project, bash-completion for clang.

To use this on bash, please run `source clang/utils/bash-autocomplete.sh`.
bash-autocomplete.sh is code for bash-completion.

In this patch, Options.td was mainly changed in order to add value class
in Options.inc.

llvm-svn: 305805
2017-06-20 16:31:31 +00:00
Sanjay Patel 0656629b87 [x86] enable CGP memcmp() expansion for 2/4/8 byte sizes
There are a couple of potential improvements as seen in the IR and asm:
1. We're unnecessarily extending to a larger type to compare values.
2. The codegen for (select cond, 1, -1) could avoid a cmov.
(or we could change the order of the compares, so we have a select with 0 operand)

llvm-svn: 305802
2017-06-20 15:58:30 +00:00
Simon Pilgrim 4822b5b649 [X86][SSE] Relax 0/-1 vector element insertion to work for any vector with >=16bit elements
Shuffle lowering/combining now does a good job for 256/512-bit vectors - we don't need to prevent this

llvm-svn: 305801
2017-06-20 15:19:02 +00:00
Tim Northover 208ddc5bdc DAG: correctly legalize UMULO.
We were incorrectly sign extending into the high word (as you would for
SMULO) when legalizing UMULO in terms of a wider full multiplication.

Patch by James Duley.

llvm-svn: 305800
2017-06-20 15:01:38 +00:00
Sanjay Patel 4ccbd58d70 [InstCombine] fix code/test comments for r305792; NFC
These diffs were in the last version of the patch in D33342,
but I accidentally committed the previous rev. 

llvm-svn: 305793
2017-06-20 12:45:46 +00:00
Sanjay Patel adca825dc1 [InstCombine] try to canonicalize xor-of-icmps to and-of-icmps
We have a large portfolio of folds for and-of-icmps and or-of-icmps in InstSimplify and InstCombine, 
but hardly anything for xor-of-icmps. Rather than trying to rethink and translate all of those folds, 
we can use the truth table definition of xor:

X ^ Y --> (X | Y) & !(X & Y)

...to see if we can convert the xor to and/or and then use the existing folds.

http://rise4fun.com/Alive/J9v

Differential Revision: https://reviews.llvm.org/D33342

llvm-svn: 305792
2017-06-20 12:40:55 +00:00
Daniel Sanders a6e2cebf98 [globalisel][tablegen] Add support for COPY_TO_REGCLASS.
Summary:
As part of this
* Emitted instructions now have named MachineInstr variables associated
  with them. This isn't particularly important yet but it's a small step
  towards multiple-insn emission.
* constrainSelectedInstRegOperands() is no longer hardcoded. It's now added
  as the ConstrainOperandsToDefinitionAction() action. COPY_TO_REGCLASS uses
  an alternate constraint mechanism ConstrainOperandToRegClassAction() which
  supports arbitrary constraints such as that defined by COPY_TO_REGCLASS.

Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls, aditya_nandakumar

Reviewed By: ab

Subscribers: javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D33590

llvm-svn: 305791
2017-06-20 12:36:34 +00:00
Simon Pilgrim b233c0a5d2 [X86][SSE] Dropped old INSERT_VECTOR_ELT lowering TODO
Target shuffle combining now supports the matching of INSERT_VECTOR_ELT/PINSRW/PINSRB for merging multiple insertions into shuffles/bitmasks.

llvm-svn: 305788
2017-06-20 10:33:34 +00:00
Igor Breger 0dee0f458e [GlobalISel][X86] fix compilation error ( -Werror=unused-function )
llvm-svn: 305786
2017-06-20 09:40:57 +00:00
Haojian Wu 6bd5cc6239 [SelectionDAG] Fix an use-after-free issue introduced in r305775.
vector.back() will be invalidated when memory reallocation happens.

llvm-svn: 305785
2017-06-20 09:29:43 +00:00
Igor Breger 1dcd5e8dc8 [GlobalISel][X86] Get correct RegClass for given RegBank.
Summary:
In some cases RegClass depends on target feature. Hight (16-31) vector registers exist only if AVX512f available.
Split from https://reviews.llvm.org/D33665

Reviewers: qcolombet, t.p.northover, zvi, guyblank

Reviewed By: t.p.northover, guyblank

Subscribers: guyblank, rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D33952

Conflicts:
	test/CodeGen/X86/GlobalISel/select-memop-scalar.mir

llvm-svn: 305784
2017-06-20 09:15:10 +00:00
Igor Breger 14535f0fc2 [GlobalISel] combine not symmetric merge/unmerge nodes.
Summary:
In some cases legalization ends up with not symmetric merge/unmerge nodes.
Transform it to merge/unmerge nodes.

Reviewers: t.p.northover, qcolombet, zvi

Reviewed By: t.p.northover

Subscribers: rovka, kristof.beyls, guyblank, llvm-commits

Differential Revision: https://reviews.llvm.org/D33626

llvm-svn: 305783
2017-06-20 08:54:17 +00:00
Max Kazantsev 0bcf6ec85c [SCEV][NFC] Fix a misleading description of AddOpsInlineThreshold
The description of this option was copy-pasted from another one and does not
correspond to reality.

Differential Revision: https://reviews.llvm.org/D34390

llvm-svn: 305782
2017-06-20 08:37:31 +00:00
NAKAMURA Takumi 9a90b68707 WasmObjectWriter.cpp: Tweak a comment line. [-Wdocumentation]
llvm-svn: 305777
2017-06-20 07:21:19 +00:00
Alexandros Lamprineas 2b2b420563 [ARM] Support constant pools in data when generating execute-only code.
Resubmission of r305387, which was reverted at r305390. The Address
Sanitizer caught a stack-use-after-scope of a Twine variable. This
is now fixed by passing the Twine directly as a function parameter.

The ARM backend asserts against constant pool lowering when it generates
execute-only code in order to prevent the generation of constant pools in
the text section. It appears that target independent optimizations might
generate DAG nodes that represent constant pools. By lowering such nodes
as global addresses we don't violate the semantics of execute-only code
and also it is guaranteed that execute-only behaves correct with the
position-independent addressing modes that support execute-only code.

Differential Revision: https://reviews.llvm.org/D33773

llvm-svn: 305776
2017-06-20 07:20:52 +00:00
Max Kazantsev b5c3362873 [SelectionDAG] Get rid of recursion in CalcNodeSethiUllmanNumber
The recursive implementation of CalcNodeSethiUllmanNumber may
overflow stack on extremely long pred chains. This patch replaces it
with an equivalent iterative implementation.

Differential Revision: https://reviews.llvm.org/D33769

llvm-svn: 305775
2017-06-20 07:07:09 +00:00
Sam Clegg 1fb8daa69a Fix unused function build error in lld
The lld-x86_64-darwin13 is failing with:
 error: unused function 'operator<<'

Wrap the declation in ifndef NDEBUG, which matches
what is done in MipsELFObjectWriter.cpp.

Differential Revision: https://reviews.llvm.org/D34384

llvm-svn: 305771
2017-06-20 05:05:10 +00:00
Sam Clegg 7f055dee27 [WebAssembly] Fix build failures introduced in r305769
This fixes two build failures that only occur in certain
configurations:
- error: unused function 'operator<<'
- error: control reaches end of non-void function

Differential Revision: https://reviews.llvm.org/D34382

llvm-svn: 305770
2017-06-20 04:47:58 +00:00
Sam Clegg b7787fd076 [WebAssembly] Add support for weak symbols in the binary format
This also introduces the updated format for the
"linking" section which can represent extra
symbol information.  See:
https://github.com/WebAssembly/tool-conventions/pull/10

Differential Revision: https://reviews.llvm.org/D34019

llvm-svn: 305769
2017-06-20 04:04:59 +00:00
Nirav Dave 47a78a2502 [DAG] Simplify BaseIndexOffset. NFCI.
Remove tail calls and cleanup codeflow.

llvm-svn: 305768
2017-06-20 02:48:39 +00:00
Vedant Kumar b1d331a36e [Coverage] PR33517: Check for failure to load func records
With PR33517, it became apparent that symbol table creation can fail
when presented with malformed inputs. This patch makes that sort of
error detectable, so llvm-cov etc. can fail more gracefully.

Specifically, we now check that function records loaded from corrupted coverage
mapping data are rejected, e.g when the recorded function name is garbage.

Testing: check-{llvm,clang,profile}, some unit test updates.
llvm-svn: 305767
2017-06-20 02:05:35 +00:00
Vedant Kumar b5794ca90c [ProfileData] PR33517: Check for failure of symtab creation
With PR33517, it became apparent that symbol table creation can fail
when presented with malformed inputs. This patch makes that sort of
error detectable, so llvm-cov etc. can fail more gracefully.

Specifically, we now check that function names within the symbol table
aren't empty.

Testing: check-{llvm,clang,profile}, some unit test updates.
llvm-svn: 305765
2017-06-20 01:38:56 +00:00
Matt Arsenault c595185f8f AMDGPU: Fix scratch wave offset relative FI expansion
The offset may not be an inline immediate, so this needs
to be materialized into a register. The post-RA run of
SIShrinkInstructions is able to fold it later if it can.

llvm-svn: 305761
2017-06-19 23:47:21 +00:00
Eugene Zelenko f292a2feca [ExecutionEngine] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 305760
2017-06-19 23:37:52 +00:00
Stanislav Mekhanoshin 50c2f251f5 [AMDGPU] Add infer address spaces pass before SROA
It adds it for the target after inlining but before SROA where
we can get most out of it.

Differential Revision: https://reviews.llvm.org/D34366

llvm-svn: 305759
2017-06-19 23:17:36 +00:00
Eugene Zelenko 8361b0a9bb [Target] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 305757
2017-06-19 22:43:19 +00:00
Eugene Zelenko de6cce2236 [IR] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC).
llvm-svn: 305755
2017-06-19 22:05:08 +00:00
Zachary Turner be548aceef Mark LLVMTestingSupport as not installed in LLVMBuild.
This is causing downstream issues with llvm-config.

llvm-svn: 305754
2017-06-19 22:01:50 +00:00
Geoff Berry 5e46600e3a [AArch64][Falkor] Fix MOVZ sched predicate to not assert on non-imm operands (e.g. blockaddress).
llvm-svn: 305752
2017-06-19 21:57:44 +00:00
Geoff Berry e9972cabbd [AArch64][Kryo] Add missing write latency for LDAXP, LDXP second destination.
Fixes PR33491 and PR33512.

llvm-svn: 305751
2017-06-19 21:57:42 +00:00
Geoff Berry 3cc4b9f780 [AArch64][Falkor] Refine load/store increment latencies.
Also fix LDXP & LDAXP write latency to avoid similar assert as PR33491 and PR33512.

llvm-svn: 305750
2017-06-19 21:56:21 +00:00
Matt Arsenault f5d61d7943 Fix typos
llvm-svn: 305749
2017-06-19 21:54:25 +00:00
Matt Arsenault e0e68a757e AMDGPU: Cleanup CreateLiveInRegister
llvm-svn: 305748
2017-06-19 21:52:45 +00:00
Xin Tong bb8dbcf915 [BDCE] Add comments. NFC
llvm-svn: 305739
2017-06-19 20:10:41 +00:00
Ana Pazos f731bde064 [PATCH] [PGO] Fixed cast operation in emIntrinsicVisitor::instrumentOneMemIntrinsic.
Reviewers: xur, efriedma, davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34293

llvm-svn: 305737
2017-06-19 20:04:33 +00:00
Nico Weber 4c5c02a448 Revert r305382, it caused PR33513.
llvm-svn: 305735
2017-06-19 19:48:59 +00:00
Sanjay Patel a351a61cf2 [CGP, PowerPC] try to constant fold before creating loads for memcmp expansion
This is the last step needed to avoid regressions for x86 before we flip the switch to allow 
expansion of the smallest set of memcpy() via CGP. The DAG version checks for constant strings, 
so we need to do that here too.

FWIW, the 2 constant test is not handled by LibCallSimplifier::optimizeMemCmp() because that 
code is limited to 8-bit constant arrays. LibCallSimplifier will also fail to optimize some 1 
constant tests because its alignment requirements are too strict (shouldn't require alignment 
for a constant operand).

Differential Revision: https://reviews.llvm.org/D34071

llvm-svn: 305734
2017-06-19 19:48:35 +00:00
David Blaikie 6ab0eb4764 Remove convenient but probably not worthwhile macro for lambda workaround
Cleanup from r305405

llvm-svn: 305731
2017-06-19 19:01:08 +00:00
Eric Beckmann ddcfbf7d0a Have writeCOFFWriter return Expected<unique_ptr>.
Summary: Have writeCOFFWriter return Expected<unique_ptr> instead of requiring being passed an uninitialized unique_ptr.

Reviewers: zturner, ruiu

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D34307

llvm-svn: 305730
2017-06-19 18:49:05 +00:00
Taewook Oh 9083547ae3 Improve profile-guided heuristics to use estimated trip count.
Summary:
Existing heuristic uses the ratio between the function entry
frequency and the loop invocation frequency to find cold loops. However,
even if the loop executes frequently, if it has a small trip count per
each invocation, vectorization is not beneficial. On the other hand,
even if the loop invocation frequency is much smaller than the function
invocation frequency, if the trip count is high it is still beneficial
to vectorize the loop.

This patch uses estimated trip count computed from the profile metadata
as a primary metric to determine coldness of the loop. If the estimated
trip count cannot be computed, it falls back to the original heuristics.

Reviewers: Ayal, mssimpso, mkuper, danielcdh, wmi, tejohnson

Reviewed By: tejohnson

Subscribers: tejohnson, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D32451

llvm-svn: 305729
2017-06-19 18:48:58 +00:00
Bjorn Pettersson 475fcd9cd8 [InstCombine] Make sure AddReachableCodeToWorklist sets MadeIRChange
Summary:
Some optimizations in AddReachableCodeToWorklist did not update
the MadeIRChange state. This could happen both when removing
trivially dead instructions (DCE) and at constant folds.

It is essential that changes to the IR is reported correctly,
since for example InstCombinePass::run() will indicate that all
analyses are preserved otherwise.
And the CGPassManager determines if the CallGraph is up-to-date
based on status from InstructionCombiningPass::runOnFunction().

The new test case early_dce_clobbers_callgraph.ll is a reproducer
for some asserts that started to trigger after changes in the
inliner in r305245. With this patch the test case passes again.

Reviewers: sanjoy, craig.topper, dblaikie

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34346

llvm-svn: 305725
2017-06-19 18:00:27 +00:00
Hans Wennborg ca69fc1cb7 Revert r304824 "Fix PR23384 (part 3 of 3)"
This seems to be interacting badly with ASan somehow, causing false reports of
heap-buffer overflows: PR33514.

> Summary:
> The patch makes instruction count the highest priority for
> LSR solution for X86 (previously registers had highest priority).
>
> Reviewers: qcolombet
>
> Differential Revision: http://reviews.llvm.org/D30562
>
> From: Evgeny Stupachenko <evstupac@gmail.com>

llvm-svn: 305720
2017-06-19 17:57:15 +00:00
Reid Kleckner 44cdb10964 [PDB] Start emitting source file and line information
Summary:
This is a first step towards getting line info to show up in VS and
windbg. So far, only llvm-pdbutil can parse the PDBs that we produce.
cvdump doesn't like something about our file checksum tables. I'll have
to dig into that next.

This patch adds a new DebugSubsectionRecordBuilder which takes bytes
directly from some other producer, such as a linker, and sticks it into
the PDB. Line tables only need to be relocated. No data needs to be
rewritten.

File checksums and string tables, on the other hand, need to be re-done.

Reviewers: zturner, ruiu

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D34257

llvm-svn: 305713
2017-06-19 17:21:45 +00:00
Reid Kleckner 18d90e17ad [CodeView] Fix dumping of public symbol record flags
I noticed nonsensical type information while dumping PDBs produced by
MSVC.

llvm-svn: 305708
2017-06-19 16:54:51 +00:00
Davide Italiano daa9c0e403 [NewGVN] Simplify findConditionEquivalence(). NFCI.
llvm-svn: 305707
2017-06-19 16:46:15 +00:00
Dinar Temirbulatov e2c6991c07 Remove brackets, NFC.
llvm-svn: 305706
2017-06-19 16:44:07 +00:00
Craig Topper a7529b68cc [InstCombine] Cleanup some duplicated one use checks
Summary:
These 4 patterns have the same one use check repeated twice for each. Once without a cast and one with. But the cast has no effect on what method is called.

For the OR case I believe it is always profitable regardless of the number of uses since we'll never increase the instruction count.

For the AND case I believe it is profitable if the pair of xors has one use such that we'll get rid of it completely. Or if the C value is something freely invertible, in which case the not doesn't cost anything.

Reviewers: spatel, majnemer

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34308

llvm-svn: 305705
2017-06-19 16:23:49 +00:00
Craig Topper ef85498e05 [Reassociate] Support some reassociation of vector xors
Summary:
Currently we don't try to do anything with vector xors.

This patch adds support for removing duplicate pairs from a chain of vector xors as its pretty easy to support. We still dont' try to combine the xors with and/ors, but I might try that in a future patch.

Reviewers: mcrosier, davide, resistor

Reviewed By: mcrosier

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34338

llvm-svn: 305704
2017-06-19 16:23:46 +00:00
Craig Topper 4350734d36 [Reassociate] Make one of the helper methods static because it doesn't use any class variables. NFC
llvm-svn: 305703
2017-06-19 16:23:43 +00:00
Nirav Dave 8dcd008d18 Allow truncated and extend memory operations in Store Merge. NFCI.
As all store merges checks are based on the memory operation
performed, allow use of truncated stores and extended loads as valid
input candidates for merging.

Relanding after fixing selection between truncated and normal store.

llvm-svn: 305701
2017-06-19 15:32:28 +00:00
Anna Thomas 7949f4529a [JumpThreading][LVI] Invalidate LVI information after blocks are merged
Summary:
After a single predecessor is merged into a basic block, we need to invalidate
the LVI information for the new merged block, when LVI is not provably true for
all of instructions in the new block.
The test cases added show the correct LVI information using the LVI printer
pass.

Reviewers: reames, dberlin, davide, sanjoy

Reviewed by: dberlin, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34108

llvm-svn: 305699
2017-06-19 15:23:33 +00:00
Xin Tong b412831d11 [TRE] Improve code motion in TRE, use AA to tell whether a load can be moved before a call that writes to memory.
Summary: use AA to tell whether a load can be moved before a call that writes to memory.

Reviewers: dberlin, davide, sanjoy, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D34115

llvm-svn: 305698
2017-06-19 15:21:18 +00:00
Florian Hahn fd44ca6c76 [AArch64] Fix order of checks in shouldScheduleAdjacent.
We need to check the opcode of FirstMI before accessing the operands. This
caused a buildbot failure during bootstrapping on AArch64.

llvm-svn: 305694
2017-06-19 13:45:41 +00:00
Tom Stellard ff63ee0db5 AMDGPU/GlobalISel: Mark G_BITCAST s32 <--> <2 x s16> legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D34129

llvm-svn: 305692
2017-06-19 13:15:45 +00:00
Igor Breger bd2dedaa38 [GlobalISel][X86] Fold FI/G_GEP into LDR/STR instruction addressing mode.
Summary: Implement some of the simplest addressing modes.It should help to test ABI.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D33888

llvm-svn: 305691
2017-06-19 13:12:57 +00:00
Florian Hahn 5f746c8e27 Recommit rL305677: [CodeGen] Add generic MacroFusion pass
Use llvm::make_unique to avoid ambiguity with MSVC.

This patch adds a generic MacroFusion pass, that is used on X86 and
AArch64, which both define target-specific shouldScheduleAdjacent
functions. This generic pass should make it easier for other targets to
implement macro fusion and I intend to add macro fusion for ARM shortly.

Differential Revision: https://reviews.llvm.org/D34144

llvm-svn: 305690
2017-06-19 12:53:31 +00:00
Diana Picus 78aaf7db04 [ARM] GlobalISel: Support G_ICMP for s8 and s16
Widen to s32 (like all other binary ops).

llvm-svn: 305683
2017-06-19 11:47:28 +00:00
Florian Hahn e16d3106f3 Revert r305677 [CodeGen] Add generic MacroFusion pass.
This causes Windows buildbot failures do an ambiguous call.

llvm-svn: 305681
2017-06-19 11:26:15 +00:00
Florian Hahn ee1b096f8a [CodeGen] Add generic MacroFusion pass.
Summary:
This patch adds a generic MacroFusion pass, that is used on X86 and
AArch64, which both define target-specific shouldScheduleAdjacent
functions. This generic pass should make it easier for other targets to
implement macro fusion and I intend to add macro fusion for ARM shortly.

Reviewers: craig.topper, evandro, t.p.northover, atrick, MatzeB

Reviewed By: MatzeB

Subscribers: atrick, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34144

llvm-svn: 305677
2017-06-19 10:51:38 +00:00
Diana Picus 621894ac76 [ARM] GlobalISel: Support G_ICMP for i32 and pointers
Add support throughout the pipeline:
- mark as legal for s32 and pointers
- map to GPRs
- lower to a sequence of instructions, which moves 0 or 1 into the
  result register based on the flags set by a CMPrr

We have copied from FastISel a helper function which maps CmpInst
predicates into ARMCC codes. Ideally, we should be able to move it
somewhere that both FastISel and GlobalISel can use.

llvm-svn: 305672
2017-06-19 09:40:51 +00:00
Max Kazantsev 35b2a18eb9 [SCEV] Teach SCEVExpander to expand BinPow
Current implementation of SCEVExpander demonstrates a very naive behavior when
it deals with power calculation. For example, a SCEV for x^8 looks like

  (x * x * x * x * x * x * x * x)

If we try to expand it, it generates a very straightforward sequence of muls, like:

  x2 = mul x, x
  x3 = mul x2, x
  x4 = mul x3, x
      ...
  x8 = mul x7, x

This is a non-efficient way of doing that. A better way is to generate a sequence of
binary power calculation. In this case the expanded calculation will look like:

  x2 = mul x, x
  x4 = mul x2, x2
  x8 = mul x4, x4

In some cases the code size reduction for such SCEVs is dramatic. If we had a loop:

  x = a;
  for (int i = 0; i < 3; i++)
    x = x * x;

And this loop have been fully unrolled, we have something like:

  x = a;
  x2 = x * x;
  x4 = x2 * x2;
  x8 = x4 * x4;

The SCEV for x8 is the same as in example above, and if we for some reason
want to expand it, we will generate naively 7 multiplications instead of 3.
The BinPow expansion algorithm here allows to keep code size reasonable.

This patch teaches SCEV Expander to generate a sequence of BinPow multiplications
if we have repeating arguments in SCEVMulExpressions.

Differential Revision: https://reviews.llvm.org/D34025

llvm-svn: 305663
2017-06-19 06:24:53 +00:00
Daniel Berlin 36b08b2088 NewGVN: Fix PR 33461, caused by slightly overzealous verification.
llvm-svn: 305657
2017-06-19 00:24:00 +00:00
Zachary Turner 26dbc5420d Delete TypeDatabase.
Merge the functionality into the random access type collection.
This class was only being used in 2 places, so getting rid of it
simplifies the code.

llvm-svn: 305653
2017-06-18 20:52:45 +00:00
Craig Topper c85be52fd8 [APFloat] Move the integerPartWidth constant into APFloatBase. Remove integerPart typedef at file scope and just use the one in APFloatBase everywhere. NFC
llvm-svn: 305652
2017-06-18 18:15:41 +00:00
Craig Topper d96177cf72 [Reassociate] Use APInt::isNullValue() instead of comparing with 0. NFC
This should compile to slightly better code.

llvm-svn: 305651
2017-06-18 18:15:38 +00:00
Kamil Rytarowski a841233a76 Implement AllocateRWX and ReleaseRWX for NetBSD
Summary:
NetBSD ships with PaX MPROTECT disallowing RWX mappings.
There is a solution to bypass this restriction with double mapping
RX (code) and RW (data) using mremap(2) MAP_REMAPDUP.
The initial mapping must be mmap(2)ed with protection:
PROT_MPROTECT(PROT_EXEC).

This functionality to bypass PaX MPROTECT appeared in NetBSD-7.99.72.

This patch fixes 20 failing tests:
-    LLVM :: DebugInfo/debuglineinfo-macho.test
-    LLVM :: DebugInfo/debuglineinfo.test
-    LLVM :: ExecutionEngine/RuntimeDyld/Mips/ELF_Mips64r2N64_PIC_relocations.s
-    LLVM :: ExecutionEngine/RuntimeDyld/Mips/ELF_N32_relocations.s
-    LLVM :: ExecutionEngine/RuntimeDyld/Mips/ELF_N64R6_relocations.s
-    LLVM :: ExecutionEngine/RuntimeDyld/Mips/ELF_O32R6_relocations.s
-    LLVM :: ExecutionEngine/RuntimeDyld/Mips/ELF_O32_PIC_relocations.s
-    LLVM :: ExecutionEngine/RuntimeDyld/X86/COFF_i386.s
-    LLVM :: ExecutionEngine/RuntimeDyld/X86/COFF_x86_64.s
-    LLVM :: ExecutionEngine/RuntimeDyld/X86/ELF-relaxed.s
-    LLVM :: ExecutionEngine/RuntimeDyld/X86/ELF_STT_FILE.s
-    LLVM :: ExecutionEngine/RuntimeDyld/X86/ELF_x64-64_PC8_relocations.s
-    LLVM :: ExecutionEngine/RuntimeDyld/X86/ELF_x64-64_PIC_relocations.s
-    LLVM :: ExecutionEngine/RuntimeDyld/X86/ELF_x86-64_PIC-small-relocations.s
-    LLVM :: ExecutionEngine/RuntimeDyld/X86/ELF_x86-64_debug_frame.s
-    LLVM :: ExecutionEngine/RuntimeDyld/X86/ELF_x86_64_StubBuf.s
-    LLVM :: ExecutionEngine/RuntimeDyld/X86/MachO_empty_ehframe.s
-    LLVM :: ExecutionEngine/RuntimeDyld/X86/MachO_i386_DynNoPIC_relocations.s
-    LLVM :: ExecutionEngine/RuntimeDyld/X86/MachO_i386_eh_frame.s
-    LLVM :: ExecutionEngine/RuntimeDyld/X86/MachO_x86-64_PIC_relocations.s

Sponsored by <The NetBSD Foundation>

Reviewers: joerg, lhames

Reviewed By: joerg

Subscribers: sdardis, llvm-commits, arichardson

Differential Revision: https://reviews.llvm.org/D33874

llvm-svn: 305650
2017-06-18 16:52:32 +00:00
Xin Tong 9d2a5b1cf7 Add argmononly attribute to strlen and wcslen, i.e. they only read memory (string) passed to them.
Summary:
This allows strlen to be moved out of the loop in case its argument is
not modified in the loop in LICM.

Reviewers: hfinkel, davide, sanjoy, dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34323

llvm-svn: 305641
2017-06-18 03:10:26 +00:00
Galina Kistanova 90e4c3f357 Fixed the warning introduced by r305625 to make ubuntu-gcc7.1-werror bot green.
llvm-svn: 305640
2017-06-17 21:05:28 +00:00
Sanjoy Das b70ddd8901 [SROA] Add support for non-integral pointers
Summary: C.f. http://llvm.org/docs/LangRef.html#non-integral-pointer-type

Reviewers: chandlerc, loladiro

Reviewed By: loladiro

Subscribers: reames, loladiro, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D32203

llvm-svn: 305639
2017-06-17 20:28:13 +00:00
Xin Tong 025780ba6e [TRE] Add assertion for folding trivial return block
llvm-svn: 305637
2017-06-17 16:55:12 +00:00
Xin Tong d5b4d0b53a [TRE] Update comments. NFC
llvm-svn: 305636
2017-06-17 16:18:36 +00:00
NAKAMURA Takumi fc7f3b7514 [CMake] Introduce LLVM_TARGET_TRIPLE_ENV as an option to override LLVM_DEFAULT_TARGET_TRIPLE at runtime.
No behavior is changed if LLVM_TARGET_TRIPLE_ENV is blank or undefined.

If LLVM_TARGET_TRIPLE_ENV is "TEST_TARGET_TRIPLE" and $TEST_TARGET_TRIPLE is not blank,
llvm::sys::getDefaultTargetTriple() returns $TEST_TARGET_TRIPLE.
Lit resets config.target_triple and config.environment[LLVM_TARGET_TRIPLE_ENV] to change the default target.

Without changing LLVM_DEFAULT_TARGET_TRIPLE nor rebuilding, lit can be run;

  TEST_TARGET_TRIPLE=i686-pc-win32 bin/llvm-lit -sv path/to/test/
  TEST_TARGET_TRIPLE=i686-pc-win32 ninja check-clang-tools

Differential Revision: https://reviews.llvm.org/D33662

llvm-svn: 305632
2017-06-17 03:19:08 +00:00
Eric Christopher c70d07b7ea Rework logic and comment out the default relocation models for PPC.
llvm-svn: 305630
2017-06-17 02:25:56 +00:00
Eric Christopher 5ec30ef4e4 Turn a large if block into a smaller early return for clarity.
llvm-svn: 305629
2017-06-17 02:25:55 +00:00
Eric Christopher ded727c5a8 Remove the old and unused PPC32 and PPC64TargetMachine classes.
llvm-svn: 305628
2017-06-17 02:25:53 +00:00
Eric Christopher 3b78693b9a Remove unused forward declaration.
llvm-svn: 305627
2017-06-17 02:25:51 +00:00
Eric Christopher a5b50cd645 Tidy up some calls to getRegister for readability.
llvm-svn: 305626
2017-06-17 02:25:49 +00:00
Matthias Braun 537d039104 RegScavenging: Add scavengeRegisterBackwards()
Re-apply r276044/r279124/r305516. Fixed a problem where we would refuse
to place spills as the very first instruciton of a basic block and thus
artifically increase pressure (test in
test/CodeGen/PowerPC/scavenging.mir:spill_at_begin)

This is a variant of scavengeRegister() that works for
enterBasicBlockEnd()/backward(). The benefit of the backward mode is
that it is not affected by incomplete kill flags.

This patch also changes
PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register
scavenger in backwards mode.

Differential Revision: http://reviews.llvm.org/D21885

llvm-svn: 305625
2017-06-17 02:08:18 +00:00
Tim Shen d123c194e0 [PPC] Remove isBarrier from CFENCE8's definition.
Summary:
This is my misunderstanding on isBarrier. It's not for memory barriers,
but for other control flow purposes. lwsync doesn't have it either.

This fixes a simple crash with -verify-machineinstrs like below:

  define void @Foo() {
  entry:
    %tmp = load atomic i64, i64* undef acquire, align 8
    unreachable
  }

I deliberately don't want to check in the test, since there is little
chance to regress on such a mistake. Such a test adds noise to the code
base.

I plan to check in first, since it fixes a crash, and the fix is obvious.

Reviewers: kbarton, echristo

Subscribers: sanjoy, nemanjai, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D34314

llvm-svn: 305624
2017-06-17 01:25:34 +00:00
Davide Italiano 9382c5560b [SelectionDAG] Update Loop info after splitting critical edges.
The analysis is expected to be preserved by SelectionDAG.

llvm-svn: 305621
2017-06-17 00:56:27 +00:00
Zachary Turner b0fdd214b7 Don't crash if a type record can't be found.
This was a regression introduced in a previous patch.  Adding
back the code that handles this case.

llvm-svn: 305617
2017-06-17 00:02:24 +00:00
Sam Clegg 9d24fb7ff3 [WebAssembly] Use __stack_pointer global when writing wasm binary
This ensures that symbolic relocations are generated for stack
pointer manipulations.

These relocations are of type R_WEBASSEMBLY_GLOBAL_INDEX_LEB.
This change also adds support for reading relocations of this
type in WasmObjectFile.cpp.

Since its a globally imported symbol this does mean that
the get_global/set_global instruction won't be valid until
the objects are linked that global used in no longer an
imported global.

Differential Revision: https://reviews.llvm.org/D34172

llvm-svn: 305616
2017-06-16 23:59:10 +00:00
Zachary Turner ad859bd472 [CodeView] Fix random access of type names.
Suppose we had a type index offsets array with a boundary at type index
N. Then you request the name of the type with index N+1, and that name
requires the name of index N-1 (think a parameter list, for example). We
didn't handle this, and we would print something like (<unknown UDT>,
<unknown UDT>).

The fix for this is not entirely trivial, and speaks to a larger
problem. I think we need to kill TypeDatabase, or at the very least kill
TypeDatabaseVisitor. We need a thing that doesn't do any caching
whatsoever, just given a type index it can compute the type name "the
slow way". The reason for the bug is that we don't have anything like
that. Everything goes through the type database, and if we've visited a
record, then we're "done". It doesn't know how to do the expensive thing
of re-visiting dependent records if they've not yet been visited.

What I've done here is more or less copied the code (albeit greatly
simplified) from TypeDatabaseVisitor, but wrapped it in an interface
that just returns a std::string. The logic of caching the name is now in
LazyRandomTypeCollection. Eventually I'd like to move the record
database here as well and the visited record bitfield here as well, at
which point we can actually just delete TypeDatabase. I don't see any
reason for it if a "sequential" collection is just a special case of a
random access collection with an empty partial offsets array.

Differential Revision: https://reviews.llvm.org/D34297

llvm-svn: 305612
2017-06-16 23:42:44 +00:00
Zachary Turner 59224cba2e Remove some dead code / includes.
I'm trying to get rid of the TypeDatabase class, so the first
step is to minimize its footprint.

llvm-svn: 305611
2017-06-16 23:42:15 +00:00
Yonghong Song a63178f756 bpf: fix a strict-aliasing issue
Davide Italiano reported the following issue if llvm
is compiled with gcc -Wstrict-aliasing -Werror:
.....
lib/Target/BPF/CMakeFiles/LLVMBPFCodeGen.dir/BPFISelDAGToDAG.cpp.o
../lib/Target/BPF/BPFISelDAGToDAG.cpp: In member function ‘virtual
void {anonymous}::BPFDAGToDAGISel::PreprocessISelDAG()’:
../lib/Target/BPF/BPFISelDAGToDAG.cpp:264:26: warning: dereferencing
type-punned pointer will break strict-aliasing rules
[-Wstrict-aliasing]
       val = *(uint16_t *)new_val;
.....

The error is caused by my previous commit (revision 305560).

This patch fixed the issue by introducing an union to avoid
type casting.

Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 305608
2017-06-16 23:28:04 +00:00
Craig Topper 61e684adcc [ConstantRange] Implement getSignedMin/Max in a less complicated and faster way
Summary: As far as I can tell we should be able to implement these almost the same way we do unsigned, but using signed comparisons and checks for min signed value instead of min unsigned value.

Reviewers: pete, davide, sanjoy

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33815

llvm-svn: 305607
2017-06-16 23:26:23 +00:00
Craig Topper 288b3c9e69 [SelectionDAG] Use APInt::isSubsetOf. NFC
llvm-svn: 305606
2017-06-16 23:19:14 +00:00
Craig Topper ea5b8bc9ef [SelectionDAG] Use APInt::isNullValue/isOneValue. NFC
llvm-svn: 305605
2017-06-16 23:19:12 +00:00
Craig Topper b681907c50 [TargetLowering] Use ConstantSDNode::isOne and getSExtValue instead of getting the underlying APInt first. NFC
llvm-svn: 305604
2017-06-16 23:19:10 +00:00
Wei Mi c7ba876323 Revert rL305578. There is still some buildbot failure to be fixed.
llvm-svn: 305603
2017-06-16 23:14:35 +00:00
Adrian Prantl 274bcbc139 Improve the accuracy of variable ranges .debug_loc location lists.
For the following motivating example
  bool c();
  void f();
  bool start() {
    bool result = c();
    if (!c()) {
      result = false;
      goto exit;
    }
    f();
    result = true;
  exit:
    return result;
  }

we would previously generate a single DW_AT_const_value(1) because
only the DBG_VALUE in the second-to-last basic block survived
codegen. This patch improves the heuristic used to determine when a
DBG_VALUE is available at the beginning of its variable's enclosing
lexical scope:

- Stop giving singular constants blanket permission to take over the
  entire scope. There is still a special case for constants in the
  function prologue that we also miight want to retire later.

- Use the lexical scope information to determine available-at-entry
  instead of proximity to the function prologue.

After this patch we generate a location list with a more accurate
narrower availability for the constant true value. As a pleasant side
effect, we also generate inline locations instead of location lists
where a loacation covers the entire range of the enclosing lexical
scope.

Measured on compiling llc with four targets this doesn't have an
effect on compile time and reduces the size of the debug info for llc
by ~600K.

rdar://problem/30286912

llvm-svn: 305599
2017-06-16 22:40:04 +00:00
Spyridoula Gravani 32614fcf42 [DWARF] Corrected behavior for when no .apple_names section is present in the object.
The verifier should not output any message in such a case.
Added test case with no .apple_name section in the file to verify new functionality.
Made existing test case more specific.

llvm-svn: 305597
2017-06-16 22:03:21 +00:00
Eric Beckmann 7687f04672 Clean up some things in the WindowsResource changes.
llvm-svn: 305596
2017-06-16 22:00:42 +00:00
Benjamin Kramer 74b4ded0c2 [Object] Remove redundant std::move.
Found by -Wpessimizing-move.

llvm-svn: 305595
2017-06-16 21:27:12 +00:00
Eric Beckmann d135e8c039 Switch external cvtres.exe for llvm's own resource library.
In this patch, I flip the switch in DriverUtils from using the external
cvtres.exe tool to using the Windows Resource library in llvm.

I also fixed a bug where .rsrc sections were marked as discardable
memory and therefore were placed in the wrong order in the final PE.

Furthermore, I modified WindowsResource to write the coff directly to a
memory buffer instead of to file, also had it use the machine types
already declared in COFF.h instead creating my own enum.

Finally, I flipped the switch to allow all unit tests that had
previously run only on windows due to a winres dependency to run
cross-platform.

Reviewers: zturner, ruiu

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D34265

llvm-svn: 305592
2017-06-16 21:13:24 +00:00
Anna Thomas 6bc14c65ad [InstCombine] Set correct insertion point for selects generated while folding phis
Summary:
When we fold vector constants that are operands of phi's that feed into select,
we need to set the correct insertion point for the *new* selects that get generated.
The correct insertion point is the incoming block for the phi.
Such cases can occur with patch r298845, which fixed folding of
vector constants, but the new selects could be inserted incorrectly (as the added
test case shows).

Reviewers: majnemer, spatel, sanjoy

Reviewed by: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34162

llvm-svn: 305591
2017-06-16 21:08:37 +00:00
Davide Italiano ec5b0257bf [SCCP] Simplify the code a bit. NFCI.
llvm-svn: 305583
2017-06-16 20:50:31 +00:00
Davide Italiano 0b1190aa8d [SCCP] Clarify a comment about unhandled instructions.
llvm-svn: 305579
2017-06-16 20:27:17 +00:00
Wei Mi a2493b6ad9 [GVN] Recommit the patch "Add phi-translate support in scalarpre".
The recommit fixes two bugs: The first one is to use CurrentBlock instead of
PREInstr's Parent as param of performScalarPREInsertion because the Parent
of a clone instruction may be uninitialized. The second one is stop PRE when
CurrentBlock to its predecessor is a backedge and an operand of CurInst is
defined inside of CurrentBlock. The same value defined inside of loop in last
iteration can not be regarded as available.

Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.

long a[100], b[100], g1, g2, g3;
__attribute__((pure)) long goo();

void foo(long a, long b, long c, long d) {

  g1 = a * b;
  if (__builtin_expect(g2 > 3, 0)) {
    a = c;
    b = d;
    g2 = a * b;
  }
  g3 = a * b;      // fully redundant.

}
The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.

Differential Revision: https://reviews.llvm.org/D32252

llvm-svn: 305578
2017-06-16 20:21:01 +00:00
Davide Italiano d95d871af0 [SCCP] Remove redundant instruction visitors.
Whenever we don't know what to do with an instruction, we send
it to overdefined anyway.

llvm-svn: 305575
2017-06-16 19:43:57 +00:00
Matthias Braun 35530d7129 Revert "RegScavenging: Add scavengeRegisterBackwards()"
Revert because of reports of some PPC input starting to spill when it
was predicted that it wouldn't and no spillslot was reserved.

This reverts commit r305516.

llvm-svn: 305566
2017-06-16 17:48:08 +00:00
Xinliang David Li c3f8e83253 Fix function name /NFC
llvm-svn: 305564
2017-06-16 16:54:13 +00:00
Yonghong Song ac2e25026f bpf: avoid load from read-only sections
If users tried to have a structure decl/init code like below
   struct test_t t = { .memeber1 = 45 };
It is very likely that compiler will generate a readonly section
to hold up the init values for variable t. Later load of t members,
e.g., t.member1 will result in a read from readonly section.

BPF program cannot handle relocation. This will force users to
write:
  struct test_t t = {};
  t.member1 = 45;
This is just inconvenient and unintuitive.

This patch addresses this issue by implementing BPF PreprocessISelDAG.
For any load from a global constant structure or an global array of
constant struct, it attempts to
translate it into a constant directly. The traversal of the
constant struct and other constant data structures are similar
to where the assembler emits read-only sections.

Four different unit test cases are also added to cover
different scenarios.

Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 305560
2017-06-16 15:41:16 +00:00
Yonghong Song 5c65439617 bpf: set missing types in insn tablegen file
o This is discovered during my study of 32-bit subregister
  support.
o This is no impact on current functionality since we
  only support 64-bit registers.
o Searching the web, looks like the issue has been discovered
  before, so fix it now.

Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 305559
2017-06-16 15:30:55 +00:00
Daniel Neilson 3faabbbe85 [Atomics] Rename and change prototype for atomic memcpy intrinsic
Summary:

Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html

This change is to alter the prototype for the atomic memcpy intrinsic. The prototype itself is being changed to more closely resemble the semantics and parameters of the llvm.memcpy intrinsic -- to ease later combination of the llvm.memcpy and atomic memcpy intrinsics. Furthermore, the name of the atomic memcpy intrinsic is being changed to make it clear that it is not a generic atomic memcpy, but specifically a memcpy is unordered atomic.

Reviewers: reames, sanjoy, efriedma

Reviewed By: reames

Subscribers: mzolotukhin, anna, llvm-commits, skatkov

Differential Revision: https://reviews.llvm.org/D33240

llvm-svn: 305558
2017-06-16 14:43:59 +00:00
Simon Dardis 5852c4c108 Revert "[mips][microMIPS] Extending size reduction pass with ADDIUSP and ADDIUR1SP"
This reverts commit r305455. This commit was reported as breaking one of
the sanitizer buildbots. Reverting until lab.llvm.org comes back online.

llvm-svn: 305557
2017-06-16 14:00:33 +00:00
Krzysztof Parzyszek 3a40b34123 [Hexagon] Don't kill live registers when creating mux out of tfr
The second part of r305300: when placing the mux at the later location,
make sure that it won't use any register that was killed between the
two original instructions. Remove any such kills and transfer them to
the mux.

llvm-svn: 305553
2017-06-16 12:24:03 +00:00
Hiroshi Inoue 3c358f8c68 [MachineBlockPlacement] trivial fix in comments, NFC
- Topologocal is abbreviated as "topo" in comments, but "top" is used in only one comment. Modify it for consistency.
- Capitalize "succ" and "pred" for consistency in one figure.
- Other trivial fixes.

llvm-svn: 305552
2017-06-16 12:23:04 +00:00
Craig Topper da6ea0d3e8 [InstCombine] Fold (!iszero(A & K1) & !iszero(A & K2)) -> (A & (K1 | K2)) == (K1 | K2) if K1 and K2 are a 1-bit mask
Summary: This is the demorganed version of the case we already handle for the OR of iszero.

Reviewers: spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34244

llvm-svn: 305548
2017-06-16 05:10:37 +00:00
Rui Ueyama af0f33a853 Fix buildbots.
llvm-svn: 305542
2017-06-16 02:42:33 +00:00
Rui Ueyama 881819ebfb Fix msan buildbot.
This patch should fix sanitizer-x86_64-linux-fast bot.

The problem was that the contents of this stream are aligned to 4 byte,
and the paddings were created just by incrementing `Offset`, so paddings
had undefined values. When the entire stream is written to an output,
it triggered msan.

llvm-svn: 305541
2017-06-16 02:17:35 +00:00
Craig Topper 4d76da4fc9 [CorrelatedValuePropagation] Remove superfluous semicolon. NFC
llvm-svn: 305538
2017-06-16 01:53:20 +00:00
Eugene Zelenko af6158962d [BinaryFormat, Option, TableGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 305537
2017-06-16 00:43:26 +00:00
Evgeniy Stepanov f63d41469c Fix build warning on 32-bit targets where sizeof(size_t) < sizeof(long long).
llvm-svn: 305535
2017-06-16 00:32:11 +00:00
Evgeniy Stepanov 4d4ee93d25 [cfi] CFI-ICall for ThinLTO.
Implement ControlFlowIntegrity for indirect function calls in ThinLTO.
Design follows the RFC in llvm-dev, see
https://groups.google.com/d/msg/llvm-dev/MgUlaphu4Qc/kywu0AqjAQAJ

llvm-svn: 305533
2017-06-16 00:18:29 +00:00
Xinliang David Li eea0ade2eb [PartialInlining] Code Refactoring
This is a NFC code refactoring and interface cleanup. This paves the
way to enable outlining-only mode for the partial inliner.

llvm-svn: 305530
2017-06-15 23:56:59 +00:00
Zachary Turner 4e950647fb [llvm-pdbutil] Add support for dumping lines and inlinee lines.
llvm-svn: 305529
2017-06-15 23:56:19 +00:00
Ahmed Bougacha d41a9c4e66 Revert "[DAG] Allow truncated and extend memory operations in Store Merge. NFCI."
This reverts commit r305468, as it caused PR33475.

llvm-svn: 305527
2017-06-15 23:29:47 +00:00
Zachary Turner 0e327d0360 [llvm-pdbutil] Add back support for dumping file checksums.
When dumping module source files, also dump checksums.

llvm-svn: 305526
2017-06-15 23:12:41 +00:00
Zachary Turner f8a2e04812 [llvm-pdbutil] Add back the ability to dump hashes and index offsets.
This was regressed in a previous patch that re-wrote the dumper,
and I'm incrementally adding back the pieces that are missing.

llvm-svn: 305524
2017-06-15 23:04:42 +00:00
Alfred Huang f9b521fdaf [AMDGPU] Testing commit access only, no real change
llvm-svn: 305523
2017-06-15 23:02:55 +00:00
Kostya Serebryany 589eae515e [libFuzzer] change the default max_len from 64 to 4096. This will affect cases where libFuzzer is run w/o initial corpus or with a corpus of very small items.
llvm-svn: 305521
2017-06-15 22:43:40 +00:00
Zachary Turner 6305545527 Resubmit "[llvm-pdbutil] rewrite the "raw" output style."
This resubmits commit c0c249e9f2ef83e1d1e5f166b50673d92f3579d7.

It was broken due to some weird template issues, which have
since been fixed.

llvm-svn: 305517
2017-06-15 22:24:24 +00:00
Matthias Braun a42c537912 RegScavenging: Add scavengeRegisterBackwards()
Re-apply r276044/r279124. Trying to reproduce or disprove the ppc64
problems reported in the stage2 build last time, which I cannot
reproduce right now.

This is a variant of scavengeRegister() that works for
enterBasicBlockEnd()/backward(). The benefit of the backward mode is
that it is not affected by incomplete kill flags.

This patch also changes
PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register
scavenger in backwards mode.

Differential Revision: http://reviews.llvm.org/D21885

llvm-svn: 305516
2017-06-15 22:14:55 +00:00
Craig Topper 2ba991ff2c [InstCombine] Add two FIXMEs for bad single use checks. NFC
llvm-svn: 305510
2017-06-15 21:38:48 +00:00
Zachary Turner da504b794c Revert "[llvm-pdbutil] rewrite the "raw" output style."
This reverts commit 83ea17ebf2106859a51fbc2a86031b44d33696ad.

This is failing due to some strange template problems, so reverting
until it can be straightened out.

llvm-svn: 305505
2017-06-15 20:55:51 +00:00
Spyridoula Gravani 7a27a26db0 [DWARF] Removed dead code. The verifier functionality is provided by
the DWARFVerifier class (as it should).

llvm-svn: 305503
2017-06-15 20:40:08 +00:00
Teresa Johnson 152277952e Split PGO memory intrinsic optimization into its own source file
Summary:
Split the PGOMemOPSizeOpt pass out from IndirectCallPromotion.cpp into
its own file.

Reviewers: davidxl

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D34248

llvm-svn: 305501
2017-06-15 20:23:57 +00:00
Zachary Turner b560fdf3b8 [llvm-pdbutil] rewrite the "raw" output style.
After some internal discussions, we agreed that the raw output style had
outlived its usefulness. It was originally created before we had even
thought of dumping to YAML, and it was intended to give us some insight
into the internals of a PDB file. Now we have YAML mode which does
almost exactly this but is more powerful in that it can round-trip back
to a PDB, which the raw mode could not do. So the raw mode had become
purely a maintenance burden.

One option was to just delete it. However, its original goal was to be
as readable as possible while staying close to the "metal" - i.e.
presenting the output in a way that maps directly to the underlying file
format. We don't actually need that last requirement anymore since it's
covered by the yaml mode, so we could repurpose "raw" mode to actually
just be as readable as possible.

This patch implements about 80% of the functionality previously in raw
mode, but in a completely different style that is more akin to what
cvdump outputs. Records are very compressed, often times appearing on
just one line. One nice thing about this is that it makes full record
matching easier, because you can grep for indices, names, and leaf types
on a single line often.

See the tests for some examples of what the new output looks like.

Note that this patch actually regresses the functionality of raw mode in
a few areas, but only because the patch was already unreasonably large
and going 100% would have been even worse. Specifically, this patch is
missing:

The ability to dump module debug subsections (checksums, lines, etc)
The ability to dump section headers
Aside from that everything is here. While goign through the tests fixing
them all up, I found many duplicate tests. They've been deleted. In
subsequent patches I will go through and re-add the missing
functionality.

Differential Revision: https://reviews.llvm.org/D34191

llvm-svn: 305495
2017-06-15 19:34:41 +00:00
Alexander Timofeev 0f9c84cd93 DivergencyAnalysis patch for review
llvm-svn: 305494
2017-06-15 19:33:10 +00:00
Craig Topper f2d3e6d3d5 [InstCombine] Make the context instruction parameter of foldOrOfICmps a reference to discourage passing nullptr and to remove the '&' from all of the call sites. NFC
llvm-svn: 305493
2017-06-15 19:09:51 +00:00
Lei Huang b4733ca8c5 [MachineLICM] Hoist TOC-based address instructions
Add condition for MachineLICM to safely hoist instructions that utilize
non constant registers that are reserved.

On PPC, global variable access is done through the table of contents (TOC)
which is always in register X2.  The ABI reserves this register in any
functions that have calls or access global variables.

A call through a function pointer involves saving, changing and restoring
this register around the call and thus MachineLICM does not consider it to
be invariant. We can however guarantee the register is preserved across the
call and thus is invariant.

Differential Revision: https://reviews.llvm.org/D33562

llvm-svn: 305490
2017-06-15 18:29:59 +00:00
Benjamin Kramer 00a970a84b Fold variable into assert.
Silences an unused variable warning in Release builds.

llvm-svn: 305488
2017-06-15 17:58:24 +00:00
Craig Topper 6eec9e21a5 [InstCombine] Handle (iszero(A & K1) | iszero(A & K2)) -> (A & (K1 | K2)) != (K1 | K2) when the one of the Ands is commuted relative to the other
Currently we expect A to be on the same side in both Ands but nothing guarantees that.

While there also switch to using matchers for some of the code.

Differential Revision: https://reviews.llvm.org/D34230

llvm-svn: 305487
2017-06-15 17:55:20 +00:00
Peter Collingbourne 7e85b26549 Silence warning with assertions disabled.
llvm-svn: 305485
2017-06-15 17:41:32 +00:00
Arnold Schwaighofer ae9312c487 ISel: Fix FastISel of swifterror values
The code assumed that we process instructions in basic block order.  FastISel
processes instructions in reverse basic block order. We need to pre-assign
virtual registers before selecting otherwise we get def-use relationships wrong.

This only affects code with swifterror registers.

rdar://32659327

llvm-svn: 305484
2017-06-15 17:34:42 +00:00
Peter Collingbourne dbd2fed6a1 Apply summary-based dead stripping to regular LTO modules with summaries.
If a regular LTO module has a summary index, then instead of linking
it into the combined regular LTO module right away, add it to the
combined summary index and associate it with a special module that
represents the combined regular LTO module.

Any such modules are linked during LTO::run(), at which time we use
the results of summary-based dead stripping to control whether to
link prevailing symbols.

Differential Revision: https://reviews.llvm.org/D33922

llvm-svn: 305482
2017-06-15 17:26:13 +00:00
Craig Topper 587525468d [BasicAA] Don't call isKnownNonEqual if we might be have gone through a PHINode.
This is a fix for the test case in PR32314.

Basic Alias Analysis can ask if two nodes are known non-equal after looking through a phi node to find a GEP. isAddOfNonZero saw an add of a constant from the same phi and said that its output couldn't be equal. But Basic Alias Analysis was really asking about the value from the previous loop iteration.

This patch at least makes that case not happen anymore, I'm not sure if there were still other ways this can fail. As was discussed in the bug, it looks like fixing BasicAA would be difficult so this patch seemed like a possible workaround

Differential Revision: https://reviews.llvm.org/D33136

llvm-svn: 305481
2017-06-15 17:16:56 +00:00
Hiroshi Inoue 7a08bb1458 [PowerPC] fix potential verification errors on CFENCE8
This patch fixes a potential verification error (64-bit register operands for cmpw) with -verify-machineinstrs.

Differential Revision: https://reviews.llvm.org/D34208

llvm-svn: 305479
2017-06-15 16:51:28 +00:00
Simon Dardis 24ca9da2de [mips] Fix documentation of member variable. NFCI.
llvm-svn: 305478
2017-06-15 16:28:28 +00:00
Nirav Dave 9d79cade42 [DAG] As StoreMerge now generates only legal nodes remove unecessary guard when run post-legalization NFCI.
llvm-svn: 305477
2017-06-15 16:27:49 +00:00
Simon Pilgrim 07cfc80186 Remove trailing whitespace. NFCI.
llvm-svn: 305476
2017-06-15 16:20:27 +00:00
Nirav Dave be2674a598 [DAG] Defer Pre/Post IndexStore merge to after mergestore. NFCI.
In preparation for doing storemerge post-legalization, reorder
visitSTORE passes to move pre/post-index combining after store
merge. Reordered passes other than store merge are unaffected.

llvm-svn: 305473
2017-06-15 15:05:48 +00:00
Simon Pilgrim 4d432b2c6b [X86][AVX2] Fix issue in lowerV8I16GeneralSingleInputVectorShuffle that was assuming v8i16 vectors
We can use this with v16i16/v32i16 as well.

Found during fuzz testing.

llvm-svn: 305472
2017-06-15 14:52:30 +00:00
Nirav Dave 85e92223b4 [AArch64] Add indexed check to splitStores. NFC.
Add explicit check for unhandled cases in preparation for delaying
splitStores to post-legalization.

llvm-svn: 305471
2017-06-15 14:47:44 +00:00
Simon Pilgrim b98cb3808c Revert r305465: [X86][AVX512] Improve lowering of AVX512 compare intrinsics (remove redundant shift left+right instructions).
This is causing windows buildbot failures

llvm-svn: 305470
2017-06-15 14:39:34 +00:00
Nirav Dave 9464c72850 [DAG] Allow truncated and extend memory operations in Store Merge. NFCI.
As all store merges checks are based on the memory operation
performed, allow use of truncated stores and extended loads as valid
input candidates for merging.

llvm-svn: 305468
2017-06-15 14:04:07 +00:00
Nirav Dave 6a41822ba7 [DAG] Make MergeStores generate legalized stores. NFCI.
Realized merged stores as truncstores if store will be realized as
such by legalization.

llvm-svn: 305467
2017-06-15 13:34:54 +00:00
Nirav Dave 9a4998980d [DAG] Use correct size for truncated store merge of load. NFCI.
Avoid non-legal memory ops by checking correct size when merging
stores of loads into a extload-truncstore pair.

llvm-svn: 305466
2017-06-15 13:28:06 +00:00
Ayman Musa 56912cda71 [X86][AVX512] Improve lowering of AVX512 compare intrinsics (remove redundant shift left+right instructions).
AVX512 compare instructions return v*i1 types.
In cases where the number of elements in the returned value are less than 8, clang adds zeroes to get a mask of v8i1 type.
Later on it's replaced with CONCAT_VECTORS, which then is lowered to many DAG nodes including insert/extract element and shift right/left nodes.
The fact that AVX512 compare instructions put the result in a k register and zeroes all its upper bits allows us to remove the extra nodes simply by copying the result to the required register class.

When lowering, identify these cases and transform them into an INSERT_SUBVECTOR node (marked legal), then catch this pattern in instructions selection phase and transform it into one avx512 cmp instruction.

Differential Revision: https://reviews.llvm.org/D33188

llvm-svn: 305465
2017-06-15 13:02:37 +00:00
Max Kazantsev dc80366d52 [ScalarEvolution] Apply Depth limit to getMulExpr
This is a fix for PR33292 that shows a case of extremely long compilation
of a single .c file with clang, with most time spent within SCEV.

We have a mechanism of limiting recursion depth for getAddExpr to avoid
long analysis in SCEV. However, there are calls from getAddExpr to getMulExpr
and back that do not propagate the info about depth. As result of this, a chain

  getAddExpr -> ... .> getAddExpr -> getMulExpr -> getAddExpr -> ... -> getAddExpr

can be extremely long, with every segment of getAddExpr's being up to max depth long.
This leads either to long compilation or crash by stack overflow. We face this situation while
analyzing big SCEVs in the test of PR33292.

This patch applies the same limit on max expression depth for getAddExpr and getMulExpr.

Differential Revision: https://reviews.llvm.org/D33984

llvm-svn: 305463
2017-06-15 11:48:21 +00:00
Diana Picus 02e11010b2 [ARM] GlobalISel: Add support for i32 modulo
Add support for modulo for targets that have hardware division and for
those that don't. When hardware division is not available, we have to
choose the correct libcall to use. This is generally straightforward,
except for AEABI.

The AEABI variant is trickier than the other libcalls because it
returns { quotient, remainder }, instead of just one value like the
other libcalls that we've seen so far. Therefore, we need to use custom
lowering for it. However, we don't want to have too much special code,
so we refactor the target-independent code in the legalizer by adding a
helper for replacing an instruction with a libcall. This helper is used
by the legalizer itself when dealing with simple calls, and also by the
custom ARM legalization for the more complicated AEABI divmod calls.

llvm-svn: 305459
2017-06-15 10:53:31 +00:00
Diana Picus 8fd1601d32 [ARM] GlobalISel: Lower only homogeneous struct args
Lowering mixed struct args, params and returns used G_INSERT, which is a
bit more convoluted to support through the entire pipeline. Since they
don't occur that often in practice, it's probably wiser to leave them
out until later.

Meanwhile, we can lower homogeneous structs using G_MERGE_VALUES, which
has good support in the legalizer. These occur e.g. as the return of
__aeabi_idivmod, so it's nice to be able to support them.

llvm-svn: 305458
2017-06-15 09:42:02 +00:00
Florian Hahn 0a26d2c298 [AArch64] Enable FeatureFuseAES for the generic processor model.
Summary:
Scheduling AESE/AESMC and AESD/AESIMC instruction pairs back-to-back
gives a double digit speedup on benchmarks using those instructions on
Cortex-A processors. In GCC, this optimization is part of the generic
processor model as well.

This change should not have a major performance impact on processors
that do not optimize AES instruction pairs, although I only had access
to Cortex-A processors for benchmarking.


Reviewers: rengolin, kristof.beyls, javed.absar, evandro, silviu.baranga, MatzeB, mcrosier, joelkevinjones, joel_k_jones, bmakam, t.p.northover

Reviewed By: evandro

Subscribers: sbaranga, aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D33836

llvm-svn: 305457
2017-06-15 09:31:23 +00:00
Zoran Jovanovic d9299293ad [mips][microMIPS] Extending size reduction pass with ADDIUSP and ADDIUR1SP
Author: milena.vujosevic.janicic
Reviewers: sdardis
The patch extends size reduction pass for MicroMIPS.
The following instructions are examined and transformed, if possible:
ADDIU instruction is transformed into 16-bit instruction ADDIUSP
ADDIU instruction is transformed into 16-bit instruction ADDIUR1SP
Differential Revision: https://reviews.llvm.org/D33887

llvm-svn: 305455
2017-06-15 09:14:33 +00:00
George Karpenkov 406c113103 Fixing section name for Darwin platforms for sanitizer coverage
On Darwin, section names have a 16char length limit.

llvm-svn: 305429
2017-06-14 23:40:25 +00:00
Peter Collingbourne 5aa56d2d6e IR: Tweak the API around adding modules to the summary index.
The current name (addModulePath) and return value
(ModulePathStringTableTy::iterator) is a little confusing. This
API adds a module, not just a path. And the iterator is basically
just an implementation detail of the summary index. Address
both of those issues by renaming to addModule and introducing a
ModuleSummaryIndex::ModuleInfo type that the function returns.

Differential Revision: https://reviews.llvm.org/D34124

llvm-svn: 305422
2017-06-14 22:35:27 +00:00
Daniel Berlin 6d2db9edb2 PredicateInfo: Don't insert conditional info when a conditional branch jumps to the same target regardless of condition
llvm-svn: 305416
2017-06-14 21:19:52 +00:00
Daniel Berlin 51e878e01d NewGVN: This is wrong by inspection, it will not cause an issue currently due to other limitations, i believe. This also means i can't make a test for it.
llvm-svn: 305415
2017-06-14 21:19:28 +00:00
Sanjay Patel 83cb007940 [x86] avoid unnecessary shuffle mask math in combineX86ShufflesRecursively()
This is a follow-up to https://reviews.llvm.org/D34174 / https://reviews.llvm.org/rL305398.

We mentioned replacing the multiplies with shifts, but the real win seems to be in
bypassing the extra ops in the common case when the RootRatio and OpRatio are one.

This gives us another 1-2% overall win for the test in PR32037:
https://bugs.llvm.org/show_bug.cgi?id=32037

llvm-svn: 305414
2017-06-14 20:37:11 +00:00
David Callahan 5960d9b1c3 Allow -profile-guided-section-prefix more than once
Summary:
At present, `-profile-guided-section-prefix` is a `cl::Optional` option, which means it demands to be passed exactly zero or one times.  Our build system makes it pretty tricky to guarantee this.  We often accidentally pass the flag more than once (but always with the same "false" value) which results in an error, after which compilation fails:

```
clang (LLVM option parsing): for the -profile-guided-section-prefix option: may only occur zero or one times!
```

While we work on improving our build system, it also seems reasonable just to allow `-profile-guided-section-prefix` to be passed more than once, by to `cl::ZeroOrMore`.  Quoting [[ http://llvm.org/docs/CommandLine.html#controlling-the-number-of-occurrences-required-and-allowed | the documentation ]]:

> The cl::ZeroOrMore modifier ... indicates that your program will allow the option to be specified zero or more times.
> ...
> If an option is specified multiple times for an option of the cl::opt class, only the last value will be retained.

Reviewers: danielcdh

Reviewed By: danielcdh

Subscribers: twoh, david2050, llvm-commits

Differential Revision: https://reviews.llvm.org/D34219

llvm-svn: 305413
2017-06-14 20:35:33 +00:00
Davide Italiano 0dc4778067 [EarlyCSE] Make PhiToCheck in removeMSSA() a set.
This way we end up not looking at PHI args already removed.
MemSSA now goes through the updater so we can prune
it to avoid having redundant MemoryPHI arguments, but that
doesn't quite work for the general case.

Discussed with Daniel Berlin, fixes PR33406.

llvm-svn: 305409
2017-06-14 19:29:53 +00:00
Frederich Munch dceb612eeb Hide dbgs() stream for when built with -fmodules.
Summary: Make DebugCounter::print and dump methods to be const correct.

Reviewers: aprantl

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34214

llvm-svn: 305408
2017-06-14 19:16:22 +00:00
Peter Collingbourne f0e26e7270 MC, Object: Reserve a section type, SHT_LLVM_ODRTAB, for the ODR table.
This is part of the ODR checker proposal:
http://lists.llvm.org/pipermail/llvm-dev/2017-June/113820.html

Per discussion on the gnu-gabi mailing list [1] the section type range
0x6fff4c00..0x6fff4cff is reserved for LLVM.

[1] https://sourceware.org/ml/gnu-gabi/2017-q2/msg00030.html

Differential Revision: https://reviews.llvm.org/D33978

llvm-svn: 305407
2017-06-14 18:52:12 +00:00
Galina Kistanova 3c0505d30c Specified ReportError as noreturn friendly to old compilers.
llvm-svn: 305405
2017-06-14 17:32:53 +00:00
Lei Huang f689f69fea Test commit - NFC.
Modified a comment to confirm commit access functionality.

llvm-svn: 305402
2017-06-14 17:25:55 +00:00
Craig Topper f93b7b1c1f [ValueTracking] Correct early out in computeKnownBitsFromOperator to work with non power of 2 bit widths
There's an early out that's trying to detect when we don't know any bits that make up the legal range of a shift. The code subtracts one from BitWidth which creates a mask in the lower bits for power of 2 bit widths. This is then ANDed with the known bits to see if any of those bits are known. If the bit width isn't a power of 2 this creates a non-sensical mask.

This patch corrects this by rounding up to a power of 2 before doing the subtract and mask.

Differential Revision: https://reviews.llvm.org/D34165

llvm-svn: 305400
2017-06-14 17:04:59 +00:00
Sanjay Patel ce0b99563a [x86] replace div/rem with shift/mask for better shuffle combining perf
We know that shuffle masks are power-of-2 sizes, but there's no way (?) for LLVM to know that, 
so hack combineX86ShufflesRecursively() to be much faster by replacing div/rem with shift/mask.

This makes the motivating compile-time bug in PR32037 ( https://bugs.llvm.org/show_bug.cgi?id=32037 ) 
about 9% faster overall.

Differential Revision: https://reviews.llvm.org/D34174

llvm-svn: 305398
2017-06-14 17:00:57 +00:00
Zachary Turner cb30e705d8 [gtest] Create a shared include directory for gtest utilities.
Many times unit tests for different libraries would like to use
the same helper functions for checking common types of errors.

This patch adds a common library with helpers for testing things
in Support, and introduces helpers in here for integrating the
llvm::Error and llvm::Expected<T> classes with gtest and gmock.

Normally, we would just be able to write:

   EXPECT_THAT(someFunction(), succeeded());

but due to some quirks in llvm::Error's move semantics, gmock
doesn't make this easy, so two macros EXPECT_THAT_ERROR() and
EXPECT_THAT_EXPECTED() are introduced to gloss over the difficulties.
Consider this an exception, and possibly only temporary as we
look for ways to improve this.

Differential Revision: https://reviews.llvm.org/D33059

llvm-svn: 305395
2017-06-14 16:41:50 +00:00
Zachary Turner a8cfc29c9a Resubmit "[codeview] Make obj2yaml/yaml2obj support .debug$S..."
This was originally reverted because of some non-deterministic
failures on certain buildbots.  Luckily ASAN eventually caught
this as a stack-use-after-scope, so the fix is included in
this patch.

llvm-svn: 305393
2017-06-14 15:59:27 +00:00
Alexandros Lamprineas 1c15ee2631 Revert "[ARM] Support constant pools in data when generating execute-only code."
This reverts commit 3a204faa093c681a1e96c5e0622f50649b761ee0.

I've upset a buildbot which runs the address sanitizer:
ERROR: AddressSanitizer: stack-use-after-scope
lib/Target/ARM/ARMISelLowering.cpp:2690
That Twine variable is used illegally.

llvm-svn: 305390
2017-06-14 15:00:08 +00:00
Simon Dardis 9790e39f45 [mips] Fix multiprecision arithmetic.
For multiprecision arithmetic on MIPS, rather than using ISD::ADDE / ISD::ADDC,
get SelectionDAG to break down the operation into ISD::ADDs and ISD::SETCCs.

For MIPS, only the DSP ASE has a carry flag, so in the general case it is not
useful to directly support ISD::{ADDE, ADDC, SUBE, SUBC} nodes.

Also improve the generation code in such cases for targets with
TargetLoweringBase::ZeroOrOneBooleanContent by directly using the result of the
comparison node rather than using it in selects. Similarly for ISD::SUBE /
ISD::SUBC.

Address optimization breakage by moving the generation of MIPS specific integer
multiply-accumulate nodes to before legalization.

This revolves PR32713 and PR33424.

Thanks to Simonas Kazlauskas and Pirama Arumuga Nainar for reporting the issue!

Reviewers: slthakur

Differential Revision: https://reviews.llvm.org/D33494

llvm-svn: 305389
2017-06-14 14:46:30 +00:00
Alexandros Lamprineas c582d6e133 [ARM] Support constant pools in data when generating execute-only code.
The ARM backend asserts against constant pool lowering when it generates
execute-only code in order to prevent the generation of constant pools in
the text section. It appears that target independent optimizations might
generate DAG nodes that represent constant pools. By lowering such nodes
as global addresses we don't violate the semantics of execute-only code
and also it is guaranteed that execute-only behaves correct with the
position-independent addressing modes that support execute-only code.

Differential Revision: https://reviews.llvm.org/D33773

llvm-svn: 305387
2017-06-14 13:22:41 +00:00
Florian Hahn ffc498dfcc Align definition of DW_OP_plus with DWARF spec [3/3]
Summary:
This patch is part of 3 patches that together form a single patch, but must be introduced in stages in order not to break things.
 
The way that LLVM interprets DW_OP_plus in DIExpression nodes is basically that of the DW_OP_plus_uconst operator since LLVM expects an unsigned constant operand. This unnecessarily restricts the DW_OP_plus operator, preventing it from being used to describe the evaluation of runtime values on the expression stack. These patches try to align the semantics of DW_OP_plus and DW_OP_minus with that of the DWARF definition, which pops two elements off the expression stack, performs the operation and pushes the result back on the stack.
 
This is done in three stages:
• The first patch (LLVM) adds support for DW_OP_plus_uconst.
• The second patch (Clang) contains changes all its uses from DW_OP_plus to DW_OP_plus_uconst.
• The third patch (LLVM) changes the semantics of DW_OP_plus and DW_OP_minus to be in line with its DWARF meaning. This patch includes the bitcode upgrade from legacy DIExpressions.

Patch by Sander de Smalen.

Reviewers: echristo, pcc, aprantl

Reviewed By: aprantl

Subscribers: fhahn, javed.absar, aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D33894

llvm-svn: 305386
2017-06-14 13:14:38 +00:00
Simon Dardis 941a49b6d6 [mips] Fix machine verifier errors in the long branch pass
This patch fixes two systemic machine verifier errors in the long
branch pass. The first is the incorrect basic block successors
and the second was the incorrect construction of several jump
instructions.

This partially resolves PR27458 and the associated PR32146.

Reviewers: slthakur

Differential Revision: https://reviews.llvm.org/D33378

llvm-svn: 305382
2017-06-14 12:16:47 +00:00
Nemanja Ivanovic 7855185bbb Revert r304907 as it is causing some failures that I cannot reproduce.
Reverting this until a test case can be provided to aid the investigation.

llvm-svn: 305372
2017-06-14 07:05:42 +00:00
Zachary Turner 0085dce221 Revert "[codeview] Make obj2yaml/yaml2obj support .debug$S..."
This is causing failures on linux bots with an invalid stream
read.  It doesn't repro in any configuration on Windows, so
reverting until I have a chance to investigate on Linux.

llvm-svn: 305371
2017-06-14 06:24:24 +00:00
Zachary Turner ba59c2f63b Use make_shared instead of make_unique.
llvm-svn: 305369
2017-06-14 05:48:33 +00:00
Zachary Turner 6bffe44659 Fix some more errors.
llvm-svn: 305368
2017-06-14 05:44:38 +00:00
Zachary Turner a3da4467fa [codeview] Make obj2yaml/yaml2obj support .debug$S/T sections.
This allows us to use yaml2obj and obj2yaml to round-trip CodeView
symbol and type information without having to manually specify the bytes
of the section. This makes for much easier to maintain tests. See the
tests under lld/COFF in this patch for example. Before they just said
SectionData: <blob> whereas now we can use meaningful record
descriptions. Note that it still supports the SectionData yaml field,
which could be useful for initializing a section to invalid bytes for
testing, for example.

Differential Revision: https://reviews.llvm.org/D34127

llvm-svn: 305366
2017-06-14 05:31:00 +00:00
Peter Collingbourne b78a68db7b Support: Remove MSVC 2013 workarounds in ThreadPool class.
I have confirmed that these are no longer needed with MSVC 2015.

Differential Revision: https://reviews.llvm.org/D34187

llvm-svn: 305347
2017-06-14 00:36:21 +00:00
Kostya Serebryany 546a286cef [libFuzzer] really restrict the new test to Linux (fails on Mac/Windows currently)
llvm-svn: 305346
2017-06-14 00:34:42 +00:00
Spyridoula Gravani e41823bb89 Added partial verification for .apple_names accelerator table in llvm-dwarfdump output.
This patch adds code which verifies that each bucket in the .apple_names
accelerator table is either empty or has a valid hash index.

Differential Revision: https://reviews.llvm.org/D34177

llvm-svn: 305344
2017-06-14 00:17:55 +00:00
Galina Kistanova 41def9b72c Reverted r305339 as MSVC is not happy with noreturn in lambda.
llvm-svn: 305343
2017-06-13 23:57:51 +00:00
Daniel Sanders 4e52366c2a [globalisel][legalizer] G_LOAD/G_STORE NarrowScalar should not emit G_GEP x, 0.
Summary:
When legalizing G_LOAD/G_STORE using NarrowScalar, we should avoid emitting
	%0 = G_CONSTANT ty 0
	%1 = G_GEP %x, %0
since it's cheaper to not emit the redundant instructions than it is to fold them
away later.

Reviewers: qcolombet, t.p.northover, ab, rovka, aditya_nandakumar, kristof.beyls

Reviewed By: qcolombet

Subscribers: javed.absar, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D32746

llvm-svn: 305340
2017-06-13 23:42:32 +00:00
Galina Kistanova 680c7605a7 Specified LLVM_ATTRIBUTE_NORETURN for ReportError.
llvm-svn: 305339
2017-06-13 23:39:42 +00:00
Kostya Serebryany d0fb427862 [libFuzzer] restrict the new test to Linux (fails on Mac currently)
llvm-svn: 305335
2017-06-13 23:09:11 +00:00
Kostya Serebryany f2d4dcb888 [libFuzzer] initial support of -fsanitize-coverage=inline-8bit-counters in libFuzzer. This is not fully functional yet, but simple tests work
llvm-svn: 305331
2017-06-13 22:31:21 +00:00
Davide Italiano 36559b2527 [AMDGPU] Remove now dead defaultOffsetS13(). NFCI.
Fixes the GCC7 build with -Werror.

llvm-svn: 305329
2017-06-13 22:24:24 +00:00
Vedant Kumar 9c056c9e1b [InstrProf] Don't take the address of alwaysinline available_externally functions
Doing so breaks compilation of the following C program
(under -fprofile-instr-generate):

 __attribute__((always_inline)) inline int foo() { return 0; }

 int main() { return foo(); }

At link time, we fail because taking the address of an
available_externally function creates an undefined external reference,
which the TU cannot provide.

Emitting the function definition into the object file at all appears to
be a violation of the langref: "Globals with 'available_externally'
linkage are never emitted into the object file corresponding to the LLVM
module."

Differential Revision: https://reviews.llvm.org/D34134

llvm-svn: 305327
2017-06-13 22:12:35 +00:00
Eric Beckmann 0096d78bf9 Use reference to iterate through string table instead of copying.
Summary: just a quick patch

Subscribers: ruiu, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D34171

llvm-svn: 305324
2017-06-13 21:05:42 +00:00
Eric Beckmann 1f76ca5a2d Fix a bug introduced in r305092 on big-endian systems.
Summary:
We were writing the length of the string based on system-endianness, and
not universally little-endian.  This fixes that.

Reviewers: zturner

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D34159

llvm-svn: 305322
2017-06-13 20:53:31 +00:00
Teresa Johnson 8015f88525 [PGO] Update VP metadata after memory intrinsic optimization
Summary:
Leave an updated VP metadata on the fallback memcpy intrinsic after
specialization. This can be used for later possible expansion based on
the average of the remaining values.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34164

llvm-svn: 305321
2017-06-13 20:44:08 +00:00
Eric Beckmann 338663348a Fix alignment complaint.
Summary: Apparently we need to write using a void* pointer on some architectures, or else alignment error is caused.

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D34166

llvm-svn: 305320
2017-06-13 20:36:19 +00:00
Frederich Munch 6391c7e2a1 Revert r305313 & r305303, self-hosting build-bot isn’t liking it.
llvm-svn: 305318
2017-06-13 19:05:24 +00:00
Sam Clegg ae03c1e724 [WebAssembly] Cleanup WebAssemblyWasmObjectWriter
Differential Revision: https://reviews.llvm.org/D34131

llvm-svn: 305316
2017-06-13 18:51:50 +00:00
Eric Beckmann 907fb81327 Improve error messages in order to help with fixing a big-endian bug.
Summary: Added output to stderr so that we can actually see what is happening when the test fails on big endian.

Reviewers: zturner

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D34155

llvm-svn: 305314
2017-06-13 18:17:36 +00:00
Geoff Berry 13d5dcb093 [AArch64][Falkor] Fix sched details for FDIV, FSQRT, SDIV, UDIV
llvm-svn: 305310
2017-06-13 17:43:39 +00:00
Kit Barton 0b216305db Test commit - NFC.
Modified a comment to confirm commit access functionality.

llvm-svn: 305309
2017-06-13 17:35:29 +00:00
Krzysztof Parzyszek b3a8d20e27 [Hexagon] Generate store-immediate instructions for stack objects
Store-immediate instructions have a non-extendable offset. Since the
actual offset for a stack object is not known until much later, only
generate these stores when the stack size (at the time of instruction
selection) is small.

llvm-svn: 305305
2017-06-13 17:10:16 +00:00
Florian Hahn c9c403c0d4 Align definition of DW_OP_plus with DWARF spec [1/3]
Summary:
This patch is part of 3 patches that together form a single patch, but must be introduced in stages in order not to break things.
 
The way that LLVM interprets DW_OP_plus in DIExpression nodes is basically that of the DW_OP_plus_uconst operator since LLVM expects an unsigned constant operand. This unnecessarily restricts the DW_OP_plus operator, preventing it from being used to describe the evaluation of runtime values on the expression stack. These patches try to align the semantics of DW_OP_plus and DW_OP_minus with that of the DWARF definition, which pops two elements off the expression stack, performs the operation and pushes the result back on the stack.
 
This is done in three stages:
• The first patch (LLVM) adds support for DW_OP_plus_uconst.
• The second patch (Clang) contains changes all its uses from DW_OP_plus to DW_OP_plus_uconst.
• The third patch (LLVM) changes the semantics of DW_OP_plus and DW_OP_minus to be in line with its DWARF meaning. This patch includes the bitcode upgrade from legacy DIExpressions.

Patch by Sander de Smalen.

Reviewers: pcc, echristo, aprantl

Reviewed By: aprantl

Subscribers: fhahn, aprantl, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33892

llvm-svn: 305304
2017-06-13 16:54:44 +00:00
Frederich Munch 4c73b40dca Force RegisterStandardPasses to construct std::function in the IPO library.
Summary: Fixes an issue using RegisterStandardPasses from a statically linked object before PassManagerBuilder::addGlobalExtension is called from a dynamic library.

Reviewers: efriedma, theraven

Reviewed By: efriedma

Subscribers: mehdi_amini, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D33515

llvm-svn: 305303
2017-06-13 16:48:41 +00:00
Krzysztof Parzyszek c83c267b84 [Hexagon] Generate multiply-high instruction in isel
llvm-svn: 305302
2017-06-13 16:21:57 +00:00
Yonghong Song 7e9d2cb553 bpf: clang-format on BPFAsmPrinter.cpp
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 305301
2017-06-13 16:17:20 +00:00
Krzysztof Parzyszek de2ac17b7b [Hexagon] Don't kill live registers when creating mux out of tfr
When a mux instruction is created from a pair of complementary conditional
transfers, it can be placed at the location of either the earlier or the
later of the transfers. Since it will use the operands of the original
transfers, putting it in the earlier location may hoist a kill of a source
register that was originally further down. Make sure the kill flag is
removed if the register is still used afterwards.

llvm-svn: 305300
2017-06-13 16:07:36 +00:00
Reid Kleckner 8cbdd0c0f2 [PDB] Add a module descriptor for every object file
Summary:
Expose the module descriptor index and fill it in for section
contributions.

Reviewers: zturner

Subscribers: llvm-commits, ruiu, hiraditya

Differential Revision: https://reviews.llvm.org/D34126

llvm-svn: 305296
2017-06-13 15:49:13 +00:00
Simon Dardis c38d391f56 [MIPS] BuildCondBr should preserve MO flags
While simplifying branches in the MachineInstr representation, the
routine BuildCondBr must preserve flags on register MachineOperands. In
particular, it must preserve the <undef> flag.

This fixes a bug that is unlikely to occur in any real scenario, but
which bugpoint is likely to introduce.

Patch By Nick Johnson!

Reviewers: ahatanak, sdardis

Differential Revision: https://reviews.llvm.org/D34041

llvm-svn: 305290
2017-06-13 14:11:29 +00:00
Krzysztof Parzyszek 9bd4d91037 [Hexagon] Stop pmpy recognition when shift conversion fails
The conversion of shifts from right shifts to left shifts may fail.
In such case, the pmpy recognition cannot proceed.

llvm-svn: 305289
2017-06-13 13:51:49 +00:00
Oliver Stannard 852fbd2fea [ARM] Add scheduling classes for VFNM[AS]
The VFNM[AS] instructions did not have scheduling information attached, which
was causing assertion failures with the Cortex-A57 scheduling model and
-fp-contract=fast, because the Cortex-A57 sched model claims to be complete.

Differential Revision: https://reviews.llvm.org/D34139

llvm-svn: 305288
2017-06-13 13:04:32 +00:00
Simon Pilgrim 9ff06a0c7e Strip UTF8 BOM that got added in rL305091
Seems my recent move to VS2017 has resulted in a few text editor issues.....

llvm-svn: 305285
2017-06-13 10:17:57 +00:00
Simon Pilgrim 2b3b717768 [X86][SSE] Refactor getTargetConstantBitsFromNode to avoid large APInts (PR32037)
Much of PR32037's compile time regression is due to getTargetConstantBitsFromNode always creating large (>64bit) APInts during the bitcasting from the source data to the destination bitwidth.

This commit avoids this bitcast stage if the data is already the correct bitwidth.

llvm-svn: 305284
2017-06-13 10:13:48 +00:00
Simon Pilgrim 7ce9926ce4 Strip UTF8 BOM that got added for some reason in rL305163
llvm-svn: 305282
2017-06-13 09:58:27 +00:00
NAKAMURA Takumi 3807ab24c6 PPCISelLowering.cpp: Fix warnings in r305214. [-Wdocumentation]
llvm-svn: 305277
2017-06-13 07:34:32 +00:00
Craig Topper 8b8767662c [AVX-512] Mark masked VPCMP instructions as commutable.
llvm-svn: 305276
2017-06-13 07:13:50 +00:00
Craig Topper e1d8103d8f [AVX-512] Mark masked version of vpcmpeq as being commutable.
llvm-svn: 305275
2017-06-13 07:13:47 +00:00
Craig Topper 42d0339257 [X86] Add masked integer compare instructions to load folding tables.
llvm-svn: 305274
2017-06-13 07:13:44 +00:00
David Blaikie 6d0f39476a Inliner: Avoid calling shouldInline until it's absolutely necessary
This restores the order of evaluation (& conditionalized evaluation) of
isTriviallyDeadInstruction, InlineHistoryIncludes, and shouldInline
(with the addition of a shouldInline call after
isTriviallyDeadInstruction) from before r305245.

llvm-svn: 305267
2017-06-13 02:24:09 +00:00
Sam Clegg 7736855dee [WebAssembly] Fix symbol type for addresses of external functions
These symbols were previously not being marked as functions
so were appearing as globals instead, and with the incorrect
relocation type.

Without this fix, objects that take address of external
functions include them as global imports rather than function
imports which then fails at link time.

Differential Revision: https://reviews.llvm.org/D34068

llvm-svn: 305263
2017-06-13 01:42:21 +00:00
George Burgess IV f613749382 Fix signed/unsigned comparison warning; NFC
llvm-svn: 305262
2017-06-13 01:28:49 +00:00
Eric Beckmann 382eaabb1f Revert "Revert "Fix alignment bug in COFF emission.""
This revert was done so that my other patch to add test framework could
land separately.  Now the revert can be reverted and this patch can
reland.

This reverts commit 18b3c75b2b0d32601fb60a06b9672c33d6f0dff9.

llvm-svn: 305259
2017-06-13 00:19:43 +00:00
Eric Beckmann 1301759792 Update the test framework for llvm-cvtres to be more comprehensive.
Summary: Added test cases for multiple machine types, file merging, multiple languages, and more resource types.  Also fixed new bugs these tests exposed.

Subscribers: javed.absar, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D34047

llvm-svn: 305258
2017-06-13 00:16:32 +00:00
Eric Beckmann 56951cb031 Revert "Fix alignment bug in COFF emission."
I accidentally combined this patch with one for adding more tests, they
should be separated.

This reverts commit 3da218a523be78df32e637d3446ecf97c9ea0465.

llvm-svn: 305257
2017-06-13 00:15:47 +00:00
Eric Beckmann 5ee9eca868 Fix alignment bug in COFF emission.
Summary: Fix alignment issue in D34020, by aligning all sections to 8 bytes.

Reviewers: zturner

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D34072

llvm-svn: 305256
2017-06-13 00:06:10 +00:00
Sam Clegg d99f6078e4 [WebAssembly] MC: Fix value of R_WEBASSEMBLY_TABLE_INDEX relocations
Previously we were writing the value function index space
value but for these types of relocations we want to be
writing the table element index space value.

Add a test case for these relocation types that fails
without this change.

Differential Revision: https://reviews.llvm.org/D33962

llvm-svn: 305253
2017-06-12 23:52:44 +00:00
Craig Topper 6364bfa0f7 [IR] Stop deleting other signatures of User::operator new when we override one signature in a class derived from User
User has 3 signatures for operator new today. They take a single size, a size and a number of users, and a size, number of users, and descriptor size.

Historically there used to only be one signature that took size and a number of uses. Long ago derived classes implemented their own versions that took just a size and would call the size and use count version. Then they left an unimplemented signature for the size and use count signature from User. As we moved to C++11 this unimplemented signature because = delete.

Since then operator new has picked up two new signatures for operator new. But when the 3 argument version was added it was never added to the delete list in all of the derived classes where the 2 argument version is deleted. This makes things inconsistent.

I believe once one version of operator new is created in a derived class name hiding will take care of making all of the base class signatures unavailable. So I don't think the deleted lines are needed at all.

This patch removes all of the deletes in cases where there is an override or there is already a delete of another signature (that should trigger name hiding too).

Differential Revision: https://reviews.llvm.org/D34120

llvm-svn: 305251
2017-06-12 23:25:15 +00:00
Zachary Turner 606d766538 [pdb] Don't choke on unknown symbol types.
When we get an unknown symbol type, we might as well at least
dump it.  Same goes for round-tripping through YAML, we can
dump the record contents as raw bytes even if we don't know
how to interpret it semantically.

llvm-svn: 305248
2017-06-12 23:10:31 +00:00
David Blaikie ae8c4af4ac Inliner: Don't remove calls to readnone+nounwind (but not always_inline) functions in the AlwaysInliner
llvm-svn: 305245
2017-06-12 23:01:17 +00:00
Adrian Prantl f45e6462ca Fix an assertion failure when duplicate dbg.declares are present.
This fixes PR33157.
https://bugs.llvm.org//show_bug.cgi?id=33157

We might also think about disallowing duplicate dbg.declare intrinsics
entirely, but this may complicate some passes needlessly.

llvm-svn: 305244
2017-06-12 22:41:06 +00:00
Sanjay Patel 2ad88f81f0 fix typos/formatting; NFC
llvm-svn: 305243
2017-06-12 22:34:37 +00:00
David Blaikie 602a5bbb32 Support: Don't set RLIMIT_AS on child processes when applying a memory limit
It doesn't seem relevant to set an address space limit - this isn't
important in any sense that I'm aware & it gets in the way of things
that use a lot of address space, like llvm-symbolizer.

This came up when I realized that bugpoint regression tests were much
slower with -gsplit-dwarf than plain -g. Turned out that bugpoint
subprocesses (opt, etc) were crashing and doing symbolization - but
bugpoint runs those subprocesses with a 400MB memory limit. So with
plain -g, mmaping the opt binary would exceed the memory limit, fail,
and thus be really fast - no symbolization occurred. Whereas with
-gsplit-dwarf, comically, having less to map in, it would succeed and
then spend lots of time symbolizing.

I've fixed at least the critical part of bugpoint's perf problem there
by adding an option to allow bugpoint to disable symbolization. Thus
improving the perfromance for -gsplit-dwarf and making the -g-esque
speed available without this quirk/accidental benefit.

llvm-svn: 305242
2017-06-12 22:16:49 +00:00
Zachary Turner 68ea80d0a7 Slightly better fix for dealing with no-id-stream PDBs.
The last fix required the user to manually add the required
feature.  This caused an LLD test to fail because I failed to
update LLD.  In practice we can hide this logic so it can just
be transparently added when we write the PDB.

llvm-svn: 305236
2017-06-12 21:46:51 +00:00
Zachary Turner 990d0c8158 [llvm-pdbdump] Don't fail on PDBs with no ID stream.
Older PDBs don't have this.  Its presence is detected by using
the various "feature" flags that come at the end of the PDB
Stream.  Detect this, and don't try to dump the ID stream if the
features tells us it's not present.

llvm-svn: 305235
2017-06-12 21:34:53 +00:00
Anna Thomas 4b027e8f89 [RS4GC] Drop invalid metadata after pointers are relocated
Summary:
After RS4GC, we should drop metadata that is no longer valid. These metadata
is used by optimizations scheduled after RS4GC, and can cause a miscompile.
One such metadata is invariant.load which is used by LICM sinking transform.
After rewriting statepoints, the address of a load maybe relocated. With
invariant.load metadata on a load instruction, LICM sinking assumes the
loaded value (from a dererenceable address) to be invariant, and
rematerializes the load operand and the load at the exit block.
This transforms the IR to have an unrelocated use of the
address after a statepoint, which is incorrect.
Other metadata we conservatively remove are related to
dereferenceability and noalias metadata.

This patch drops such metadata on store and load instructions after
rewriting statepoints.

Reviewers: reames, sanjoy, apilipenko

Reviewed by: reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33756

llvm-svn: 305234
2017-06-12 21:26:53 +00:00
Tom Stellard ee6e6452df AMDGPU/GlobalISel: Mark 32-bit G_ADD as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D33992

llvm-svn: 305232
2017-06-12 20:54:56 +00:00
George Burgess IV 1f990e5b4f [ADT] Reduce duplication between {Contextual,}FoldingSet; NFC
This is a precursor to another change (coming soon) that aims to make
FoldingSet's API more type-safe. Without this, the type-safety change
would just duplicate 4 more public methods between the already very
similar classes.

This renames FoldingSetImpl to FoldingSetBase so it's consistent with
the FooBase -> FooImpl<T> -> Foo<T> convention we seem to have with
other containers.

llvm-svn: 305231
2017-06-12 20:52:53 +00:00
Tim Northover 7a61316e89 AArch64: don't try to emit an add (shifted reg) for SP.
The "Add/sub (shifted reg)" instructions use the 31 encoding for xzr and wzr
rather than the SP, so we need to use different variants.

Situations where this actually comes up are rare enough (see test-case) that I
think falling back to DAG is fine.

llvm-svn: 305230
2017-06-12 20:49:53 +00:00
Zachary Turner d334cebac4 Fix a null pointer dereference in llvm-pdbutil pretty.
Static data members were causing a problem because I mistakenly
assumed all members would affect a class's layout and so the
Layout member would be non-null.

llvm-svn: 305229
2017-06-12 20:46:35 +00:00
Matthias Braun 76f063090b SplitKit: Fix partially live subreg splitting
Fix thinko/typo in subreg aware liverange splitting logic. I'm not sure
how to write a proper testcase for this. The original problem only
happens on an out-of-tree target. Forcing subreg enabled targets to
spill and split in a predictable way is near impossible.

llvm-svn: 305228
2017-06-12 20:30:52 +00:00
Peter Collingbourne 89061b2224 IR: Replace the "Linker Options" module flag with "llvm.linker.options" named metadata.
The new metadata is easier to manipulate than module flags.

Differential Revision: https://reviews.llvm.org/D31349

llvm-svn: 305227
2017-06-12 20:10:48 +00:00
Reid Kleckner 2f3f503d13 [llvm-ar] Make llvm-lib behave more like the MSVC archiver
Summary:
Use the filepath used to open the archive member as the archive member
name instead of the file basename. This path might be absolute or
relative.  This is important because the archive member name will show
up in the PDB, and we want our PDBs to look as much like MSVC's as
possible.

This also helps avoid an issue in our PDB module descriptor writing
code, which assumes that all module names are unique. Relative paths
still aren't guaranteed to be unique, but they're much better than
basenames, which definitely aren't unique.

Reviewers: ruiu, zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33575

llvm-svn: 305223
2017-06-12 19:45:35 +00:00
Sylvestre Ledru 337804d86a Same expressions on both sides of the return
Summary:
I guess we want PointerToMemberFunction & PointerToDataMember


Fix coverity cid 1376038 


Reviewers: zturner

Reviewed By: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34110

llvm-svn: 305219
2017-06-12 18:53:46 +00:00
Tony Jiang 1a8eec141a [PowerPC] Match vec_revb builtins to P9 instructions.
Power9 has instructions that will reverse the bytes within an element for all
sizes (half-word, word, double-word and quad-word). These can be used for the
vec_revb builtins in altivec.h. However, we implement these to match vector
shuffle nodes as that will cover both the builtins and vector shuffles that
occur in the SDAG through other means.

Differential Revision: https://reviews.llvm.org/D33690

llvm-svn: 305214
2017-06-12 18:24:36 +00:00
Tony Jiang 30a49d1a3d [Power9] Added support for the modsw, moduw, modsd, modud hardware instructions.
Note that if we need the result of both the divide and the modulo then we
compute the modulo based on the result of the divide and not using the new
hardware instruction.

Commit on behalf of STEFAN PINTILIE.
Differential Revision: https://reviews.llvm.org/D33940

llvm-svn: 305210
2017-06-12 17:58:42 +00:00
Matt Arsenault 05c26472fa AMDGPU: Don't add same implicit use multiple times
For the last component, the same register use
was added as an implicit use and another implicit kill use.

llvm-svn: 305205
2017-06-12 17:19:20 +00:00
Geoff Berry 06c9dc3d9c [SelectionDAG] Allow sin/cos -> sincos optimization on GNU triples w/ just -fno-math-errno
Summary:
This change enables the sin(x) cos(x) -> sincos(x) optimization on GNU
target triples.  This optimization was being inhibited when -ffast-math
wasn't set because sincos in GLibC does not set errno, while sin and cos
do.  However, this optimization will only run if the attributes on the
sin/cos calls include readnone, which is how clang represents the fact
that it doesn't care about the errno values set by these functions (via
the -fno-math-errno flag).

Reviewers: hfinkel, bogner

Subscribers: mcrosier, javed.absar, llvm-commits, paul.redmond

Differential Revision: https://reviews.llvm.org/D32921

llvm-svn: 305204
2017-06-12 17:15:41 +00:00
Matt Arsenault d9b77848f2 AMDGPU: Teach isLegalAddressingMode about flat offsets
Also fix reporting r+r as a valid addressing mode without
offsets.

llvm-svn: 305203
2017-06-12 17:06:35 +00:00
Matt Arsenault db7c6a8731 AMDGPU: Start selecting flat instruction offsets
llvm-svn: 305201
2017-06-12 16:53:51 +00:00
Matt Arsenault 89ad17ce4c AMDGPU: Verify that flat offsets aren't used pre-GFX9
For convenience the operand is always present in the instruction,
but it isn't valid to use except on GFX9.

llvm-svn: 305200
2017-06-12 16:37:55 +00:00
Haicheng Wu ef790ffd56 [Falkor] Enable SW Prefetch.
SW prefetch is good for Falkor.

Differential Revision: http://reviews.llvm.org/D34084

llvm-svn: 305199
2017-06-12 16:34:19 +00:00
Matt Arsenault fd02314113 AMDGPU: Start adding offset fields to flat instructions
llvm-svn: 305194
2017-06-12 15:55:58 +00:00
Than McIntosh 14d61436c0 StackColoring: smarter check for slot overlap
Summary:
The old check for slot overlap treated 2 slots `S` and `T` as
overlapping if there existed a CFG node in which both of the slots could
possibly be active. That is overly conservative and caused stack blowups
in Rust programs. Instead, check whether there is a single CFG node in
which both of the slots are possibly active *together*.

Fixes PR32488.

Patch by Ariel Ben-Yehuda <ariel.byd@gmail.com>

Reviewers: thanm, nagisa, llvm-commits, efriedma, rnk

Reviewed By: thanm

Subscribers: dotdash

Differential Revision: https://reviews.llvm.org/D31583

llvm-svn: 305193
2017-06-12 14:56:02 +00:00
Sanjay Patel d4765a38b4 [DAG] add helper to bind memop chains; NFCI
This step is just intended to reduce code duplication rather than change any functionality.

A follow-up would be to replace PPCTargetLowering::spliceIntoChain() usage with this new helper.

Differential Revision: https://reviews.llvm.org/D33649

llvm-svn: 305192
2017-06-12 14:41:48 +00:00
Sanjay Patel 2e33bbaff0 [InstCombine] lshr (sext iM X to iN), N-M --> zext (ashr X, min(N-M, M-1)) to iN
This is a follow-up to https://reviews.llvm.org/D33879 / https://reviews.llvm.org/rL304939 ,
and was discussed in https://reviews.llvm.org/D33338.

We prefer this form because a narrower shift may be cheaper, and we can more easily fold a
zext than a sext.

http://rise4fun.com/Alive/slVe

Name: shz
%s = sext i8 %x to i12
%r = lshr i12 %s, 4
=>
%a = ashr i8 %x, 4
%r = zext i8 %a to i12 

llvm-svn: 305190
2017-06-12 14:23:43 +00:00
Daniel Neilson c0112ae8da Const correctness for TTI::getRegisterBitWidth
Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation.

Reviewers: chandlerc, rnk, reames

Reviewed By: reames

Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits

Differential Revision: https://reviews.llvm.org/D33903

llvm-svn: 305189
2017-06-12 14:22:21 +00:00
Simon Pilgrim b079c8b35b [X86][SSE] Change memop fragment to inherit from vec128load with local alignment controls
First possible step towards merging SSE/AVX memory folding pattern fragments.

Also allows us to remove the duplicate non-temporal load logic.

Differential Revision: https://reviews.llvm.org/D33902

llvm-svn: 305184
2017-06-12 10:01:27 +00:00
Craig Topper 69fead95c7 [AVX-512] Add VPCONFLICT and VPLZCNT to load folding tables.
llvm-svn: 305180
2017-06-12 04:57:31 +00:00
Yaron Keren 7d46392124 Address http://bugs.llvm.org/pr32207 by making BannerPrinted local to runOnSCC and skipping banner for function declarations.
Reviewed By: Mehdi AMINI

Differential Revision: https://reviews.llvm.org/D34086

llvm-svn: 305179
2017-06-12 02:18:50 +00:00
Sanjay Patel dcbfbb11d9 [x86] use vperm2f128 rather than vinsertf128 when there's a chance to fold a 32-byte load
I was looking closer at the x86 test diffs in D33866, and the first change seems like it 
shouldn't happen in the first place. So this patch will resolve that.

Using Agner's tables and AMD docs, vperm2f128 and vinsertf128 have identical timing for 
any given CPU model, so we should be able to interchange those without affecting perf. 
But as we can see in some of the diffs here, using vperm2f128 allows load folding, so 
we should take that opportunity to reduce code size and register pressure.

A secondary advantage is making AVX1 and AVX2 codegen more similar. Given that vperm2f128 
was introduced with AVX1, we should be selecting it in all of the same situations that we 
would with AVX2. If there's some reason that an AVX1 CPU would not want to use this 
instruction, that should be fixed up in a later pass.

Differential Revision: https://reviews.llvm.org/D33938

llvm-svn: 305171
2017-06-11 21:18:58 +00:00
Xinliang David Li 7ed6cd32ea [PartialInlining] Support shrinkwrap life_range markers
Differential Revision: http://reviews.llvm.org/D33847

llvm-svn: 305170
2017-06-11 20:46:05 +00:00
Simon Pilgrim 516938452f Fix unused variable warning on non-debug EXPENSIVE_CHECKS builds
llvm-svn: 305163
2017-06-11 12:49:29 +00:00
Amaury Sechet 2127452ff7 [DAGCombine] Make sure we check the ResNo from UADDO before combining
Summary: UADDO has 2 result, and one must check the result no before doing any kind of combine. Without it, the transform is invalid.

Reviewers: joerg

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34088

llvm-svn: 305162
2017-06-11 11:36:38 +00:00
Davide Italiano 83122058cf [MemorySSA] preservesAll() implies preserves<MemorySSA>(). NFCI.
llvm-svn: 305160
2017-06-11 01:05:45 +00:00
David Blaikie a91885a08c dwarfdump: Handle relocs to zlib (.zdebug*) compressed sections
llvm-svn: 305152
2017-06-10 19:32:50 +00:00
Vedant Kumar c6e9e3007b Fix a ubsan failure introduced by r305092
lib/Object/WindowsResource.cpp:578:3: runtime error: store to
misaligned address 0x7fa09aedebbe for type 'unsigned int', which
requires 4 byte alignment
0x7fa09aedebbe: note: pointer points here
00 00 03 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00
            ^

llvm-svn: 305149
2017-06-10 18:07:24 +00:00
Geoff Berry 3cca1da20c [EarlyCSE] Add option to use MemorySSA for function simplification run of EarlyCSE (off by default).
Summary:
Use MemorySSA for memory dependency checking in the EarlyCSE pass at the
start of the function simplification portion of the pipeline.  We rely
on the fact that GVNHoist runs just after this pass of EarlyCSE to
amortize the MemorySSA construction cost since GVNHoist uses MemorySSA
and EarlyCSE preserves it.

This is turned off by default.  A follow-up change will turn it on to
allow for easier reversion in case it breaks something.

llvm-svn: 305146
2017-06-10 15:20:03 +00:00
Galina Kistanova 038f9854ec Added llvm_unreachable as ReportError cannot be specified as noreturn.
llvm-svn: 305143
2017-06-10 07:50:14 +00:00
Wei Ding 7c3e5115a5 AMDGPU : Fix ISA Version Definitions.
Differential Revision: http://reviews.llvm.org/D28531

llvm-svn: 305137
2017-06-10 03:53:19 +00:00
Andrew Kaylor 647025f9e1 [InstSimplify] Don't constant fold or DCE calls that are marked nobuiltin
Differential Revision: https://reviews.llvm.org/D33737

llvm-svn: 305132
2017-06-09 23:18:11 +00:00
Sanjay Patel 2843cad435 [CGP] add a reference to DataLayout in MemCmpExpansion; NFCI
We're currently passing endian-ness around as a param (and not uniformly),
so this eliminates the need for that. I'd like to add a constant fold
call too, and that requires a DL.

llvm-svn: 305129
2017-06-09 23:01:05 +00:00
I-Jui (Ray) Sung 21fde385fa [AArch64] Add fallback in FastISel fp16 conversions
Summary:
- Fix assertion failures on F16 to/from int types in FastISel by falling
  back to regular ISel
- Add a testcase of various conversion cases with FastISel (-O0)

Reviewers: kristof.beyls, jmolloy, SjoerdMeijer

Reviewed By: SjoerdMeijer

Subscribers: SjoerdMeijer, llvm-commits, srhines, pirama, aemerson, rengolin, javed.absar, kristof.beyls

Differential Revision: https://reviews.llvm.org/D33734

llvm-svn: 305127
2017-06-09 22:40:50 +00:00
Craig Topper 7ad13f259f [LVI] Fix spelling error in comment. NFC
llvm-svn: 305115
2017-06-09 21:21:17 +00:00
Craig Topper 6dd9dcf26e [LVI] Const correct and rename the LVILatticeVal parameter to getPredicateResult. NFC
Previously it was non-const reference named Result which would tend to make someone think that it was an outparam when really its an input.

llvm-svn: 305114
2017-06-09 21:18:16 +00:00
Zachary Turner 3226fe95bb [pdb] Support CoffSymbolRVA debug subsection.
llvm-svn: 305108
2017-06-09 20:46:52 +00:00
Yaxun Liu 6455b0dbf3 [SROA] Fix APInt size when load/store have different address space
Currently there is a bug in SROA::presplitLoadsAndStores which causes assertion in
GEPOperator::accumulateConstantOffset.

Basically it does not consider the situation that the pointer operand of load or store
may be in a non-zero address space and its size may be different from the size of
a pointer in address space 0.

This patch fixes assertion when compiling Blender Cycles kernels for amdgpu backend.

Diffferential Revision: https://reviews.llvm.org/D33298

llvm-svn: 305107
2017-06-09 20:46:29 +00:00
Keno Fischer 5329174cb1 [Sink] Fix predicate in legality check
Summary:
isSafeToSpeculativelyExecute is the wrong predicate to use here.
All that checks for is whether it is safe to hoist a value due to
unaligned/un-dereferencable accesses. However, not only are we doing
sinking rather than hoisting, our concern is that the location
we're loading from may have been modified. Instead forbid sinking
any load across a critical edge.

Reviewers: majnemer

Subscribers: davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D33179

llvm-svn: 305102
2017-06-09 19:31:10 +00:00
Stanislav Mekhanoshin 1a61ab8172 [AMDGPU] Add intrinsics for alignbit and alignbyte instructions
Differential Revision: https://reviews.llvm.org/D34046

llvm-svn: 305098
2017-06-09 19:03:00 +00:00
Zachary Turner 7e62cd17d6 Allow VarStreamArray to use stateful extractors.
Previously extractors tried to be stateless with any additional
context information needed in order to parse items being passed
in via the extraction method.  This led to quite cumbersome
implementation challenges and awkwardness of use.  This patch
brings back support for stateful extractors, making the
implementation and usage simpler.

llvm-svn: 305093
2017-06-09 17:54:36 +00:00
Eric Beckmann d9de6389fc Implement COFF emission for parsed Windows Resource ( .res) files.
Summary: Add the WindowsResourceCOFFWriter class for producing the final COFF after all parsing is done.

Reviewers: hiraditya!, zturner, ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34020

llvm-svn: 305092
2017-06-09 17:34:30 +00:00
Simon Pilgrim 3d37b1a277 [X86][SSE] Add support for PACKSS nodes to faux shuffle extraction
If the inputs won't saturate during packing then we can treat the PACKSS as a truncation shuffle

llvm-svn: 305091
2017-06-09 17:29:52 +00:00
Craig Topper 31ce4ec2fd [LazyValueInfo] Don't run the more complex predicate handling code for EQ and NE in getPredicateResult
Summary:
Unless I'm mistaken, the special handling for EQ/NE should cover everything and there is no reason to fallthrough to the more complex code. For that matter I'm not sure there's any reason to special case EQ/NE other than avoiding creating temporary ConstantRanges.

This patch moves the complex code into an else so we only do it when we are handling a predicate other than EQ/NE.

Reviewers: anna, reames, resistor, Farhana

Reviewed By: anna

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34000

llvm-svn: 305086
2017-06-09 16:16:20 +00:00
Krzysztof Parzyszek 7aca2fd830 [Hexagon] Fixes and updates to the selection patterns
- Add some missing patterns.
- Use C4_cmplte in branch patterns.
- Fix signedness of immediate operand in M2_accii.

llvm-svn: 305085
2017-06-09 15:26:21 +00:00
Zvi Rackover 3a2c4b48bb SelectionDAG: Remove deleted nodes from legalized set to avoid clash with newly created nodes
Summary:
During DAG legalization loop in SelectionDAG::Legalize(),
bookkeeping of the SDNodes that were already legalized is implemented
with SmallPtrSet (LegalizedNodes). This kind of set stores only pointers
to objects, not the objects themselves. Unfortunately, if SDNode is
deleted during legalization for some reason, LegalizedNodes set is not
informed about this fact. This wouldn’t be so bad, if SelectionDAG wouldn’t reuse
space deallocated after deletion of unused nodes, for creation of new
ones. Because of this, new nodes, created during legalization often can
have pointers identical to ones that have been previously legalized,
added to the LegalizedNodes set, and deleted afterwards. This in turn
causes, that newly created nodes, sharing the same pointer as deleted
old ones, are present in LegalizedNodes *already at the moment of
creation*, so we never call Legalize on them.
The fix facilitates the fact, that DAG notifies listeners about each
modification. I have registered DAGNodeDeletedListener inside
SelectionDAG::Legalize, with a callback function that removes any
pointer of any deleted SDNode from the LegalizedNodes set. With this
modification, LegalizeNodes set does not contain pointers to nodes that
were deleted, so newly created nodes can always be inserted to it, even
if they share pointers with old deleted nodes.

Patch by pawel.szczerbuk@intel.com

The issue this patch addresses causes failures in an out-of-tree target,
and i was not able to create a reproducer for an in-tree target, hence
there is no test-case.

Reviewers: delena, spatel, RKSimon, hfinkel, davide, qcolombet

Reviewed By: delena

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33891

llvm-svn: 305084
2017-06-09 14:53:45 +00:00
Simon Dardis 212cccb2f4 Reland "[SelectionDAG] Enable target specific vector scalarization of calls and returns"
By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown,
backends can request that LLVM to scalarize vector types for calls
and returns.

The MIPS vector ABI requires that vector arguments and returns are passed in
integer registers. With SelectionDAG's new hooks, the MIPS backend can now
handle LLVM-IR with vector types in calls and returns. E.g.
'call @foo(<4 x i32> %4)'.

Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for
calls and returns if vector types were not legal. If vector types were legal,
a single 128bit vector argument would be assigned to a single 32 bit / 64 bit
integer register.

By teaching the MIPS backend to inspect the original types, it can now
implement the MIPS vector ABI which requires a particular method of
scalarizing vectors.

Previously, the MIPS backend relied on clang to scalarize types such as "call
@foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3,
i32 inreg %4)".

This patch enables the MIPS backend to take either form for vector types.

The previous version of this patch had a "conditional move or jump depends on
uninitialized value".

Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur

Differential Revision: https://reviews.llvm.org/D27845

llvm-svn: 305083
2017-06-09 14:37:08 +00:00
Sanjay Patel 70db424601 [SimplifyLibCalls] fix formatting; NFC
llvm-svn: 305081
2017-06-09 14:22:03 +00:00
Sanjay Patel fef83e8fb9 [ValueTracking] fix typo; NFC
llvm-svn: 305080
2017-06-09 14:21:18 +00:00
David Stuttard 82618baa0f [AMDGPU] Fix for issue in alloca to vector promotion pass
Summary:
Alloca promotion pass not dealing with non-canonical input

Added some additional checks so the pass simply backs-off forms it can't deal with (non-canonical)

Also added some test cases in non-canonical form to check that it no longer crashes

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D31710

llvm-svn: 305079
2017-06-09 14:16:22 +00:00
Javed Absar 9e1ff8654f [ARM] Custom machine-scheduler. NFCI.
This patch creates a customised machine-scheduler for ARM targets,
so that subsequently DAG mutations etc can be added.
Reviewed by: hahn, rengolin, rovka. 
Differential Revision: https://reviews.llvm.org/D34039

llvm-svn: 305078
2017-06-09 14:07:21 +00:00
Nirav Dave 670109d89a [MC] Fix compiler crash in AsmParser::Lex
When an empty comment is present in an assembly file, the compiler will crash because it checks the first character for '\n' or '\r'.
The fix consists of also checking if the string is empty before accessing the *front* method of the StringRef.
A test is included for the x86 target, but this issue is reproducible with other targets as well.

Patch by Alexandru Guduleasa!

Reviewers: niravd, grosbach, llvm-commits

Reviewed By: niravd

Differential Revision: https://reviews.llvm.org/D33993

llvm-svn: 305077
2017-06-09 14:04:03 +00:00
Krzysztof Parzyszek 7881415510 [Hexagon] Add LLVM header to HexagonPatterns.td
llvm-svn: 305074
2017-06-09 13:30:58 +00:00
Serge Rogatch 85427c0da7 [XRay] Fix computation of function size subject to XRay threshold
Summary:
Currently XRay compares its threshold against `Function::size()` . However, `Function::size()` returns the number of basic blocks (as I understand, such as cycle bodies, if/else bodies, switch-case bodies, etc.), rather than the number of instructions.

The name of the parameter `-fxray-instruction-threshold=N`, as well as XRay documentation at http://llvm.org/docs/XRay.html , suggests that instructions should be counted, rather than the number of basic blocks.

I see two options:
1. Count the number of MachineInstr`s in MachineFunction : this gives better  estimate for the number of assembly instructions on the target. So a user can check in disassembly that the threshold works more or less correctly.
2. Count the number of Instruction`s in a Function : AFAIK, this gives correct number of IR instructions, which the user can check in IR listing. However, this number may be far (several times for small functions) from the number of assembly instructions finally emitted.

Option 1 is implemented in this patch because I think that having the closer estimate for the number of assembly instructions emitted is more important than to have a clear definition of the metric.

Reviewers: dberris, rengolin

Reviewed By: dberris

Subscribers: llvm-commits, iid_iunknown

Differential Revision: https://reviews.llvm.org/D34027

llvm-svn: 305072
2017-06-09 13:23:23 +00:00
Nirav Dave 43a4d8122f Prevent RemoveDeadNodes from deleted already deleted node.
This prevents against assertion errors like PR32659 which occur from a
replacement deleting a node after it's been added to the list argument
of RemoveDeadNodes. The specific failure from PR32659 does not
currently happen, but it is still potentially possible. The underlying
cause is that the callers of the change dfunction builds up a list of
nodes to delete after having moved their uses and it possible that a
move of a later node will cause a previously deleted nodes to be
deleted.

Reviewers: bkramer, spatel, davide

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33731

llvm-svn: 305070
2017-06-09 12:57:35 +00:00
Oliver Stannard ad0973557c [ARM] Add scheduling info for VFMS
The scalar VFMS instructions did not have scheduling information attached (but
VFMA did), which was causing assertion failures with the Cortex-A57 scheduling
model and -fp-contract=fast.

Differential Revision: https://reviews.llvm.org/D34040

llvm-svn: 305064
2017-06-09 09:19:09 +00:00
Stefan Maksimovic add20f8f17 Test commit: remove whitespace
llvm-svn: 305059
2017-06-09 07:57:05 +00:00
David Blaikie 4a60d370e8 bugpoint: disabling symbolication of bugpoint-executed programs
Initial implementation - needs similar work/testing for other tools
bugpoint invokes (llc, lli I think, maybe more).

Alternatively (as suggested by chandlerc@) an environment variable could
be used. This would allow the option to pass transparently through user
scripts, pass to compilers if they happened to be LLVM-ish, etc.

I worry a bit about using cl::opt in the crash handling code - LLVM
might crash early, perhaps before the cl::opt is properly initialized?
Or at least before arguments have been parsed?

 - should be OK since it defaults to "pretty", so if the crash is very
 early in opt parsing, etc, then crash reports will still be symbolized.

I shyed away from doing this with an environment variable when I
realized that would require copying the existing environment and
appending the env variable of interest. But it seems there's no existing
LLVM API for accessing the environment (even the Support tests for
process launching have their own ifdefs for getting the environment). It
could be added, but seemed like a higher bar/untested codepath to
actually add environment variables.

Most importantly, this reduces the runtime of test/BugPoint/metadata.ll
in a split-dwarf Debug build from 1m34s to 6.5s by avoiding a lot of
symbolication. (this wasn't a problem for non-split-dwarf builds only
because the executable was too large to map into memory (due to bugpoint
setting a 400MB memory (including address space - not sure why? Going to
remove that) limit on the child process) so symbolication would fail
fast & wouldn't spend all that time parsing DWARF, etc)

Reviewers: chandlerc, dannyb

Differential Revision: https://reviews.llvm.org/D33804

llvm-svn: 305056
2017-06-09 07:29:03 +00:00
Serguei Katkov 38414b57f9 [IndVars] Add an option to be able to disable LFTR
This change adds an option disable-lftr to be able to disable Linear Function Test Replace optimization.
By default option is off so current behavior is not changed.

Reviewers: reames, sanjoy, wmi, andreadb, apilipenko
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33979

llvm-svn: 305055
2017-06-09 06:11:59 +00:00
George Burgess IV a20352e13e [LoopVectorize] Don't preserve nsw/nuw flags on shrunken ops.
If we're shrinking a binary operation, it may be the case that the new
operations wraps where the old didn't. If this happens, the behavior
should be well-defined. So, we can't always carry wrapping flags with us
when we shrink operations.

If we do, we get incorrect optimizations in cases like:

void foo(const unsigned char *from, unsigned char *to, int n) {
  for (int i = 0; i < n; i++)
    to[i] = from[i] - 128;
}

which gets optimized to:

void foo(const unsigned char *from, unsigned char *to, int n) {
  for (int i = 0; i < n; i++)
    to[i] = from[i] | 128;
}

Because:
- InstCombine turned `sub i32 %from.i, 128` into
  `add nuw nsw i32 %from.i, 128`.
- LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a
  vector full of `i8 128`s
- InstCombine took advantage of the fact that the newly-shrunken add
  "couldn't wrap", and changed the `add` to an `or`.

InstCombine seems happy to figure out whether we can add nuw/nsw on its
own, so I just decided to drop the flags. There are already a number of
places in LoopVectorize where we rely on InstCombine to clean up.

llvm-svn: 305053
2017-06-09 03:56:15 +00:00
David Blaikie cb9327b02d Inliner: Don't touch indirect calls
Other comments/implications are that this isn't intended behavior (nor
perserved/reimplemented in the new inliner) & complicates fixing the
'inlining' of trivially dead calls without consulting the cost function
first.

llvm-svn: 305052
2017-06-09 03:29:20 +00:00
Rui Ueyama 365d4d0000 Fix -Wunused-variable.
llvm-svn: 305051
2017-06-09 03:26:45 +00:00
Craig Topper a420562257 [InstCombine] Pass a proper context instruction to all of the calls into InstSimplify
Summary: This matches the behavior we already had for compares and makes us consistent everywhere.

Reviewers: dberlin, hfinkel, spatel

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33604

llvm-svn: 305049
2017-06-09 03:21:29 +00:00
Bob Haarman fdf499bf2d [codeview] use 32-bit integer for RelocOffset in DebugLinesSubsection
Summary:
RelocOffset is a 32-bit value, but we previously truncated it to 16 bits.

Fixes PR33335.

Reviewers: zturner, hiraditya!

Reviewed By: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33968

llvm-svn: 305043
2017-06-09 01:18:10 +00:00
Zachary Turner 28c22c83e3 [pdb] Don't crash on unknown debug subsections.
More and more unknown debug subsection kinds are being discovered
so we should make it possible to dump these and display the
bytes.

llvm-svn: 305041
2017-06-09 00:53:59 +00:00
Saleem Abdulrasool 1f62f57b37 sink DebugCompressionType into MC for exposing to clang
This is a preparatory change to expose the debug compression style to
clang.  It requires exposing the enumeration and passing the actual
value through to the backend from the frontend in actual value form
rather than a boolean that selects the GNU style of debug info
compression.

Minor tweak to the ELF Object Writer to use a variable for re-used
values.  Add an assertion that debug information format is one of the
two currently known types if debug information is being compressed.

llvm-svn: 305038
2017-06-09 00:40:19 +00:00
Zachary Turner deb391309c [CodeView] Support remaining debug subsection types
This adds support for Symbols, StringTable, and FrameData subsection
types.  Even though these subsections rarely if ever appear in a PDB
file (they are usually in object files), there's no theoretical reason
why they *couldn't* appear in a PDB.  The real issue though is that in
order to add support for dumping and writing them (which will be useful
for object files), we need a way to test them.  And since there is no
support for reading and writing them to / from object files yet, making
PDB support them is the best way to both add support for the underlying
format and add support for tests at the same time.  Later, when we go
to add support for reading / writing them from object files, we'll need
only minimal changes in the underlying read/write code.

llvm-svn: 305037
2017-06-09 00:28:08 +00:00
Zachary Turner 1bf7762049 [llvm-pdbdump] Support native ordering of subsections in raw mode.
This is the same change for the YAML Output style applied to the
raw output style.  Previously we would queue up all subsections
until every one had been read, and then output them in a pre-
determined order.  This was because some subsections need to be
read first in order to properly dump later subsections.  This
patch allows them to be dumped in the order they appear.

Differential Revision: https://reviews.llvm.org/D34015

llvm-svn: 305034
2017-06-08 23:49:01 +00:00
Evgeniy Stepanov d02dbf6b1c [CFI] Remove LinkerSubsectionsViaSymbols.
Since D17854 LinkerSubsectionsViaSymbols is unnecessary.

It is interfering with ThinLTO implementation of CFI-ICall, where
the aliases used on the !LinkerSubsectionsViaSymbols branch are
needed to export jump tables to ThinLTO backends.

This is the second attempt to land this change after fixing PR33316.

llvm-svn: 305031
2017-06-08 23:38:22 +00:00
Craig Topper 2aa4d39f5e [ExtractGV] Fix the doxygen comment on the constructor and the class to refer to global values instead of functions. While there fix an 80 column violation. NFC
llvm-svn: 305030
2017-06-08 23:38:19 +00:00
Galina Kistanova 415ec9260f Fixed warning: dereferencing type-punned pointer will break strict-aliasing rules.
No need in reinterpret_cast<StringTableOffset &> here, as struct coff_symbol Name is a unin
with the member StringTableOffset Offset. This union member could be accessed directly.

llvm-svn: 305029
2017-06-08 23:35:52 +00:00
Craig Topper c1993fa1a3 [IR] Remove getNumSuccessorsV/getSuccessorV/setSuccessorV from the TerminatorInst subclasses as much as possible now that Value has been de-virtualized
These used to be virtual methods that would enable doing the right thing with only a TerminatorInst pointer. I believe they were also acting as vtable anchors in my cases. I think the fact that they had a separate name ending in V was to allow a version without V to be called without a virtual call in a pre-C++11 final keyword world.

Where possible the base methods in TerminatorInst dispatch directly to the public methods in the classes that have the same signature. For some classes this wasn't possible so I've left private method versions that match the name and signature of the version in TerminatorInst. All versions have been moved into the class definitions since we no longer need vtable anchors here.

Differential Revision: https://reviews.llvm.org/D34011

llvm-svn: 305028
2017-06-08 23:23:08 +00:00
Peter Collingbourne e357fbd243 Write summaries for merged modules when splitting modules for ThinLTO.
This is to prepare to allow for dead stripping of globals in the
merged modules.

Differential Revision: https://reviews.llvm.org/D33921

llvm-svn: 305027
2017-06-08 23:01:49 +00:00
Kostya Serebryany 2c2fb8896b [sanitizer-coverage] one more flavor of coverage: -fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet. Reapplying revisions 304630, 304631, 304632, 304673, see PR33308
llvm-svn: 305026
2017-06-08 22:58:19 +00:00
Peter Collingbourne dc8c01891f Object: Move datalayout check into irsymtab::build. NFCI.
This check is a requirement of the irsymtab builder, not of any
particular caller.

Differential Revision: https://reviews.llvm.org/D33970

llvm-svn: 305023
2017-06-08 22:04:24 +00:00
Peter Collingbourne 8dde4cba4c Bitcode: Introduce a BitcodeFileContents data type. NFCI.
This data type includes the contents of a bitcode file.
Right now a bitcode file can only contain modules, but
a later change will add a symbol table.

Differential Revision: https://reviews.llvm.org/D33969

llvm-svn: 305019
2017-06-08 22:00:24 +00:00
Matthias Braun 1ee25e0c3f RegAllocPBQP: Do not assign reserved physical register
(0) RegAllocPBQP: Since getRawAllocationOrder() may return a collection that includes reserved physical registers, iterate to find an un-reserved physical register.

(1) VirtRegMap: Enforce the invariant: "no reserved physical registers" in assignVirt2Phys(). Previously, this was checked only after the fact in VirtRegRewriter::rewrite.

(2) MachineVerifier: updated the test per MatzeB's review.

(3) +testcase

Patch by Nick Johnson<Nicholas.Paul.Johnson@deshawresearch.com>!

Differential Revision: https://reviews.llvm.org/D33947

llvm-svn: 305016
2017-06-08 21:30:54 +00:00
Krzysztof Parzyszek b1ada4e742 [Hexagon] Re-enable machine verifier after codegen passes
Remove "false" from the arguments to "addPass" in Hexagon's target pass
config.

llvm-svn: 305015
2017-06-08 21:25:36 +00:00
Krzysztof Parzyszek 8a7fb0fe51 [Hexagon] Skip mux generation when predicate register is undefined
llvm-svn: 305014
2017-06-08 20:56:36 +00:00
Evgeniy Stepanov 60d411b66e [MachO] Fix codegen of alias of alias.
Fixes PR33316.

llvm-svn: 305012
2017-06-08 20:49:03 +00:00
Dehao Chen e2a428bad7 Do not early-inline recursive calls in sample profile loader.
Summary: Early-inlining of recursive call makes the code size bloat exponentially. We should not disable it.

Reviewers: davidxl, dnovillo, iteratee

Reviewed By: iteratee

Subscribers: iteratee, llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D34017

llvm-svn: 305009
2017-06-08 20:11:57 +00:00
Sanjay Patel 3b8974b51b fix formatting; NFC
llvm-svn: 305008
2017-06-08 20:00:09 +00:00
Sanjay Patel 5e370850d4 [CGP] don't expand a memcmp with nobuiltin attribute
This matches the behavior used in the SDAG when expanding memcmp.

For reference, we're intentionally treating the earlier fortified call transforms differently after:
https://bugs.llvm.org/show_bug.cgi?id=23093
https://reviews.llvm.org/rL233776

One motivation for not transforming nobuiltin calls is that it can interfere with sanitizers:
https://reviews.llvm.org/D19781
https://reviews.llvm.org/D19801

Differential Revision: https://reviews.llvm.org/D34043

llvm-svn: 305007
2017-06-08 19:47:25 +00:00
Matt Arsenault f1202e650a AMDGPU: Work around build special casing .inc files
It complains because it assumes these were autogenerated files
in the source directory.

llvm-svn: 305005
2017-06-08 19:25:21 +00:00
Matt Arsenault 3c7581bbeb AMDGPU: Use correct register names in inline assembly
Fixes using physical registers in inline asm from clang.

llvm-svn: 305004
2017-06-08 19:03:20 +00:00
Nirav Dave 6a38cc6d67 [Hexagon] Speedup NumNodesBlocking calculation. NFCI.
llvm-svn: 305003
2017-06-08 18:49:25 +00:00
Guozhi Wei f31c56df2a [PPC] In PPCBoolRetToInt change the bool value to i64 if the target is ppc64
In PPCBoolRetToInt bool value is changed to i32 type. On ppc64 it may introduce an extra zero extension for the return value. This patch changes the integer type to i64 to avoid the zero extension on ppc64.

This patch fixed PR32442.

Differential Revision: https://reviews.llvm.org/D31407

llvm-svn: 305001
2017-06-08 18:27:24 +00:00
Mark Searles e5c7832311 [AMDGPU] Force qsads instrs to use different dest register than source registers
The V_MQSAD_PK_U16_U8, V_QSAD_PK_U16_U8, and V_MQSAD_U32_U8 take more than 1 pass in hardware. For these three instructions, the destination registers must be different than all sources, so that the first pass does not overwrite sources for the following passes.

Differential Revision: https://reviews.llvm.org/D33783

llvm-svn: 304998
2017-06-08 18:21:19 +00:00
Galina Kistanova e128958552 Changed a comparison operator for std::stable_sort to implement strict weak ordering.
This is a temporarily fix which needs additional work, as it triggers a test3 failure.
test3 is commented out till then.

llvm-svn: 304993
2017-06-08 17:27:40 +00:00
Zaara Syeda 79acbbe513 [Power9] Exploit vector integer extend instructions
This patch adds build vector patterns to exploit the vector integer
extend instructions:
vextsb2w - Vector Extend Sign Byte To Word
vextsb2d - Vector Extend Sign Byte To Doubleword
vextsh2w - Vector Extend Sign Halfword To Word
vextsh2d - Vector Extend Sign Halfword To Doubleword
vextsw2d - Vector Extend Sign Word To Doubleword

Differential Revision: https://reviews.llvm.org/D33510

llvm-svn: 304992
2017-06-08 17:14:36 +00:00
Craig Topper db52809e77 [LazyValueInfo] Make LVILatticeVal intersect method take arguments by reference so we don't copy ConstantRanges unless we need to.
llvm-svn: 304990
2017-06-08 17:08:58 +00:00
Sanjay Patel e7c5041c2a [CGP / PowerPC] avoid multi-block overhead for simple memcmp expansion
The test diff for PowerPC shows we can better optimize if this case is one block.

For x86, there's would be a substantial difference if CGP expansion was enabled because branches are assumed 
cheap and SDAG can't optimize across blocks. 

Instead of this:

_cmp_eq8:
  movq  (%rdi), %rax
  cmpq  (%rsi), %rax
  je  LBB23_1
## BB#2:                                ## %res_block
  movl  $1, %ecx
  jmp LBB23_3
LBB23_1:
  xorl  %ecx, %ecx
LBB23_3:                                ## %endblock
  xorl  %eax, %eax
  testl %ecx, %ecx
  sete  %al
  retq

We get this:

cmp_eq8:   
  movq  (%rdi), %rcx
  xorl  %eax, %eax
  cmpq  (%rsi), %rcx
  sete  %al
  retq

And that matches the optimal codegen that we get from the current expansion in SelectionDAGBuilder::visitMemCmpCall(). 
If this looks right, then I just need to confirm that vector-sized expansion will work from here, and we can enable 
CGP memcmp() expansion for x86. Ie, we'll bypass the power-of-2 special cases currently optimized in SDAG because we 
can lower the IR produced here optimally.

Differential Revision: https://reviews.llvm.org/D34005

llvm-svn: 304987
2017-06-08 16:53:18 +00:00
Andrew V. Tischenko 8cb1d0931f Add scheduler classes to integer/float horizontal operations.
This patch will close PR32801.
Differential Revision: https://reviews.llvm.org/D33203

llvm-svn: 304986
2017-06-08 16:44:13 +00:00
Zachary Turner 15eb237fd3 [PDB] Don't crash on /debug:fastlink PDBs.
Apparently support for /debug:fastlink PDBs isn't part of the
DIA SDK (!), and it was causing llvm-pdbdump to crash because
we weren't checking for a null pointer return value.  This
manifests when calling findChildren on the IDiaSymbol, and
it returns E_NOTIMPL.

llvm-svn: 304982
2017-06-08 16:00:40 +00:00
Nirav Dave 62fb8498d3 InferAddressSpaces: Avoid assertion failure with replacing identical
cloned constexpr

Have cloneConstantExprWithNewAddressSpaces return nullptr when
returning initial ConstantExpr.

Reviewers: arsenm

Subscribers: jholewinski, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D33995

llvm-svn: 304975
2017-06-08 13:20:55 +00:00
Andrew V. Tischenko e0531025f8 This patch closes PR28513: an optimization of multiplication by different constants.
The initial patch was rejected: I fixed the issue and re-apply it.

llvm-svn: 304972
2017-06-08 10:20:13 +00:00
John Brawn da4a68a1d2 [BPI] Don't assume that strcmp returning >0 is more likely than <0
The zero heuristic assumes that integers are more likely positive than negative,
but this also has the effect of assuming that strcmp return values are more
likely positive than negative. Given that for nonzero strcmp return values it's
the ordering of arguments that determines the sign of the result there's no
reason to assume that's true.

Fix this by inspecting the LHS of the compare and using TargetLibraryInfo to
decide if it's strcmp-like, and if so only assume that nonzero is more likely
than zero i.e. strings are more often different than the same. This causes a
slight code generation change in the spec2006 benchmark 403.gcc, but with no
noticeable performance impact. The intent of this patch is to allow better
optimisation of dhrystone on Cortex-M cpus, but currently it won't as there are
also some changes that need to be made to if-conversion.

Differential Revision: https://reviews.llvm.org/D33934

llvm-svn: 304970
2017-06-08 09:44:40 +00:00
Peter Collingbourne c00c2b246b Object: Factor out the code for creating the irsymtab for an arbitrary bitcode file.
This code now lives in lib/Object. The idea is that it can now be reused by
IRObjectFile among other things.

Differential Revision: https://reviews.llvm.org/D31921

llvm-svn: 304958
2017-06-08 01:26:14 +00:00
Eugene Zelenko 6ac7a34816 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 304954
2017-06-07 23:53:32 +00:00
David Blaikie 7a9b788830 GlobalsModRef: Ensure optnone+readonly/readnone attributes are respected
llvm-svn: 304945
2017-06-07 21:37:39 +00:00
Sanjay Patel 66f7fdb300 [InstCombine] fold lshr (sext X), C1 --> zext (lshr X, C2)
This was discussed in D33338. We have larger pattern-matching ending in a truncate that 
we can reduce or remove by handling these smaller patterns first. Further motivation is 
that narrower shift ops are easier for value tracking and zext is better than sext.

http://rise4fun.com/Alive/rhh

Name: boolshift
%sext = sext i1 %x to i8
%r = lshr i8 %sext, 7

=>

%r = zext i1 %x to i8

Name: noboolshift
%sext = sext i3 %x to i8
%r = lshr i8 %sext, 7

=>

%sh = lshr i3 %x, 2
%r = zext i3 %sh to i8

Differential Revision: https://reviews.llvm.org/D33879

llvm-svn: 304939
2017-06-07 20:32:08 +00:00
Krzysztof Parzyszek 5ba13825f0 [Hexagon] Generate 'inbounds' GEPs in HexagonCommonGEP
llvm-svn: 304937
2017-06-07 20:04:33 +00:00
Nirav Dave 772ea3ae1a [DAG] Improve Store Merge candidate pruning. NFC.
When considering merging stores values are the results of loads only
consider stores whose values come from loads from the same base.

This fixes much of the longer compile times in PR33330.

llvm-svn: 304934
2017-06-07 18:51:56 +00:00
Xinliang David Li 4f49bee764 Fix builin_expect lowering bug
PR33346

Skip cases when expected value is not constant int.

llvm-svn: 304933
2017-06-07 18:32:24 +00:00
Alina Sbirlea 33e5872367 [mssa] Fix case when there is no definition in a block prior to an inserted use.
Summary:
Check that the first access before one being tested is valid.
Before this patch, if there was no definition prior to the Use being tested,
the first time Iter was deferenced, it hit the sentinel.

Reviewers: dberlin, gbiv

Subscribers: sanjoy, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D33950

llvm-svn: 304926
2017-06-07 16:46:53 +00:00
Sanjay Patel 8ce1e3b759 [CGP] avoid zext/trunc of a memcmp expansion compare
This could be viewed as another shortcoming of the DAGCombiner:
when both operands of a compare are zexted from the same source
type, we should be able to compare the original types.

The effect on PowerPC perf is likely unnoticeable, but there's a
visible regression for x86 if we feed the suboptimal IR for memcmp
expansion to the DAG:

_cmp_eq4_zexted_to_i64:
  movl  (%rdi), %ecx
  movl  (%rsi), %edx
  xorl  %eax, %eax
  cmpq  %rdx, %rcx
  sete  %al

_cmp_eq4_better:
  movl  (%rdi), %ecx
  xorl  %eax, %eax
  cmpl  (%rsi), %ecx
  sete  %al

llvm-svn: 304923
2017-06-07 16:16:45 +00:00
Dmitry Preobrazhensky 5a2f881b39 [AMDGPU][MC] Corrected error message for s_waitcnt helpers
See Bug 32711: https://bugs.llvm.org//show_bug.cgi?id=32711

Reviewers: artem.tamazov

Differential Revision: https://reviews.llvm.org/D33781

llvm-svn: 304922
2017-06-07 16:08:02 +00:00
Peter Collingbourne aaae7eed5c LowerTypeTests: Generate simpler IR for br(llvm.type.test, then, else).
This makes it so that the code quality for CFI checks when compiling
with -O2 and linking with --lto-O0 is similar to that of the rest of
the code.

Reduces the size of a chrome binary built with -O2/--lto-O0 by
about 750KB.

Differential Revision: https://reviews.llvm.org/D33925

llvm-svn: 304921
2017-06-07 15:49:14 +00:00
Sanjay Patel cf531ca50c [CGP] pass size as param in MemCmpExpansion; NFCI
Avoid extracting the constant int twice.

llvm-svn: 304920
2017-06-07 15:05:13 +00:00
Petar Jovanovic 2f5f8e947a [mips][dsp] Modify repl.ph to accept signed immediate values
Changed immediate type for repl.ph from uimm10 to simm10 as per the specs.
Repl.qb still accepts uimm8. Both instructions now mimic the behaviour of
GNU as.

Patch by Stefan Maksimovic.

Differential Revision: https://reviews.llvm.org/D33594

llvm-svn: 304918
2017-06-07 14:48:46 +00:00
Sanjay Patel af515d9497 [CGP] pass size as param in MemCmpExpansion; NFCI
Avoid extracting the constant int twice.

llvm-svn: 304917
2017-06-07 14:45:49 +00:00
Sanjay Patel 4137d51bc1 [CGP] getParent()->getParent() --> getFunction(); NFCI
llvm-svn: 304916
2017-06-07 14:29:52 +00:00
Jonas Paulsson ae8d22cee2 [SystemZ] Propagate MachineMemOperands
In emitCondStore() and emitMemMemWrapper().

Review: Ulrich Weigand
llvm-svn: 304913
2017-06-07 14:08:34 +00:00
Simon Pilgrim be8866f691 [DAG] Move SelectionDAG::isCommutativeBinOp to TargetLowering.
This will allow commutation of target-specific DAG nodes in future patches

Differential Revision: https://reviews.llvm.org/D33882

llvm-svn: 304911
2017-06-07 14:05:04 +00:00
Tom Stellard 2860a428f7 AMDGPU/GlobalISel: Mark 32-bit G_SELECT as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D33949

llvm-svn: 304910
2017-06-07 13:54:51 +00:00
Sanjay Patel 6e8e7cc70e [x86] avoid flipping sign bits for vector icmp by using known bits
If we know that both operands of an unsigned integer vector comparison are non-negative, 
then it's safe to directly use a signed-compare-greater-than instruction (the only non-equality
integer vector compare predicate provided by SSE/AVX).

We're intentionally not changing the condition code to signed in order to preserve the
existing transforms that use min/max/psubus below here.

This should solve PR33276:
https://bugs.llvm.org/show_bug.cgi?id=33276

Differential Revision: https://reviews.llvm.org/D33862

llvm-svn: 304909
2017-06-07 13:46:34 +00:00
Sanjay Patel 6007000824 [CGP] add helper function for generating compare of load pairs; NFCI
In the special (but also the likely common) case, we can avoid
the multi-block complexity of the general algorithm, so moving
this part off on its own will make it re-usable.

llvm-svn: 304908
2017-06-07 13:33:00 +00:00
Nemanja Ivanovic d8623f0825 [PowerPC] Eliminate integer compare instructions - vol. 5
Adds handling for i64 SETNE comparison (both sign and zero extended).

Differential Revision: https://reviews.llvm.org/D33720

llvm-svn: 304907
2017-06-07 13:18:06 +00:00
Petar Jovanovic 3c039d968e [mips] do not use FastISel when -mxgot is present
The clang compiler by default uses FastISel when invoked with -O0, which
is also the default. In that case, passing of -mxgot does not get honored,
i.e. the code path that is to deal with large got is not taken.
Clang produces same output regardless of -mxgot being present or not.
This change checks whether -mxgot is passed as an option, and turns off
FastISel if it is.

Patch by Stefan Maksimovic.

Differential Revision: https://reviews.llvm.org/D33593

llvm-svn: 304906
2017-06-07 12:59:53 +00:00
Florian Hahn 28a61d64e2 [ARM] Use FixupKind variable in processFixupValue (cleanup, NFC).
llvm-svn: 304905
2017-06-07 12:58:08 +00:00
Sanjay Patel ab0ecc00b7 [CGP] fix formatting in MemCmpExpansion; NFC
llvm-svn: 304903
2017-06-07 12:44:36 +00:00
Diana Picus 0b4190a9d6 [ARM] GlobalISel: Purge G_SEQUENCE
According to the commit message from r296921, G_MERGE_VALUES and
G_INSERT are to be preferred over G_SEQUENCE. Therefore, stop generating
G_SEQUENCE in the ARM backend and remove the code dealing with it.

This boils down to the code breaking up double values for the soft float
calling convention. Use G_MERGE_VALUES + G_UNMERGE_VALUES instead of
G_SEQUENCE + G_EXTRACT for it. This maps very nicely to VMOVDRR +
VMOVRRD and simplifies the code in the instruction selector.

There's one occurence of G_SEQUENCE left in arm-irtranslator.ll, but
that is part of the target-independent code for translating constant
structs. Therefore, it is beyond the scope of this commit.

llvm-svn: 304902
2017-06-07 12:35:05 +00:00
Nemanja Ivanovic bb67f847d6 [PowerPC] Eliminate integer compare instructions - vol. 3
Adds handling for i32 SETNE comparison (both sign and zero extended).

Differential Revision: https://reviews.llvm.org/D33718

llvm-svn: 304901
2017-06-07 12:23:41 +00:00
Diana Picus 0196427b03 [ARM] GlobalISel: Support G_XOR
Same as the other binary operators:
- legalize to 32 bits
- map to GPRs
- select to EORrr via TableGen'erated code

llvm-svn: 304898
2017-06-07 11:57:30 +00:00
Simon Dardis 7c96ba1920 evert "[mips] Fix test mips64fpldst.ll with machine verifier enabled"
This reverts commit r301394. It broke some internal buildbots, reverting
while the issue is being investigated.

llvm-svn: 304896
2017-06-07 11:21:37 +00:00
Simon Pilgrim 58f5be2771 [X86][SSE] Fix an issue with PEXTRW/PEXTRB indices during shuffle combining
We were checking that the index was in range of the destination vector type, not the (larger) source vector type

llvm-svn: 304894
2017-06-07 10:30:35 +00:00
Diana Picus eeb0aad8e4 [ARM] GlobalISel: Support G_OR
Same as the other binary operators:
- legalize to 32 bits
- map to GPRs
- select ORRrr thanks to TableGen'erated code

llvm-svn: 304890
2017-06-07 10:14:23 +00:00