Commit Graph

104393 Commits

Author SHA1 Message Date
Vedant Kumar c6e9e3007b Fix a ubsan failure introduced by r305092
lib/Object/WindowsResource.cpp:578:3: runtime error: store to
misaligned address 0x7fa09aedebbe for type 'unsigned int', which
requires 4 byte alignment
0x7fa09aedebbe: note: pointer points here
00 00 03 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00
            ^

llvm-svn: 305149
2017-06-10 18:07:24 +00:00
Geoff Berry 3cca1da20c [EarlyCSE] Add option to use MemorySSA for function simplification run of EarlyCSE (off by default).
Summary:
Use MemorySSA for memory dependency checking in the EarlyCSE pass at the
start of the function simplification portion of the pipeline.  We rely
on the fact that GVNHoist runs just after this pass of EarlyCSE to
amortize the MemorySSA construction cost since GVNHoist uses MemorySSA
and EarlyCSE preserves it.

This is turned off by default.  A follow-up change will turn it on to
allow for easier reversion in case it breaks something.

llvm-svn: 305146
2017-06-10 15:20:03 +00:00
Galina Kistanova 038f9854ec Added llvm_unreachable as ReportError cannot be specified as noreturn.
llvm-svn: 305143
2017-06-10 07:50:14 +00:00
Wei Ding 7c3e5115a5 AMDGPU : Fix ISA Version Definitions.
Differential Revision: http://reviews.llvm.org/D28531

llvm-svn: 305137
2017-06-10 03:53:19 +00:00
Andrew Kaylor 647025f9e1 [InstSimplify] Don't constant fold or DCE calls that are marked nobuiltin
Differential Revision: https://reviews.llvm.org/D33737

llvm-svn: 305132
2017-06-09 23:18:11 +00:00
Sanjay Patel 2843cad435 [CGP] add a reference to DataLayout in MemCmpExpansion; NFCI
We're currently passing endian-ness around as a param (and not uniformly),
so this eliminates the need for that. I'd like to add a constant fold
call too, and that requires a DL.

llvm-svn: 305129
2017-06-09 23:01:05 +00:00
I-Jui (Ray) Sung 21fde385fa [AArch64] Add fallback in FastISel fp16 conversions
Summary:
- Fix assertion failures on F16 to/from int types in FastISel by falling
  back to regular ISel
- Add a testcase of various conversion cases with FastISel (-O0)

Reviewers: kristof.beyls, jmolloy, SjoerdMeijer

Reviewed By: SjoerdMeijer

Subscribers: SjoerdMeijer, llvm-commits, srhines, pirama, aemerson, rengolin, javed.absar, kristof.beyls

Differential Revision: https://reviews.llvm.org/D33734

llvm-svn: 305127
2017-06-09 22:40:50 +00:00
Craig Topper 7ad13f259f [LVI] Fix spelling error in comment. NFC
llvm-svn: 305115
2017-06-09 21:21:17 +00:00
Craig Topper 6dd9dcf26e [LVI] Const correct and rename the LVILatticeVal parameter to getPredicateResult. NFC
Previously it was non-const reference named Result which would tend to make someone think that it was an outparam when really its an input.

llvm-svn: 305114
2017-06-09 21:18:16 +00:00
Zachary Turner 3226fe95bb [pdb] Support CoffSymbolRVA debug subsection.
llvm-svn: 305108
2017-06-09 20:46:52 +00:00
Yaxun Liu 6455b0dbf3 [SROA] Fix APInt size when load/store have different address space
Currently there is a bug in SROA::presplitLoadsAndStores which causes assertion in
GEPOperator::accumulateConstantOffset.

Basically it does not consider the situation that the pointer operand of load or store
may be in a non-zero address space and its size may be different from the size of
a pointer in address space 0.

This patch fixes assertion when compiling Blender Cycles kernels for amdgpu backend.

Diffferential Revision: https://reviews.llvm.org/D33298

llvm-svn: 305107
2017-06-09 20:46:29 +00:00
Keno Fischer 5329174cb1 [Sink] Fix predicate in legality check
Summary:
isSafeToSpeculativelyExecute is the wrong predicate to use here.
All that checks for is whether it is safe to hoist a value due to
unaligned/un-dereferencable accesses. However, not only are we doing
sinking rather than hoisting, our concern is that the location
we're loading from may have been modified. Instead forbid sinking
any load across a critical edge.

Reviewers: majnemer

Subscribers: davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D33179

llvm-svn: 305102
2017-06-09 19:31:10 +00:00
Stanislav Mekhanoshin 1a61ab8172 [AMDGPU] Add intrinsics for alignbit and alignbyte instructions
Differential Revision: https://reviews.llvm.org/D34046

llvm-svn: 305098
2017-06-09 19:03:00 +00:00
Zachary Turner 7e62cd17d6 Allow VarStreamArray to use stateful extractors.
Previously extractors tried to be stateless with any additional
context information needed in order to parse items being passed
in via the extraction method.  This led to quite cumbersome
implementation challenges and awkwardness of use.  This patch
brings back support for stateful extractors, making the
implementation and usage simpler.

llvm-svn: 305093
2017-06-09 17:54:36 +00:00
Eric Beckmann d9de6389fc Implement COFF emission for parsed Windows Resource ( .res) files.
Summary: Add the WindowsResourceCOFFWriter class for producing the final COFF after all parsing is done.

Reviewers: hiraditya!, zturner, ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34020

llvm-svn: 305092
2017-06-09 17:34:30 +00:00
Simon Pilgrim 3d37b1a277 [X86][SSE] Add support for PACKSS nodes to faux shuffle extraction
If the inputs won't saturate during packing then we can treat the PACKSS as a truncation shuffle

llvm-svn: 305091
2017-06-09 17:29:52 +00:00
Craig Topper 31ce4ec2fd [LazyValueInfo] Don't run the more complex predicate handling code for EQ and NE in getPredicateResult
Summary:
Unless I'm mistaken, the special handling for EQ/NE should cover everything and there is no reason to fallthrough to the more complex code. For that matter I'm not sure there's any reason to special case EQ/NE other than avoiding creating temporary ConstantRanges.

This patch moves the complex code into an else so we only do it when we are handling a predicate other than EQ/NE.

Reviewers: anna, reames, resistor, Farhana

Reviewed By: anna

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34000

llvm-svn: 305086
2017-06-09 16:16:20 +00:00
Krzysztof Parzyszek 7aca2fd830 [Hexagon] Fixes and updates to the selection patterns
- Add some missing patterns.
- Use C4_cmplte in branch patterns.
- Fix signedness of immediate operand in M2_accii.

llvm-svn: 305085
2017-06-09 15:26:21 +00:00
Zvi Rackover 3a2c4b48bb SelectionDAG: Remove deleted nodes from legalized set to avoid clash with newly created nodes
Summary:
During DAG legalization loop in SelectionDAG::Legalize(),
bookkeeping of the SDNodes that were already legalized is implemented
with SmallPtrSet (LegalizedNodes). This kind of set stores only pointers
to objects, not the objects themselves. Unfortunately, if SDNode is
deleted during legalization for some reason, LegalizedNodes set is not
informed about this fact. This wouldn’t be so bad, if SelectionDAG wouldn’t reuse
space deallocated after deletion of unused nodes, for creation of new
ones. Because of this, new nodes, created during legalization often can
have pointers identical to ones that have been previously legalized,
added to the LegalizedNodes set, and deleted afterwards. This in turn
causes, that newly created nodes, sharing the same pointer as deleted
old ones, are present in LegalizedNodes *already at the moment of
creation*, so we never call Legalize on them.
The fix facilitates the fact, that DAG notifies listeners about each
modification. I have registered DAGNodeDeletedListener inside
SelectionDAG::Legalize, with a callback function that removes any
pointer of any deleted SDNode from the LegalizedNodes set. With this
modification, LegalizeNodes set does not contain pointers to nodes that
were deleted, so newly created nodes can always be inserted to it, even
if they share pointers with old deleted nodes.

Patch by pawel.szczerbuk@intel.com

The issue this patch addresses causes failures in an out-of-tree target,
and i was not able to create a reproducer for an in-tree target, hence
there is no test-case.

Reviewers: delena, spatel, RKSimon, hfinkel, davide, qcolombet

Reviewed By: delena

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33891

llvm-svn: 305084
2017-06-09 14:53:45 +00:00
Simon Dardis 212cccb2f4 Reland "[SelectionDAG] Enable target specific vector scalarization of calls and returns"
By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown,
backends can request that LLVM to scalarize vector types for calls
and returns.

The MIPS vector ABI requires that vector arguments and returns are passed in
integer registers. With SelectionDAG's new hooks, the MIPS backend can now
handle LLVM-IR with vector types in calls and returns. E.g.
'call @foo(<4 x i32> %4)'.

Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for
calls and returns if vector types were not legal. If vector types were legal,
a single 128bit vector argument would be assigned to a single 32 bit / 64 bit
integer register.

By teaching the MIPS backend to inspect the original types, it can now
implement the MIPS vector ABI which requires a particular method of
scalarizing vectors.

Previously, the MIPS backend relied on clang to scalarize types such as "call
@foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3,
i32 inreg %4)".

This patch enables the MIPS backend to take either form for vector types.

The previous version of this patch had a "conditional move or jump depends on
uninitialized value".

Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur

Differential Revision: https://reviews.llvm.org/D27845

llvm-svn: 305083
2017-06-09 14:37:08 +00:00
Sanjay Patel 70db424601 [SimplifyLibCalls] fix formatting; NFC
llvm-svn: 305081
2017-06-09 14:22:03 +00:00
Sanjay Patel fef83e8fb9 [ValueTracking] fix typo; NFC
llvm-svn: 305080
2017-06-09 14:21:18 +00:00
David Stuttard 82618baa0f [AMDGPU] Fix for issue in alloca to vector promotion pass
Summary:
Alloca promotion pass not dealing with non-canonical input

Added some additional checks so the pass simply backs-off forms it can't deal with (non-canonical)

Also added some test cases in non-canonical form to check that it no longer crashes

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D31710

llvm-svn: 305079
2017-06-09 14:16:22 +00:00
Javed Absar 9e1ff8654f [ARM] Custom machine-scheduler. NFCI.
This patch creates a customised machine-scheduler for ARM targets,
so that subsequently DAG mutations etc can be added.
Reviewed by: hahn, rengolin, rovka. 
Differential Revision: https://reviews.llvm.org/D34039

llvm-svn: 305078
2017-06-09 14:07:21 +00:00
Nirav Dave 670109d89a [MC] Fix compiler crash in AsmParser::Lex
When an empty comment is present in an assembly file, the compiler will crash because it checks the first character for '\n' or '\r'.
The fix consists of also checking if the string is empty before accessing the *front* method of the StringRef.
A test is included for the x86 target, but this issue is reproducible with other targets as well.

Patch by Alexandru Guduleasa!

Reviewers: niravd, grosbach, llvm-commits

Reviewed By: niravd

Differential Revision: https://reviews.llvm.org/D33993

llvm-svn: 305077
2017-06-09 14:04:03 +00:00
Krzysztof Parzyszek 7881415510 [Hexagon] Add LLVM header to HexagonPatterns.td
llvm-svn: 305074
2017-06-09 13:30:58 +00:00
Serge Rogatch 85427c0da7 [XRay] Fix computation of function size subject to XRay threshold
Summary:
Currently XRay compares its threshold against `Function::size()` . However, `Function::size()` returns the number of basic blocks (as I understand, such as cycle bodies, if/else bodies, switch-case bodies, etc.), rather than the number of instructions.

The name of the parameter `-fxray-instruction-threshold=N`, as well as XRay documentation at http://llvm.org/docs/XRay.html , suggests that instructions should be counted, rather than the number of basic blocks.

I see two options:
1. Count the number of MachineInstr`s in MachineFunction : this gives better  estimate for the number of assembly instructions on the target. So a user can check in disassembly that the threshold works more or less correctly.
2. Count the number of Instruction`s in a Function : AFAIK, this gives correct number of IR instructions, which the user can check in IR listing. However, this number may be far (several times for small functions) from the number of assembly instructions finally emitted.

Option 1 is implemented in this patch because I think that having the closer estimate for the number of assembly instructions emitted is more important than to have a clear definition of the metric.

Reviewers: dberris, rengolin

Reviewed By: dberris

Subscribers: llvm-commits, iid_iunknown

Differential Revision: https://reviews.llvm.org/D34027

llvm-svn: 305072
2017-06-09 13:23:23 +00:00
Nirav Dave 43a4d8122f Prevent RemoveDeadNodes from deleted already deleted node.
This prevents against assertion errors like PR32659 which occur from a
replacement deleting a node after it's been added to the list argument
of RemoveDeadNodes. The specific failure from PR32659 does not
currently happen, but it is still potentially possible. The underlying
cause is that the callers of the change dfunction builds up a list of
nodes to delete after having moved their uses and it possible that a
move of a later node will cause a previously deleted nodes to be
deleted.

Reviewers: bkramer, spatel, davide

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33731

llvm-svn: 305070
2017-06-09 12:57:35 +00:00
Oliver Stannard ad0973557c [ARM] Add scheduling info for VFMS
The scalar VFMS instructions did not have scheduling information attached (but
VFMA did), which was causing assertion failures with the Cortex-A57 scheduling
model and -fp-contract=fast.

Differential Revision: https://reviews.llvm.org/D34040

llvm-svn: 305064
2017-06-09 09:19:09 +00:00
Stefan Maksimovic add20f8f17 Test commit: remove whitespace
llvm-svn: 305059
2017-06-09 07:57:05 +00:00
David Blaikie 4a60d370e8 bugpoint: disabling symbolication of bugpoint-executed programs
Initial implementation - needs similar work/testing for other tools
bugpoint invokes (llc, lli I think, maybe more).

Alternatively (as suggested by chandlerc@) an environment variable could
be used. This would allow the option to pass transparently through user
scripts, pass to compilers if they happened to be LLVM-ish, etc.

I worry a bit about using cl::opt in the crash handling code - LLVM
might crash early, perhaps before the cl::opt is properly initialized?
Or at least before arguments have been parsed?

 - should be OK since it defaults to "pretty", so if the crash is very
 early in opt parsing, etc, then crash reports will still be symbolized.

I shyed away from doing this with an environment variable when I
realized that would require copying the existing environment and
appending the env variable of interest. But it seems there's no existing
LLVM API for accessing the environment (even the Support tests for
process launching have their own ifdefs for getting the environment). It
could be added, but seemed like a higher bar/untested codepath to
actually add environment variables.

Most importantly, this reduces the runtime of test/BugPoint/metadata.ll
in a split-dwarf Debug build from 1m34s to 6.5s by avoiding a lot of
symbolication. (this wasn't a problem for non-split-dwarf builds only
because the executable was too large to map into memory (due to bugpoint
setting a 400MB memory (including address space - not sure why? Going to
remove that) limit on the child process) so symbolication would fail
fast & wouldn't spend all that time parsing DWARF, etc)

Reviewers: chandlerc, dannyb

Differential Revision: https://reviews.llvm.org/D33804

llvm-svn: 305056
2017-06-09 07:29:03 +00:00
Serguei Katkov 38414b57f9 [IndVars] Add an option to be able to disable LFTR
This change adds an option disable-lftr to be able to disable Linear Function Test Replace optimization.
By default option is off so current behavior is not changed.

Reviewers: reames, sanjoy, wmi, andreadb, apilipenko
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33979

llvm-svn: 305055
2017-06-09 06:11:59 +00:00
George Burgess IV a20352e13e [LoopVectorize] Don't preserve nsw/nuw flags on shrunken ops.
If we're shrinking a binary operation, it may be the case that the new
operations wraps where the old didn't. If this happens, the behavior
should be well-defined. So, we can't always carry wrapping flags with us
when we shrink operations.

If we do, we get incorrect optimizations in cases like:

void foo(const unsigned char *from, unsigned char *to, int n) {
  for (int i = 0; i < n; i++)
    to[i] = from[i] - 128;
}

which gets optimized to:

void foo(const unsigned char *from, unsigned char *to, int n) {
  for (int i = 0; i < n; i++)
    to[i] = from[i] | 128;
}

Because:
- InstCombine turned `sub i32 %from.i, 128` into
  `add nuw nsw i32 %from.i, 128`.
- LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a
  vector full of `i8 128`s
- InstCombine took advantage of the fact that the newly-shrunken add
  "couldn't wrap", and changed the `add` to an `or`.

InstCombine seems happy to figure out whether we can add nuw/nsw on its
own, so I just decided to drop the flags. There are already a number of
places in LoopVectorize where we rely on InstCombine to clean up.

llvm-svn: 305053
2017-06-09 03:56:15 +00:00
David Blaikie cb9327b02d Inliner: Don't touch indirect calls
Other comments/implications are that this isn't intended behavior (nor
perserved/reimplemented in the new inliner) & complicates fixing the
'inlining' of trivially dead calls without consulting the cost function
first.

llvm-svn: 305052
2017-06-09 03:29:20 +00:00
Rui Ueyama 365d4d0000 Fix -Wunused-variable.
llvm-svn: 305051
2017-06-09 03:26:45 +00:00
Craig Topper a420562257 [InstCombine] Pass a proper context instruction to all of the calls into InstSimplify
Summary: This matches the behavior we already had for compares and makes us consistent everywhere.

Reviewers: dberlin, hfinkel, spatel

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33604

llvm-svn: 305049
2017-06-09 03:21:29 +00:00
Bob Haarman fdf499bf2d [codeview] use 32-bit integer for RelocOffset in DebugLinesSubsection
Summary:
RelocOffset is a 32-bit value, but we previously truncated it to 16 bits.

Fixes PR33335.

Reviewers: zturner, hiraditya!

Reviewed By: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33968

llvm-svn: 305043
2017-06-09 01:18:10 +00:00
Zachary Turner 28c22c83e3 [pdb] Don't crash on unknown debug subsections.
More and more unknown debug subsection kinds are being discovered
so we should make it possible to dump these and display the
bytes.

llvm-svn: 305041
2017-06-09 00:53:59 +00:00
Saleem Abdulrasool 1f62f57b37 sink DebugCompressionType into MC for exposing to clang
This is a preparatory change to expose the debug compression style to
clang.  It requires exposing the enumeration and passing the actual
value through to the backend from the frontend in actual value form
rather than a boolean that selects the GNU style of debug info
compression.

Minor tweak to the ELF Object Writer to use a variable for re-used
values.  Add an assertion that debug information format is one of the
two currently known types if debug information is being compressed.

llvm-svn: 305038
2017-06-09 00:40:19 +00:00
Zachary Turner deb391309c [CodeView] Support remaining debug subsection types
This adds support for Symbols, StringTable, and FrameData subsection
types.  Even though these subsections rarely if ever appear in a PDB
file (they are usually in object files), there's no theoretical reason
why they *couldn't* appear in a PDB.  The real issue though is that in
order to add support for dumping and writing them (which will be useful
for object files), we need a way to test them.  And since there is no
support for reading and writing them to / from object files yet, making
PDB support them is the best way to both add support for the underlying
format and add support for tests at the same time.  Later, when we go
to add support for reading / writing them from object files, we'll need
only minimal changes in the underlying read/write code.

llvm-svn: 305037
2017-06-09 00:28:08 +00:00
Zachary Turner 1bf7762049 [llvm-pdbdump] Support native ordering of subsections in raw mode.
This is the same change for the YAML Output style applied to the
raw output style.  Previously we would queue up all subsections
until every one had been read, and then output them in a pre-
determined order.  This was because some subsections need to be
read first in order to properly dump later subsections.  This
patch allows them to be dumped in the order they appear.

Differential Revision: https://reviews.llvm.org/D34015

llvm-svn: 305034
2017-06-08 23:49:01 +00:00
Evgeniy Stepanov d02dbf6b1c [CFI] Remove LinkerSubsectionsViaSymbols.
Since D17854 LinkerSubsectionsViaSymbols is unnecessary.

It is interfering with ThinLTO implementation of CFI-ICall, where
the aliases used on the !LinkerSubsectionsViaSymbols branch are
needed to export jump tables to ThinLTO backends.

This is the second attempt to land this change after fixing PR33316.

llvm-svn: 305031
2017-06-08 23:38:22 +00:00
Craig Topper 2aa4d39f5e [ExtractGV] Fix the doxygen comment on the constructor and the class to refer to global values instead of functions. While there fix an 80 column violation. NFC
llvm-svn: 305030
2017-06-08 23:38:19 +00:00
Galina Kistanova 415ec9260f Fixed warning: dereferencing type-punned pointer will break strict-aliasing rules.
No need in reinterpret_cast<StringTableOffset &> here, as struct coff_symbol Name is a unin
with the member StringTableOffset Offset. This union member could be accessed directly.

llvm-svn: 305029
2017-06-08 23:35:52 +00:00
Craig Topper c1993fa1a3 [IR] Remove getNumSuccessorsV/getSuccessorV/setSuccessorV from the TerminatorInst subclasses as much as possible now that Value has been de-virtualized
These used to be virtual methods that would enable doing the right thing with only a TerminatorInst pointer. I believe they were also acting as vtable anchors in my cases. I think the fact that they had a separate name ending in V was to allow a version without V to be called without a virtual call in a pre-C++11 final keyword world.

Where possible the base methods in TerminatorInst dispatch directly to the public methods in the classes that have the same signature. For some classes this wasn't possible so I've left private method versions that match the name and signature of the version in TerminatorInst. All versions have been moved into the class definitions since we no longer need vtable anchors here.

Differential Revision: https://reviews.llvm.org/D34011

llvm-svn: 305028
2017-06-08 23:23:08 +00:00
Peter Collingbourne e357fbd243 Write summaries for merged modules when splitting modules for ThinLTO.
This is to prepare to allow for dead stripping of globals in the
merged modules.

Differential Revision: https://reviews.llvm.org/D33921

llvm-svn: 305027
2017-06-08 23:01:49 +00:00
Kostya Serebryany 2c2fb8896b [sanitizer-coverage] one more flavor of coverage: -fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet. Reapplying revisions 304630, 304631, 304632, 304673, see PR33308
llvm-svn: 305026
2017-06-08 22:58:19 +00:00
Peter Collingbourne dc8c01891f Object: Move datalayout check into irsymtab::build. NFCI.
This check is a requirement of the irsymtab builder, not of any
particular caller.

Differential Revision: https://reviews.llvm.org/D33970

llvm-svn: 305023
2017-06-08 22:04:24 +00:00
Peter Collingbourne 8dde4cba4c Bitcode: Introduce a BitcodeFileContents data type. NFCI.
This data type includes the contents of a bitcode file.
Right now a bitcode file can only contain modules, but
a later change will add a symbol table.

Differential Revision: https://reviews.llvm.org/D33969

llvm-svn: 305019
2017-06-08 22:00:24 +00:00
Matthias Braun 1ee25e0c3f RegAllocPBQP: Do not assign reserved physical register
(0) RegAllocPBQP: Since getRawAllocationOrder() may return a collection that includes reserved physical registers, iterate to find an un-reserved physical register.

(1) VirtRegMap: Enforce the invariant: "no reserved physical registers" in assignVirt2Phys(). Previously, this was checked only after the fact in VirtRegRewriter::rewrite.

(2) MachineVerifier: updated the test per MatzeB's review.

(3) +testcase

Patch by Nick Johnson<Nicholas.Paul.Johnson@deshawresearch.com>!

Differential Revision: https://reviews.llvm.org/D33947

llvm-svn: 305016
2017-06-08 21:30:54 +00:00
Krzysztof Parzyszek b1ada4e742 [Hexagon] Re-enable machine verifier after codegen passes
Remove "false" from the arguments to "addPass" in Hexagon's target pass
config.

llvm-svn: 305015
2017-06-08 21:25:36 +00:00
Krzysztof Parzyszek 8a7fb0fe51 [Hexagon] Skip mux generation when predicate register is undefined
llvm-svn: 305014
2017-06-08 20:56:36 +00:00
Evgeniy Stepanov 60d411b66e [MachO] Fix codegen of alias of alias.
Fixes PR33316.

llvm-svn: 305012
2017-06-08 20:49:03 +00:00
Dehao Chen e2a428bad7 Do not early-inline recursive calls in sample profile loader.
Summary: Early-inlining of recursive call makes the code size bloat exponentially. We should not disable it.

Reviewers: davidxl, dnovillo, iteratee

Reviewed By: iteratee

Subscribers: iteratee, llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D34017

llvm-svn: 305009
2017-06-08 20:11:57 +00:00
Sanjay Patel 3b8974b51b fix formatting; NFC
llvm-svn: 305008
2017-06-08 20:00:09 +00:00
Sanjay Patel 5e370850d4 [CGP] don't expand a memcmp with nobuiltin attribute
This matches the behavior used in the SDAG when expanding memcmp.

For reference, we're intentionally treating the earlier fortified call transforms differently after:
https://bugs.llvm.org/show_bug.cgi?id=23093
https://reviews.llvm.org/rL233776

One motivation for not transforming nobuiltin calls is that it can interfere with sanitizers:
https://reviews.llvm.org/D19781
https://reviews.llvm.org/D19801

Differential Revision: https://reviews.llvm.org/D34043

llvm-svn: 305007
2017-06-08 19:47:25 +00:00
Matt Arsenault f1202e650a AMDGPU: Work around build special casing .inc files
It complains because it assumes these were autogenerated files
in the source directory.

llvm-svn: 305005
2017-06-08 19:25:21 +00:00
Matt Arsenault 3c7581bbeb AMDGPU: Use correct register names in inline assembly
Fixes using physical registers in inline asm from clang.

llvm-svn: 305004
2017-06-08 19:03:20 +00:00
Nirav Dave 6a38cc6d67 [Hexagon] Speedup NumNodesBlocking calculation. NFCI.
llvm-svn: 305003
2017-06-08 18:49:25 +00:00
Guozhi Wei f31c56df2a [PPC] In PPCBoolRetToInt change the bool value to i64 if the target is ppc64
In PPCBoolRetToInt bool value is changed to i32 type. On ppc64 it may introduce an extra zero extension for the return value. This patch changes the integer type to i64 to avoid the zero extension on ppc64.

This patch fixed PR32442.

Differential Revision: https://reviews.llvm.org/D31407

llvm-svn: 305001
2017-06-08 18:27:24 +00:00
Mark Searles e5c7832311 [AMDGPU] Force qsads instrs to use different dest register than source registers
The V_MQSAD_PK_U16_U8, V_QSAD_PK_U16_U8, and V_MQSAD_U32_U8 take more than 1 pass in hardware. For these three instructions, the destination registers must be different than all sources, so that the first pass does not overwrite sources for the following passes.

Differential Revision: https://reviews.llvm.org/D33783

llvm-svn: 304998
2017-06-08 18:21:19 +00:00
Galina Kistanova e128958552 Changed a comparison operator for std::stable_sort to implement strict weak ordering.
This is a temporarily fix which needs additional work, as it triggers a test3 failure.
test3 is commented out till then.

llvm-svn: 304993
2017-06-08 17:27:40 +00:00
Zaara Syeda 79acbbe513 [Power9] Exploit vector integer extend instructions
This patch adds build vector patterns to exploit the vector integer
extend instructions:
vextsb2w - Vector Extend Sign Byte To Word
vextsb2d - Vector Extend Sign Byte To Doubleword
vextsh2w - Vector Extend Sign Halfword To Word
vextsh2d - Vector Extend Sign Halfword To Doubleword
vextsw2d - Vector Extend Sign Word To Doubleword

Differential Revision: https://reviews.llvm.org/D33510

llvm-svn: 304992
2017-06-08 17:14:36 +00:00
Craig Topper db52809e77 [LazyValueInfo] Make LVILatticeVal intersect method take arguments by reference so we don't copy ConstantRanges unless we need to.
llvm-svn: 304990
2017-06-08 17:08:58 +00:00
Sanjay Patel e7c5041c2a [CGP / PowerPC] avoid multi-block overhead for simple memcmp expansion
The test diff for PowerPC shows we can better optimize if this case is one block.

For x86, there's would be a substantial difference if CGP expansion was enabled because branches are assumed 
cheap and SDAG can't optimize across blocks. 

Instead of this:

_cmp_eq8:
  movq  (%rdi), %rax
  cmpq  (%rsi), %rax
  je  LBB23_1
## BB#2:                                ## %res_block
  movl  $1, %ecx
  jmp LBB23_3
LBB23_1:
  xorl  %ecx, %ecx
LBB23_3:                                ## %endblock
  xorl  %eax, %eax
  testl %ecx, %ecx
  sete  %al
  retq

We get this:

cmp_eq8:   
  movq  (%rdi), %rcx
  xorl  %eax, %eax
  cmpq  (%rsi), %rcx
  sete  %al
  retq

And that matches the optimal codegen that we get from the current expansion in SelectionDAGBuilder::visitMemCmpCall(). 
If this looks right, then I just need to confirm that vector-sized expansion will work from here, and we can enable 
CGP memcmp() expansion for x86. Ie, we'll bypass the power-of-2 special cases currently optimized in SDAG because we 
can lower the IR produced here optimally.

Differential Revision: https://reviews.llvm.org/D34005

llvm-svn: 304987
2017-06-08 16:53:18 +00:00
Andrew V. Tischenko 8cb1d0931f Add scheduler classes to integer/float horizontal operations.
This patch will close PR32801.
Differential Revision: https://reviews.llvm.org/D33203

llvm-svn: 304986
2017-06-08 16:44:13 +00:00
Zachary Turner 15eb237fd3 [PDB] Don't crash on /debug:fastlink PDBs.
Apparently support for /debug:fastlink PDBs isn't part of the
DIA SDK (!), and it was causing llvm-pdbdump to crash because
we weren't checking for a null pointer return value.  This
manifests when calling findChildren on the IDiaSymbol, and
it returns E_NOTIMPL.

llvm-svn: 304982
2017-06-08 16:00:40 +00:00
Nirav Dave 62fb8498d3 InferAddressSpaces: Avoid assertion failure with replacing identical
cloned constexpr

Have cloneConstantExprWithNewAddressSpaces return nullptr when
returning initial ConstantExpr.

Reviewers: arsenm

Subscribers: jholewinski, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D33995

llvm-svn: 304975
2017-06-08 13:20:55 +00:00
Andrew V. Tischenko e0531025f8 This patch closes PR28513: an optimization of multiplication by different constants.
The initial patch was rejected: I fixed the issue and re-apply it.

llvm-svn: 304972
2017-06-08 10:20:13 +00:00
John Brawn da4a68a1d2 [BPI] Don't assume that strcmp returning >0 is more likely than <0
The zero heuristic assumes that integers are more likely positive than negative,
but this also has the effect of assuming that strcmp return values are more
likely positive than negative. Given that for nonzero strcmp return values it's
the ordering of arguments that determines the sign of the result there's no
reason to assume that's true.

Fix this by inspecting the LHS of the compare and using TargetLibraryInfo to
decide if it's strcmp-like, and if so only assume that nonzero is more likely
than zero i.e. strings are more often different than the same. This causes a
slight code generation change in the spec2006 benchmark 403.gcc, but with no
noticeable performance impact. The intent of this patch is to allow better
optimisation of dhrystone on Cortex-M cpus, but currently it won't as there are
also some changes that need to be made to if-conversion.

Differential Revision: https://reviews.llvm.org/D33934

llvm-svn: 304970
2017-06-08 09:44:40 +00:00
Peter Collingbourne c00c2b246b Object: Factor out the code for creating the irsymtab for an arbitrary bitcode file.
This code now lives in lib/Object. The idea is that it can now be reused by
IRObjectFile among other things.

Differential Revision: https://reviews.llvm.org/D31921

llvm-svn: 304958
2017-06-08 01:26:14 +00:00
Eugene Zelenko 6ac7a34816 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 304954
2017-06-07 23:53:32 +00:00
David Blaikie 7a9b788830 GlobalsModRef: Ensure optnone+readonly/readnone attributes are respected
llvm-svn: 304945
2017-06-07 21:37:39 +00:00
Sanjay Patel 66f7fdb300 [InstCombine] fold lshr (sext X), C1 --> zext (lshr X, C2)
This was discussed in D33338. We have larger pattern-matching ending in a truncate that 
we can reduce or remove by handling these smaller patterns first. Further motivation is 
that narrower shift ops are easier for value tracking and zext is better than sext.

http://rise4fun.com/Alive/rhh

Name: boolshift
%sext = sext i1 %x to i8
%r = lshr i8 %sext, 7

=>

%r = zext i1 %x to i8

Name: noboolshift
%sext = sext i3 %x to i8
%r = lshr i8 %sext, 7

=>

%sh = lshr i3 %x, 2
%r = zext i3 %sh to i8

Differential Revision: https://reviews.llvm.org/D33879

llvm-svn: 304939
2017-06-07 20:32:08 +00:00
Krzysztof Parzyszek 5ba13825f0 [Hexagon] Generate 'inbounds' GEPs in HexagonCommonGEP
llvm-svn: 304937
2017-06-07 20:04:33 +00:00
Nirav Dave 772ea3ae1a [DAG] Improve Store Merge candidate pruning. NFC.
When considering merging stores values are the results of loads only
consider stores whose values come from loads from the same base.

This fixes much of the longer compile times in PR33330.

llvm-svn: 304934
2017-06-07 18:51:56 +00:00
Xinliang David Li 4f49bee764 Fix builin_expect lowering bug
PR33346

Skip cases when expected value is not constant int.

llvm-svn: 304933
2017-06-07 18:32:24 +00:00
Alina Sbirlea 33e5872367 [mssa] Fix case when there is no definition in a block prior to an inserted use.
Summary:
Check that the first access before one being tested is valid.
Before this patch, if there was no definition prior to the Use being tested,
the first time Iter was deferenced, it hit the sentinel.

Reviewers: dberlin, gbiv

Subscribers: sanjoy, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D33950

llvm-svn: 304926
2017-06-07 16:46:53 +00:00
Sanjay Patel 8ce1e3b759 [CGP] avoid zext/trunc of a memcmp expansion compare
This could be viewed as another shortcoming of the DAGCombiner:
when both operands of a compare are zexted from the same source
type, we should be able to compare the original types.

The effect on PowerPC perf is likely unnoticeable, but there's a
visible regression for x86 if we feed the suboptimal IR for memcmp
expansion to the DAG:

_cmp_eq4_zexted_to_i64:
  movl  (%rdi), %ecx
  movl  (%rsi), %edx
  xorl  %eax, %eax
  cmpq  %rdx, %rcx
  sete  %al

_cmp_eq4_better:
  movl  (%rdi), %ecx
  xorl  %eax, %eax
  cmpl  (%rsi), %ecx
  sete  %al

llvm-svn: 304923
2017-06-07 16:16:45 +00:00
Dmitry Preobrazhensky 5a2f881b39 [AMDGPU][MC] Corrected error message for s_waitcnt helpers
See Bug 32711: https://bugs.llvm.org//show_bug.cgi?id=32711

Reviewers: artem.tamazov

Differential Revision: https://reviews.llvm.org/D33781

llvm-svn: 304922
2017-06-07 16:08:02 +00:00
Peter Collingbourne aaae7eed5c LowerTypeTests: Generate simpler IR for br(llvm.type.test, then, else).
This makes it so that the code quality for CFI checks when compiling
with -O2 and linking with --lto-O0 is similar to that of the rest of
the code.

Reduces the size of a chrome binary built with -O2/--lto-O0 by
about 750KB.

Differential Revision: https://reviews.llvm.org/D33925

llvm-svn: 304921
2017-06-07 15:49:14 +00:00
Sanjay Patel cf531ca50c [CGP] pass size as param in MemCmpExpansion; NFCI
Avoid extracting the constant int twice.

llvm-svn: 304920
2017-06-07 15:05:13 +00:00
Petar Jovanovic 2f5f8e947a [mips][dsp] Modify repl.ph to accept signed immediate values
Changed immediate type for repl.ph from uimm10 to simm10 as per the specs.
Repl.qb still accepts uimm8. Both instructions now mimic the behaviour of
GNU as.

Patch by Stefan Maksimovic.

Differential Revision: https://reviews.llvm.org/D33594

llvm-svn: 304918
2017-06-07 14:48:46 +00:00
Sanjay Patel af515d9497 [CGP] pass size as param in MemCmpExpansion; NFCI
Avoid extracting the constant int twice.

llvm-svn: 304917
2017-06-07 14:45:49 +00:00
Sanjay Patel 4137d51bc1 [CGP] getParent()->getParent() --> getFunction(); NFCI
llvm-svn: 304916
2017-06-07 14:29:52 +00:00
Jonas Paulsson ae8d22cee2 [SystemZ] Propagate MachineMemOperands
In emitCondStore() and emitMemMemWrapper().

Review: Ulrich Weigand
llvm-svn: 304913
2017-06-07 14:08:34 +00:00
Simon Pilgrim be8866f691 [DAG] Move SelectionDAG::isCommutativeBinOp to TargetLowering.
This will allow commutation of target-specific DAG nodes in future patches

Differential Revision: https://reviews.llvm.org/D33882

llvm-svn: 304911
2017-06-07 14:05:04 +00:00
Tom Stellard 2860a428f7 AMDGPU/GlobalISel: Mark 32-bit G_SELECT as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D33949

llvm-svn: 304910
2017-06-07 13:54:51 +00:00
Sanjay Patel 6e8e7cc70e [x86] avoid flipping sign bits for vector icmp by using known bits
If we know that both operands of an unsigned integer vector comparison are non-negative, 
then it's safe to directly use a signed-compare-greater-than instruction (the only non-equality
integer vector compare predicate provided by SSE/AVX).

We're intentionally not changing the condition code to signed in order to preserve the
existing transforms that use min/max/psubus below here.

This should solve PR33276:
https://bugs.llvm.org/show_bug.cgi?id=33276

Differential Revision: https://reviews.llvm.org/D33862

llvm-svn: 304909
2017-06-07 13:46:34 +00:00
Sanjay Patel 6007000824 [CGP] add helper function for generating compare of load pairs; NFCI
In the special (but also the likely common) case, we can avoid
the multi-block complexity of the general algorithm, so moving
this part off on its own will make it re-usable.

llvm-svn: 304908
2017-06-07 13:33:00 +00:00
Nemanja Ivanovic d8623f0825 [PowerPC] Eliminate integer compare instructions - vol. 5
Adds handling for i64 SETNE comparison (both sign and zero extended).

Differential Revision: https://reviews.llvm.org/D33720

llvm-svn: 304907
2017-06-07 13:18:06 +00:00
Petar Jovanovic 3c039d968e [mips] do not use FastISel when -mxgot is present
The clang compiler by default uses FastISel when invoked with -O0, which
is also the default. In that case, passing of -mxgot does not get honored,
i.e. the code path that is to deal with large got is not taken.
Clang produces same output regardless of -mxgot being present or not.
This change checks whether -mxgot is passed as an option, and turns off
FastISel if it is.

Patch by Stefan Maksimovic.

Differential Revision: https://reviews.llvm.org/D33593

llvm-svn: 304906
2017-06-07 12:59:53 +00:00
Florian Hahn 28a61d64e2 [ARM] Use FixupKind variable in processFixupValue (cleanup, NFC).
llvm-svn: 304905
2017-06-07 12:58:08 +00:00
Sanjay Patel ab0ecc00b7 [CGP] fix formatting in MemCmpExpansion; NFC
llvm-svn: 304903
2017-06-07 12:44:36 +00:00
Diana Picus 0b4190a9d6 [ARM] GlobalISel: Purge G_SEQUENCE
According to the commit message from r296921, G_MERGE_VALUES and
G_INSERT are to be preferred over G_SEQUENCE. Therefore, stop generating
G_SEQUENCE in the ARM backend and remove the code dealing with it.

This boils down to the code breaking up double values for the soft float
calling convention. Use G_MERGE_VALUES + G_UNMERGE_VALUES instead of
G_SEQUENCE + G_EXTRACT for it. This maps very nicely to VMOVDRR +
VMOVRRD and simplifies the code in the instruction selector.

There's one occurence of G_SEQUENCE left in arm-irtranslator.ll, but
that is part of the target-independent code for translating constant
structs. Therefore, it is beyond the scope of this commit.

llvm-svn: 304902
2017-06-07 12:35:05 +00:00
Nemanja Ivanovic bb67f847d6 [PowerPC] Eliminate integer compare instructions - vol. 3
Adds handling for i32 SETNE comparison (both sign and zero extended).

Differential Revision: https://reviews.llvm.org/D33718

llvm-svn: 304901
2017-06-07 12:23:41 +00:00
Diana Picus 0196427b03 [ARM] GlobalISel: Support G_XOR
Same as the other binary operators:
- legalize to 32 bits
- map to GPRs
- select to EORrr via TableGen'erated code

llvm-svn: 304898
2017-06-07 11:57:30 +00:00
Simon Dardis 7c96ba1920 evert "[mips] Fix test mips64fpldst.ll with machine verifier enabled"
This reverts commit r301394. It broke some internal buildbots, reverting
while the issue is being investigated.

llvm-svn: 304896
2017-06-07 11:21:37 +00:00
Simon Pilgrim 58f5be2771 [X86][SSE] Fix an issue with PEXTRW/PEXTRB indices during shuffle combining
We were checking that the index was in range of the destination vector type, not the (larger) source vector type

llvm-svn: 304894
2017-06-07 10:30:35 +00:00
Diana Picus eeb0aad8e4 [ARM] GlobalISel: Support G_OR
Same as the other binary operators:
- legalize to 32 bits
- map to GPRs
- select ORRrr thanks to TableGen'erated code

llvm-svn: 304890
2017-06-07 10:14:23 +00:00
Diana Picus 8445858a93 [ARM] GlobalISel: Support G_AND
This is identical to the support for the other binary operators:
- widen to s32
- map into GPR
- select ANDrr (via TableGen'erated code)

llvm-svn: 304885
2017-06-07 09:17:41 +00:00
Florian Hahn 1d38129b92 [Linker] Remove warning when linking ARM and Thumb IR modules.
Summary:
This patch updates Triple::isCompatibleWith to make armxx and thumbxx
triples compatible, as long as the subarch, vendor, os, envorionment and
object format match. Thumb/ARM code generation should be controlled
using the thumb-mode per-function target feature rather than by the
triple to allow mixing Thumb and ARM functions.

D33448 updates Clang's codegen to add thumb-mode for all functions with
armxx or thumbxx triples.

Reviewers: echristo, t.p.northover, rafael, kristof.beyls, rengolin, tejohnson

Reviewed By: tejohnson

Subscribers: rinon, eugenis, pcc, srhines, aemerson, mehdi_amini, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33287

llvm-svn: 304884
2017-06-07 09:17:01 +00:00
Florian Hahn 9afd9d9254 [ARM] Create relocations for unconditional branches.
Summary:
Relocations are required for unconditional branches to function symbols with
different execution mode. Without this patch, incorrect branches are
generated for tail calls between functions with different execution
mode.


Reviewers: peter.smith, rafael, echristo, kristof.beyls

Reviewed By: peter.smith

Subscribers: aemerson, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33898

llvm-svn: 304882
2017-06-07 08:54:47 +00:00
Craig Topper 73ba1c84be [InstCombine][InstSimplify] Use APInt::isNullValue/isOneValue to reduce compiled code for comparing APInts with 0 and 1. NFC
These methods are specifically optimized to only counting leading zeros without an additional uint64_t compare.

llvm-svn: 304876
2017-06-07 07:40:37 +00:00
Craig Topper 29c282eac8 [InstCombine] Fix two asserts that were accidentally checking that an APInt pointer is non-zero instead of checking that the APInt self is non-zero.
I believe this code used to use APInt references which would have worked. But then they were changed to pointers to allow m_APInt to be used.

llvm-svn: 304875
2017-06-07 07:40:29 +00:00
NAKAMURA Takumi 92c99cd6dc Update libdeps to add BinaryFormat, introduced in r304864.
llvm-svn: 304869
2017-06-07 04:48:49 +00:00
NAKAMURA Takumi ef9d9481b5 Reorder and reformat.
llvm-svn: 304868
2017-06-07 04:48:45 +00:00
Zachary Turner 830b6fd350 Add dependency from LibDriver to BinaryFormat.
llvm-svn: 304867
2017-06-07 04:39:50 +00:00
Zachary Turner 8a9e2c6bad Add dependency from AsmParser to BinaryFormat.
This breaks the MinGW build, but not other builds for some reason.

llvm-svn: 304866
2017-06-07 04:24:33 +00:00
Zachary Turner 264b5d9e88 Move Object format code to lib/BinaryFormat.
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.

Differential Revision: https://reviews.llvm.org/D33843

llvm-svn: 304864
2017-06-07 03:48:56 +00:00
Craig Topper 7945248267 [LazyValueInfo] Remove redundant calls to ConstantRange::contains. The same exact call was made in the if above and we already know it returned true. NFC
llvm-svn: 304857
2017-06-07 00:58:09 +00:00
Craig Topper 50b1e5135e [Constants] Use isUIntN/isIntN from MathExtras instead of reimplementing the same code. NFC
llvm-svn: 304856
2017-06-07 00:58:05 +00:00
Craig Topper 93ac6e14cd [Constants] Use APInt::isNullValue/isOneValue/uge to simplify some code and take advantage of APInt optimizations. NFC
llvm-svn: 304855
2017-06-07 00:58:02 +00:00
Quentin Colombet 9e9d638676 [InlineSpiller] Only account for real spills in the hoisting logic
Spills of undef values shouldn't impact the placement of the relevant
spills. Drive by review.

llvm-svn: 304850
2017-06-07 00:22:07 +00:00
Sanjay Patel f57015d4cc [CGP / PowerPC] use direct compares if there's only one load per block in memcmp() expansion
I'd like to enable CGP memcmp expansion for x86, but the output from CGP would regress the 
special cases (memcmp(x,y,N) != 0 for N=1,2,4,8,16,32 bytes) that we already handle.

I'm not sure if we'll actually be able to produce the optimal code given the block-at-a-time 
limitation in the DAG. We might have to just avoid those special-cases here in CGP. But 
regardless of that, I think this is a win for the more general cases.

http://rise4fun.com/Alive/cbQ

Differential Revision: https://reviews.llvm.org/D33963

llvm-svn: 304849
2017-06-07 00:17:08 +00:00
Zachary Turner 1bfb9f47af Fix uninitialized read.
llvm-svn: 304846
2017-06-06 23:54:23 +00:00
Adrian Prantl 318d1195f2 Introduce -brief command line option to llvm-dwarfdump
This patch introduces a new command line option, called brief, to
llvm-dwarfdump.  When -brief is used, the attribute forms for the
.debug_info section will not be emitted to output.

Patch by Spyridoula Gravani!

rdar://problem/21474365
Differential Revision: https://reviews.llvm.org/D33867

llvm-svn: 304844
2017-06-06 23:28:45 +00:00
Chandler Carruth abd32bad37 Fix the includes in lib/Fuzzer on Windows that have ordering
dependencies and add comments to tell future maintainers about those
requirements.

llvm-svn: 304843
2017-06-06 23:28:01 +00:00
Davide Italiano c88f2c712f [CFLAA] Remove unused include. NFCI.
llvm-svn: 304842
2017-06-06 23:16:19 +00:00
Eugene Zelenko fb69e66cff [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 304839
2017-06-06 22:22:41 +00:00
Dimitry Andric bc3feaaa88 Allow VersionPrinter to print to arbitrary raw_ostreams
Summary:
I would like to add printing of registered targets to clang's version
information.  For this to work correctly, the VersionPrinter logic in
CommandLine.cpp should support printing to arbitrary raw_ostreams,
instead of always defaulting to outs().

Add a raw_ostream& parameter to the function pointer type used for
VersionPrinter, and while doing so, introduce a typedef for convenience.

Note that VersionPrinter::print() will still default to using outs(),
the clang part will necessarily go into a separate review.

Reviewers: beanz, chandlerc, dberris, mehdi_amini, zturner

Reviewed By: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33899

llvm-svn: 304835
2017-06-06 21:54:04 +00:00
David Blaikie c662b50150 GlobalsModRef+OptNone: Don't prove readnone/other properties from an optnone function
Seems like at least one reasonable interpretation of optnone is that the
optimizer never "looks inside" a function. This fix is consistent with
that interpretation.

Specifically this came up in the situation:

f3 calls f2 calls f1
f2 is always_inline
f1 is optnone

The application of readnone to f1 (& thus to f2) caused the inliner to
kill the call to f2 as being trivially dead (without even checking the
cost function, as it happens - not sure if that's also a bug).

llvm-svn: 304833
2017-06-06 20:51:15 +00:00
Sanjay Patel b4b7df95de [CGP] fix formatting/typos in MemCmpExpansion; NFC
llvm-svn: 304830
2017-06-06 20:30:47 +00:00
Matthias Braun 7e23fc05c1 llc: Add ability to parse mir from stdin
- Add -x <language> option to switch between IR and MIR inputs.
- Change MIR parser to read from stdin when filename is '-'.
- Add a simple mir roundtrip test.

llvm-svn: 304825
2017-06-06 20:06:57 +00:00
Evgeny Stupachenko 3b88291581 Fix PR23384 (part 3 of 3)
Summary:
The patch makes instruction count the highest priority for
 LSR solution for X86 (previously registers had highest priority).

Reviewers: qcolombet

Differential Revision: http://reviews.llvm.org/D30562

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 304824
2017-06-06 20:04:16 +00:00
Sanjay Patel 2726ea2ac0 [DAG] remove duplicated code for isOnlyUsedInZeroEqualityComparison(); NFCI
llvm-svn: 304822
2017-06-06 19:40:09 +00:00
Anna Thomas 4acfc7e16e [LVI Printer] Rely on the LVI analysis functions rather than the LVI cache
Summary:
LVIPrinter pass was previously relying on the LVICache. We now directly call the
the LVI functions which solves the value if the LVI information is not already
available in the cache. This has 2 benefits over the printing of LVI cache:
1. higher coverage (i.e. catches errors) in LVI code when cache value is
invalidated.
2. relies on the core functions, and not dependent on the LVI cache (which may
be scrapped at some point).
It would still catch any cache invalidation errors, since we first go through
the cache.

Reviewers: reames, dberlin, sanjoy

Reviewed by: reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32135

llvm-svn: 304819
2017-06-06 19:25:31 +00:00
Sam Clegg acd7d2b00b [WebAssembly] MC: Refactor relocation handling
The change cleans up and unifies the handling of relocation
entries in WasmObjectWriter.  Type index relocation no longer
need to be handled separately.

The only externally visible change should be that type
index relocations are no longer grouped at the end.

Differential Revision: https://reviews.llvm.org/D33918

llvm-svn: 304816
2017-06-06 19:15:05 +00:00
Matthias Braun 8b5f9e4438 MIRPrinter: Avoid assert() when printing empty INLINEASM strings.
CodeGen uses MO_ExternalSymbol to represent the inline assembly strings.
Empty strings for symbol names appear to be invalid. For now just
special case the output code to avoid hitting an `assert()` in
`printLLVMNameWithoutPrefix()`.

This fixes https://llvm.org/PR33317

llvm-svn: 304815
2017-06-06 19:00:58 +00:00
Konstantin Zhuravlyov 1e2b87893b AMDGPU/NFC: Move amdgpu code object metadata to support
Differential Revision: https://reviews.llvm.org/D31437

llvm-svn: 304812
2017-06-06 18:35:50 +00:00
Daniel Berlin eafdd862e5 NewGVN: Fix PR/33187. This is a bug caused by two things:
1. When there is no perfect iteration order, we can't let phi nodes
put themselves in terms of things that come later in the iteration
order, or we will endlessly cycle (the normal RPO algorithm clears the
hashtable to avoid this issue).
2. We are sometimes erasing the wrong expression (causing pessimism)
because our equality says loads and stores are the same.
We introduce an exact equality function and use it when erasing to
make sure we erase only identical expressions, not equivalent ones.

llvm-svn: 304807
2017-06-06 17:15:28 +00:00
Anna Thomas b2a212c070 [Atomics][LoopIdiom] Recognize unordered atomic memcpy
Summary:
Expanding the loop idiom test for memcpy to also recognize
unordered atomic memcpy. The only difference for recognizing
an unordered atomic memcpy and instead of a normal memcpy is
that the loads and/or stores involved are unordered atomic operations.

Background:  http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html

Patch by Daniel Neilson!

Reviewers: reames, anna, skatkov

Reviewed By: reames, anna

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33243

llvm-svn: 304806
2017-06-06 16:45:25 +00:00
Stanislav Mekhanoshin e4cda7417c [AMDGPU] Return correct value from SDWA pass
Differential Revision: https://reviews.llvm.org/D33927

llvm-svn: 304805
2017-06-06 16:42:30 +00:00
Sam Clegg 6dc65e9105 [WebAssembly] Remove unused methods from MCWasmObjectTargetWriter
These methods looks like they were originally came from
MCELFObjectTargetWriter but they are never called by the
WasmObjectWriter.

Remove these methods meant the declaration of WasmRelocationEntry
could also move into the cpp file.

Differential Revision: https://reviews.llvm.org/D33905

llvm-svn: 304804
2017-06-06 16:38:59 +00:00
Petar Jovanovic 64fb7a8ebd [mips] Add madd4 subtarget feature
Addition of a feature and a predicate used to control generation of madd.fmt
and similar instructions.

Patch by Stefan Maksimovic.

Differential Revision: https://reviews.llvm.org/D33400

llvm-svn: 304801
2017-06-06 15:33:01 +00:00
Anna Thomas 7218032019 [IRCE] Canonicalize pre/post loops after the blocks are added into parent loop
Summary:
We were canonizalizing the pre loop (into loop-simplify form) before
the post loop blocks were added into parent loop. This is incorrect when IRCE is
done on a subloop. The post-loop blocks are created, but not yet added to the
parent loop. So, loop-simplification on the pre-loop incorrectly updates
LoopInfo.

This patch corrects the ordering so that pre and post loop blocks are added to
parent loop (if any), and then the loops are canonicalized to LCSSA and
LoopSimplifyForm.

Reviewers: reames, sanjoy, apilipenko

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33846

llvm-svn: 304800
2017-06-06 14:54:01 +00:00
Simon Pilgrim 3446ff4df5 Fix spelling mistake in getRThroughput static function names. NFCI.
llvm-svn: 304799
2017-06-06 14:25:34 +00:00
Simon Pilgrim f7113fd270 [X86][AVX1] Split 256-bit vector non-temporal FastISel loads to keep it non-temporal (PR32744)
Extension to D33728

llvm-svn: 304798
2017-06-06 14:18:39 +00:00
Tom Stellard 8cd60a5067 AMDGPU/GlobalISel: Mark 32-bit G_ICMP as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D33890

llvm-svn: 304797
2017-06-06 14:16:50 +00:00
Chandler Carruth aaeada6c75 Fix another ordering constraint with windows.h and comment about
a revers constraint that we got right (by chance).

llvm-svn: 304792
2017-06-06 12:43:20 +00:00
Chandler Carruth 185ddeffd4 Fix one place where I missed a commented requirement for a particular
include ordering.

I've changed the structure so that clang-format will preserve this going
forward.

llvm-svn: 304788
2017-06-06 12:11:24 +00:00
Chandler Carruth 6bda14b313 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

llvm-svn: 304787
2017-06-06 11:49:48 +00:00
Peter Smith d16c55de6d [ARM] Add curly braces around switch case [NFC]
My previous commit r304702 introduced a new case into a switch statement.
This case defined a variable but I forgot to add the curly brackets around the
case to limit the scope.

This change puts the curly braces back in so that the next person that adds a
case doesn't get a build failure. Thanks to avieira for the spot.

Differential Revision: https://reviews.llvm.org/D33931

llvm-svn: 304785
2017-06-06 10:22:49 +00:00
Joey Gouly 61eaa63b65 [InstSimplify] Constant fold the new GEP in SimplifyGEPInst.
llvm-svn: 304784
2017-06-06 10:17:14 +00:00
Vivek Pandya 56d87ef5d7 [Improve CodeGen Testing] This patch renables MIRPrinter print fields which have value equal to its default.
If -simplify-mir option is passed then MIRPrinter will not print such fields.
This change also required some lit test cases in CodeGen directory to be changed.

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D32304

llvm-svn: 304779
2017-06-06 08:16:19 +00:00
Craig Topper aa9a24bd8b [InstSimplify] Remove some redundant code from InstSimplify now that llvm::isKnownNonEqual handles vectors.
isKnownNonEqual is called a little earlier in this function and can handle the case that we were checking here as well as more complex cases.

llvm-svn: 304775
2017-06-06 07:13:17 +00:00
Craig Topper 3002d5b0bf [ValueTracking] Remove scalar only restriction from isKnownNonEqual. The computeKnownBits and isKnownNonZero calls this code relies on should work fine for vectors.
This will be used by another commit to remove some code from InstSimplify that is redundant for scalars, but was needed for vectors due to this issue.

llvm-svn: 304774
2017-06-06 07:13:15 +00:00
Craig Topper 2dfb4804f2 [InstSimplify] Use the getTrue/getFalse helpers and make sure we use the computed result type instead of hardcoding to i1. NFC
Currently, isKnownNonEqual punts on vectors so the hardcoding to i1 doesn't matter. But I plan to fix that in a future patch.

llvm-svn: 304773
2017-06-06 07:13:13 +00:00
Craig Topper 8e662f7f81 [ValueTracking] Use the computeKnownBits version that returns a KnownBits object instead of taking one by reference. NFC
llvm-svn: 304772
2017-06-06 07:13:11 +00:00
Craig Topper 8365df825e [ValueTracking] Use APInt::intersects to avoid some temporary APInts. NFC
llvm-svn: 304771
2017-06-06 07:13:09 +00:00
Craig Topper c2790ecda8 [InstSimplify] Use ICmpInst::isEquality predicate method. NFC
llvm-svn: 304770
2017-06-06 07:13:04 +00:00
Mandeep Singh Grang 5e1697ef28 [llvm] Remove double semicolons
Reviewers: craig.topper, arsenm, mehdi_amini

Reviewed By: mehdi_amini

Subscribers: mehdi_amini, wdng, nhaehnle, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33924

llvm-svn: 304767
2017-06-06 05:08:36 +00:00
Xin Tong 9d6f08a8d4 Add a dominanance check interface that uses caching for instructions within same basic block.
Summary:
This problem stems from the fact that instructions are allocated using new
in LLVM, i.e. there is no relationship that can be derived by just looking
at the pointer value.

This interface dispatches to appropriate dominance check given 2 instructions,
i.e. in case the instructions are in the same basic block, ordered basicblock
(with instruction numbering and caching) are used. Otherwise, dominator tree
is used.

This is a preparation patch for https://reviews.llvm.org/D32720

Reviewers: dberlin, hfinkel, davide

Subscribers: davide, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D33380

llvm-svn: 304764
2017-06-06 02:34:41 +00:00
Chandler Carruth 41ed4034dd [x86] Revert the X86FoldTablesEmitter due to more miscompiles.
In testing, we've found yet another miscompile caused by the new tables.
And this one is even less clear how to fix (we could teach it to fold
a 16-bit load instead of the 32-bit load it wants, or block folding
entirely).

Also, the approach to excluding instructions seems increasingly to not
scale well.

I have left a more detailed analysis on the review log for the original
patch (https://reviews.llvm.org/D32684) along with suggested path
forward. I will land an additional test case that I wrote which covers
the code that was miscompiling (folding into the output of `pextrw`) in
a subsequent commit to keep this a pure revert.

For each commit reverted here, I've restricted the revert to the
non-test code touching the x86 fold table emission until the last commit
where I did revert the test updates. This means the *new* test cases
added for `insertps` and `xchg` remain untouched (and continue to pass).

Reverted commits:
r304540: [X86] Don't fold into memory operands into insertps in the ...
r304347: [TableGen] Adapt more places to getValueAsString now ...
r304163: [X86] Don't fold away the memory operand of an xchg.
r304123: Don't capture a temporary std::string in a StringRef.
r304122: Resubmit "[X86] Adding new LLVM TableGen backend that ..."

Original commit was in r304088, and after a string of fixes was reverted
previously in r304121 to fix build bots, and then re-landed in r304122.

llvm-svn: 304762
2017-06-06 02:15:31 +00:00
Wolfgang Pieb 77d3e938f8 [DWARF] Adding support for the DWARF v5 string offsets table (consumer/reader part only).
Reviewers: dblaikie, aprantl

Differential Revision: https://reviews.llvm.org/D32779

llvm-svn: 304759
2017-06-06 01:22:34 +00:00
Matthias Braun 7bda195812 CodeGen: Refactor MIR parsing
When parsing .mir files immediately construct the MachineFunctions and
put them into MachineModuleInfo.

This allows us to get rid of the delayed construction (and delayed error
reporting) through the MachineFunctionInitialzier interface.

Differential Revision: https://reviews.llvm.org/D33809

llvm-svn: 304758
2017-06-06 00:44:35 +00:00
Matthias Braun c7c06f158c CodeGen/LLVMTargetMachine: Refactor ISel pass construction; NFCI
- Move ISel (and pre-isel) pass construction into TargetPassConfig
- Extract AsmPrinter construction into a helper function

Putting the ISel code into TargetPassConfig seems a lot more natural and
both changes together make make it easier to build custom pipelines
involving .mir in an upcoming commit. This moves MachineModuleInfo to an
earlier place in the pass pipeline which shouldn't have any effect.

llvm-svn: 304754
2017-06-06 00:26:13 +00:00
Quentin Colombet c668935d85 [InlineSpiller] Don't spill fully undef values
Althought it is not wrong to spill undef values, it is useless and harms
both code size and runtime. Before spilling a value, check that its
content actually matters.

http://www.llvm.org/PR33311

llvm-svn: 304752
2017-06-05 23:51:27 +00:00
Evgeny Stupachenko f2b3b467e5 Fix PR23384 (part 2 of 3) NFC
Summary:
The patch moves LSR cost comparison to target part.

Reviewers: qcolombet

Differential Revision: http://reviews.llvm.org/D30561

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 304750
2017-06-05 23:37:00 +00:00
Matt Arsenault 540d133138 Remove double semicolon
llvm-svn: 304749
2017-06-05 23:01:31 +00:00
Matthias Braun e89461a400 Remove some #include from StackProtector.h; NFC
llvm-svn: 304748
2017-06-05 22:59:21 +00:00
Matt Arsenault 9e5b5053d1 RenameIndependentSubregs: Fix handling of undef tied operands
If a tied source operand was undef, it would be replaced but not
update the other tied operand, which would end up using different
virtual registers.

llvm-svn: 304747
2017-06-05 22:58:57 +00:00
Evgeny Stupachenko 4d94e99446 LSR: Calculate instruction cost only if InsnsCost is set to true (NFC)
Summary:

The patch guard all instruction cost calculations with InsnCosts (-lsr-insns-cost) option.
Currently even if the option set to false we calculate and print (in debug mode) instruction costs.

Reviewers: qcolombet

Differential Revision: http://reviews.llvm.org/D33914

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 304746
2017-06-05 22:44:18 +00:00
Volkan Keles ebe6bb9006 [GlobalISel] IRTranslator: Add MachineMemOperand to target memory intrinsics
Reviewers: qcolombet, ab, t.p.northover, aditya_nandakumar, dsanders

Reviewed By: qcolombet

Subscribers: rovka, kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D33724

llvm-svn: 304743
2017-06-05 22:17:17 +00:00
Davide Italiano fb4d5c095b [SelectionDAG] Update the dominator after splitting critical edges.
Running `llc -verify-dom-info` on the attached testcase results in a
crash in the verifier, due to a stale dominator tree.

i.e.

  DominatorTree is not up to date!
  Computed:
  =============================--------------------------------
  Inorder Dominator Tree:
    [1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,7}
      [2] %lor.lhs.false.i61.i.i.i {1,2}
      [2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,6}
        [3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5}

  Actual:
  =============================--------------------------------
  Inorder Dominator Tree:
    [1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,9}
      [2] %lor.lhs.false.i61.i.i.i {1,2}
      [2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,8}
        [3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5}
        [3] %safe_mod_func_int8_t_s_s.exit.i.i.i.lor.lhs.false.i61.i.i.i_crit_edge {6,7}

This is because in `SelectionDAGIsel` we split critical edges without
updating the corresponding dominator for the function (and we claim
in `MachineFunctionPass::getAnalysisUsage()` that the domtree is preserved).

We could either stop preserving the domtree in `getAnalysisUsage`
or tell `splitCriticalEdge()` to update it.
As the second option is easy to implement, that's the one I chose.

Differential Revision:  https://reviews.llvm.org/D33800

llvm-svn: 304742
2017-06-05 22:16:41 +00:00
Zachary Turner 88101dadcc [CodeView] Fix endianness bug.
We should be outputting in little endian, but we were writing
in host endianness.

llvm-svn: 304741
2017-06-05 22:12:23 +00:00
Zachary Turner 349c18f837 [CodeView] Handle Cross Module Imports and Exports.
While it's not entirely clear why a compiler or linker might
put this information into an object or PDB file, one has been
spotted in the wild which was causing llvm-pdbdump to crash.

This patch adds support for reading-writing these sections.
Since I don't know how to get one of the native tools to
generate this kind of debug info, the only test here is one
in which we feed YAML into the tool to produce a PDB and
then spit out YAML from the resulting PDB and make sure that
it matches.

llvm-svn: 304738
2017-06-05 21:40:33 +00:00
Konstantin Zhuravlyov 5b0bf2ff0d AMDGPU: Remove deprecated and unused elf definitions
Differential Revision: https://reviews.llvm.org/D33689

llvm-svn: 304737
2017-06-05 21:33:40 +00:00
Saleem Abdulrasool 4c47434b25 CodeGen: add support for emitting ObjC image info
This ensures that we can emit the ObjC Image Info structure on COFF and
ELF as well.  The frontend already would attempt to emit this
information but would get dropped when generating assembly or an object
file.

llvm-svn: 304736
2017-06-05 21:26:39 +00:00
Craig Topper f6e138d794 [ConstantRange] Remove costly udivrem from ConstantRange::truncate
Truncate currently uses a udivrem call which is going to be slow particularly for larger than 64-bit widths.

As far as I can tell all we were trying to do was modulo LowerDiv by (MaxValue+1) and make sure whatever value was effectively subtracted from LowerDiv was also subtracted from UpperDiv.

This patch recognizes that MaxValue+1 is a power of 2 so we can just use a bitwise AND to accomplish a modulo operation or isolate the upper bits.

Differential Revision: https://reviews.llvm.org/D32672

llvm-svn: 304733
2017-06-05 20:48:05 +00:00
Mark Searles 602ee930bf [AMDGPU] Fix uninit'ed var (RevisitLoop)
Differential Revision: https://reviews.llvm.org/D33907

llvm-svn: 304729
2017-06-05 19:29:01 +00:00
Sanjay Patel 6350de76fa [DAGCombine] Fix unchecked calls to DAGCombiner::*ExtPromoteOperand
Other calls to DAGCombiner::*PromoteOperand check the result, but here it could cause an assertion in getNode. 
Falling back to any extend in this case instead of failing outright seems correct to me.

No test case because:
The failure was triggered by an out of tree backend. In order to trigger it, a backend would need to overload 
TargetLowering::IsDesirableToPromoteOp to return true for a type for which ISD::SIGN_EXTEND_INREG is marked 
illegal. In tree, only X86 overloads and sometimes returns true for MVT::i16 yet it marks 
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16  , Legal);.

Patch by Jacob Young!

Differential Revision: https://reviews.llvm.org/D33633

llvm-svn: 304723
2017-06-05 17:01:10 +00:00
Simon Pilgrim 807b708d13 [X86][SSE41] Non-temporal loads shouldn't be folded if it can be avoided (PR32743)
Missed SSE41 non-temporal load case in previous commit

Differential Revision: https://reviews.llvm.org/D33728

llvm-svn: 304722
2017-06-05 16:45:32 +00:00
Adam Nemet 4ef096b0c2 Handle non-unique edges in edge-dominance
This removes a quadratic behavior in assert-enabled builds.

GVN propagates the equivalence from a condition into the blocks guarded by the
condition.  E.g. for 'if (a == 7) { ... }', 'a' will be replaced in the block
with 7.  It does this by replacing all the uses of 'a' that are dominated by
the true edge.

For a switch with N cases and U uses of the value, this will mean N * U calls
to 'dominates'.  Asserting isSingleEdge in 'dominates' make this N^2 * U
because this function checks for the uniqueness of the edge. I.e. traverses
each edge between the SwitchInst's block and the cases.

The change removes the assert and makes 'dominates' works correctly in the
presence of non-unique edges.

This brings build time down by an order of magnitude for an input that has
~10k cases in a switch statement.

Differential Revision: https://reviews.llvm.org/D33584

llvm-svn: 304721
2017-06-05 16:27:09 +00:00
Frederich Munch ad12580012 Close DynamicLibraries in reverse order they were opened.
Summary: Matches C++ destruction ordering better and fixes possible problems of loaded libraries having inter-dependencies.

Reviewers: efriedma, v.g.vassilev, chapuni

Reviewed By: efriedma

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D33652

llvm-svn: 304720
2017-06-05 16:26:58 +00:00
Dmitry Mikulin db3b87b2c0 Symbols re-defined with -wrap and -defsym need to be excluded from inter-
procedural optimizations to prevent dropping symbols and allow the linker
to process re-directs.

PR33145: --wrap doesn't work with lto.
Differential Revision: https://reviews.llvm.org/D33621

llvm-svn: 304719
2017-06-05 16:24:25 +00:00
Simon Pilgrim b2ef948628 [X86][AVX1] Split 256-bit vector non-temporal loads to keep it non-temporal (PR32744)
Differential Revision: https://reviews.llvm.org/D33728

llvm-svn: 304718
2017-06-05 16:02:01 +00:00
Simon Pilgrim a25bf0b6b9 [X86][SSE] Non-temporal loads shouldn't be folded if it can be avoided (PR32743)
Differential Revision: https://reviews.llvm.org/D33728

llvm-svn: 304717
2017-06-05 15:43:03 +00:00
Diana Picus 0091cc3528 [ARM] GlobalISel: Constrain callee register on indirect calls
When lowering calls, we generate instructions with machine opcodes
rather than generic ones. Therefore, we need to constrain the register
classes of the operands.

Also enable the machine verifier on the arm-irtranslator.ll test, since
that would've caught this issue.

Fixes (part of) PR32146.

llvm-svn: 304712
2017-06-05 12:54:53 +00:00
whitequark f6059fdc54 [LLVM-C] [OCaml] Expose Type::subtypes.
The C functions added are LLVMGetNumContainedTypes and
LLVMGetSubtypes.

The OCaml function added is Llvm.subtypes.

Patch by Ekaterina Vaartis.

Differential Revision: https://reviews.llvm.org/D33677

llvm-svn: 304709
2017-06-05 11:49:52 +00:00
Dimitry Andric f5d486f43d Fix building DynamicLibrary.cpp with musl libc
Summary:
The workaround added in rL301240 for stderr/out/in symbols being both
macros and globals is only necessary for glibc, and it does not compile
with musl libc. Alpine Linux has had the following fix for it:

https://git.alpinelinux.org/cgit/aports/plain/main/llvm4/llvm-fix-DynamicLibrary-to-build-with-musl-libc.patch

Adapt the fix in our DynamicLibrary.inc for Unix.

Reviewers: marsupial, chandlerc, krytarowski

Reviewed By: krytarowski

Subscribers: srhines, krytarowski, llvm-commits

Differential Revision: https://reviews.llvm.org/D33883

llvm-svn: 304707
2017-06-05 11:22:18 +00:00
Javed Absar b16d146838 Add support for #pragma clang section
This patch provides a means to specify section-names for global variables,
functions and static variables, using #pragma directives.
This feature is only defined to work sensibly for ELF targets.
One can specify section names as:
#pragma clang section bss="myBSS" data="myData" rodata="myRodata" text="myText"
One can "unspecify" a section name with empty string e.g.
#pragma clang section bss="" data="" text="" rodata=""

Reviewers: Roger Ferrer, Jonathan Roelofs, Reid Kleckner
Differential Revision: https://reviews.llvm.org/D33413

llvm-svn: 304704
2017-06-05 10:09:13 +00:00
Peter Smith adde667007 [ARM] Support fixup for Thumb2 modified immediate
This change adds a new fixup fixup_t2_so_imm for the t2_so_imm_asmoperand
"T2SOImm". The fixup permits code such as:
.L1:
 sub r3, r3, #.L2 - .L1
.L2:
to assemble in Thumb2 as well as in ARM state.
    
The operand predicate isT2SOImm() explicitly doesn't match expressions
containing :upper16: and :lower16: as expressions with these operators
must match the movt and movw instructions.
    
The test mov r0, foo2 in thumb2-diagnostics is moved to a new file as the
fixup delays the error message till after the assembler has quit due to
the other errors.
    
As the mov instruction shares the t2_so_imm_asmoperand mov instructions
with a non constant expression now match t2MOVi rather than t2MOVi16 so the
error message is slightly different.
    
Fixes PR28647

Differential Revision: https://reviews.llvm.org/D33492

llvm-svn: 304702
2017-06-05 09:37:12 +00:00
Sven van Haastregt 78819e0fd4 [InstCombine] Fix extractelement use before def
This fixes a bug that can cause extractelements with operands that
haven't been defined yet to be inserted at a wrong point when
optimising insertelements.

Patch by Karl Hylen.

Differential Revision: https://reviews.llvm.org/D33449

llvm-svn: 304701
2017-06-05 09:18:10 +00:00
Renato Golin cdf840fd38 Revert "[sanitizer-coverage] one more flavor of coverage: -fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet."
This reverts commit r304630, as it broke ARM/AArch64 bots for 2 days.

llvm-svn: 304698
2017-06-05 07:35:52 +00:00
Stanislav Mekhanoshin 286a4225b9 [AMDGPU] Fix SIFoldOperands crash with clamp
Fixes bug #33302. Pass did not account that Src1 of max instruction
can be an immediate.

Differential Revision: https://reviews.llvm.org/D33884

llvm-svn: 304696
2017-06-05 01:03:04 +00:00
Craig Topper da8037f299 [InstSimplify] Use llvm::all_of instead of a manual loop. NFC
llvm-svn: 304692
2017-06-04 22:41:56 +00:00
Peter Collingbourne 2b9e9e474c IR: When creating a global variable, assert that its type is valid.
llvm-svn: 304690
2017-06-04 22:12:03 +00:00
Simon Pilgrim 46dd55f1e1 [X86][SSE] Change BUILD_VECTOR interleaving ordering to improve coalescing/combine opportunities
We currently generate BUILD_VECTOR as a tree of UNPCKL shuffles of the same type:

e.g. for v4f32:

Step 1: unpcklps 0, 2 ==> X: <?, ?, 2, 0>
      : unpcklps 1, 3 ==> Y: <?, ?, 3, 1>
Step 2: unpcklps X, Y ==>    <3, 2, 1, 0>

The issue is because we are not placing sequential vector elements together early enough, we fail to recognise many combinable patterns - consecutive scalar loads, extractions etc.

Instead, this patch unpacks progressively larger sequential vector elements together:

e.g. for v4f32:

Step 1: unpcklps 0, 2 ==> X: <?, ?, 1, 0>
      : unpcklps 1, 3 ==> Y: <?, ?, 3, 2>
Step 2: unpcklpd X, Y ==>    <3, 2, 1, 0>

This does mean that we are creating UNPCKL shuffle of different value types, but the relevant combines that benefit from this are quite capable of handling the additional BITCASTs that are now included in the shuffle tree.

Differential Revision: https://reviews.llvm.org/D33864

llvm-svn: 304688
2017-06-04 20:12:04 +00:00
Ayal Zaks ab32aff838 [LV] Make scalarizeInstruction() non-virtual. NFC.
Following the request made in https://reviews.llvm.org/D32871,
scalarizeInstruction() which is no longer overridden by InnerLoopUnroller is
hereby made non-virtual in InnerLoopVectorizer.

Should have been part of r297580 originally.

llvm-svn: 304685
2017-06-04 13:29:51 +00:00
Craig Topper d470d73c2d [ConstantFolding] Combine an if statement into an earlier one that checked the same condition. NFC
llvm-svn: 304681
2017-06-04 08:21:53 +00:00
Craig Topper 0dd29e2256 [ConstantFolding][X86] Replace an LLVM_FALLTHROUGH with a break because it really shouldn't fallthrough.
This is actually NFC because the next case starts with the same if statement as this case did. So the result will be the same and it will fallthrough to the end of the switch. But there's no reason to rely on that so we should just break.

llvm-svn: 304680
2017-06-04 08:21:51 +00:00
Craig Topper fe9ad82e44 [ConstantFolding] Properly support constant folding of vector powi intrinsic. The second argument is not a vector so needs special treatment.
llvm-svn: 304679
2017-06-04 07:30:28 +00:00
Davide Italiano be1b6a963e [PM] Add GVNSink to the pipeline.
With this, the two pipelines should be in sync again (modulo
LoopUnswitch, but Chandler is actively working on that).

Differential Revision:  https://reviews.llvm.org/D33810

llvm-svn: 304671
2017-06-03 23:18:29 +00:00
Saleem Abdulrasool ac76ec7ce8 ADT: handle special case of ARM environment for SUSE
SUSE treats "gnueabi" as "gnueabihf" so make sure that we normalise the
environment.

llvm-svn: 304670
2017-06-03 22:31:06 +00:00
Craig Topper 0799ff9e64 [InstCombine] Add support for simplifying ctlz/cttz intrinsics based on known bits.
llvm-svn: 304669
2017-06-03 18:50:32 +00:00
Craig Topper 7c553edced [ConstantFolding] Fix constant folding for vector cttz and ctlz intrinsics to understand that the second argument is still a scalar.
llvm-svn: 304668
2017-06-03 18:50:29 +00:00
Stanislav Mekhanoshin 0330660403 [AMDGPU] Untangle SDWA pass from SIShrinkInstructions
Remove dependency of SDWA pass on SIShrinkInstructions.
The goal is to move SDWA even higher in the stack to avoid second run
of MachineLICM, MachineCSE and SIFoldOperands.

Also added handling to preserve original src modifiers.

Differential Revision: https://reviews.llvm.org/D33860

llvm-svn: 304665
2017-06-03 17:39:47 +00:00
Simon Pilgrim f93debb40c [X86][SSE] Add SCALAR_TO_VECTOR(PEXTRW/PEXTRB) support to faux shuffle combining
Generalized existing SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT) code to support AssertZext + PEXTRW/PEXTRB cases as well. 

llvm-svn: 304659
2017-06-03 11:12:57 +00:00
Craig Topper a803d5b8b0 [LazyValueInfo] Use Type::getIntegerBitWidth instead of casting to IntegerType to call getBitWidth. NFC
llvm-svn: 304656
2017-06-03 07:47:14 +00:00
Craig Topper 0e5f1093ee [LazyValueInfo] Make solveBlockValueCast take a CastInst* instead of Instruction*. Makes getOpcode return the appropriate enum without a cast. NFC
llvm-svn: 304655
2017-06-03 07:47:08 +00:00
Galina Kistanova e9cacb6ae8 Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304638
2017-06-03 05:19:32 +00:00
Galina Kistanova 55344aba7e Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304637
2017-06-03 05:19:10 +00:00
Galina Kistanova 96d51f5bcb Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304636
2017-06-03 05:18:46 +00:00
Galina Kistanova bd79f73f02 Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304635
2017-06-03 05:11:14 +00:00
Sam Clegg 9e15f3592a [WebAssembly] Refactor WasmObjectWriter::writeObject
The size of this function was getting a little out of.
control.  Split code for writing each section type into
seperate functions.

Differential Revision: https://reviews.llvm.org/D33792

llvm-svn: 304634
2017-06-03 02:01:24 +00:00
Kostya Serebryany f7db346cdf [sanitizer-coverage] one more flavor of coverage: -fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet.
llvm-svn: 304630
2017-06-03 01:35:47 +00:00
Tom Stellard e042412ef1 AMDGPU/GlobalISel: Mark 1-bit integer constants as legal
Summary:
These are mostly legal, but will probably need special lowering for some
cases.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D33791

llvm-svn: 304628
2017-06-03 01:13:33 +00:00
Eugene Zelenko c85638b29d [CodeGen] Fix Windows builds which treat warnings as errors, broken in r304621.
llvm-svn: 304627
2017-06-03 01:04:06 +00:00
Evgeniy Stepanov 704003ea3d Revert "[CFI] Remove LinkerSubsectionsViaSymbols."
This reverts commit r304582: breaks cfi-devirt :: anon-namespace.cpp on Darwin.

llvm-svn: 304626
2017-06-03 00:46:27 +00:00
Stanislav Mekhanoshin f154b4f52c [AMDGPU] Preserve operand order in SIFoldOperands
SIFoldOperands can commute operands even if no folding was done.
This change is to preserve IR is no folding was done.

Differential Revision: https://reviews.llvm.org/D33802

llvm-svn: 304625
2017-06-03 00:41:52 +00:00
Zachary Turner 5b74ff33e7 [PDB] Fix use after free.
Previously MappedBlockStream owned its own BumpPtrAllocator that
it would allocate from when a read crossed a block boundary.  This
way it could still return the user a contiguous buffer of the
requested size.  However, It's not uncommon to open a stream, read
some stuff, close it, and then save the information for later.
After all, since the entire file is mapped into memory, the data
should always be available as long as the file is open.

Of course, the exception to this is when the data isn't *in* the
file, but rather in some buffer that we temporarily allocated to
present this contiguous view.  And this buffer would get destroyed
as soon as the strema was closed.

The fix here is to force the user to specify the allocator, this
way it can provide an allocator that has whatever lifetime it
chooses.

Differential Revision: https://reviews.llvm.org/D33858

llvm-svn: 304623
2017-06-03 00:33:35 +00:00
Matthias Braun 4e8624d138 LiveRegUnits: Port recent LivePhysRegs bugfixes
Adjust code to look more like the code in LivePhysRegs and port over the
fix for LivePhysRegs from r304001 and adapt to the new CSR management in
MachineRegisterInfo.

llvm-svn: 304622
2017-06-03 00:26:35 +00:00
Eugene Zelenko 167595ab51 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 304621
2017-06-03 00:22:41 +00:00
Stanislav Mekhanoshin ca5d2efe5a [AMDGPU] V_DIV_FIXUP_F16 is not a commutable operation
Differential Revision: https://reviews.llvm.org/D33808

llvm-svn: 304619
2017-06-03 00:16:44 +00:00
Alexey Bataev e4e5923ef1 [SLP] Improve comments and naming of functions/variables/members, NFC.
Fixed some comments, added an additional description of the algorithms,
improved readability of the code.

Differential revision: https://reviews.llvm.org/D33320

llvm-svn: 304616
2017-06-03 00:08:21 +00:00
Sanjay Patel e737cf8500 [x86] simplify code for vector icmp pred transforms; NFCI
Organizing by transform is smaller and easier to read than a squashed switch with fall-throughs.

llvm-svn: 304611
2017-06-02 23:21:53 +00:00
Kostya Serebryany aed6ba770c [sanitizer-coverage] refactor the code to make it easier to add more sections in future. NFC
llvm-svn: 304610
2017-06-02 23:13:44 +00:00
Alexey Bataev 03ca396b95 Revert "[SLP] Improve comments and naming of functions/variables/members, NFC."
This reverts commit 6e311de8b907aa20da9a1a13ab07c3ce2ef4068a.

llvm-svn: 304609
2017-06-02 23:09:15 +00:00
Philip Reames b70cecd60a [Statepoint] Be consistent about using deopt naming [NFCI]
We'd called this "vm state" in the early days, but have long since standardized on calling it "deopt" in line with the operand bundle tag.  Fix a few cases we'd missed.

llvm-svn: 304607
2017-06-02 23:03:26 +00:00
Matthias Braun 0021d46a1c RegisterScavenging: Add ScavengerTest pass
This pass allows to run the register scavenging independently of
PrologEpilogInserter to allow targeted testing.

Also adds some basic register scavenging tests.

llvm-svn: 304606
2017-06-02 23:01:42 +00:00
Quentin Colombet 2145cf3f07 [RABasic] Properly update the LiveRegMatrix when LR splitting occur
Prior to this patch we used to not touch the LiveRegMatrix while doing
live-range splitting. In other words, when live-range splitting was
occurring, the LiveRegMatrix was not reflecting the changes.
This is generally fine because it means the query to the LiveRegMatrix
will be conservately correct. However, when decisions are taken based on
what is going to happen on the interferences (e.g., when we spill a
register and know that it is going to be available for another one), we
might hit an assertion that the color used for the assignment is still
in use.

This patch makes sure the changes on the live-ranges are properly
reflected in the LiveRegMatrix, so the assertions don't break.
An alternative could have been to remove the assertion, but it would
make the invariants of the code and the general reasoning more
complicated in my opnion.

http://llvm.org/PR33057

llvm-svn: 304603
2017-06-02 22:46:31 +00:00
Quentin Colombet ebbaed6d3c [RABasic] Properly initialize the pass
Use the initializeXXX method to initialize the RABasic pass in the
pipeline. This enables us to take advantage of the .mir infrastructure.

llvm-svn: 304602
2017-06-02 22:46:26 +00:00
Xinliang David Li 5fdc75aea1 Fix debug build test failure
llvm-svn: 304600
2017-06-02 22:38:48 +00:00
Xinliang David Li 0b7d858fa3 [PartialInlining] Minor cost anaysis tuning
Also added a test option and 2 cost analysis related tests.

llvm-svn: 304599
2017-06-02 22:08:04 +00:00
David Blaikie 6aeacaa527 FunctionAttrs: Skip it if the effective SCC (ignoring optnone functions) is empty
Minor optimization but mostly simplifies my debugging so I'm not dealing
with empty SCCNodeSets while investigating issues in this optimization.

llvm-svn: 304597
2017-06-02 21:24:17 +00:00
Matthias Braun dfa892139c RegisterScavenging: Move scavenging logic from PEI to RegisterScavenging; NFC
These parts do not depend on any PrologEpilogInserter logic and
therefore better fits RegisterScaveging.cpp.

llvm-svn: 304596
2017-06-02 21:02:03 +00:00
Zachary Turner 64726f2269 Fix build error on gcc.
llvm-svn: 304595
2017-06-02 21:00:22 +00:00
Jun Bum Lim 2960d41e68 [InlineCost] Enable the new switch cost heuristic
Summary:
This is to enable the new switch inline cost heuristic (r301649) by removing the
old heuristic as well as the flag itself.
In my experiment for LLVM test suite and spec2000/2006, +17.82% performance and
8% code size reduce was observed in spec2000/vertex with O3 LTO in AArch64.
No significant code size / performance regression was found in O3/O2/Os. No
significant complain was reported from the llvm-dev thread.

Reviewers: hans, chandlerc, eraman, haicheng, mcrosier, bmakam, eastig, ddibyend, echristo

Reviewed By: echristo

Subscribers: javed.absar, kristof.beyls, echristo, aemerson, rengolin, mehdi_amini

Differential Revision: https://reviews.llvm.org/D32653

llvm-svn: 304594
2017-06-02 20:42:54 +00:00
Alexey Bataev 2c08fde9e5 [SLP] Improve comments and naming of functions/variables/members, NFC.
Summary:
Fixed some comments, added an additional description of the algorithms,
improved readability of the code.

Reviewers: anemet

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33320

llvm-svn: 304593
2017-06-02 20:39:27 +00:00
Ahmed Bougacha 018a68f9e4 [X86] Correctly broadcast NaN-like integers as float on AVX.
Since r288804, we try to lower build_vectors on AVX using broadcasts of
float/double.  However, when we broadcast integer values that happen to
have a NaN float bitpattern, we lose the NaN payload, thereby changing
the integer value being broadcast.

This is caused by ConstantFP::get, to which we pass the splat i32 as
a float (by bitcasting it using bitsToFloat).  ConstantFP::get takes
a double parameter, so we end up lossily converting a single-precision
NaN to double-precision.

Instead, avoid any kinds of conversions by directly building an APFloat
from the splatted APInt.

Note that this also fixes another piece of code (broadcast of
subvectors), that currently isn't susceptible to the same problem.

Also note that we could really just use APInt and ConstantInt
throughout: the constant pool type doesn't matter much.  Still, for
consistency, use the appropriate type.

llvm-svn: 304590
2017-06-02 20:02:59 +00:00
Zachary Turner 4bedb5fd00 Fix build error with clang and gcc.
llvm-svn: 304589
2017-06-02 20:00:10 +00:00
Zachary Turner 92dcdda623 [CodeView] Support CodeView subsections in any order.
Previously we would expect certain subsections to appear
in a certain order because some subsections would reference
other subsections, but in practice we need to support
arbitrary orderings since some object file and PDB file
producers generate them this way.  This also paves the
way for supporting Yaml <-> Object File conversion of
CodeView, since Object Files typically have quite a
large number of subsections in their debug info.

Differential Revision: https://reviews.llvm.org/D33807

llvm-svn: 304588
2017-06-02 19:49:14 +00:00
Keno Fischer 514a6a54e7 [SROA] Fix crash due to bad bitcast
Summary:
As shown in the test case, SROA was crashing when trying to split
stores (to the alloca) of loads (from anywhere), because it assumed
the pointer operand to the loads and stores had to have the same
address space. This isn't the case. Make sure to use the correct
pointer type for both the load and the store.

Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D32593

llvm-svn: 304585
2017-06-02 19:04:17 +00:00
Evgeniy Stepanov 63f056327d [CFI] Remove LinkerSubsectionsViaSymbols.
Since D17854 LinkerSubsectionsViaSymbols is unnecessary.

It is interfering with ThinLTO implementation of CFI-ICall, where
the aliases used on the !LinkerSubsectionsViaSymbols branch are
needed to export jump tables to ThinLTO backends.

llvm-svn: 304582
2017-06-02 18:45:14 +00:00
David Blaikie 358c012db2 BitcodeWriter: Removing unnecessary std::function in favor of template
More cleanup from post-commit discussion on r304516

llvm-svn: 304579
2017-06-02 18:25:29 +00:00
Evgeniy Stepanov b933ad3a77 Skip CFI for dead functions.
Differential Revision: https://reviews.llvm.org/D33805

llvm-svn: 304578
2017-06-02 18:24:23 +00:00
Evgeniy Stepanov 659b3bc77d Move summary dead stripping before regular LTO.
This way dead stripping results are recorded in combined summary and
can be used in regular LTO passes.

Differential Revision: https://reviews.llvm.org/D33615

llvm-svn: 304577
2017-06-02 18:24:17 +00:00
Sanjay Patel 469014ada4 [x86] fix formatting; NFCI
llvm-svn: 304576
2017-06-02 18:14:31 +00:00
Matt Arsenault 746e065716 AMDGPU: Register AMDGPUAlwaysInline
llvm-svn: 304574
2017-06-02 18:02:42 +00:00
Reid Kleckner 146eb7a65f Re-land "COFF: migrate def parser from LLD to LLVM"
This reverts commit r304561 and re-lands r303490 & co.

The fix was to use "SymbolName" when translating LLD's internal export
list to lib/Object's short export struct. The SymbolName reflects the
actual symbol name, which may include fastcall and stdcall mangling bits
not included in the /EXPORT or .def file EXPORTS name:

@@ -434,8 +434,7 @@ std::vector<COFFShortExport> createCOFFShortExportFromConfig() {
   std::vector<COFFShortExport> Exports;
   for (Export &E1 : Config->Exports) {
     COFFShortExport E2;
-    E2.Name = E1.Name;
+    // Use SymbolName, which will have any stdcall or fastcall qualifiers.
+    E2.Name = E1.SymbolName;
     E2.ExtName = E1.ExtName;
     E2.Ordinal = E1.Ordinal;
     E2.Noname = E1.Noname;

llvm-svn: 304573
2017-06-02 17:53:06 +00:00
Konstantin Zhuravlyov be6c0ca5e2 AMDGPU: Make auto waitcnt before barrier a feature
Differential Revision: https://reviews.llvm.org/D33793

llvm-svn: 304571
2017-06-02 17:40:26 +00:00
Sanjay Patel cdb5dad4cc [TargetLowering] fix formatting; NFC
llvm-svn: 304569
2017-06-02 17:35:02 +00:00
Craig Topper 9277a86f03 [LazyValueInfo] Fix formatting NFC.
llvm-svn: 304567
2017-06-02 17:28:12 +00:00
David Blaikie b6b42e018a Tidy up a bit of r304516, use SmallVector::assign rather than for loop
This might give a few better opportunities to optimize these to memcpy
rather than loops - also a few minor cleanups (StringRef-izing,
templating (to avoid std::function indirection), etc).

The SmallVector::assign(iter, iter) could be improved with the use of
SFINAE, but the (iter, iter) ctor and append(iter, iter) need it to and
don't have it - so, workaround it for now rather than bothering with the
added complexity.

(also, as noted in the added FIXME, these assign ops could potentially
be optimized better at least for non-trivially-copyable types)

llvm-svn: 304566
2017-06-02 17:24:26 +00:00
Philip Reames 0f02bbc6f4 Verify a couple more fields in STATEPOINT instructions
While doing so, clarify the comments and update them to reflect current reality.

Note: I'm going to let this sit for a week or so before adding further verification.  I want to give this time to cycle through bots and merge it into our downstream tree before pushing this further.
llvm-svn: 304565
2017-06-02 17:02:33 +00:00
Philip Reames 94cc4a29ed Add placeholder for more extensive verification of psuedo ops
This initial patch doesn't actually do much useful. It's just to show where the new code goes. Once this is in, I'll extend the verification logic to check more useful properties.

For those curious, the more complicated version of this patch already found one very suspicious thing.

Differential Revision: https://reviews.llvm.org/D33819

llvm-svn: 304564
2017-06-02 16:36:37 +00:00
Craig Topper 3778c8943b [LazyValueInfo] Make solveBlockValueBinaryOp take a BinaryOperator* instead of Instruction*. This removes a cast of getOpcode to BinaryOps.
llvm-svn: 304563
2017-06-02 16:33:13 +00:00
Sanjay Patel ce241f48c5 [InstCombine] fix icmp with not op and constant to work with splat vector constant
llvm-svn: 304562
2017-06-02 16:29:41 +00:00
Reid Kleckner d249e4a188 Revert "COFF: migrate def parser from LLD to LLVM"
This reverts commits r303490, r303491, r303493, and r303494.

This caused http://crbug.com/728726. Essentially, exporting stdcall
functions doesn't appear to work after this change. Reduced test case
soon.

llvm-svn: 304561
2017-06-02 16:26:24 +00:00
Craig Topper 84a9f168f1 [LazyValueInfo] Fix typo in comment. NFC
llvm-svn: 304560
2017-06-02 16:21:13 +00:00
Craig Topper b23e7c78a5 [InstSimplify][ConstantFolding] Teach constant folding how to handle icmp null, (inttoptr x) as well as it handles icmp (inttoptr x), null
Summary:
The constant folding code currently assumes that the constant expression will always be on the left and the simple null will be on the right. But that's not true at least on the path from InstSimplify.

This patch adds support to ConstantFolding to detect the reversed case.

Reviewers: spatel, dberlin, majnemer, davide, joey

Reviewed By: joey

Subscribers: joey, llvm-commits

Differential Revision: https://reviews.llvm.org/D33801

llvm-svn: 304559
2017-06-02 16:17:32 +00:00
Sanjay Patel 4dc85eb75a [InstCombine] improve perf by not creating a known non-canonical instruction
Op1 (RHS) is a constant, so putting it on the LHS makes us churn through visitICmp
an extra time to canonicalize it:

INSTCOMBINE ITERATION #1 on cmpnot
IC: ADDING: 3 instrs to worklist
IC: Visiting:   %notx = xor i8 %x, -1
IC: Visiting:   %cmp = icmp sgt i8 %notx, 42
IC: Old =   %cmp = icmp sgt i8 %notx, 42
    New =   <badref> = icmp sgt i8 -43, %x
IC: ADD:   %cmp = icmp sgt i8 -43, %x
IC: ERASE   %1 = icmp sgt i8 %notx, 42
IC: ADD:   %notx = xor i8 %x, -1
IC: DCE:   %notx = xor i8 %x, -1
IC: ERASE   %notx = xor i8 %x, -1
IC: Visiting:   %cmp = icmp sgt i8 -43, %x
IC: Mod =   %cmp = icmp sgt i8 -43, %x
    New =   %cmp = icmp slt i8 %x, -43
IC: ADD:   %cmp = icmp slt i8 %x, -43
IC: Visiting:   %cmp = icmp slt i8 %x, -43
IC: Visiting:   ret i1 %cmp

If we create the swapped ICmp directly, we go faster:

INSTCOMBINE ITERATION #1 on cmpnot
IC: ADDING: 3 instrs to worklist
IC: Visiting:   %notx = xor i8 %x, -1
IC: Visiting:   %cmp = icmp sgt i8 %notx, 42
IC: Old =   %cmp = icmp sgt i8 %notx, 42
    New =   <badref> = icmp slt i8 %x, -43
IC: ADD:   %cmp = icmp slt i8 %x, -43
IC: ERASE   %1 = icmp sgt i8 %notx, 42
IC: ADD:   %notx = xor i8 %x, -1
IC: DCE:   %notx = xor i8 %x, -1
IC: ERASE   %notx = xor i8 %x, -1
IC: Visiting:   %cmp = icmp slt i8 %x, -43
IC: Visiting:   ret i1 %cmp

llvm-svn: 304558
2017-06-02 16:11:14 +00:00
Alexander Timofeev 3f70b619a9 AMDGPUAnnotateUniformValue should always treat volatile loads as divergent
llvm-svn: 304554
2017-06-02 15:25:52 +00:00
Geoff Berry 57d8a417e7 [AArch64][Falkor] Model immediate forwarding.
llvm-svn: 304552
2017-06-02 14:27:41 +00:00
Mark Searles 70359ac60d [AMDGPU] Turn on the new waitcnt insertion pass. Adjust tests.
-enable-si-insert-waitcnts=1 becomes the default
-enable-si-insert-waitcnts=0 to use old pass

Differential Revision: https://reviews.llvm.org/D33730

llvm-svn: 304551
2017-06-02 14:19:25 +00:00
Zoran Jovanovic 2aae0649a1 [mips][microMIPS] Extending size reduction pass with LBU16, LHU16, SB16 and SH16
Author: milena.vujosevic.janicic
Reviewers: sdardis
The patch extends size reduction pass for MicroMIPS.
The following instructions are examined and transformed, if possible:
LBU instruction is transformed into 16-bit instruction LBU16
LHU instruction is transformed into 16-bit instruction LHU16
SB instruction is transformed into 16-bit instruction SB16
SH instruction is transformed into 16-bit instruction SH16
Differential Revision: https://reviews.llvm.org/D33091

llvm-svn: 304550
2017-06-02 14:14:21 +00:00
Krzysztof Parzyszek 066e8b56a0 [Hexagon] Return 0 from getDotNewPredOp when .new opcode does not exist
This allows using this function to test if an instruction can be converted
to a .new form.

llvm-svn: 304549
2017-06-02 14:07:06 +00:00
Benjamin Kramer c1f5ae236c [OrderedBasicBlock] Return false for comesBefore(A, A)
So far it would return true for the first uncached query, then cached
queries return false.

llvm-svn: 304545
2017-06-02 13:10:31 +00:00
John Brawn 6671616cde [GlobalMerge] Don't merge globals that may be preempted
When a global may be preempted it needs to be accessed directly, instead of
indirectly through a MergedGlobals symbol, for the preemption to work.

This fixes PR33136.

Differential Revision: https://reviews.llvm.org/D33727

llvm-svn: 304537
2017-06-02 10:24:14 +00:00
Diana Picus e7aa90987d [ARM] GlobalISel: Support struct params/returns
Very very similar to the support for arrays. As with arrays, we don't
support returning large structs that wouldn't fit in R0-R3. Most
front-ends would likely use sret arguments for that anyway.

The only significant difference is that when splitting a struct, we need
to make sure we set the correct original alignment on each member,
otherwise it may get split incorrectly between stack and registers.

llvm-svn: 304536
2017-06-02 10:16:48 +00:00
Amaury Sechet 437f7060fe nits in TargetLowering.cpp . NFC
llvm-svn: 304532
2017-06-02 09:18:18 +00:00
Javed Absar 4ae7e81233 [ARM] Cortex-A57 scheduling model for ARM backend (AArch32)
This patch implements the Cortex-A57 scheduling model.
The main code is in ARMScheduleA57.td, ARMScheduleA57WriteRes.td.
Small changes in cpp,.h files to support required scheduling predicates.

Scheduling model implemented according to:
 http://infocenter.arm.com/help/topic/com.arm.doc.uan0015b/Cortex_A57_Software_Optimization_Guide_external.pdf.

Patch by : Andrew Zhogin (submitted on his behalf, as requested).
Rewiewed by: Renato Golin, Diana Picus, Javed Absar, Kristof Beyls.
Differential Revision: https://reviews.llvm.org/D28152

llvm-svn: 304530
2017-06-02 08:53:19 +00:00
Max Kazantsev 4d8748a987 [SelectionDAG] Get rid of recursion in findNonImmUse
The recursive implementation of findNonImmUse may overflow stack
on extremely long use chains. This patch replaces it with an equivalent
iterative implementation.

Reviewed By: bogner

Differential Revision: https://reviews.llvm.org/D33775

llvm-svn: 304522
2017-06-02 07:11:00 +00:00
Gor Nishanov 053d2d24f7 [coroutines] PR33271: Remove stray coro.save intrinsics during CoroSplit
Summary:
Optimization passes may remove llvm.coro.suspend intrinsic while leaving matching llvm.coro.save intrinsic orphaned.
Make sure we clean up orphaned coro.saves.  The bug manifested with a crash similar to this:

```
    llvm_unreachable("Unknown type!");
    llvm::MVT::getVT (Ty=0x489518, HandleUnknown=false)
    llvm::EVT::getEVT
    llvm::TargetLoweringBase::getValueType
    llvm::ComputeValueVTs
    llvm::SelectionDAGBuilder::visitTargetIntrinsic
```

Reviewers: GorNishanov

Subscribers: EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D33817

llvm-svn: 304518
2017-06-02 02:18:36 +00:00
Xinliang David Li 621e8dcf1f [Profile] Enhance expect lowering to handle correlated branches
builtin_expect applied on && or || expressions were not
handled properly before. With this patch, the problem is fixed.

Differential Revision: http://reviews.llvm.org/D33164

llvm-svn: 304517
2017-06-02 02:09:31 +00:00
Teresa Johnson 7a27b132a8 [ThinLTO] Efficiency improvement when writing module path string table
Summary:
When writing the combined index, we are walking the entire module
path StringMap in the full index, and checking whether each one should be
included in the index being written. For distributed backends, where we
write an individual combined index for each file, each with only a few
module paths, this is incredibly inefficient. Add a method that takes
a callback and hides the details of whether we are writing the full
combined index, or just a slice, and in the latter case it walks the set
of modules to include instead of the entire index.

For a huge application with around 23K files (i.e. where we were iterating
through the 23K-entry modulePath StringMap 23K times), this change improved
the thin link time by a whopping 48%.

Reviewers: pcc

Subscribers: Prazek, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D33813

llvm-svn: 304516
2017-06-02 01:56:02 +00:00
Philip Reames ae80045deb [RS4GC] Comment clarification
llvm-svn: 304514
2017-06-02 01:52:06 +00:00
Jacob Gravelle 26115924a2 Revert r304117 - WebAssembly object format isn't ready to be the default
Summary: Wasm object format has some functionality regressions from the ELF format, and doesn't play nicely with the rest of the toolchain. It should eventually be the default, but not yet.

Reviewers: sunfish, sbc100

Subscribers: jfb, dschuff, llvm-commits

Differential Revision: https://reviews.llvm.org/D33811

llvm-svn: 304512
2017-06-02 01:26:17 +00:00
Sam Clegg c38e947e50 [WebAssembly] MC: Fix references to undefined externals in data section
Undefined externals don't need to have a size or an offset.
This was broken by r303915.  Added a test for this case.

This fixes the "Compile LLVM Torture (o)" step on the wasm
waterfall.

Differential Revision: https://reviews.llvm.org/D33803

llvm-svn: 304505
2017-06-02 01:05:24 +00:00
Davide Italiano 1dd5558e52 [PM] GVNSink is off by default, fix an obvious typo.
llvm-svn: 304497
2017-06-01 23:47:53 +00:00
Eugene Zelenko 7ea692373c [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 304495
2017-06-01 23:25:02 +00:00
Zachary Turner afb81a83a9 Fix 2 more -Wreorder warnings.
llvm-svn: 304494
2017-06-01 23:24:50 +00:00
Tim Shen 4e912aa5af [ThinLTO] Move -lto-use-new-pm to llvm-lto2, and change it to -use-new-pm.
Summary:
As we teach Clang to use ThinkLTO + new PM, it's good for the users to
inject through Config, instead of setting a flag in the LTOBackend
library. Move the flag to llvm-lto2.

As it moves to llvm-lto2, a new name -use-new-pm seems simpler and as
clear.

Reviewers: davide, tejohnson

Subscribers: mehdi_amini, Prazek, inglorion, eraman, chandlerc, llvm-commits

Differential Revision: https://reviews.llvm.org/D33799

llvm-svn: 304492
2017-06-01 23:13:44 +00:00
Davide Italiano c368831580 Move GVNHoist to the right position in the new pass manager pipeline.
GVNHoist was moved as part of simplification passes for the current
pass manager (but not for the new), so they're out-of-sync.

Differential Revision:  https://reviews.llvm.org/D33806

llvm-svn: 304490
2017-06-01 23:08:14 +00:00
Xinliang David Li d6cfba2a02 Fix compiler_rt buildbot failure
llvm-svn: 304489
2017-06-01 23:05:11 +00:00
Keno Fischer fa635d730f Reapply "[Cloning] Take another pass at properly cloning debug info"
This was rL304226, reverted in 304228 due to a clang assertion failure
on the build bots. That problem should have been addressed by clang
commit rL304470.

llvm-svn: 304488
2017-06-01 23:02:12 +00:00
Zachary Turner ebd3ae8371 [CodeView] Properly align symbol records on read/write.
Object files have symbol records not aligned to any particular
boundary (e.g. 1-byte aligned), while PDB files have symbol
records padded to 4-byte aligned boundaries.  Since they share
the same reading / writing code, we have to provide an option to
specify the alignment and propagate it up to the producer or
consumer who knows what the alignment is supposed to be for the
given container type.

Added a test for this by modifying the existing PDB -> YAML -> PDB
round-tripping code to round trip symbol records as well as types.

Differential Revision: https://reviews.llvm.org/D33785

llvm-svn: 304484
2017-06-01 21:52:41 +00:00
Yaxun Liu a618acf923 [AMDGPU] Fix kernel arg segment size for amdgizcl
Differential Revision: https://reviews.llvm.org/D33307

llvm-svn: 304482
2017-06-01 21:31:53 +00:00
Eli Friedman 0d823d610d Add opt-bisect support for region passes.
This is necessary to get opt-bisect working with polly.

Differential Revision: https://reviews.llvm.org/D33751

llvm-svn: 304476
2017-06-01 21:22:26 +00:00
Adrian Prantl d9cd4d52e3 DbgValueHistoryCalculator: Ignore call instructions that claim to clobber SP.
The AArch64 backend marks calls that involve aggregate function
arguments as having an implicit def of SP. We already have the same
workaround in LiveDebugValues and in DbgValueHistoryCalculator for SP
clobbers in register masks. This adds register defs to the list.

Fixes rdar://problem/30361929 and Swift SR-3851.

llvm-svn: 304471
2017-06-01 21:14:58 +00:00
Teresa Johnson 596b2e7ab2 [PGO] Adjust indirect call promotion threshold
Summary:
Reduce min percent required for indirect call promotion from 33% to 30%,
which matches gcc's threshold and catches the same hot opportunities.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33798

llvm-svn: 304469
2017-06-01 21:10:10 +00:00
Keno Fischer 3cdd4935cd [DIBuilder] Add a more fine-grained finalization method
Summary:
Clang wants to clone a function before it is done building the entire
compilation unit. As of now, there is no good way to do that, because
CloneFunction doesn't like dealing with temporary metadata. However,
as long as clang doesn't want to add any variables to this SP, it
should be fine to just prematurely finalize it. Add an API to allow this.

This is done in preparation of a clang commit to fix the assertion that
necessitated the revert of D33655.

Reviewers: aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33704

llvm-svn: 304467
2017-06-01 20:42:44 +00:00
Evgeniy Stepanov 56584bbf16 (NFC) Track global summary liveness in GVFlags.
Replace GVFlags::LiveRoot with GVFlags::Live and use that instead of
all the DeadSymbols sets. This is refactoring in order to make
liveness information available in the RegularLTO pipeline.

llvm-svn: 304466
2017-06-01 20:30:06 +00:00
Nirav Dave 4952871630 [SDAG] Fix CombineTo ordering in visitZERO_EXTEND and visitSIGN_EXTEND
Reorder CombineTo Calls to prevent references to stale/deleted SDNodes which caused undue assertions.

Reviewers: dbabokin

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D31625

llvm-svn: 304460
2017-06-01 19:33:50 +00:00
Xinliang David Li ee8d6acb1f [Profile] Fix builtin_expect lowering bug
The lowerer wrongly assumes the ICMP instruction 
 1) always has a constant operand;
 2) the operand has value 0.

It also assumes the expected value can only be one, thus
other values other than one will be considered 'zero'.

This leads to wrong profile annotation when other integer values
are used other than 0, 1 in the comparison or in the expect intrinsic.

Also missing is handling of equal predicate.

This patch fixes all the above problems.

Differential Revision: http://reviews.llvm.org/D33757

llvm-svn: 304453
2017-06-01 19:05:55 +00:00
Xinliang David Li 0a0acbcf78 [PartialInlining] Emit branch info and profile data as remarks
This allows us to collect profile statistics to tune static 
branch prediction.

Differential Revision: http://reviews.llvm.org/D33746

llvm-svn: 304452
2017-06-01 18:58:50 +00:00
Mandeep Singh Grang 33a1b73600 [PredicateInfo] Fix non-determinism in codegen uncovered by reverse iterating SmallPtrSet
Summary:
Sort OpsToRename before iterating to make iteration order deterministic.

Thanks to Daniel Berlin for the sorting logic.

Reviewers: dberlin, RKSimon, efriedma, davide

Reviewed By: dberlin, davide

Subscribers: sanjoy, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D33265

llvm-svn: 304447
2017-06-01 18:36:24 +00:00
Adrian Prantl f4bc1f77b7 [DWARF] Introduce Dump Options
This commit introduces a structure that holds all the flags that
control the pretty printing of dwarf output.

Patch by Spyridoula Gravani!

Differential Revision: https://reviews.llvm.org/D33749

llvm-svn: 304446
2017-06-01 18:18:23 +00:00
Krzysztof Parzyszek 3cf16576d5 [Hexagon] Fix dependence check in the packetizer
An incorrect check in the packetizer lead to an attempt to convert
an unconditional branch to a .new (conditional) form.

llvm-svn: 304442
2017-06-01 18:02:40 +00:00
Krzysztof Parzyszek 51fd5405d5 [Hexagon] Handle long-running simplification loop in idiom recognition
The initial assumption was that the simplification would converge to a
fixed point relatvely quickly. Turns out that there are legitimate situa-
tions where the complexity of the code causes it to take a large number
of iterations.

Two main changes:
- Instead of aborting upon hitting the limit, simply return nullptr.
- Reduce the limit to 10,000 from 100,000.

llvm-svn: 304441
2017-06-01 18:00:47 +00:00
Amaury Sechet 2adb7bdbca Remove ADDC, ADDE, SUBC, SUBE and SETCCE support from the X86 backend, use the CARRY ops instead.
Summary:
As per title. This cleanup some technical debt.

Depends on D33374

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33390

llvm-svn: 304435
2017-06-01 16:33:08 +00:00
Matt Arsenault 3416b8c874 AMDGPU: Remove error on call in AsmPrinter
Partial revert of r301938 which is making it harder
to split patches up.

llvm-svn: 304418
2017-06-01 15:05:15 +00:00
Matt Arsenault b083570532 DAG: Remove pointless type check
These are only integer operations.

llvm-svn: 304417
2017-06-01 14:49:46 +00:00
Matt Arsenault 50f43e4168 AMDGPU: Set high getCSRFirstUseCost
llvm-svn: 304416
2017-06-01 14:38:02 +00:00
Florian Hahn fca7b8348f [ARM] Create relocations for Thumb functions calling ARM fns in ELF.
Summary:
Without using a fixup in this case, BL will be used instead of BLX to
call internal ARM functions from Thumb functions.

Reviewers: rafael, t.p.northover, peter.smith, kristof.beyls

Reviewed By: peter.smith

Subscribers: srhines, echristo, aemerson, rengolin, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33436

llvm-svn: 304413
2017-06-01 13:50:57 +00:00
Kamil Rytarowski 07c81b1856 [Solaris] Fix PR33228 - llvm::sys::fs::is_local_impl done right
Summary:
Solaris-specific implementation for llvm::sys::fs::is_local_impl.
FStype pattern matching might be a bit unreliable, but at least it fixes the build failure.



Reviewers: mgorny, nlopes, llvm-commits, krytarowski

Reviewed By: krytarowski

Subscribers: voskresensky.vladimir, krytarowski

Differential Revision: https://reviews.llvm.org/D33695

llvm-svn: 304412
2017-06-01 12:57:00 +00:00
Amaury Sechet c84cc230b3 Only generate addcarry node when it is legal.
Summary:
This is a problem uncovered by stage2 testing. ADDCARRY end up being generated on target that do not support it.

The patch that introduced the problem has other patches layed on top of it, so we want to fix the issue rather than revert it to avoid creating a lor of churn.

A regression test will be added shortly, but this is committed as this in order to get the build back to green promptly.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33770

llvm-svn: 304409
2017-06-01 12:03:16 +00:00
Chandler Carruth 8b3be4e59d [PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM.
Based on the original patch by Davide, but I've adjusted the API exposed
to just be different entry points rather than exposing more state
parameters. I've factored all the common logic out so that we don't have
any duplicate pipelines, we just stitch them together in different ways.
I think this makes the build easier to reason about and understand.

This adds a direct method for getting the module simplification pipeline
as well as a method to get the optimization pipeline. While not my
express goal, this seems nice and gives a good place comment about the
restrictions that are imposed on them.

I did make some minor changes to the way the pipelines are structured
here, but hopefully not ones that are significant or controversial:

1) I sunk the PGO indirect call promotion to only be run when we have
   PGO enabled (or as part of the special ThinLTO pipeline).

2) I made the extra GlobalOpt run in ThinLTO just happen all the time
   and at a slightly more powerful place (before we remove available
   externaly functions). This seems like general goodness and not a big
   compile time sink, so it didn't make sense to *only* use it in
   ThinLTO. Fewer differences in the pipeline makes everything simpler
   IMO.

3) I hoisted the ThinLTO stop point pre-link above the the RPO function
   attr inference. The RPO inference won't infer anything terribly
   meaningful pre-link (recursiveness?) so it didn't make a lot of
   sense. But if the placement of RPO inference starts to matter, we
   should move it to the canonicalization phase anyways which seems like
   a better place for it (and there is a FIXME to this effect!). But
   that seemed a bridge too far for this patch.

If we ever need to parameterize these pipelines more heavily, we can
always sink the logic to helper functions with parameters to keep those
parameters out of the public API. But the changes above seemed minor
that we could possible get away without the parameters entirely.

I added support for parsing 'thinlto' and 'thinlto-pre-link' names in
pass pipelines to make it easy to test these routines and play with them
in larger pipelines. I also added a really basic manifest of passes test
that will show exactly how the pipelines behave and work as well as
making updates to them clear.

Lastly, this factoring does introduce a nesting layer of module pass
managers in the default pipeline. I don't think this is a big deal and
the flexibility of decoupling the pipelines seems easily worth it.

Differential Revision: https://reviews.llvm.org/D33540

llvm-svn: 304407
2017-06-01 11:39:39 +00:00
Zvi Rackover 7693733e80 [X86] Match bitcast of vxi1 to pmovmsk
Summary:
Add an early combine to match patterns such as:
  (i16 bitcast (v16i1 x))
  ->
  (i16 movmsk (v16i8 sext (v16i1 x)))

This combine needs to happen early enough before
type-legalization scalarizes the result of the setcc.

Reviewers: igorb, craig.topper, RKSimon

Subscribers: delena, llvm-commits

Differential Revision: https://reviews.llvm.org/D33311

llvm-svn: 304406
2017-06-01 11:27:57 +00:00
Amaury Sechet 251ea8a4f8 Do not legalize large setcc with setcce, introduce setcccarry and do it with usubo/setcccarry.
Summary:
This is a continuation of the work started in D29872 . Passing the carry down as a value rather than as a glue allows for further optimizations. Introducing setcccarry makes the use of addc/subc unecessary and we can start the removal process.

This patch only introduce the optimization strictly required to get the same level of optimization as was available before nothing more.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33374

llvm-svn: 304404
2017-06-01 11:14:17 +00:00
Amaury Sechet 6506a90a70 Remove ISD::SETCC match from combineX86ADD. It's done improperly and doesn't work.
llvm-svn: 304403
2017-06-01 11:13:10 +00:00
Amaury Sechet 9c5d1e966b [DAGCombine] Refactor common addcarry pattern.
Summary: This pattern is no very useful per se, but it exposes optimization for toehr patterns that wouldn't kick in otherwize. It's very common and worth optimizing for.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32756

llvm-svn: 304402
2017-06-01 10:48:04 +00:00
Amaury Sechet 2e43cb6d03 [DAGCombine] (add/uaddo X, Carry) -> (addcarry X, 0, Carry)
Summary:
This enables further transforms.

Depends on D32916

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32925

llvm-svn: 304401
2017-06-01 10:42:39 +00:00
Craig Topper f441226085 [TableGen] Remove RecordVal constructor that takes a StringRef and Record::setName(StringRef). Leave just the versions that take an Init.
They weren't used often enough to justify having two different interfaces. Push the responsiblity of creating a StringInit up to the caller.

llvm-svn: 304388
2017-06-01 06:56:16 +00:00
Tim Shen 6b41141863 [ThinLTO] Migrate ThinLTOBitcodeWriter to the new PM.
Summary: Also see D33429 for other ThinLTO + New PM related changes.

Reviewers: davide, chandlerc, tejohnson

Subscribers: mehdi_amini, Prazek, cfe-commits, inglorion, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D33525

llvm-svn: 304378
2017-06-01 01:02:12 +00:00
Xinliang David Li 32c5e809be [PartialInlining] Reduce outlining overhead by removing unneeded live-out(s)
Differential Revision: http://reviews.llvm.org/D33694

llvm-svn: 304375
2017-06-01 00:12:41 +00:00
Dehao Chen 6b737ddce7 Add LiveRangeShrink pass to shrink live range within BB.
Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB.

Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb

Reviewed By: MatzeB, andreadb

Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D32563

llvm-svn: 304371
2017-05-31 23:25:25 +00:00
Reid Kleckner fc7ba565ed [EH] Recognize __(gxx|gcc)_personality_seh0 as the GNU EH personalities
These are no-ops when there are no invokes. We don't need to emit LSDAs
for them.

Fixes PR33220.

llvm-svn: 304367
2017-05-31 22:35:52 +00:00
Matthias Braun 605f779516 ImplicitNullChecks: Clear kill/dead flags when moving instructions around
The values are marked as livein in the successor blocks so marking them
as killed or dead was wrong.

llvm-svn: 304366
2017-05-31 22:23:08 +00:00
Reid Kleckner 57ac61e005 Check hasPersonalityFn before calling getPersonalityFn
llvm-svn: 304365
2017-05-31 22:21:20 +00:00
Reid Kleckner c2f1bbfe4f [EH] Fix the LSDA that we emit for unknown EH personalities
We should have a single call site entry with no landing pad. This
indicates that no EH action should be taken and the unwinder should
unwind to the next frame.

We currently don't recognize __gxx_personality_seh0 as a known
personality, so we forcibly emit a table, and that table was wrong. This
was filed as PR33220. Now we emit a correct table for that personality.
The next step is to recognize that we can completely skip the table for
this personality.

llvm-svn: 304363
2017-05-31 22:18:49 +00:00
Steven Wu 97e2cf87e1 [MachOObject] Fix bind opcode parser error on valid opcode sequence
BIND_OPCODE_SET_DYLIB_SPECIAL_IMM(0) is a valid way to setp library
ordinal. MachOObject should set LibraryOrdinalSet even when IMM is zero.

llvm-svn: 304362
2017-05-31 22:17:43 +00:00
Galina Kistanova 244621faad Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304361
2017-05-31 22:16:24 +00:00
Galina Kistanova 8514dd540d Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304358
2017-05-31 22:09:46 +00:00
Galina Kistanova 0b69e363f6 Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304356
2017-05-31 22:02:05 +00:00
Galina Kistanova c752c4bf56 Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304355
2017-05-31 21:50:45 +00:00
Wei Mi 0bd3f41588 Revert rL304050. It may break sanitizer bootstrap. Revert it for now while investigating.
llvm-svn: 304350
2017-05-31 21:29:33 +00:00
Matthias Braun e2e65911a2 Try to fix buildbots
It seems not all of our bots have a std::vector::erase() taking a
const_iterator (even though that seems to be part of C++11) attempt to
workaround.

llvm-svn: 304349
2017-05-31 21:25:03 +00:00
Matthias Braun ac4beccaca X86FloatingPoint: Fix livein lists
After transforming FP to ST registers:
- Do not add the ST register to the livein lists, they are reserved so
  we do not need to track their liveness.
- Remove the FP registers from the livein lists, they don't have defs or
  uses anymore and so are not live.
- (The setKillFlags() call is moved to an earlier place as it relies on
   the FP registers still being present in the livein list.)

llvm-svn: 304342
2017-05-31 20:30:22 +00:00
Matthias Braun 43692a2245 X86FloatingPoint: Add some static assert, cleanup; NFC
llvm-svn: 304341
2017-05-31 20:30:17 +00:00
Galina Kistanova c2b642d009 Added missing break; added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304340
2017-05-31 20:25:13 +00:00
Kostya Serebryany 2e98c045cb [libFuzzer] fix a test to match the new sanitizer run-time
llvm-svn: 304333
2017-05-31 19:47:11 +00:00
Galina Kistanova b2c0116e71 Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304332
2017-05-31 19:41:33 +00:00
Reid Kleckner 5fbdd17714 [IR] Add additional addParamAttr/removeParamAttr to AttributeList API
Summary:
Fairly straightforward patch to fill in some of the holes in the
attributes API with respect to accessing parameter/argument attributes.
The patch aims to step further towards encapsulating the
idx+FirstArgIndex pattern to access these attributes to within the
AttributeList.

Patch by Daniel Neilson!

Reviewers: rnk, chandlerc, pete, javed.absar, reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33355

llvm-svn: 304329
2017-05-31 19:23:09 +00:00
Craig Topper 2b8419a22d [TableGen] Make Record::getValueAsString and getValueAsListOfStrings return StringRefs instead of std::string
Internally both these methods just return the result of getValue on either a StringInit or a CodeInit object. In both cases this returns a StringRef pointing to a string allocated in the BumpPtrAllocator so its not going anywhere. So we can just pass that StringRef along.

This is a fairly naive patch that targets just the build failures caused by this change. There's additional work that can be done to avoid creating std::string at call sites that still think getValueAsString returns a std::string. I'll try to clean those up in future patches.

Differential Revision: https://reviews.llvm.org/D33710

llvm-svn: 304325
2017-05-31 19:01:11 +00:00
Craig Topper fa5dc09292 [BPF] Correct the file name of the -gen-asm-matcher output file to not start with X86.
llvm-svn: 304324
2017-05-31 19:01:05 +00:00
Teresa Johnson a6a3fb57a1 [ThinLTO] Reduce unnecessary map lookups during combined summary write
Summary:
Don't assign values to undefined references, simply don't emit those
reference edges as they are not useful (we were already not emitting
call edges to undefined refs).

Also, streamline the later lookup of value ids when writing the
summaries, by combining the check for value id existence with the access
of that value id.

Reviewers: pcc

Subscribers: Prazek, llvm-commits, inglorion

Differential Revision: https://reviews.llvm.org/D33634

llvm-svn: 304323
2017-05-31 18:58:11 +00:00
Nirav Dave 3424373f30 [ScheduleDAG] Deal with already scheduled loads in ScheduleDAG.
Summary:
If we attempt to unfold an SUnit in ScheduleDAG that results in
finding an already scheduled load, we must should abort the
unfold as it will not improve scheduling.

This fixes PR32610.

Reviewers: jmolloy, sunfish, bogner, spatel

Subscribers: llvm-commits, MatzeB

Differential Revision: https://reviews.llvm.org/D32911

llvm-svn: 304321
2017-05-31 18:43:17 +00:00
Matthias Braun d6a36ae282 TargetMachine: Indicate whether machine verifier passes.
This adds a callback to the LLVMTargetMachine that lets target indicate
that they do not pass the machine verifier checks in all cases yet.

This is intended to be a temporary measure while the targets are fixed
allowing us to enable the machine verifier by default with
EXPENSIVE_CHECKS enabled!

Differential Revision: https://reviews.llvm.org/D33696

llvm-svn: 304320
2017-05-31 18:41:23 +00:00
Kostya Serebryany 53b34c8443 [sanitizer-coverage] remove stale code (old coverage); llvm part
llvm-svn: 304319
2017-05-31 18:27:33 +00:00
Sean Fertile 457ddd311a [PowerPC] Correctly specify the cache line size for Power 7, 8 and 9.
Fixes PPCTTIImpl::getCacheLineSize() returning the wrong cache line size for
newer ppc processors.

Commiting on behalf of Stefan Pintilie.
Differential Revision: https://reviews.llvm.org/D33656

llvm-svn: 304317
2017-05-31 18:20:17 +00:00
Anna Thomas 777bb90bdc Revert "[Atomics][LoopIdiom] Recognize unordered atomic memcpy"
This reverts commit r304310.

It caused build failures in polly and mingw
due to undefined reference to
llvm::RTLIB::getMEMCPY_ELEMENT_ATOMIC.

llvm-svn: 304315
2017-05-31 17:20:51 +00:00
Zaara Syeda 3a7578c658 [PPC] Inline expansion of memcmp
This patch does an inline expansion of memcmp.
It changes the memcmp library call into an inline expansion when the size is
known at compile time and is under a target specified threshold.
This expansion is implemented in CodeGenPrepare and expands into straight line
code. The target specifies a maximum load size and the expansion works by using
this size to load the two sources, compare, and exit early if a difference is
found. It also has a special case when the memcmp result is used in a compare
to zero equality.

Differential Revision: https://reviews.llvm.org/D28637

llvm-svn: 304313
2017-05-31 17:12:38 +00:00
Galina Kistanova 6ad77845e2 Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304312
2017-05-31 17:10:03 +00:00
Mark Searles 11d0a04050 [AMDGPU] Fix bugs in new waitcnt pass. Add test.
- new waitcnt pass remains off by default; -enable-si-insert-waitcnts=1 to enable it
- fix handling of PERMUTE ops
- fix insertion of waitcnt instrs at function begin/end ( port of analogous code that was added to old waitcnt pass )
- add new test

  Differential Revision: https://reviews.llvm.org/D33114

llvm-svn: 304311
2017-05-31 16:44:23 +00:00
Anna Thomas 056c009f1b [Atomics][LoopIdiom] Recognize unordered atomic memcpy
Summary:
Expanding the loop idiom test for memcpy to also recognize unordered atomic memcpy.
The only difference for recognizing
an unordered atomic memcpy and instead of a normal memcpy is
that the loads and/or stores involved are unordered atomic operations.
Background:  http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html

Patch by Daniel Neilson!

Reviewers: reames, anna, skatkov

Reviewed By: reames

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33243

llvm-svn: 304310
2017-05-31 16:39:52 +00:00
Dmitry Preobrazhensky 793c592652 [AMDGPU][MC] New syntax for ds_swizzle_b32 offset
See Bug 28601: https://bugs.llvm.org//show_bug.cgi?id=28601

Reviewers: artem.tamazov, vpykhtin

Differential Revision: https://reviews.llvm.org/D33542

llvm-svn: 304309
2017-05-31 16:26:47 +00:00
Florian Hahn ff25b6d8f6 [AArch64] Enable FeatureFuseAES on Cortex-A53.
It improves performance on Cortex-A53.

llvm-svn: 304307
2017-05-31 15:50:03 +00:00
Florian Hahn 064a2f9222 [AArch64] Enable FeatureFuseAES on Cortex-A73.
It improves performance on Cortex-A73.

llvm-svn: 304304
2017-05-31 15:25:25 +00:00
Reid Kleckner 1d7cbdfc3d Fix assertion when merging multiple empty AttributeLists
Patch by Nicholas Wilson!

Differential Revision: https://reviews.llvm.org/D33627

llvm-svn: 304300
2017-05-31 14:24:06 +00:00
Nirav Dave 7c70fddba6 [DAG] Avoid use of stale store.
Correct references to alignment of store which may be deleted in a
previous iteration of merge. Instead use first store that would be
merged.

Corrects pr33172's use-after-poison caught by ASan.

Reviewers: spatel, hfinkel, RKSimon

Reviewed By: RKSimon

Subscribers: thegameg, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33686

llvm-svn: 304299
2017-05-31 13:36:17 +00:00
Tony Jiang 60c247de18 [PowerPC] Fix a performance bug for PPC::XXPERMDI.
There are some VectorShuffle Nodes in SDAG which can be selected to XXPERMDI
Instruction, this patch recognizes them and does the selection to improve
the PPC performance.

Differential Revision: https://reviews.llvm.org/D33404

llvm-svn: 304298
2017-05-31 13:09:57 +00:00
Nemanja Ivanovic accab033c9 [PowerPC] Eliminate integer compare instructions - vol. 3
This patch builds upon https://reviews.llvm.org/rL302810 to add
handling for the 64-bit SETEQ patterns.

Differential Revision: https://reviews.llvm.org/D33369

llvm-svn: 304286
2017-05-31 08:04:07 +00:00
Dylan McKay 043fa4b3d6 [AVR] Fix a big in shift operator lowering; Authored by Dr. Gergo Erdi
When generating code for a shift loop, check the shift
 amount against the literal value 0, not R0

llvm-svn: 304284
2017-05-31 06:27:46 +00:00
Dylan McKay 48614d4a2c [AVR] CPIRdK can only work with r16..r31; Authored by Dr. Gergo Erdi
(https://github.com/avr-rust/rust/issues/50)

llvm-svn: 304283
2017-05-31 06:10:59 +00:00
Nemanja Ivanovic e597bd8230 [PowerPC] Eliminate integer compare instructions - vol. 2
This patch builds upon https://reviews.llvm.org/rL302810 to add
handling for bitwise logical operations in general purpose registers.
The idea is to keep the values in GPRs as long as possible - only
extracting them to a condition register bit when no further operations
are to be done.

Differential Revision: https://reviews.llvm.org/D31851

llvm-svn: 304282
2017-05-31 05:40:25 +00:00
Craig Topper 01197f686f [TableGen] Make one of RecordVal's constructors delegate to the other to reduce duplicate code.
llvm-svn: 304280
2017-05-31 05:12:33 +00:00
Zachary Turner 1b88f4f33a [ObjectYAML] Split CodeViewYAML into 3 pieces.
The code was a mess and disorganized due to the sheer amount
of it being in one file.  So I'm splitting this into three files.
One for CodeView types, one for CodeView symbols, and one for
CodeView debug subsections.  NFC.

llvm-svn: 304278
2017-05-31 04:17:13 +00:00
Gor Nishanov 2bc782d8da [coroutines] Call initializePass in coroutine pass constructors
Summary:

Fixes: https://bugs.llvm.org/show_bug.cgi?id=33226

Reviewers: chandlerc, davide, majnemer, dblaikie

Reviewed By: chandlerc

Subscribers: EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D33701

llvm-svn: 304277
2017-05-31 03:12:42 +00:00
George Burgess IV 0a7b989036 [CFLAA] Add missing break; note things are broken.
Thanks to Galina Kistanova for finding the missing break!

When trying to make a test for this, I realized our logic for handling
extractvalue/insertvalue/... is somewhat broken. This makes constructing
a test-case for this missing break nontrivial.

llvm-svn: 304275
2017-05-31 02:35:26 +00:00
Matthias Braun bcd4c68233 X86FrameLowering: No need to mark FP as live-in everywhere
The frame pointer (when used as frame pointer) is a reserved register.
We do not track liveness of reserved registers and hence do not need to
add them to the basic block livein lists.

llvm-svn: 304274
2017-05-31 02:11:10 +00:00
Daniel Berlin be3e7ba45e NewGVN: Fix PR 33185 by checking whether we need to recursively
generate a phi of ops, which we don't currently support.

llvm-svn: 304272
2017-05-31 01:47:32 +00:00
Daniel Berlin 71ff663e1b InstructionSimplify: Remove now-redundant reachability tests, as dominates() already does them
llvm-svn: 304270
2017-05-31 01:47:24 +00:00
Matthias Braun 05eeadbfd1 ARM: Fix cmpxchg O0 expansion
This is the equivalent of r304048 for ARM:

- Rewrite livein calculation to use the computeLiveIns() helper
  function. This is slightly less efficient but easier to reason about
  and doesn't unnecessarily add pristine and reserved registers[1]
- Zero the status register at the beginning of the loop to make sure it
  has a defined value.
- Remove kill flags of values that need to stay alive throughout the loop.

[1] An upcoming commit of mine will tighten the MachineVerifier to catch
    these.

llvm-svn: 304267
2017-05-31 01:21:35 +00:00
Matthias Braun 0dba4e3509 ARM: Do not add reserved registers to block livein lists; NFC
llvm-svn: 304266
2017-05-31 01:21:30 +00:00
Eugene Zelenko 4e9736b1c9 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 304265
2017-05-31 01:10:10 +00:00
Zachary Turner 083342bd34 [ObjectYAML] Clean up the CodeView headers a bit.
CodeViewYAML.h attempts to hide the details of many of the
CodeView yaml structures and types, but at the same time it
exposes the mapping traits for them to external users of the
header.

This patch just hides these in the implementation files so that
the interface is kept as simple as possible.

llvm-svn: 304263
2017-05-31 01:08:36 +00:00
Abderrazek Zaafrani 855411566b Add latency info for Exynos interleaved Load/Store instructions.
llvm-svn: 304259
2017-05-31 00:20:55 +00:00
Zachary Turner 7a75bc05b7 Try to fix build again.
llvm-svn: 304257
2017-05-30 23:57:46 +00:00
Zachary Turner 1e4d3693c4 [CodeView] Move CodeView symbol yaml logic to ObjectYAML.
This continues the effort to get the CodeView YAML parsing logic
into ObjectYAML.  After this patch, the only missing piece will
be the CodeView debug symbol subsections.

llvm-svn: 304256
2017-05-30 23:50:44 +00:00
Eric Beckmann 025e82bac1 Fix bug on Big-Endian system, due to reference to vector out of scope.
llvm-svn: 304255
2017-05-30 23:10:57 +00:00
Matthias Braun bc09894d6a MachineInstr: Do not skip dead def operands when printing.
This was introduced a long time ago in r86583 when regmask operands
didn't exist. Nowadays the behavior hurts more than it helps. This
removes it.

llvm-svn: 304254
2017-05-30 23:09:21 +00:00
Eric Beckmann ba395ef491 This patch should fix various clang warnings and a use of to_string
which isn't support before c++11.

llvm-svn: 304252
2017-05-30 22:29:06 +00:00
Tim Shen 0bd0aa8f07 [AntiDepBreaker] Revert r299124 and add a test.
Summary:
AntiDepBreaker intends to add all live-outs, including the implicit
CSRs, in StartBlock. r299124 was done without understanding that
intention.

Now with the live-ins propagated correctly (D32464), we can revert this change.

Reviewers: MatzeB, qcolombet

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D33697

llvm-svn: 304251
2017-05-30 22:26:52 +00:00
Zachary Turner 9c1ba225a9 Try to fix build.
llvm-svn: 304249
2017-05-30 22:00:37 +00:00
Zachary Turner d427383cb8 [CodeView] Move CodeView YAML code to ObjectYAML.
This is the beginning of an effort to move the codeview yaml
reader / writer into ObjectYAML so that it can be shared.
Currently the only consumer / producer of CodeView YAML is
llvm-pdbdump, but CodeView can exist outside of PDB files, and
indeed is put into object files and passed to the linker to
produce PDB files.  Furthermore, there are subtle differences
in the types of records that show up in object file CodeView
vs PDB file CodeView, but they are otherwise 99% the same.

By having this code in ObjectYAML, we can have llvm-pdbdump
reuse this code, while teaching obj2yaml and yaml2obj to use
this syntax for dealing with object files that can contain
CodeView.

This patch only adds support for CodeView type information
to ObjectYAML.  Subsequent patches will add support for
CodeView symbol information.

llvm-svn: 304248
2017-05-30 21:53:05 +00:00
Matthias Braun 5e394c3d6f TargetPassConfig: Keep a reference to an LLVMTargetMachine; NFC
TargetPassConfig is not useful for targets that do not use the CodeGen
library, so we may just as well store a pointer to an
LLVMTargetMachine instead of just to a TargetMachine.

While at it, also change the constructor to take a reference instead of a
pointer as the TM must not be nullptr.

llvm-svn: 304247
2017-05-30 21:36:41 +00:00
Tim Northover fb26d9a286 MIR: remove explicit "noVRegs" property.
We can infer this from the incoming MIR, so there's no reason to
represent it with a special flag.

llvm-svn: 304246
2017-05-30 21:28:57 +00:00
Xinliang David Li 74480adafd [PartialInlining] Shrinkwrap allocas with live range contained in outline region.
Differential Revision: http://reviews.llvm.org/D33618

llvm-svn: 304245
2017-05-30 21:22:18 +00:00
Quentin Colombet 73141d5b4b [Localizer] Don't trick to be smart for the insertion point
There is no guarantee that the first use of a constant that is traversed
is actually the first in the related basic block. Thus, if we use that
as the insertion point we may end up with definitions that don't
dominate there use.

llvm-svn: 304244
2017-05-30 20:53:06 +00:00
Matthew Simpson 646475a9bc [LV] Reapply r303763 with fix for PR33193
r303763 caused build failures in some out-of-tree tests due to an assertion in
TTI. The original patch updated cost estimates for induction variable update
instructions marked for scalarization. However, it didn't consider that the
incoming value of an induction variable phi node could be a cast instruction.
This caused queries for cast instruction costs with a mix of vector and scalar
types. This patch includes a fix for cast instructions and the test case from
PR33193.

The fix was suggested by Jonas Paulsson <paulsson@linux.vnet.ibm.com>.

Reference: https://bugs.llvm.org/show_bug.cgi?id=33193
Original Differential Revision: https://reviews.llvm.org/D33457

llvm-svn: 304235
2017-05-30 19:55:57 +00:00
Benjamin Kramer c69fe9cc62 [Object] Remove unused field + constructor.
llvm-svn: 304233
2017-05-30 19:37:02 +00:00
Benjamin Kramer 14ea122e6e [Object] Fix pessimizing move.
Returning the Error by value triggers copy elision, the move is more
expensive. Clang rightfully warns about it.

llvm-svn: 304232
2017-05-30 19:36:58 +00:00
Vedant Kumar 87aefe9042 Revert "This patch closes PR28513: an optimization of multiplication by different constants. It's implemented on DAG combiner level."
This reverts commit r304209.

I think this change is responsible for a tablgen failure in stage2 builds:

  http://green.lab.llvm.org/green/job/clang-stage2-configure-Rthinlto_build/2171/

I reproduced the failure locally (without ThinLTO), reverted the commit, rebuilt the stage1 clang, rebuilt the stage2 llvm-tblgen tool, and found that the crash disappears when the commit is reverted. Here is the stack trace:

FAILED: lib/Target/ARM/ARMGenRegisterBank.inc.tmp
cd /Volumes/Builds/pz-master-stage2-RA/lib/Target/ARM && /Volumes/Builds/pz-master-stage2-RA/bin/llvm-tblgen -gen-register-bank -I /Users/vk/llvm/lib/Target/ARM -I /Users/vk/llvm/include -I /Users/vk/llvm/lib/Target /Users/vk/llvm/lib/Target/ARM/ARM.td -o /Volumes
/Builds/pz-master-stage2-RA/lib/Target/ARM/ARMGenRegisterBank.inc.tmp
0  llvm-tblgen              0x0000000106fc9568 llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 40
1  llvm-tblgen              0x0000000106fc9be6 SignalHandler(int) + 422
2  libsystem_platform.dylib 0x00000001076a7fba _sigtramp + 26
3  libsystem_platform.dylib 0x00007fff58deb468 _sigtramp + 1366570184
4  llvm-tblgen              0x0000000106e89cc7 llvm::CodeGenRegBank::getCompositeSubRegIndex(llvm::CodeGenSubRegIndex*, llvm::CodeGenSubRegIndex*) + 615
5  llvm-tblgen              0x0000000106e88be6 llvm::CodeGenRegister::computeSubRegs(llvm::CodeGenRegBank&) + 2182
6  llvm-tblgen              0x0000000106e8e9f0 llvm::CodeGenRegBank::CodeGenRegBank(llvm::RecordKeeper&) + 2192
7  llvm-tblgen              0x0000000106f384a1 llvm::EmitRegisterBank(llvm::RecordKeeper&, llvm::raw_ostream&) + 65
8  llvm-tblgen              0x0000000106f72c64 (anonymous namespace)::LLVMTableGenMain(llvm::raw_ostream&, llvm::RecordKeeper&) + 1172
9  llvm-tblgen              0x0000000106fcb15f llvm::TableGenMain(char*, bool (*)(llvm::raw_ostream&, llvm::RecordKeeper&)) + 3599
10 llvm-tblgen              0x0000000106f727a6 main + 134
11 libdyld.dylib            0x000000010733c6a5 start + 1
Stack dump:
0.      Program arguments: /Volumes/Builds/pz-master-stage2-RA/bin/llvm-tblgen -gen-register-bank -I /Users/vk/llvm/lib/Target/ARM -I /Users/vk/llvm/include -I /Users/vk/llvm/lib/Target /Users/vk/llvm/lib/Target/ARM/ARM.td -o /Volumes/Builds/pz-master-stage2-RA/lib/Target/ARM/ARMGenRegisterBank.inc.tmp
/bin/sh: line 1: 41986 Segmentation fault: 11  /Volumes/Builds/pz-master-stage2-RA/bin/llvm-tblgen -gen-register-bank -I /Users/vk/llvm/lib/Target/ARM -I /Users/vk/llvm/include -I /Users/vk/llvm/lib/Target /Users/vk/llvm/lib/Target/ARM/ARM.td -o /Volumes/Builds/pz
-master-stage2-RA/lib/Target/ARM/ARMGenRegisterBank.inc.tmp

llvm-svn: 304231
2017-05-30 19:25:22 +00:00
Galina Kistanova 8c1e2f9108 Added missing break.
llvm-svn: 304230
2017-05-30 19:02:49 +00:00
Keno Fischer 3fa5db4c04 Revert "[Cloning] Take another pass at properly cloning debug info"
At least one build bot is complaining. Will investigate after lunch.

llvm-svn: 304228
2017-05-30 18:56:26 +00:00
Matthias Braun 700603555a ARM: Add missing flags to TBB_[JH]T pseudo instructions
NFC except for calming down the machine verifier in some cases.

llvm-svn: 304227
2017-05-30 18:52:33 +00:00
Keno Fischer 945dc1d2d1 [Cloning] Take another pass at properly cloning debug info
Summary:
In rL302576, DISubprograms gained the constraint that a !dbg attachments to functions must
have a 1:1 mapping to DISubprograms. As part of that change, the function cloning support
was adjusted to attempt to enforce this invariant during cloning. However, there
were several problems with the implementation. Part of these were fixed in rL304079.
However, there was a more fundamental problem with these changes, namely that it
bypasses the matadata value map, causing the cloned metadata to be a mix of metadata
pointing to the new suprogram (where manual code was added to fix those up) and the
old suprogram (where this was not the case). This mismatch could cause a number of
different assertion failures in the DWARF emitter. Some of these are given at
https://github.com/JuliaLang/julia/issues/22069, but some others have been observed
as well. Attempt to rectify this by partially reverting the manual DI metadata fixup,
and instead using the standard value map approach. To retain the desired semantics
of not duplicating the compilation unit and inlined subprograms, explicitly freeze
these in the value map.

Reviewers: dblaikie, aprantl, GorNishanov, echristo

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33655

llvm-svn: 304226
2017-05-30 18:28:30 +00:00
Eric Beckmann 72fb6a87fb Adding parsing ability for .res file.
Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33566

llvm-svn: 304225
2017-05-30 18:19:06 +00:00
Krzysztof Parzyszek ef58017b35 [Hexagon] Improve code generation for 32x32-bit multiplication
For multiplications of 64-bit values (giving 64-bit result), detect
cases where the arguments are sign-extended 32-bit values, on a per-
operand basis. This will allow few patterns to match a wider variety
of combinations in which extensions can occur.

llvm-svn: 304223
2017-05-30 17:47:51 +00:00
Zachary Turner 591312c5c1 [CodeView] Add more DebugSubsection implementations.
This adds implementations for Symbols and FrameData, and renames
the existing codeview::StringTable class to conform to the
DebugSectionStringTable convention.

llvm-svn: 304222
2017-05-30 17:13:33 +00:00
Craig Topper 5fd588be34 [SelectionDAG] Remove special case for ISD::FPOWI from the strict FP intrinsic handling.
This code was compensating for FPOWI defaulting to Legal and many targets not changing it to Expand. This was fixed in r304215 to default to Expand so this special handling should no longer be necessary.

llvm-svn: 304221
2017-05-30 17:12:18 +00:00
Stanislav Mekhanoshin 56ea488d8b [AMDGPU] Allow SDWA in instructions with immediates and SGPRs
An encoding does not allow to use SDWA in an instruction with
scalar operands, either literals or SGPRs. That is however possible
to copy these operands into a VGPR first.

Several copies of the value are produced if multiple SDWA conversions
were done. To cleanup MachineLICM (to hoist copies out of loops),
MachineCSE (to remove duplicate copies) and SIFoldOperands (to replace
SGPR to VGPR copy with immediate copy right to the VGPR) runs are added
after the SDWA pass.

Differential Revision: https://reviews.llvm.org/D33583

llvm-svn: 304219
2017-05-30 16:49:24 +00:00
Zachary Turner 8c099fe06e [CodeView] Rename ModuleDebugFragment -> DebugSubsection.
This is more concise, and matches the terminology used in other
parts of the codebase more closely.

llvm-svn: 304218
2017-05-30 16:36:15 +00:00
Mark Searles 00ce96f6ee [AMDGPU] Require waitcnt before barrier for all targets; adjust tests.
Differential Revision: https://reviews.llvm.org/D33576

llvm-svn: 304217
2017-05-30 16:22:43 +00:00
Craig Topper f6d4dc5b4a [SelectionDAG] Set ISD::FPOWI to Expand by default
Summary:
Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie".

This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default.

Reviewers: spatel, RKSimon, efriedma

Reviewed By: RKSimon

Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits

Differential Revision: https://reviews.llvm.org/D33530

llvm-svn: 304215
2017-05-30 15:27:55 +00:00
Andrew V. Tischenko 8b04826663 This patch closes PR28513: an optimization of multiplication by different constants.
It's implemented on DAG combiner level.

llvm-svn: 304209
2017-05-30 13:00:44 +00:00
Max Kazantsev d8fe3eb9cb [SCEV][NFC] Remove redundant params from isAvailableAtLoopEntry
Params DT and LI are redundant, because these values are contained in fields anyways.

Differential Revision: https://reviews.llvm.org/D33668

llvm-svn: 304204
2017-05-30 10:54:58 +00:00
Ulrich Weigand 3f484e68cc [SystemZ] Add decimal floating-point instructions
This adds assembler / disassembler support for the decimal
floating-point instructions.  Since LLVM does not yet have
support for decimal float types, these cannot be used for
codegen at this point.

llvm-svn: 304203
2017-05-30 10:15:16 +00:00
Ulrich Weigand f32adf6944 [SystemZ] Add hexadecimal floating-point instructions
This adds assembler / disassembler support for the hexadecimal
floating-point instructions.  Since the Linux ABI does not use
any hex float data types, these are not useful for codegen.

llvm-svn: 304202
2017-05-30 10:13:23 +00:00
Zoran Jovanovic 375b60de74 [mips] Expansion of LI.S and LI.D
Author: smaksimovic
Reviewers: dsanders sdardis
Introduces LI.S and LI.D pseudo instructions with floating point operands.
Differential Revision: https://reviews.llvm.org/D14390

llvm-svn: 304198
2017-05-30 09:33:43 +00:00
Kristof Beyls 2af1e90eb2 Fix PR33031: correct the estimate of maximum offset for instructions spilling/filling the stack.
llvm-svn: 304196
2017-05-30 06:58:41 +00:00
Daniel Berlin 2aa5dc1589 NewGVN: Compute hash value of expression on demand and use it in inequality testing.
llvm-svn: 304195
2017-05-30 06:58:18 +00:00
Daniel Berlin c8ed40400c NewGVN: Fix PR33194, memory corruption by putting temporary instructions in tables sometimes.
llvm-svn: 304194
2017-05-30 06:42:29 +00:00
Galina Kistanova 5c4f1a9b02 Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304187
2017-05-30 03:30:34 +00:00
Joerg Sonnenberger 9375a25342 Revert r303763, results in asserts i.e. while building Ruby.
llvm-svn: 304179
2017-05-29 22:52:17 +00:00
Craig Topper 638b1021bf [TableGen] Use StringMap instead of DenseMap<StringRef> to unique CodeInit and StringInit objects. Override the allocator to keep using the BumpPtrAllocator. NFCI
StringMap is better suited to mapping strings than a DenseMap.

llvm-svn: 304178
2017-05-29 21:49:37 +00:00
Craig Topper 481ff7087f [TableGen] Introduce DagInit::getArgs that returns an ArrayRef. Use it to fix 80 column violations in arg_begin/arg_end. Remove DagInit::args and use getArgs instead. NFC
llvm-svn: 304177
2017-05-29 21:49:34 +00:00
Benjamin Kramer 74de08031f [ManagedStatic] Avoid putting function pointers in template args.
This is super awkward, but GCC doesn't let us have template visible when
an argument is an inline function and -fvisibility-inlines-hidden is
used.

llvm-svn: 304175
2017-05-29 20:56:27 +00:00
Davide Italiano af66659d6b [GlobalIsel] Fix a warning with GCC 7 -Wpedantic. NFCI.
llvm-svn: 304174
2017-05-29 20:13:22 +00:00
Benjamin Kramer 2a441a52df Try to work around MSVC being buggy. Attempt #1.
error C2971: 'llvm::ManagedStatic': template parameter 'Creator': 'CreateDefaultTimerGroup': a variable with non-static storage duration cannot be used as a non-type argument

llvm-svn: 304157
2017-05-29 14:28:04 +00:00
Benjamin Kramer 351779e972 [Timer] Move DefaultTimerGroup into a ManagedStatic.
This used to be just leaked. r295370 made it use magic statics. This adds
a global destructor, which is something we'd like to avoid. It also creates
a weird situation where the mutex used by TimerGroup is re-created during
global shutdown and leaked.

Using a ManagedStatic here is also subtle as it relies on the mutex
inside of ManagedStatic to be recursive. I've added a test for that
in a previous change.

llvm-svn: 304156
2017-05-29 14:05:29 +00:00
Sanjay Patel 51152a3727 [DAGCombiner] fix load narrowing transform to exclude loads with extension
The extending load possibility was missed in:
https://reviews.llvm.org/rL304072

We might want to handle this cases as a follow-up, but bailing out for now
to avoid miscompiling.

llvm-svn: 304153
2017-05-29 13:24:58 +00:00
Jonas Paulsson fe0c0935c8 [SystemZ] Improve buildVector() in SystemZISelLowering.cpp.
Use VLREP when inserting one or more loads into a vector. This is more
efficient than to first load and then use a VLVGP.

Review: Ulrich Weigand
llvm-svn: 304152
2017-05-29 13:22:23 +00:00
Nikolai Bozhenov 82f0801c1b [Nios2] Target registration
Reviewers: craig.topper, hfinkel, joerg, lattner, zvi

Reviewed By: craig.topper

Subscribers: oren_ben_simhon, igorb, belickim, tvvikram, mgorny, llvm-commits, pavel.v.chupin, DavidKreitzer

Differential Revision: https://reviews.llvm.org/D32669
Patch by AndreiGrischenko <andrei.l.grischenko@intel.com>

llvm-svn: 304144
2017-05-29 09:48:30 +00:00
Diana Picus 0c05cce4e0 [ARM] GlobalISel: Extract helper. NFCI.
Create a helper to deal with the common code for merging incoming values
together after they've been split during call lowering. There's likely
more stuff that can be commoned up here, but we'll leave that for later.

llvm-svn: 304143
2017-05-29 09:09:54 +00:00
Hiroshi Inoue ac9cd3080d [trivial] fix a typo in comment, NFC
llvm-svn: 304139
2017-05-29 08:37:42 +00:00
Diana Picus bf4aed2c38 [ARM] GlobalISel: Support array returns
These are a bit rare in practice, but they don't require anything
special compared to array parameters, so support them as well.

llvm-svn: 304137
2017-05-29 08:19:19 +00:00
Hiroshi Inoue e3c14ebbfa [PPC] Fix assertion failure during binary encoding with -mcpu=pwr9
Summary
clang -c -mcpu=pwr9 test/CodeGen/PowerPC/build-vector-tests.ll causes an assertion failure during the binary encoding.
The failure occurs when a D-form load instruction takes two register operands instead of a register + an immediate.

This patch fixes the problem and also adds an assertion to catch this failure earlier before the binary encoding (i.e. during lit test).
The fix is from Nemanja Ivanovic @nemanjai.

Differential Revision: https://reviews.llvm.org/D33482

llvm-svn: 304133
2017-05-29 07:12:39 +00:00
Diana Picus 8cca8cb0ce [ARM] GlobalISel: Support array parameters/arguments
Clang coerces structs into arrays, so it's a good idea to support them.
Most of the support boils down to getting the splitToValueTypes helper
to actually split types. We then use G_INSERT/G_EXTRACT to deal with the
parts.

llvm-svn: 304132
2017-05-29 07:01:52 +00:00
Mehdi Amini 96ab48f9da DebugInfo: Include .dwo file name when hashing multiple CUs in a single file
This is really a workaround for ThinLTO in particular - since it can
import partial CUs that may end up looking very similar/the same as
the same partial import in another ThinLTO compile.

An alternative fix would be to change the DICompileUnit metadata to
include a "primary file" or the like - and when importing for ThinLTO
set the primary file to the name of the DICompileUnit that is being
imported into. This involves changing the schema and would reduce the
excessive uniqueness in the hash that this change creates - allowing
diagnosing of more duplicate CUs than will be caught with this change.

But duplicate CUs can still be caught in non-ThinLTO builds & are mostly
a nuisance rather than a particularly deliberate/effective tool for
finding broken code. (arguably the hash could always include the dwo
file and nothing in fission would break, I think..)

Reapply of r304119 after adding a triple to the test and moving it
to the X86 directory.

llvm-svn: 304130
2017-05-29 06:32:34 +00:00
Mehdi Amini 4181205563 DebugInfo: Omit an empty CU when a subprogram was moved into its use
When the only use of a CU is for a subprogram that's only emitted into
the using CU (to avoid cross-CU references in DWO files), avoid creating
that CU at all.

Reapply of r304111 after adding a triple to the test and moving it
to the X86 directory.

llvm-svn: 304129
2017-05-29 06:25:30 +00:00
Tobias Grosser 8cf785f6b1 Revert "[IfConversion] Keep the CFG updated incrementally in IfConvertTriangle"
The reverted change introdued assertions ala:

"MachineBasicBlock::succ_iterator
llvm::MachineBasicBlock::removeSuccessor(succ_iterator, bool): Assertion
`I != Successors.end() && "Not a current successor!"'

Mikael, the original committer, wrote me that he is working on a fix, but that
it likely will take some time to get this resolved. As this bug is one of the
last two issues that keep the AOSP buildbot from turning green, I revert the
original commit r302876.

I am looking forward to see this recommitted after the assertion has been
resolved.

llvm-svn: 304128
2017-05-29 06:12:18 +00:00
Mehdi Amini e161ced16a Revert "DebugInfo: Omit an empty CU when a subprogram was moved into its use"
This reverts commit r304111.
GreenDragon is broken.

llvm-svn: 304126
2017-05-29 05:17:57 +00:00
Mehdi Amini d8056bb7d8 Revert "DebugInfo: Include .dwo file name when hashing multiple CUs in a single file"
This reverts commit r304119 and r304118. GreenDragon is broken.

llvm-svn: 304125
2017-05-29 05:17:54 +00:00
Zachary Turner df1832cf86 Resubmit "[X86] Adding new LLVM TableGen backend that generates the X86 backend memory folding tables."
This was reverted due to buildbot breakages and I was not familiar
with this code to investigate it.  But while trying to get a
useful backtrace for the author, it turns out the fix was very
obvious.  Resubmitting this patch as is, and will submit the
fix in a followup so that the fix is not hidden in the larger
CL.

llvm-svn: 304122
2017-05-29 02:19:37 +00:00
Zachary Turner 5b199be769 Revert "[X86] Adding new LLVM TableGen backend that generates the X86 backend memory folding tables."
This reverts commit 28cb1003507f287726f43c771024a1dc102c45fe as well
as all subsequent followups.  llvm-tblgen currently segfaults with
this change, and it seems it has been broken on the bots all
day with no fixes in preparation.  See, for example:

http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/

llvm-svn: 304121
2017-05-29 01:48:53 +00:00
Galina Kistanova 229c9c1159 Disabled implicit-fallthrough warnings for ConvertUTF.cpp.
ConvertUTF.cpp has a little dependency on LLVM, and since the code extensively uses fall-through switches,
I prefer disabling the warning for the whole file, rather than adding attributes for each case.

llvm-svn: 304120
2017-05-29 01:34:26 +00:00
David Blaikie ce0c205813 DebugInfo: Include .dwo file name when hashing multiple CUs in a single file
This is really a workaround for ThinLTO in particular - since it can
import partial CUs that may end up looking very similar/the same as
the same partial import in another ThinLTO compile.

An alternative fix would be to change the DICompileUnit metadata to
include a "primary file" or the like - and when importing for ThinLTO
set the primary file to the name of the DICompileUnit that is being
imported into. This involves changing the schema and would reduce the
excessive uniqueness in the hash that this change creates - allowing
diagnosing of more duplicate CUs than will be caught with this change.

But duplicate CUs can still be caught in non-ThinLTO builds & are mostly
a nuisance rather than a particularly deliberate/effective tool for
finding broken code. (arguably the hash could always include the dwo
file and nothing in fission would break, I think..)

llvm-svn: 304119
2017-05-29 00:48:45 +00:00
Saleem Abdulrasool f122423ace Support: adjust the default obj format for wasm
WebAssemly uses a custom object file format.  For the wasm targets,
default to the `Wasm` object file format.

llvm-svn: 304117
2017-05-29 00:14:57 +00:00
Dylan McKay 74fc1ce0c2 [AVR] Remove SREG from CPI's Uses; authored by Florian Zeitz
Summary: CPI does not read the status register, but only writes it.

Reviewers: dylanmckay

Reviewed By: dylanmckay

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33223

llvm-svn: 304116
2017-05-29 00:10:14 +00:00
Erik Pilkington de83eea576 [ItaniumDemangle] Fix a exponential string copying bug
This is a port of libcxxabi's r304113.

llvm-svn: 304114
2017-05-28 23:24:52 +00:00
NAKAMURA Takumi a288ec412f Prune trailing whitespace. (To regenerate makefiles)
llvm-svn: 304112
2017-05-28 22:54:25 +00:00
David Blaikie f2f898a044 DebugInfo: Omit an empty CU when a subprogram was moved into its use
When the only use of a CU is for a subprogram that's only emitted into
the using CU (to avoid cross-CU references in DWO files), avoid creating
that CU at all.

llvm-svn: 304111
2017-05-28 22:51:37 +00:00
Geoff Berry 2739ebafb6 [AArch64][Falkor] Combine sched details files into one. NFC.
llvm-svn: 304109
2017-05-28 22:20:44 +00:00
Geoff Berry b542fb3817 [AArch64][Falkor] Fix some sched details.
- Remove all uses of base sched model entries and set them all to
  Unsupported so all the opcodes are described in
  AArch64SchedFalkorDetails.td.
- Remove entries for unsupported half-float opcodes.
- Remove entries for unsupported LSE extension opcodes.
- Add entry for MOVbaseTLS (and set Sched in base td file entry to
  WriteSys) and a few other pseudo ops.
- Fix a few FP load/store with reg offset entries to use the LSLfast
  predicates.
- Add Q size BIF/BIT/BSL entries.
- Fix swapped Q/D sized CLS/CLZ/CNT/RBIT entires.
- Fix pre/post increment address register latency (this operand is
  always dest 0).
- Fix swapped FCVTHD/FCVTHS/FCVTDH/FCVTDS entries.
- Fix XYZ resource over usage on LD[1-4] opcodes.

llvm-svn: 304108
2017-05-28 21:48:31 +00:00
Benjamin Kramer 9d8ed2653f [InstrProf] Use more ArrayRef/StringRef.
No functional change intended.

llvm-svn: 304089
2017-05-28 13:23:02 +00:00
Ayman Musa d9f1fe43a8 [X86] Adding new LLVM TableGen backend that generates the X86 backend memory folding tables.
X86 backend holds huge tables in order to map between the register and memory forms of each instruction.
This TableGen Backend automatically generated all these tables with the appropriate flags for each entry.

Differential Revision: https://reviews.llvm.org/D32684

llvm-svn: 304088
2017-05-28 12:55:36 +00:00
Ayman Musa 0b4f97d5e9 [X86] Adding FoldGenRegForm helper field (for memory folding tables tableGen backend) to X86Inst class and set its value for the relevant instructions.
Some register-register instructions can be encoded in 2 different ways, this happens when 2 register operands can be folded (separately). 
For example if we look at the MOV8rr and MOV8rr_REV, both instructions perform exactly the same operation, but are encoded differently. Here is the relevant information about these instructions from Intel's 64-ia-32-architectures-software-developer-manual:

Opcode  Instruction  Op/En  64-Bit Mode  Compat/Leg Mode  Description
8A /r   MOV r8,r/m8  RM     Valid        Valid            Move r/m8 to r8.
88 /r   MOV r/m8,r8  MR     Valid        Valid            Move r8 to r/m8.
Here we can see that in order to enable the folding of the output and input registers, we had to define 2 "encodings", and as a result we got 2 move 8-bit register-register instructions.

In the X86 backend, we define both of these instructions, usually one has a regular name (MOV8rr) while the other has "_REV" suffix (MOV8rr_REV), must be marked with isCodeGenOnly flag and is not emitted from CodeGen.

Automatically generating the memory folding tables relies on matching encodings of instructions, but in these cases where we want to map both memory forms of the mov 8-bit (MOV8rm & MOV8mr) to MOV8rr (not to MOV8rr_REV) we have to somehow point from the MOV8rr_REV to the "regular" appropriate instruction which in this case is MOV8rr.

This field enable this "pointing" mechanism - which is used in the TableGen backend for generating memory folding tables.

Differential Revision: https://reviews.llvm.org/D32683

llvm-svn: 304087
2017-05-28 12:39:37 +00:00
Oren Ben Simhon f3aab2fa33 [X86] Fixing VPOPCNTDQ feature set lookup.
llvm-svn: 304086
2017-05-28 11:26:11 +00:00
Gor Nishanov ffbeb22b6f Cloning: Fix debug info cloning
Summary:
I believe https://reviews.llvm.org/rL302576 introduced two bugs:

1) it produces duplicate distinct variables for every: dbg.value describing the same variable.
    To fix the problme I switched form getDistinct() to get() in DebugLoc.cpp: auto reparentVar = [&](DILocalVariable *Var) {
    return DILocalVariable::getDistinct(

2) It passes NewFunction plain name as a linkagename parameter to Subprogram constructor. Breaks assert in:

 || DeclLinkageName.empty()) || LinkageName == DeclLinkageName) && "decl has a linkage name and it is different"' failed.
#9 0x00007f5010261b75 llvm::DwarfUnit::applySubprogramDefinitionAttributes(llvm::DISubprogram const*, llvm::DIE&) /home/gor/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp:1173:3
#
(Edit: reproducer added)

Here how https://reviews.llvm.org/rL302576 broke coroutine debug info.
Coroutine body of the original function is split into several parts by cloning and removing unneeded code.
All parts describe the original function and variables present in the original function.

For a simple case, prior to Split, original function has these two blocks:

```
PostSpill:                                        ; preds = %AllocaSpillBB
  call void @llvm.dbg.value(metadata i32 %x, i64 0, metadata !14, metadata !15), !dbg !13
  store i32 %x, i32* %x.addr, align 4
  ...
and

sw.epilog:                                        ; preds = %sw.bb
  %x.addr.reload.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 4, !dbg !20
  %4 = load i32, i32* %x.addr.reload.addr, align 4, !dbg !20
  call void @llvm.dbg.value(metadata i32 %4, i64 0, metadata !14, metadata !15), !dbg !13

!14 = !DILocalVariable(name: "x", arg: 1, scope: !6, file: !7, line: 55, type: !11)

```

Note that in two blocks different expression represent the same original user variable X.

Before rL302576, for every cloned function there was exactly one cloned DILocalVariable(name: "x" as in:

```
define i8* @f(i32 %x) #0 !dbg !6 {
  ...
!6 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped,
...
!14 = !DILocalVariable(name: "x", arg: 1, scope: !6, file: !7, line: 55, type: !11)

define internal fastcc void @f.resume(%f.Frame* %FramePtr) #0 !dbg !25 {
...
!25 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped, isOptimized: false, unit: !0, variables: !2)
!28 = !DILocalVariable(name: "x", arg: 1, scope: !25, file: !7, line: 55, type: !11)
```
After rL302576, for every cloned function there were as many DILocalVariable(name: "x" as there were "call void @llvm.dbg.value" for that variable.
This was causing asserts in VerifyDebugInfo and AssemblyPrinter.

Example:

```
!27 = distinct !DISubprogram(name: "f", linkageName: "f.resume", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55,
!29 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11)
!39 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11)
!41 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11)
```

Second problem:

Prior to rL302576, all clones were described by DISubprogram referring to original function.

```
define i8* @f(i32 %x) #0 !dbg !6 {
...
!6 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped,

define internal fastcc void @f.resume(%f.Frame* %FramePtr) #0 !dbg !25 {
...
!25 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped,
```

After rL302576, DISubprogram for clones is of two minds, plain name refers to the original name, linkageName refers to plain name of the clone.

```
!27 = distinct !DISubprogram(name: "f", linkageName: "f.resume", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55,
```

I think the assumption in AsmPrinter is that both name and linkageName should refer to the same entity. It asserts here when they are not:

```
 || DeclLinkageName.empty()) || LinkageName == DeclLinkageName) && "decl has a linkage name and it is different"' failed.
#9 0x00007f5010261b75 llvm::DwarfUnit::applySubprogramDefinitionAttributes(llvm::DISubprogram const*, llvm::DIE&) /home/gor/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp:1173:3
```
After this fix, behavior (with respect to coroutines) reverts to exactly as it was before and therefore making them debuggable again, or even more importantly, compilable, with "-g"

Reviewers: dblaikie, echristo, aprantl

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33614

llvm-svn: 304079
2017-05-27 19:41:09 +00:00
George Rimar a25d329b33 Recommit "[DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC"
With fix of uninitialized variable.

Original commit message:

This change is intended to use for LLD in D33183. 
Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to.

Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo
interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason of incorrect behavior when
sections share the same offsets, like D33176 shows.

This patch makes DWARF parsers to return section index as well. That solves problem mentioned above.

Differential revision: https://reviews.llvm.org/D33184

llvm-svn: 304078
2017-05-27 18:10:23 +00:00
Craig Topper a568c72b7d [TableGen] Prevent DagInit from leaking its Args and ArgNames when they exceed the size of the SmallVector.
DagInits are allocated in a BumpPtrAllocator so they are never destructed. This means the destructor for the SmallVector never runs.

To fix this we now allocate the vectors in the BumpPtrAllocator too using TrailingObjects.

llvm-svn: 304077
2017-05-27 17:36:50 +00:00
Tobias Grosser e3684d0b84 [SCEV] Assume parameters coming from function calls contain IVs
The optimistic delinearization implemented in LLVM detects array sizes by
looking for non-linear products between parameters and induction variables.
In OpenCL code, such products often look like:

  A[get_global_id(0) * N + get_global_id(1)]

Hence, the IV is hidden in the get_global_id() call and consequently
delinearization would fail as no induction variable is available that helps
us to identify N as array size parameter.

We now use a very simple heuristic to change this. We assume that each parameter
that comes directly from a function call is a hidden induction variable. As
a result, we can delinearize the access above to:

  A[get_global_id(0)][get_global_id(1]

llvm-svn: 304073
2017-05-27 15:17:49 +00:00
Sanjay Patel 33f4a97287 [DAGCombiner] use narrow load to avoid vector extract
If we have (extract_subvector(load wide vector)) with no other users, 
that can just be (load narrow vector). This is intentionally conservative.
Follow-ups may loosen the one-use constraint to account for the extract cost
or just remove the one-use check.

The memop chain updating is based on code that already exists multiple times
in x86 lowering, so that should be pulled into a helper function as a follow-up.

Background: this is a potential improvement noticed via regressions caused by
making x86's peekThroughBitcasts() not loop on consecutive bitcasts (see 
comments in D33137).

Differential Revision: https://reviews.llvm.org/D33578

llvm-svn: 304072
2017-05-27 14:07:03 +00:00
Craig Topper b8ff353fc6 [TableGen] Remove all the static vectors named TheActualPool.
These used to hold std::unique_ptrs that managed the allocation for the various *Init object so that they would be deleted on exit. Everything is allocated in a BumpPtrAllocator name so there is no reason for these to still exist.

llvm-svn: 304066
2017-05-27 06:14:12 +00:00
Gor Nishanov 9c6ac6138d [coroutines] Define getPassName() for coroutine passes
Reviewers: GorNishanov

Reviewed By: GorNishanov

Subscribers: EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D33622

llvm-svn: 304065
2017-05-27 05:54:30 +00:00
Vitaly Buka a637489ef1 [PartialInlining] Replace delete with unique_ptr in computeCallsiteToProfCountMap
Reviewers: davidxl

Reviewed By: davidxl

Subscribers: vsk, llvm-commits

Differential Revision: https://reviews.llvm.org/D33220

llvm-svn: 304064
2017-05-27 05:32:09 +00:00
Matthias Braun 88c8c9847d AArch64/PEI: Do not add reserved regs to liveins
We do not track liveness for reserved registers. It is unnecessary to
add them to block livein lists.

llvm-svn: 304059
2017-05-27 03:38:02 +00:00
Keno Fischer 090f1959c1 [SCEVExpander] Try harder to avoid introducing inttoptr
Summary:
This fixes introduction of an incorrect inttoptr/ptrtoint pair in
the included test case which makes use of non-integral pointers. I
suspect there are more cases like this left, but this takes care of
the one I was seeing at the moment.

Reviewers: sanjoy

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D33129

llvm-svn: 304058
2017-05-27 03:22:55 +00:00
Matthias Braun 868bbd4022 ScheduleDAGInstrs: Fix fixupKills()
Rewrite fixupKills() to use the LivePhysRegs class. Simplifies the code
and fixes a bug where the CSR registers in return blocks where missed
leading to invalid kill flags. Also remove the unnecessary rule that we
wouldn't set kill flags on tied operands.

No tests as I have an upcoming commit improving MachineVerifier checks
to catch these cases in multiple existing lit tests.

llvm-svn: 304055
2017-05-27 02:50:50 +00:00
Erik Pilkington cbc82b3ca9 [Demangler] copy changes made in libcxxabi's r303718 to ItaniumDemangle
llvm-svn: 304053
2017-05-27 01:48:34 +00:00
Quentin Colombet 7a43eddf28 [AArch64][GlobalISel] Add the Localizer pass for the O0 pipeline
This should fix most of the issue we have right now with constants being
spilled all over the place.

llvm-svn: 304052
2017-05-27 01:34:07 +00:00
Quentin Colombet bece442bd8 [GlobalISel] Add a localizer pass for target to use
This reverts commit r299287 plus clean-ups.

The localizer pass is a helper pass that could be run at O0 in the GISel
pipeline to work around the deficiency of the fast register allocator.
It basically shortens the live-ranges of the constants so that the
allocator does not spill all over the place.

Long term fix would be to make the greedy allocator fast.

llvm-svn: 304051
2017-05-27 01:34:00 +00:00
Wei Mi 5bbb5aafc1 [GVN] Recommit the patch "Add phi-translate support in scalarpre".
The recommit is to fix a bug about ExtractValue and InsertValue ops. For those
ops, some varargs inside GVN::Expression are not value numbers but raw index
numbers. It is wrong to do phi-translate for raw index numbers, and the fix is
to stop doing that.

Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.

long a[100], b[100], g1, g2, g3;
__attribute__((pure)) long goo();

void foo(long a, long b, long c, long d) {
  g1 = a * b;
  if (__builtin_expect(g2 > 3, 0)) {
    a = c;
    b = d;
    g2 = a * b;
  }
  g3 = a * b;      // fully redundant.
}
The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.

Differential Revision: https://reviews.llvm.org/D32252

llvm-svn: 304050
2017-05-27 00:54:19 +00:00
Matthias Braun 24dc63a9b9 BranchRelaxation: computeLiveIns() after creating new block
One case in BranchRelaxation did not compute liveins after creating a
new block. This is catched by existing tests with an upcoming commit
that will improve MachineVerifier checking of livein lists.

llvm-svn: 304049
2017-05-27 00:53:48 +00:00
Matthias Braun b4f74224ff AArch64: Fix cmpxchg O0 expansion
- Rewrite livein calculation to use the computeLiveIns() helper
  function. This is slightly less efficient but easier to reason about
  and doesn't unnecessarily add pristine and reserved registers[1]
- Zero the status register at the beginning of the loop to make sure it
  has a defined value.
- Remove kill flags of values that need to stay alive throughout the loop.

[1] An upcoming commit of mine will tighten the MachineVerifier to catch
    these.

llvm-svn: 304048
2017-05-26 23:48:59 +00:00
Peter Collingbourne 2c26a18501 Bitcode: Remove some dead code. Spotted by Teresa.
Differential Revision: https://reviews.llvm.org/D33609

llvm-svn: 304046
2017-05-26 23:21:40 +00:00
Craig Topper 348314dfb8 [InstSimplify] Push commuted op checks for and/or of icmp further down to avoid duplicate work
Previously, we called simplifyPossiblyCastedAndOrOfICmps twice with the operands commuted, but the call to simplifyAndOrOfICmpsWithConstants further down already handles commuting and doesn't need to be called both ways.

This patch pushes double calls further down to just the individual routines that need to be called twice.

Differential Revision: https://reviews.llvm.org/D33603

llvm-svn: 304044
2017-05-26 22:42:34 +00:00
Alexei Starovoitov 3c585d3a8f [bpf] disallow global_addr+off folding
Wrong assembly code is generated for a simple program with
clang. If clang only produces IR and llc is used
for IR lowering and optimization, correct assembly
code is generated.

The main reason is that clang feeds default Reloc::Static
to llvm and llc feeds no RelocMode to llvm, where
for llc case, BPF backend picks up Reloc::PIC_ mode.
This leads different IR lowering behavior and clang
permits global_addr+off folding while llc doesn't.

This patch introduces isOffsetFoldingLegal function into
BPF backend and the function always return false.
This will make clang and llc behave the same for
the lowering.

Bug https://bugs.llvm.org//show_bug.cgi?id=33183
has more detailed explanation.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 304043
2017-05-26 22:32:41 +00:00
Davide Italiano ef9bfe9531 [Mips] Placate GCC's -Wmisleading-indentation. NFCI.
llvm-svn: 304041
2017-05-26 21:56:19 +00:00
Davide Italiano d4db116af8 [lib/LTO] Don't reinvent the code for switching linkage.
Differential Revision:  https://reviews.llvm.org/D33582

llvm-svn: 304040
2017-05-26 21:56:14 +00:00
Matthias Braun ac4307c41e LivePhysRegs: Rework constructor + documentation; NFC
- Take reference instead of pointer to a TRI that cannot be nullptr.
- Improve documentation comments.

llvm-svn: 304038
2017-05-26 21:51:00 +00:00
Matthias Braun 61cf1a9e85 LivePhysRegs: Add default for removeRegsInMask(Clobbers); NFC
llvm-svn: 304036
2017-05-26 21:50:51 +00:00
Matthias Braun d8f4e99933 MachineVerifier: Remove unused set; NFC
llvm-svn: 304035
2017-05-26 21:50:48 +00:00
Sumanth Gundapaneni a6cf2fd5ec [Hexagon] Cleanup of unused function isCalleeSaveReg (NFC)
llvm-svn: 304034
2017-05-26 21:09:54 +00:00
Konstantin Zhuravlyov b2ff8dfea0 Resubmit r303859 with test fixed.
[AMDGPU] add intrinsic for s_getpc

Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc.

Patch by Tim Corringham

llvm-svn: 304031
2017-05-26 20:38:26 +00:00
Benjamin Kramer debb3c35e0 Make helper functions static. NFC.
llvm-svn: 304029
2017-05-26 20:09:00 +00:00
Frederich Munch 8c3735e597 Fix the ManagedStatic list ordering when using DynamicLibrary::addPermanentLibrary.
Summary:
r295737 included a fix for leaking libraries loaded via. DynamicLibrary::addPermanentLibrary.
This created a problem where static constructors in a library could insert llvm::ManagedStatic objects before DynamicLibrary would register it's own ManagedStatic, meaning a crash could occur at shutdown.

r301562 exasperated this problem by cleaning up the DynamicLibrary ManagedStatic during llvm_shutdown.

Reviewers: v.g.vassilev, lhames, efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33581

llvm-svn: 304027
2017-05-26 19:43:23 +00:00
Craig Topper 9bce1ad232 [InstSimplify] Move a variable declaration to make simplifyAndOfICmps look more like simplifyOrOfICmps. NFC
llvm-svn: 304023
2017-05-26 19:04:02 +00:00
Craig Topper c8bebb1e84 [InstSimplify] Use commutable matchers to shorten some code
This code was replicated two additional times to handle commuted cases, but I think a commutable matcher can take care of it.

Differential Revision: https://reviews.llvm.org/D33585

llvm-svn: 304022
2017-05-26 19:03:59 +00:00
Craig Topper 1da22c3244 [InstSimplify] Use m_APInt instead of m_ConstantInt in ((V + N) & C1) | (V & C2) handling in order to support splat vectors.
The tests here are have operands commuted to provide more coverage. I also commuted one of the instructions in the scalar tests so the 4 tests cover the 4 commuted variations

Differential Revision: https://reviews.llvm.org/D33599

llvm-svn: 304021
2017-05-26 19:03:53 +00:00
David Blaikie 07963bd1d1 DebugInfo: Do not emit empty CUs
Consistent with GCC and addresses a shortcoming with ThinLTO where many
imported CUs may end up being empty (because the functions imported from
them either ended up not being used (and were then discarded, since
they're imported as available_externally) or optimized away entirely).

Test cases previously testing empty CUs (either intentionally, or
because they didn't need anything more complicated) had a trivial 'int'
or similar basic type added to their retained types list.

This is a first order approximation - a deeper implementation could do
things like:

1) Be more lazy about construction of the CU - for example if two CUs
containing a single identical retained type are linked together, with
this change one of the two CUs will be produced but empty (since a
duplicate type won't be produced).

2) Go further and invert all the CU links the same way the subprogram
link is inverted - keep named CU lists of retained types, macros, etc,
and have those link back to the CU. Then if they're emitted, the CU is
emitted, but never otherwise - this would allow the metadata itself to
be dropped earlier too, though it seems unlikely that's an important
optimization as there shouldn't be many CUs relative to the number of
other entities.

llvm-svn: 304020
2017-05-26 18:52:56 +00:00
Peter Collingbourne 7730b24448 PMB: Run the whole-program-devirt pass during LTO at --lto-O0.
The whole-program-devirt pass needs to run at -O0 because only it
knows about the llvm.type.checked.load intrinsic: it needs to both
lower the intrinsic itself and handle it in the summary.

Differential Revision: https://reviews.llvm.org/D33571

llvm-svn: 304019
2017-05-26 18:27:13 +00:00
Craig Topper d45185f231 [InstCombine] Pass the DominatorTree, AssumptionCache, and context instruction to a few calls to isKnownPositive, isKnownNegative, and isKnownNonZero
Every other place in InstCombine that uses these methods in ValueTracking already pass this information. This makes the remaining sites consistent.

Differential Revision: https://reviews.llvm.org/D33567

llvm-svn: 304018
2017-05-26 18:23:57 +00:00
Dmitry Preobrazhensky 6a2431df0b [AMDGPU][MC][GFX9] Corrected encoding of flat_scratch* for SDWA opcodes
See bug 33171: https://bugs.llvm.org/show_bug.cgi?id=33171

Reviewers: Sam Kolton

Differential Revision: https://reviews.llvm.org/D33553

llvm-svn: 304015
2017-05-26 18:01:29 +00:00
George Rimar 1f9cab6b1c Revert r304002 "[DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC"
Revert it again. Now another bot unhappy: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/8750

llvm-svn: 304011
2017-05-26 17:36:23 +00:00
David Blaikie 7f2b717b52 DebugInfo: Don't include locations for debug-having code inlined into nodebug functions
This produced 'strange' DWARF anyway - the CU would have no ranges (or
at least not a range including the inlined code) nor any subprogram or
inlined_subroutine - yet the line table would have entries for these
instructions.

(this actually becomes more relevant with changes coming after this,
where a CU without any contents will be omitted entirely - so there
would be no line table to put this on anyway)

llvm-svn: 304004
2017-05-26 17:05:15 +00:00
Tom Stellard dde28a8c92 AMDGPU/GlobalISel: Mark 32-bit float constants as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D33212

llvm-svn: 304003
2017-05-26 16:40:03 +00:00
George Rimar bc223c63cc [DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC
This change is intended to use for LLD in D33183. 
Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to.

Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo
interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason of incorrect behavior when
sections share the same offsets, like D33176 shows.

This patch makes DWARF parsers to return section index as well. That solves problem mentioned above.

Differential revision: https://reviews.llvm.org/D33184

llvm-svn: 304002
2017-05-26 16:26:18 +00:00
Matthias Braun eec1f3672a LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI
Re-commit r303938 and r303954 with a fix for addLiveIns(): the internal
addPristines() function must be called on an empty set or it may
accidentally reset saved registers.

- addLiveOutsNoPristines() needs to add callee saved registers that are
  actually saved and restored somewhere to the set (they are not
  pristine).
- Cleanup/rewrite the code for addLiveOuts()/addLiveOutsNoPristines().

This fixes the problem from D32156.

Differential Revision: https://reviews.llvm.org/D32464

llvm-svn: 304001
2017-05-26 16:23:08 +00:00
Sam Kolton 363f47a2c7 [AMDGPU] SDWA: add disassembler support for GFX9
Summary: Added decoder methods and tests

Reviewers: vpykhtin, artem.tamazov, dp

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D33545

llvm-svn: 303999
2017-05-26 15:52:00 +00:00
Sanjay Patel ec13ebf2c8 [DAGCombiner] use narrow vector ops to eliminate concat/extract (PR32790)
In the best case:
extract (binop (concat X1, X2), (concat Y1, Y2)), N --> binop XN, YN
...we kill all of the extract/concat and just have narrow binops remaining.

If only one of the binop operands is amenable, this transform is still
worthwhile because we kill some of the extract/concat.

Optional bitcasting makes the code more complicated, but there doesn't
seem to be a way to avoid that.

The TODO about extending to more than bitwise logic is there because we really
will regress several x86 tests including madd, psad, and even a plain
integer-multiply-by-2 or shift-left-by-1. I don't think there's anything
fundamentally wrong with this patch that would cause those regressions; those
folds are just missing or brittle.

If we extend to more binops, I found that this patch will fire on at least one
non-x86 regression test. There's an ARM NEON test in
test/CodeGen/ARM/coalesce-subregs.ll with a pattern like:

            t5: v2f32 = vector_shuffle<0,3> t2, t4
          t6: v1i64 = bitcast t5
          t8: v1i64 = BUILD_VECTOR Constant:i64<0>
        t9: v2i64 = concat_vectors t6, t8
      t10: v4f32 = bitcast t9
    t12: v4f32 = fmul t11, t10
  t13: v2i64 = bitcast t12
t16: v1i64 = extract_subvector t13, Constant:i32<0>

There was no functional change in the codegen from this transform from what I
could see though.

For the x86 test changes:

1. PR32790() is the closest call. We don't reduce the AVX1 instruction count in that case,
   but we improve throughput. Also, on a core like Jaguar that double-pumps 256-bit ops,
   there's an unseen win because two 128-bit ops have the same cost as the wider 256-bit op.
   SSE/AVX2/AXV512 are not affected which is expected because only AVX1 has the extract/concat
   ops to match the pattern.
2. do_not_use_256bit_op() is the best case. Everyone wins by avoiding the concat/extract.
   Related bug for IR filed as: https://bugs.llvm.org/show_bug.cgi?id=33026
3. The SSE diffs in vector-trunc-math.ll are just scheduling/RA, so nothing real AFAICT.
4. The AVX1 diffs in vector-tzcnt-256.ll are all the same pattern: we reduced the instruction
   count by one in each case by eliminating two insert/extract while adding one narrower logic op.

https://bugs.llvm.org/show_bug.cgi?id=32790

Differential Revision: https://reviews.llvm.org/D33137

llvm-svn: 303997
2017-05-26 15:33:18 +00:00
Nirav Dave 689709c928 [DAG] Move legal type checks in store merge to be checked only
on non-legal cases. NFC.

llvm-svn: 303994
2017-05-26 14:37:27 +00:00
John Brawn 9009d2905d [ARM] Fix lowering of misaligned memcpy/memset
Currently getOptimalMemOpType returns i32 for large enough sizes without
checking for alignment, leading to poor code generation when misaligned accesses
aren't permitted as we generate a word store then later split it up into byte
stores. This means we inadvertantly go over the MaxStoresPerMemcpy limit and for
memset we splat the memset value into a word then immediately split it up
again.

Fix this by leaving it up to FindOptimalMemOpLowering to figure out which type
to use, but also fix a bug there where it wasn't correctly checking if
misaligned memory accesses are allowed.

Differential Revision: https://reviews.llvm.org/D33442

llvm-svn: 303990
2017-05-26 13:59:12 +00:00
Andrew V. Tischenko fdb264e263 The fix for PR22004: X86AsmParser.cpp asserts: OperandStack.size() > 1 && "Too few operands."
llvm-svn: 303985
2017-05-26 13:23:34 +00:00
George Rimar a8403a64ea Revert "[DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC"
Broked BB again:

TEST 'LLVM :: DebugInfo/X86/dbg-value-regmask-clobber.ll' FAILED
...
LLVM ERROR: Section was outside of section table.

llvm-svn: 303984
2017-05-26 13:20:09 +00:00
George Rimar 655b7b63f6 Recommit r303978 "[DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC"
With fix of test compilation.

Initial commit message:

This change is intended to use for LLD in D33183. 
Problem we have in LLD when building .gdb_index is that we need to know section 
which address range belongs to.

Previously it was solved on LLD side by providing fake section addresses
with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed.
Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason 
of incorrect behavior when
sections share the same offsets, like D33176 shows.

This patch makes DWARF parsers to return section index as well. 
That solves problem mentioned above.

Differential revision: https://reviews.llvm.org/D33184

llvm-svn: 303983
2017-05-26 13:13:50 +00:00
George Rimar 7d5f12185a Revert r303978 "[DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC"
It failed BB.

llvm-svn: 303981
2017-05-26 12:53:41 +00:00
Nirav Dave 6ff50bf242 Fix signedness of constant. NFC.
llvm-svn: 303980
2017-05-26 12:53:10 +00:00
George Rimar 732f268aa0 [DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC
This change is intended to use for LLD in D33183. 
Problem we have in LLD when building .gdb_index is that we need to know section 
which address range belongs to.

Previously it was solved on LLD side by providing fake section addresses
with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed.
Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason 
of incorrect behavior when
sections share the same offsets, like D33176 shows.

This patch makes DWARF parsers to return section index as well. 
That solves problem mentioned above.

Differential revision: https://reviews.llvm.org/D33184

llvm-svn: 303978
2017-05-26 12:46:41 +00:00
Max Kazantsev 41450329f7 Re-enable "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start"
The patch rL303730 was reverted because test lsr-expand-quadratic.ll failed on
many non-X86 configs with this patch. The reason of this is that the patch
makes a correctless fix that changes optimizer's behavior for this test.
Without the change, LSR was making an overconfident simplification basing on a
wrong SCEV. Apparently it did not need the IV analysis to do this. With the
change, it chose a different way to simplify (that wasn't so confident), and
this way required the IV analysis. Now, following the right execution path,
LSR tries to make a transformation relying on IV Users analysis. This analysis
is target-dependent due to this code:

  // LSR is not APInt clean, do not touch integers bigger than 64-bits.
  // Also avoid creating IVs of non-native types. For example, we don't want a
  // 64-bit IV in 32-bit code just because the loop has one 64-bit cast.
  uint64_t Width = SE->getTypeSizeInBits(I->getType());
  if (Width > 64 || !DL.isLegalInteger(Width))
    return false;

To make a proper transformation in this test case, the type i32 needs to be
legal for the specified data layout. When the test runs on some non-X86
configuration (e.g. pure ARM 64), opt gets confused by the specified target
and does not use it, rejecting the specified data layout as well. Instead,
it uses some default layout that does not treat i32 as a legal type
(currently the layout that is used when it is not specified does not have
legal types at all). As result, the transformation we expect to happen does
not happen for this test.

This re-enabling patch does not have any source code changes compared to the
original patch rL303730. The only difference is that the failing test is
moved to X86 directory and now has requirement of running on x86 only to comply
with the specified target triple and data layout.

Differential Revision: https://reviews.llvm.org/D33543

llvm-svn: 303971
2017-05-26 06:47:04 +00:00
Matthias Braun e51c435c07 LivePhysRegs: Skip reserved regs in computeLiveIns; NFCI
Re-commit r303937 + r303949 as they were not the cause for the build
failures.

We do not track liveness of reserved registers so adding them to the
liveins list in computeLiveIns() was completely unnecessary.

llvm-svn: 303970
2017-05-26 06:32:31 +00:00
Wei Mi 3250ae3f7c Revert rL303923 since it broke the sanitizer bootstrap build bot.
llvm-svn: 303969
2017-05-26 05:42:50 +00:00
Craig Topper 25d9ba9a12 [InstSimplify] Use APInt::isMask isntead of manually implementing it. NFC
llvm-svn: 303968
2017-05-26 05:16:22 +00:00
Craig Topper 50500d5054 [InstSimplify] Use m_ConstantInt matchers to short some code. NFC
llvm-svn: 303967
2017-05-26 05:16:20 +00:00
Chandler Carruth 8fa1e37342 [IR] Add an iterator and range accessor for the PHI nodes of a basic
block.

This allows writing much more natural and readable range based for loops
directly over the PHI nodes. It also takes advantage of the same tricks
for terminating the sequence as the hand coded versions.

I've replaced one example of this mostly to showcase the difference and
I've added a unit test to make sure the facilities really work the way
they're intended. I want to use this inside of SimpleLoopUnswitch but it
seems generally nice.

Differential Revision: https://reviews.llvm.org/D33533

llvm-svn: 303964
2017-05-26 03:10:00 +00:00
Matthias Braun c93c063993 Revert "LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI"
Tentatively revert this to see if it fixes the buildbot stage2
breakages.

This reverts commit r303938.
This reverts commit r303954.

llvm-svn: 303960
2017-05-26 02:25:20 +00:00
Matthias Braun f56a6d84b6 Revert "LivePhysRegs: Skip reserved regs in computeLiveIns; NFCI"
Tentatively revert, suspecting that it caused breakage in stage2
buildbots.

This reverts commit r303949.
This reverts commit r303937.

llvm-svn: 303955
2017-05-26 01:29:32 +00:00
Chandler Carruth 86248d5632 [PM] Enable the new simple loop unswitch pass in the new pass manager
(where it is the only realistic option).

This passes the LLVM test suite for me, but I'm clearly still hammering
on this.

llvm-svn: 303952
2017-05-26 01:24:11 +00:00
Matthias Braun daea6f1e84 LivePhysRegs: Follow-up to r303937
We may have situations in which a superregister is reserved and not
added to liveins, so we have to add the subregisters.

llvm-svn: 303949
2017-05-26 00:54:24 +00:00
Zachary Turner f2110283c6 Remove unused member.
llvm-svn: 303942
2017-05-25 23:47:56 +00:00
Tim Shen a76f20c364 [PPC] Add text for assert.
llvm-svn: 303940
2017-05-25 23:40:46 +00:00
Peter Collingbourne f87197ad91 LTO: Do summary-based prevailing symbol resolution at --lto-O0.
Prevailing symbol resolution is necessary for correctness. Without
this we can end up dropping a referenced linkonce symbol from the link.

Differential Revision: https://reviews.llvm.org/D33570

llvm-svn: 303939
2017-05-25 23:40:11 +00:00
Matthias Braun e2133d5b42 LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI
- addLiveOutsNoPristines() needs to add callee saved registers that are
  actually saved and restored somewhere to the set (they are not
  pristine).
- Cleanup/rewrite the code for addLiveOuts()/addLiveOutsNoPristines().

This fixes the problem from D32156.

Differential Revision: https://reviews.llvm.org/D32464

llvm-svn: 303938
2017-05-25 23:39:40 +00:00
Matthias Braun 9512dd5ffd LivePhysRegs: Skip reserved regs in computeLiveIns; NFCI
We do not track liveness of reserved registers so adding them to the
liveins list in computeLiveIns() was completely unnecessary.

llvm-svn: 303937
2017-05-25 23:39:33 +00:00
Zachary Turner fed467eefb [CV Type Merging] Find nested type indices faster.
Merging two type streams is one of the most time consuming
parts of generating a PDB, and as such it needs to be as
fast as possible.  The visitor abstractions used for interoperating
nicely with many different types of inputs and outputs have
been used widely and help greatly for testability and implementing
tools, but the abstractions build up and get in the way of
performance.

This patch removes all of the visitation stuff from the type
stream merger, essentially re-inventing the leaf / member switch
and loop, but at a very low level.  This allows us many other
optimizations, such as not actually deserializing *any* records
(even member records which don't describe their own length), as
the operation of "figure out how long this record is" is somewhat
faster than "figure out how long this record *and* get all its
fields out".  Furthermore, whereas before we had to deserialize,
re-write type indices, then re-serialize, now we don't have to
do any of those 3 steps.  We just find out where the type indices
are and pull them directly out of the byte stream and re-write
them.

This is worth a 50-60% performance increase.  On top of all other
optimizations that have been applied this week, I now get the
following numbers when linking lld.exe and lld.pdb

MSVC: 25.67s
Before This Patch: 18.59s
After This Patch: 8.92s

So this is a huge performance win.

Differential Revision: https://reviews.llvm.org/D33564

llvm-svn: 303935
2017-05-25 23:36:16 +00:00
David Blaikie 2c78f183fe DebugInfo: Simplify scopes+subprogram handling since the subprogram<>cu link inversion
Previously this code was defensive to the situation in which the debug
info scopes would lead to a different subprogram from the subprogram in
the CU's subprogram list (this could've happened with linkonce
functions, etc as per the comment being removed). Since the CU<>SP link
reversal this is no longer possible.

llvm-svn: 303933
2017-05-25 23:11:28 +00:00
Tim Shen a2b85da879 [PPC] Fix atomics lowering in DAG lowering.
I forgot to forward the chain, causing some missing instruction
dependencies. The test crashes the compiler without this patch.

Inspired by the test case, D33519 also tries to remove the extra sync.

Differential Revision: https://reviews.llvm.org/D33573

llvm-svn: 303931
2017-05-25 22:58:35 +00:00
Craig Topper d4039f7283 [InstCombine] Add an InstCombine specific wrapper around isKnownToBeAPowerOfTwo to shorten code. NFC
We have wrappers for several other ValueTracking methods that take care of passing all of the analysis and assumption cache parameters. This extends it to isKnownToBeAPowerOfTwo.

llvm-svn: 303924
2017-05-25 21:51:12 +00:00
Wei Mi fd257fa7bf [GVN] Add phi-translate support in scalarpre.
Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.

  long a[100], b[100], g1, g2, g3;
  __attribute__((pure)) long goo();

  void foo(long a, long b, long c, long d) {
    g1 = a * b;
    if (__builtin_expect(g2 > 3, 0)) {
      a = c;
      b = d;
      g2 = a * b;
    }
    g3 = a * b;      // fully redundant.
  }

The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.

Differential Revision: https://reviews.llvm.org/D32252

llvm-svn: 303923
2017-05-25 21:49:02 +00:00
Andrew Kaylor f466001eef Add constrained intrinsics for some libm-equivalent operations
Differential revision: https://reviews.llvm.org/D32319

llvm-svn: 303922
2017-05-25 21:31:00 +00:00
Matthias Braun 1527baab0c CodeGen: Rename DEBUG_TYPE to match passnames
Rename the DEBUG_TYPE to match the names of corresponding passes where
it makes sense. Also establish the pattern of simply referencing
DEBUG_TYPE instead of repeating the passname where possible.

llvm-svn: 303921
2017-05-25 21:26:32 +00:00
Zachary Turner 2897e0306e [lld] Fix a bug where we continually re-follow type servers.
Originally this was intended to be set up so that when linking
a PDB which refers to a type server, it would only visit the
PDB once, and on subsequent visitations it would just skip it
since all the records had already been added.

Due to some C++ scoping issues, this was not occurring and it
was revisiting the type server every time, which caused every
record to end up being thrown away on all subsequent visitations.

This doesn't affect the performance of linking clang-cl generated
object files because we don't use type servers, but when linking
object files and libraries generated with /Zi via MSVC, this means
only 1 object file has to be linked instead of N object files, so
the speedup is quite large.

llvm-svn: 303920
2017-05-25 21:16:03 +00:00
Zachary Turner 7f97c362a4 [CodeView Type Merging] Don't keep re-allocating temp serializer.
Previously, every time we wanted to serialize a field list record, we
would create a new copy of FieldListRecordBuilder, which would in turn
create a temporary instance of TypeSerializer, which itself had a
std::vector<> that was about 128K in size. So this 128K allocation was
happening every time. We can re-use the same instance over and over, we
just have to clear its internal hash table and seen records list between
each run. This saves us from the constant re-allocations.

This is worth an ~18.5% speed increase (3.75s -> 3.05s) in my tests.

Differential Revision: https://reviews.llvm.org/D33506

llvm-svn: 303919
2017-05-25 21:15:37 +00:00
Zachary Turner 95c625ecc9 Make BinaryStreamReader::readCString a bit faster.
Previously it would do a character by character search for a null
terminator, to account for the fact that an arbitrary stream need not
store its data contiguously so you couldn't just do a memchr. However, the
stream API has a function which will return the longest contiguous chunk
without doing a copy, and by using this function we can do a memchr on the
individual chunks. For certain types of streams like data from object
files etc, this is guaranteed to find the null terminator with only a
single memchr, but even with discontiguous streams such as
MappedBlockStream, it's rare that any given string will cross a block
boundary, so even those will almost always be satisfied with a single
memchr.

This optimization is worth a 10-12% reduction in link time (4.2 seconds ->
3.75 seconds)

Differential Revision: https://reviews.llvm.org/D33503

llvm-svn: 303918
2017-05-25 21:12:27 +00:00
Bob Haarman 55256ada25 [pdb] pad source file name buffer at the end instead of the beginning
Summary:
DbiStreamBuilder calculated the offset of the source file names inside
the file info substream as the size of the file info substream minus
the size of the file names. Since the file info substream is padded to
a multiple of 4 bytes, this caused the first file name to be aligned
on a 4-byte boundary. By contrast, DbiModuleList would read the file
names immediately after the file name offset table, without skipping
to the next 4-byte boundary. This change makes it so that the file
names are written to the location where DbiModuleList expects them,
and puts any necessary padding for the file info substream after the
file names instead of before it.

Reviewers: amccarth, rnk, zturner

Reviewed By: amccarth, zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33475

llvm-svn: 303917
2017-05-25 21:12:15 +00:00
Zachary Turner c4e4b7e31e Fix a bug in MappedBlockStream.
It was using the number of blocks of the entire PDB file as the number
of blocks of each stream that was created.  This was only an issue in
the readLongestContiguousChunk function, which  was never called prior.
This bug surfaced when I updated an algorithm to use this function and
the algorithm broke.

llvm-svn: 303916
2017-05-25 21:12:00 +00:00
Sam Clegg 1c154a6107 [WebAssembly] MC: Include unnamed data when writing wasm files
Also, include global entries for all data symbols, not
just external ones, since these are referenced by the
relocation records.

Add a test case that includes unnamed data.

Differential Revision: https://reviews.llvm.org/D33079

llvm-svn: 303915
2017-05-25 21:08:07 +00:00
Zachary Turner dda25b128c [CodeView Type Merging] Avoid record deserialization when possible.
A profile shows the majority of time doing type merging is spent
deserializing records from sequences of bytes into friendly C++ structures
that we can easily access members of in order to find the type indices to
re-write.

Records are prefixed with their length, however, and most records have
type indices that appear at fixed offsets in the record. For these
records, we can save some cycles by just looking at the right place in the
byte sequence and re-writing the value, then skipping the record in the
type stream. This saves us from the costly deserialization of examining
every field, including potentially null terminated strings which are the
slowest, even though it was unnecessary to begin with.

In addition, we apply another optimization. Previously, after
deserializing a record and re-writing its type indices, we would
unconditionally re-serialize it in order to compute the hash of the
re-written record. This would result in an alloc and memcpy for every
record. If no type indices were re-written, however, this was an
unnecessary allocation. In this patch re-writing is made two phase. The
first phase discovers the indices that need to be rewritten and their new
values. This information is passed through to the de-duplication code,
which only copies and re-writes type indices in the serialized byte
sequence if at least one type index is different.

Some records have type indices which only appear after variable length
strings, or which have lists of type indices, or various other situations
that can make it tricky to make this optimization. While I'm not giving up
on optimizing these cases as well, for now we can get the easy cases out
of the way and lay the groundwork for more complicated cases later.

This patch yields another 50% speedup on top of the already large speedups
submitted over the past 2 days. In two tests I have run, I went from 9
seconds to 3 seconds, and from 16 seconds to 8 seconds.

Differential Revision: https://reviews.llvm.org/D33480

llvm-svn: 303914
2017-05-25 21:06:28 +00:00
Kyle Butt 13379d7c99 PPC: Correct Size for GETtlsADDR
PPC::GETtlsADDR is lowered to a branch and a nop, by the assembly
printer. Its size was incorrectly marked as 4, correct it to 8. The
incorrect size can cause incorrect branch relaxation in
PPCBranchSelector under the right conditions.

llvm-svn: 303904
2017-05-25 19:37:41 +00:00
Nico Weber b3d83a092a Revert r303859, CodeGen/AMDGPU/llvm.amdgcn.s.getpc.ll fails on bots.
llvm-svn: 303902
2017-05-25 19:19:29 +00:00
Manoj Gupta d536180fdc [AArch64]: add 'a' inline asm operand modifier.
Summary:
This is used in the Linux kernel, and effectively just means "print an
address". This brings back r193593.

Reviewed by: Renato Golin

Reviewers: t.p.northover, rengolin, richard.barton.arm, kristof.beyls

Subscribers: aemerson, javed.absar, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D33558

llvm-svn: 303901
2017-05-25 19:07:57 +00:00
Adrian Prantl f062192632 Fix SelectionDAGBuilder::getDbgValue to not expect DW_OP_deref on FI vars
This fixes an oversight in r300522, which changed alloca
dbg.values to no longer emit a DW_OP_deref.

The array.ll testcase was regenerated from source.

Fixes PR33166:
https://bugs.llvm.org/show_bug.cgi?id=33166

llvm-svn: 303897
2017-05-25 18:54:10 +00:00
David Blaikie b3cee2fb42 DebugInfo: Produce debug_{gnu_}pub{names,types} entries when explicitly requested, even in -gmlt or when empty
Turns out gold doesn't use the DW_AT_GNU_pubnames to decide whether to
parse the rest of the DIEs when building gdb-index. This causes gold to
trip over LLVM's output when there are DW_FORM_ref_addr present.

Gold does use the presence of a debug_gnu_pub{names,types} entry for the
CU to skip parsing the debug_info portion, so make sure that's included
even when empty (technically, when empty there couldn't be any ref_addr
anyway - it only came up when gmlt didn't produce any (even non-empty)
pubnames - but given what that reveals about gold's implementation, this
seems like a good thing to do for consistency).

llvm-svn: 303894
2017-05-25 18:50:28 +00:00
Daniel Berlin e67c322260 NewGVN: Fix PR 33119, PR 33129, due to regressed undef handling
Fix PR33120 and others by eliminating self-cycles a different way.

llvm-svn: 303875
2017-05-25 15:44:20 +00:00
Artur Pilipenko 315eafc339 [InstCombine] Teach isAllocSiteRemovable to look through addrspacecasts
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D28565

llvm-svn: 303870
2017-05-25 15:14:48 +00:00
Sanjay Patel 5150612012 [InstCombine] make icmp-mul fold more efficient
There's probably a lot more like this (see also comments in D33338 about responsibility), 
but I suspect we don't usually get a visible manifestation.

Given the recent interest in improving InstCombine efficiency, another potential micro-opt
that could be repeated several times in this function: morph the existing icmp pred/operands
instead of creating a new instruction.

llvm-svn: 303860
2017-05-25 14:13:57 +00:00
Tim Corringham 32d0d38679 [AMDGPU] add intrinsic for s_getpc
Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D32862

llvm-svn: 303859
2017-05-25 14:04:14 +00:00
Oren Ben Simhon 7bf27f03f2 [X86] Adding vpopcntd and vpopcntq instructions
AVX512_VPOPCNTDQ is a new feature set that was published by Intel.
The patch represents the LLVM side of the addition of two new intrinsic based instructions (vpopcntd and vpopcntq).

Differential Revision: https://reviews.llvm.org/D33169

llvm-svn: 303858
2017-05-25 13:45:23 +00:00
James Molloy dc2d64bc35 [GVNSink] Pacify MSVC
Don't convert an unsigned to a pointer for a sentinel, use a size_t instead.

llvm-svn: 303855
2017-05-25 13:14:10 +00:00
James Molloy 2a237f19f1 [GVNSink] Don't define operator<< in NDEBUG
Without debug macros enabled, the raw_ostream operator<< overload
is unused.

llvm-svn: 303852
2017-05-25 13:11:18 +00:00
James Molloy a929063233 [GVNSink] GVNSink pass
This patch provides an initial prototype for a pass that sinks instructions based on GVN information, similar to GVNHoist. It is not yet ready for commiting but I've uploaded it to gather some initial thoughts.

This pass attempts to sink instructions into successors, reducing static
instruction count and enabling if-conversion.
We use a variant of global value numbering to decide what can be sunk.
Consider:

[ %a1 = add i32 %b, 1  ]   [ %c1 = add i32 %d, 1  ]
[ %a2 = xor i32 %a1, 1 ]   [ %c2 = xor i32 %c1, 1 ]
                 \           /
           [ %e = phi i32 %a2, %c2 ]
           [ add i32 %e, 4         ]

GVN would number %a1 and %c1 differently because they compute different
results - the VN of an instruction is a function of its opcode and the
transitive closure of its operands. This is the key property for hoisting
and CSE.

What we want when sinking however is for a numbering that is a function of
the *uses* of an instruction, which allows us to answer the question "if I
replace %a1 with %c1, will it contribute in an equivalent way to all
successive instructions?". The (new) PostValueTable class in GVN provides this
mapping.

This pass has some shown really impressive improvements especially for codesize already on internal benchmarks, so I have high hopes it can replace all the sinking logic in SimplifyCFG.

Differential revision: https://reviews.llvm.org/D24805

llvm-svn: 303850
2017-05-25 12:51:11 +00:00
Chandler Carruth f4d62c480c [PM] Teach the PGO instrumentation pasess to run GlobalDCE before
instrumenting code.

This is important in the new pass manager. The old pass manager's
inliner has a small DCE routine embedded within it. The new pass manager
relies on the actual GlobalDCE pass for this.

Without this patch, instrumentation profiling with the new PM results in
massive code bloat in the object files because the instrumentation
itself ends up preventing DCE from working to remove the code.

We should probably change the instrumentation (and/or DCE) so that we
can eliminate dead code even if instrumented, but we shouldn't even
spend the time generating instrumentation for that code so this still
seems like a good patch.

Differential Revision: https://reviews.llvm.org/D33535

llvm-svn: 303845
2017-05-25 07:15:09 +00:00
Chandler Carruth dd2e275a47 [PM/Unswitch] Fix a bug in the domtree update logic for the new unswitch
pass.

The original logic only considered direct successors of the hoisted
domtree nodes, but that isn't really enough. If there are other basic
blocks that are completely within the subtree, their successors could
just as easily be impacted by the hoisting.

The more I think about it, the more I think the correct update here is
to hoist every block on the dominance frontier which has an idom in the
chain we hoist across. However, this is subtle enough that I'd
definitely appreciate some more eyes on it.

Sadly, if this is the correct algorithm, it requires computing a (highly
localized) dominance frontier. I've done this in the simplest (IE, least
code) way I could come up with, but that may be too naive. Suggestions
welcome here, dominance update algorithms are not an area I've studied
much, so I don't have strong opinions.

In good news, with this patch, turning on simple unswitch passes the
LLVM test suite for me with asserts enabled.

Differential Revision: https://reviews.llvm.org/D32740

llvm-svn: 303843
2017-05-25 06:33:36 +00:00
Chandler Carruth 29c22d2835 [LegacyPM] Make the 'addLoop' method accept a loop to add rather than
having it internally allocate the loop.

This is a much more flexible API and necessary in the new loop unswitch
to reasonably support both new and old PMs in common code. It also just
seems like a cleaner separation of concerns.

NFC, this should just be a pure refactoring.

Differential Revision: https://reviews.llvm.org/D33528

llvm-svn: 303834
2017-05-25 03:01:31 +00:00
Vitaly Buka bf40f1b6dd [libFuzzer] Don't replace custom signal handlers.
Summary:
This allows to keep handlers installed by sanitizers.
In other cases third-party code can replace handlers after libFuzzer
initialization anyway.

Reviewers: kcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33522

llvm-svn: 303828
2017-05-25 01:43:13 +00:00
George Karpenkov a1c532784d Fix coverage check for full post-dominator basic blocks.
Coverage instrumentation which does not instrument full post-dominators
and full-dominators may skip valid paths, as the reasoning for skipping
blocks may become circular.
This patch fixes that, by only skipping
full post-dominators with multiple predecessors, as such predecessors by
definition can not be full-dominators.

llvm-svn: 303827
2017-05-25 01:41:46 +00:00
Gor Nishanov 1fbc01f70f [coroutines] CoroFrame.cpp conform to coding convention (s/repeat/Repeat) (NFC)
llvm-svn: 303826
2017-05-25 01:07:10 +00:00
Gor Nishanov 0ea1863b27 [coroutines] Relocate instructions that maybe spilled after coro.begin
Summary:
Frontend generates store instructions after allocas, for example:

```
define i8* @f(i64 %this) "coroutine.presplit"="1" personality i32 0 {
entry:
  %this.addr = alloca i64
  store i64 %this, i64* %this.addr
  ..
  %hdl = call i8* @llvm.coro.begin(token %id, i8* %alloc)

```
Such instructions may require spilling into coro.frame, but, coro-frame address is only available after coro.begin and thus needs to be moved after coro.begin.
The only instructions that should not be moved are the arguments of coro.begin and all of their operands.

Reviewers: GorNishanov, majnemer

Reviewed By: GorNishanov

Subscribers: llvm-commits, EricWF

Differential Revision: https://reviews.llvm.org/D33527

llvm-svn: 303825
2017-05-25 00:46:20 +00:00
Tony Jiang 0a429f040e [PowerPC] Fix a performance bug for PPC::XXSLDWI.
There are some VectorShuffle Nodes in SDAG which can be selected to XXSLDWI
instruction, this patch recognizes them and does the selection to improve the
PPC performance.

llvm-svn: 303822
2017-05-24 23:48:29 +00:00
Eugene Zelenko 75480cce12 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 303820
2017-05-24 23:10:29 +00:00
Gor Nishanov 1f72d75714 [coroutines] Allow rematerialization upto 4 times. Remove incorrect assert
Reviewers: majnemer

Subscribers: EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D33524

llvm-svn: 303819
2017-05-24 23:01:02 +00:00
Sanjay Patel 07b1ba54b5 [InstCombine] use m_APInt to allow icmp-mul-mul vector fold
The swapped operands in the first test is a manifestation of an 
inefficiency for vectors that doesn't exist for scalars because 
the IRBuilder checks for an all-ones mask for scalars, but not 
vectors.

llvm-svn: 303818
2017-05-24 22:58:17 +00:00
Nirav Dave 7a8717d216 [DAG] Prevent crashes when merging constant stores with high-bit set. NFC.
llvm-svn: 303802
2017-05-24 19:56:39 +00:00
Nirav Dave bb20b5d5c3 [AArch64] Prevent nested ADDs from address calc in splitStoreSplat. NFC
In preparation for late-stage store merging.

llvm-svn: 303800
2017-05-24 19:55:49 +00:00
Craig Topper 2f9c6dafe3 [InstCombine] Merge together the SimplifyDemandedUseBits implementations for ZExt and Trunc. NFC
While there avoid resizing the DemandedMask twice. Make a copy into a separate variable instead. This potentially removes an allocation on large bit widths.

With the use of the zextOrTrunc methods on APInt and KnownBits these can be made almost source identical. The only difference is the zero of the upper bits for ZExt. This is similar to how its done in computeKnownBits in ValueTracking.

llvm-svn: 303791
2017-05-24 18:40:25 +00:00
Teresa Johnson cd2aa0d2e4 Fix a couple of typos in memory intrinsic optimization output (NFC)
s/instrinsic/intrinsic

llvm-svn: 303782
2017-05-24 17:55:25 +00:00
Zaara Syeda 932978315b P9: D-form vector load/store. Differential Revision: https://reviews.llvm.org/D33248
llvm-svn: 303780
2017-05-24 17:50:37 +00:00
Craig Topper 1c660dbea6 [InstCombine] Use less bitwise operations to handle Instruction::SExt in SimplifyDemandedUseBits. Other improvements.
The current code created a NewBits mask and used it as a mask several times. One of them just before a call to trunc making it unnecessary. A call to getActiveBits can get us the same information for the case. We also ORed with this mask later when we should have just sign extended the known bits.

We also called trunc on the guaranteed to be zero KnownZeros/Ones masks entering this code. Creating appropriately sized temporary APInts is probably better.

Differential Revision: https://reviews.llvm.org/D32098

llvm-svn: 303779
2017-05-24 17:33:30 +00:00
Craig Topper 77e07cc010 [InstSimplify] Simplify uadd/sadd/umul/smul with overflow intrinsics when the Zero or Undef is on the LHS.
Summary: This code was migrated from InstCombine a few years ago. InstCombine had nearby code that would move Constants to the RHS for these, but InstSimplify doesn't have such code on this path.

Reviewers: spatel, majnemer, davide

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33473

llvm-svn: 303774
2017-05-24 17:05:28 +00:00
Craig Topper 8205a1a9b6 [ValueTracking] Convert most of the calls to computeKnownBits to use the version that returns the KnownBits object.
This continues the changes started when computeSignBit was replaced with this new version of computeKnowBits.

Differential Revision: https://reviews.llvm.org/D33431

llvm-svn: 303773
2017-05-24 16:53:07 +00:00
Craig Topper a2025eaaef [ValueTracking] Add OptimizationRemarkEmitter to the other signature for commuteKnownBits.
This is needed for an upcoming patch.

llvm-svn: 303772
2017-05-24 16:53:03 +00:00
Matthew Simpson 6349380fa4 Revert r291254: [AArch64] Reduce vector insert/extract cost for Falkor
The default vector insert/extract cost is more profitable on Falkor than the
reduced cost.

llvm-svn: 303771
2017-05-24 16:48:39 +00:00
Nirav Dave d20066cbad [AMDGPU] Prevent too large store merges in AMDGPU Subtargets. NFCI.
Various address spaces on the SI and R600 subtargets have stricter
limits on memory access size that other address spaces. Use
canMergeStoresTo predicate to prevent the DAGCombiner from creating
these stores as they will be split up during legalization.

llvm-svn: 303767
2017-05-24 15:59:09 +00:00
Matthew Simpson d6f179cad6 [LV] Update type in cost model for scalarization
For non-uniform instructions marked for scalarization, we should update
`VectorTy` when computing instruction costs to reflect the scalar type. In
addition to determining instruction costs, this type is also used to signal
that all instructions in the loop will be scalarized. This currently affects
memory instructions and non-pointer induction variables and their updates. (We
also mark GEPs scalar after vectorization, but their cost is computed together
with memory instructions.) For scalarized induction updates, this patch also
scales the scalar cost by the vectorization factor, corresponding to each
induction step.

llvm-svn: 303763
2017-05-24 15:26:15 +00:00
Vadzim Dambrouski b07351f4f8 [MSP430] Fix PR33050: Don't use ADD16ri to lower FrameIndex.
Use ADDframe pseudo instruction instead.
This will fix machine verifier error, and will help to fix PR32146.

Differential Revision: https://reviews.llvm.org/D33452

llvm-svn: 303758
2017-05-24 15:08:30 +00:00
Marek Olsak 8973a0a22c Revert "AMDGPU: Fold CI-specific complex SMRD patterns into existing complex patterns"
This reverts commit e065977c4b5f68ab845400b256f6a3822b1325fa.

It doesn't work. S_LOAD_DWORD_IMM_ci and friends aren't selected by any of
the patterns, so it was putting 32-bit literals into the 8-bit field.

llvm-svn: 303754
2017-05-24 14:53:50 +00:00
Diana Picus 183863fc3b Revert "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start"
This reverts commit r303730 because it broke all the buildbots.

llvm-svn: 303747
2017-05-24 14:16:04 +00:00
Krzysztof Parzyszek e3ec97b031 [Hexagon] Fix comment in HexagonPacketizer::runOnMachineFunction
Patch by Wei-Ren Chen.

Differential Revision: https://reviews.llvm.org/D33439

llvm-svn: 303745
2017-05-24 13:43:42 +00:00
Jonas Paulsson 8624b7e1ce [LoopVectorizer] Let target prefer scalar addressing computations.
The loop vectorizer usually vectorizes any instruction it can and then
extracts the elements for a scalarized use. On SystemZ, all elements
containing addresses must be extracted into address registers (GRs). Since
this extraction is not free, it is better to have the address in a suitable
register to begin with. By forcing address arithmetic instructions and loads
of addresses to be scalar after vectorization, two benefits result:

* No need to extract the register
* LSR optimizations trigger (LSR isn't handling vector addresses currently)

Benchmarking show improvements on SystemZ with this new behaviour.

Any other target could try this by returning false in the new hook
prefersVectorizedAddressing().

Review: Renato Golin, Elena Demikhovsky, Ulrich Weigand
https://reviews.llvm.org/D32422

llvm-svn: 303744
2017-05-24 13:42:56 +00:00
Jonas Paulsson 081b5a1e9d [SystemZ] Fix register modelling in expandLoadStackGuard()
EXPENSIVE_CHECKS found this bug (https://bugs.llvm.org/show_bug.cgi?id=33047), which
this patch fixes. The EAR instruction defines a GR32, not a GR64.

Review: Ulrich Weigand
llvm-svn: 303743
2017-05-24 13:15:48 +00:00
Tamas Berghammer e3852fa405 Demangler: Fix constructor cv qualifier handling
Previously if we parsed a constructor then we set parsed_ctor_dtor_cv
to true and never reseted it. This causes issue when a template argument
references a constructor (e.g. type of lambda defined inside a
constructor) as we will have the parsed_ctor_dtor_cv flag set what will
cause issues when parsing later arguments.

Differential Revision: https://reviews.llvm.org/D33385
libcxxabi change: https://reviews.llvm.org/rL303737

llvm-svn: 303738
2017-05-24 11:29:02 +00:00
Simon Pilgrim 9f46d1d479 Strip trailing whitespace. NFCI.
llvm-svn: 303736
2017-05-24 11:02:27 +00:00
Florian Hahn d211fe7c26 [ARM] Remove ThumbTargetMachines. (NFC)
Summary:
Thumb code generation is controlled by ARMSubtarget and the concrete
ThumbLETargetMachine and ThumbBETargetMachine are not needed.

Eric Christopher suggested removing the unneeded target machines in
https://reviews.llvm.org/D33287.

I think it still makes sense to keep separate TargetMachines for big and
little endian as we probably do not want to have different endianess for
difference functions in a single compilation unit. The MIPS backend has
two separate TargetMachines for big and little endian as well. 

Reviewers: echristo, rengolin, kristof.beyls, t.p.northover

Reviewed By: echristo

Subscribers: aemerson, javed.absar, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D33318

llvm-svn: 303733
2017-05-24 10:18:57 +00:00
Mikael Holmen 2676f8269a MachineCSE: Respect interblock physreg liveness
Summary:
This is a fix for PR32538. MachineCSE first looks at MO.isDead(), but
if it is not marked dead, MachineCSE still wants to do its own check
to see if it is trivially dead. This check for the trivial case
assumed that physical registers cannot be live out of a block.

Patch by Mattias Eriksson.

Reviewers: qcolombet, jbhateja

Reviewed By: qcolombet, jbhateja

Subscribers: jbhateja, llvm-commits

Differential Revision: https://reviews.llvm.org/D33408

llvm-svn: 303731
2017-05-24 09:35:23 +00:00
Max Kazantsev 13e016bf48 [SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start
When folding arguments of AddExpr or MulExpr with recurrences, we rely on the fact that
the loop of our base recurrency is the bottom-lost in terms of domination. This assumption
may be broken by an expression which is treated as invariant, and which depends on a complex
Phi for which SCEVUnknown was created. If such Phi is a loop Phi, and this loop is lower than
the chosen AddRecExpr's loop, it is invalid to fold our expression with the recurrence.

Another reason why it might be invalid to fold SCEVUnknown into Phi start value is that unlike
other SCEVs, SCEVUnknown are sometimes position-bound. For example, here:

for (...) { // loop
  phi = {A,+,B}
}
X = load ...
Folding phi + X into {A+X,+,B}<loop> actually makes no sense, because X does not exist and cannot
exist while we are iterating in loop (this memory can be even not allocated and not filled by this moment).
It is only valid to make such folding if X is defined before the loop. In this case the recurrence {A+X,+,B}<loop>
may be existant.

This patch prohibits folding of SCEVUnknown (and those who use them) into the start value of an AddRecExpr,
if this instruction is dominated by the loop. Merging the dominating unknown values is still valid. Some tests that
relied on the fact that some SCEVUnknown should be folded into AddRec's are changed so that they no longer
expect such behavior.

llvm-svn: 303730
2017-05-24 08:52:18 +00:00
Craig Topper e6a2318573 [APInt] Use std::end to avoid mentioning the size of a local buffer repeatedly.
llvm-svn: 303726
2017-05-24 07:00:55 +00:00
Javed Absar a32e3a1acf [ARM] Add VLDx/VSTx sched defs for machine-schedulers. NFCI
This patch adds missing scheds for Neon VLDx/VSTx instructions.
This will help one write schedulers easier/faster in the future for ARM sub-targets.
Existing models will not affected by this patch.
Reviewed by: Renato Golin, Diana Picus
Differential Revision: https://reviews.llvm.org/D33120

llvm-svn: 303717
2017-05-24 05:32:48 +00:00
Davide Italiano fd9100e056 [NewGVN] Update additionalUsers when we simplify to a value.
Otherwise we don't revisit an instruction that could be simplified,
and when we verify, we discover there's something that changed, i.e.
what we had wasn't a maximal fixpoint.

Fixes PR32836.

llvm-svn: 303715
2017-05-24 02:30:24 +00:00
George Karpenkov 018472c34a Revert "Disable coverage opt-out for strong postdominator blocks."
This reverts commit 2ed06f05fc10869dd1239cff96fcdea2ee8bf4ef.
Buildbots do not like this on Linux.

llvm-svn: 303710
2017-05-24 00:29:12 +00:00
Zachary Turner bb64231d2d Don't do a full scan of the type stream before processing records.
LazyRandomTypeCollection is designed for random access, and in
order to provide this it lazily indexes ranges of types.  In the
case of types from an object file, there is no partial index
to build off of, so it has to index the full stream up front.
However, merging types only requires sequential access, and when
that is needed, this extra work is simply wasted.  Changing the
algorithm to work on sequential arrays of types rather than
random access type collections eliminates this up front scan.

llvm-svn: 303707
2017-05-24 00:26:27 +00:00
Davide Italiano c4861adad9 [SCCP] Use the `hasAddressTaken()` version defined in `Function`.
Instead of using the SCCP homegrown one. We should eventually
make the private SCCP version disappear, but that wont' be today.
PR33143 tracks this issue.

Add braces for consistency while here. No functional change intended.

llvm-svn: 303706
2017-05-23 23:59:23 +00:00
Davide Italiano 7bf95b964f [LIR] Use the newly `getRecurrenceVar()` helper. NFCI.
llvm-svn: 303704
2017-05-23 23:51:54 +00:00
Davide Italiano 4bc91190ea [LIR] Strengthen the check for recurrence variable in popcnt/CTLZ.
Fixes PR33114.
Differential Revision:  https://reviews.llvm.org/D33420

llvm-svn: 303700
2017-05-23 22:32:56 +00:00
George Karpenkov 9017ca290a Disable coverage opt-out for strong postdominator blocks.
Coverage instrumentation has an optimization not to instrument extra
blocks, if the pass is already "accounted for" by a
successor/predecessor basic block.
However (https://github.com/google/sanitizers/issues/783) this
reasoning may become circular, which stops valid paths from having
coverage.
In the worst case this can cause fuzzing to stop working entirely.

This change simplifies logic to something which trivially can not have
such circular reasoning, as losing valid paths does not seem like a
good trade-off for a ~15% decrease in the # of instrumented basic blocks.

llvm-svn: 303698
2017-05-23 21:58:54 +00:00
Tim Northover 8c605c0eda Revert LLVM changes for "Sema: allow imaginary constants via GNU extension if UDL overloads not present."
The changes accidentally crept into a Clang commit I was making.

llvm-svn: 303697
2017-05-23 21:53:11 +00:00
Vadzim Dambrouski 49dd6e68c2 [MSP430] Add subtarget features for hardware multiplier.
Also add more processors to make -mcpu option behave similar to gcc.

Differential Revision: https://reviews.llvm.org/D33335

llvm-svn: 303695
2017-05-23 21:49:42 +00:00
Tim Northover 6b5eceac2e Sema: allow imaginary constants via GNU extension if UDL overloads not present.
C++14 added user-defined literal support for complex numbers so that you can
write something like "complex<double> val = 2i". However, there is an existing
GNU extension supporting this syntax and interpreting the result as a _Complex
type.

This changes parsing so that such literals are interpreted in terms of C++14's
operators if an overload is present but otherwise falls back to the original
GNU extension.

llvm-svn: 303694
2017-05-23 21:41:49 +00:00
Reid Kleckner 26450bf579 Silence MSVC warning about unsigned integer overflow, which has defined behavior
llvm-svn: 303693
2017-05-23 21:35:32 +00:00
Simon Pilgrim c910a70b21 [AMDGPU] Add INDIRECT_BASE_ADDR to R600_Reg32 class (PR33045)
This fixes 17 of the 41 -verify-machineinstrs test failures identified in PR33045

Differential Revision: https://reviews.llvm.org/D33451

llvm-svn: 303691
2017-05-23 21:27:15 +00:00
Francis Visoiu Mistrih 1c98701e57 AsmPrinter: mark the beginning and the end of a function in verbose mode
llvm-svn: 303690
2017-05-23 21:22:16 +00:00
Changpeng Fang 1dbace195d AMDGPU/SI: Move the local memory usage related checking after calling convention checking in PromoteAlloca
Summary:
  Promoting Alloca to Vector and Promoting Alloca to LDS are two independent handling of Alloca and should not affect each other.
As a result, we should not give up promoting to vector if there is not enough LDS. This patch factors out the local memory usage
related checking out and replace it after the calling convention checking.

Reviewer:
  arsenm

Differential Revision:
  http://reviews.llvm.org/D33139

llvm-svn: 303684
2017-05-23 20:25:41 +00:00
Geoff Berry d6ac96f953 [AArch64][Falkor] Refine sched details for LSLfast/ASRfast.
llvm-svn: 303682
2017-05-23 19:57:45 +00:00
Stanislav Mekhanoshin 53a21292f8 [AMDGPU] Combine and (srl) into shl (bfe)
Perform DAG combine:
and (srl x, c), mask => shl (bfe x, nb + c, mask >> nb), nb
Where nb is a number of trailing zeroes in mask.

It replaces two instructions with two and BFE is generally a more
expensive one. However this is only done if we are selecting a byte
or word at an aligned boundary which results in a proper SDWA
operand pattern. It is only done if SDWA is supported.

TODO: improve SDWA pass to actually convert this pattern. It is not
done now because we have an immediate in the instruction, which has
be moved into a VGPR.

Differential Revision: https://reviews.llvm.org/D33455

llvm-svn: 303681
2017-05-23 19:54:48 +00:00
Geoff Berry e6366f505f [AArch64][Falkor] Fix sched details for FMOV of WZR/XZR.
llvm-svn: 303680
2017-05-23 19:54:28 +00:00
Oleg Ranevskyy 09df0020fc [ARM] Temporarily disable globals promotion to constant pools to prevent miscompilation
Summary:
A temporary workaround for PR32780 - rematerialized instructions accessing the same promoted global through different constant pool entries.

The patch turns off the globals promotion optimization leaving all its code in place, so that it can be easily turned on once PR32780 is fixed.

Since this is a miscompilation issue causing generation of misbehaving code, and the problem is very subtle, the patch might be valuable enough to get into 4.0.1.

Reviewers: efriedma, jmolloy

Reviewed By: efriedma

Subscribers: aemerson, javed.absar, llvm-commits, rengolin, asl, tstellar

Differential Revision: https://reviews.llvm.org/D33446

llvm-svn: 303679
2017-05-23 19:38:37 +00:00
Zachary Turner 7daf62e743 [CodeView] Eliminate redundant hashes and allocations.
When writing field list records, we would construct a temporary
type serializer that shared a bump ptr allocator with the rest
of the application, so anything allocated from here would live
forever.  Furthermore, this temporary serializer had all the
properties of a full blown serializer including record hashing
and de-duplication.

These features are required when you're merging multiple type
streams into each other, because different streams may contain
identical records, but records from the same type stream will
never collide with each other.  So all of this hashing was
unnecessary.

To solve this, two fixes are made:

1) The temporary serializer keeps its own bump ptr allocator
instead of sharing a global one.  When it's finished, all of
its memory is freed.

2) Instead of using the same temporary serializer for the life
of an entire type stream, we use it only for the life of a single
field list record and delete it when the field list record is
completed.  This way the hash table will not grow as other
records from the same type stream get inserted.  Further improvements
could eliminate hashing entirely from this codepath.

This reduces the link time by 85% in my test, from 1 minute to 9
seconds.

llvm-svn: 303676
2017-05-23 18:56:23 +00:00
Nirav Dave 6c910c0dd8 [DAG] Add AddressSpace parameter to canMergeStoresTo. NFC.
llvm-svn: 303673
2017-05-23 18:53:02 +00:00
Yuka Takahashi c8068dbb07 [GSoC] Shell autocompletion for clang
Summary:
This is a first patch for GSoC project, bash-completion for clang.
To use this on bash, please run `source clang/utils/bash-autocomplete.sh`.
bash-autocomplete.sh is code for bash-completion.

Simple flag completion and path completion is available in this patch.

Reviewers: teemperor, v.g.vassilev, ruiu, Bigcheese, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33237

llvm-svn: 303670
2017-05-23 18:39:08 +00:00
David Blaikie 7b0a6aa642 Fix DIEHash refactoring that dropped the DW_AT_name from the hash
llvm-svn: 303669
2017-05-23 18:36:07 +00:00
Nirav Dave 3b4f7cc0b3 [DAG] Add canMergeStoresTo predicate checks. NFCI.
Propagate canMergeStoresTo checks to missing cases in StoreMerge.

llvm-svn: 303668
2017-05-23 18:33:09 +00:00
Reid Kleckner 36238b15d7 Speculative build fix for non-Windows
llvm-svn: 303667
2017-05-23 18:28:13 +00:00
David Blaikie 74fa80399a Refactor DWARF hashing to use a .def file to avoid repetition
llvm-svn: 303666
2017-05-23 18:27:09 +00:00
Reid Kleckner ded38803c5 [PDB] Hash types up front when merging types instead of using StringMap
Summary:
First, StringMap uses llvm::HashString, which is only good for short
identifiers and really bad for large blobs of binary data like type
records. Moving to `DenseMap<StringRef, TypeIndex>` with some tricks for
memory allocation fixes that.

Unfortunately, that didn't buy very much performance. Profiling showed
that we spend a long time during DenseMap growth rehashing existing
entries. Also, in general, DenseMap is faster when the keys are small.
This change takes that to the logical conclusion by introducing a small
wrapper value type around a pointer to key data. The key data contains a
precomputed hash, the original record data (pointer and size), and the
type index, which is the "value" of our original map.

This reduces the time to produce llvm-as.exe and llvm-as.pdb from ~15s
on my machine to 3.5s, which is about a 4x improvement.

Reviewers: zturner, inglorion, ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33428

llvm-svn: 303665
2017-05-23 18:23:59 +00:00
Sanjay Patel d3106add77 [InstCombine] allow icmp-xor folds for vectors (PR33138)
This fixes the first part of:
https://bugs.llvm.org/show_bug.cgi?id=33138

More work is needed for the bitcasted variant.

llvm-svn: 303660
2017-05-23 17:29:58 +00:00
Marek Olsak 7dadd86a35 AMDGPU: Fold CI-specific complex SMRD patterns into existing complex patterns
This is just a cleanup. Also, it adds checking that ByteCount is aligned to 4.

Reviewers: arsenm, nhaehnle, tstellarAMD

Subscribers: kzhuravl, wdng, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D28994

llvm-svn: 303658
2017-05-23 17:14:34 +00:00
Reid Kleckner 545aa4f4dd Commit AttributeList change that was supposed to be part of r303654
llvm-svn: 303656
2017-05-23 17:03:28 +00:00
Ulrich Weigand fe9fcb8af3 [RuntimeDyld, PowerPC] Fix regression from r303637
Actually, to identify external symbols, we need to check for
*either* non-null Value.SymbolName *or* a SymType of
Symbol::ST_Unknown.

The former may happen for symbols not known to the JIT at all
(e.g. defined in a native library), while the latter happens
for symbols known to the JIT, but defined in a different module.

Fixed several regressions on big-endian ppc64.

llvm-svn: 303655
2017-05-23 17:03:23 +00:00
Reid Kleckner 8bf67fe98f [IR] Switch AttributeList to use an array for O(1) access
Summary:
Before this change, AttributeLists stored a pair of index and
AttributeSet. This is memory efficient if most arguments do not have
attributes. However, it requires doing a search over the pairs to test
an argument or function attribute. Profiling shows that this loop was
0.76% of the time in 'opt -O2' of sqlite3.c, because LLVM constantly
tests values for nullability.

This was worth about 2.5% of mid-level optimization cycles on the
sqlite3 amalgamation. Here are the full perf results:
https://reviews.llvm.org/P7995

Here are just the before and after cycle counts:
```
$ perf stat -r 5 ./opt_before -O2 sqlite3.bc -o /dev/null
    13,274,181,184      cycles                    #    3.047 GHz                      ( +-  0.28% )
$ perf stat -r 5 ./opt_after -O2 sqlite3.bc -o /dev/null
    12,906,927,263      cycles                    #    3.043 GHz                      ( +-  0.51% )
```

This patch *does not* change the indices used to query attributes, as
requested by reviewers. Tracking whether an index is usable for array
indexing is a huge pain that affects many of the internal APIs, so it
would be good to come back later and do a cleanup to remove this
internal adjustment.

Reviewers: pete, chandlerc

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D32819

llvm-svn: 303654
2017-05-23 17:01:48 +00:00
Stanislav Mekhanoshin a96ec3f360 [AMDGPU] Convert shl (add) into add (shl)
shl (or|add x, c2), c1 => or|add (shl x, c1), (c2 << c1)
This allows to fold a constant into an address in some cases as
well as to eliminate second shift if the expression is used as
an address and second shift is a result of a GEP.

Differential Revision: https://reviews.llvm.org/D33432

llvm-svn: 303641
2017-05-23 15:59:58 +00:00
Zachary Turner bf35e6ab2a Revert "Make TypeSerializer's StringMap use the same allocator."
This reverts commit e34ccb7b57da25cc89ded913d8638a2906d1110a.

This is causing failures on the ASAN bots.

llvm-svn: 303640
2017-05-23 15:50:37 +00:00
Simon Atanasyan 57253043a4 [mips] Remove unused class field. NFC
llvm-svn: 303639
2017-05-23 15:00:30 +00:00
Simon Atanasyan 039b02ec78 [mips] Change type of MipsSubtarget ctor arguments s/std::string/StringRef/. NFC
llvm-svn: 303638
2017-05-23 15:00:26 +00:00
Ulrich Weigand 7f02d67fce [RuntimeDyld, PowerPC] Fix check for external symbols when detecting reloction overflow
The PowerPC part of processRelocationRef currently assumes that external
symbols can be identified by checking for SymType == SymbolRef::ST_Unknown.
This is actually incorrect in some cases, causing relocation overflows to
be mis-detected. The correct check is to test whether Value.SymbolName
is null.

Includes test case. Note that it is a bit tricky to replicate the exact
condition that triggers the bug in a test case. The one included here
seems to fail reliably (before the fix) across different operating
system versions on Power, but it still makes a few assumptions (called
out in the test case comments).

Also add ppc64le platform name to the supported list in the lit.local.cfg
files for the MCJIT and OrcMCJIT directories, since those tests were
currently not run at all.

Fixes PR32650.

Reviewer: hfinkel

Differential Revision: https://reviews.llvm.org/D33402

llvm-svn: 303637
2017-05-23 14:51:18 +00:00
Anna Thomas c07d5544dd [JumpThreading] Safely replace uses of condition
This patch builds over https://reviews.llvm.org/rL303349 and replaces
the use of the condition only if it is safe to do so.

We should not blindly RAUW the condition if experimental.guard or assume
is a use of that
condition. This is because LVI may have used the guard/assume to
identify the
value of the condition, and RUAWing will fold the guard/assume and uses
before the guards/assumes.

Reviewers: sanjoy, reames, trentxintong, mkazantsev

Reviewed by: sanjoy, reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33257

llvm-svn: 303633
2017-05-23 13:36:25 +00:00
Ulrich Weigand b6d40cceee [RuntimeDyld, PowerPC] Fix relocation detection overflow
Code in RuntimeDyldELF currently uses 32-bit temporaries to detect
whether a PPC64 relocation target is out of range. This is incorrect,
and can mis-detect overflow where the distance between relocation site
and target is close to a multiple of 4GB. Fixed by using 64-bit
temporaries.

Noticed while debugging PR32650.

Reviewer: hfinkel

Differential Revision: https://reviews.llvm.org/D33403

llvm-svn: 303632
2017-05-23 12:43:57 +00:00
Sam Kolton f7659d71eb [AMDGPU] SDWA: Add assembler support for GFX9
Summary:
Added separate pseudo and real instruction for GFX9 SDWA instructions.
Currently supports only in assembler.
Depends D32493

Reviewers: vpykhtin, artem.tamazov

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D33132

llvm-svn: 303620
2017-05-23 10:08:55 +00:00
Florian Hahn abb4218b98 [AArch64] Make instruction fusion more aggressive.
Summary:
This patch makes instruction fusion more aggressive by
* adding artificial edges between the successors of FirstSU and
  SecondSU, similar to BaseMemOpClusterMutation::clusterNeighboringMemOps.
* updating PostGenericScheduler::tryCandidate to keep clusters together,
   similar to GenericScheduler::tryCandidate.

This change increases the number of AES instruction pairs generated on
 Cortex-A57 and Cortex-A72. This doesn't change code at all in
 most benchmarks or general code, but we've seen improvement on kernels
 using AESE/AESMC and AESD/AESIMC. 

Reviewers: evandro, kristof.beyls, t.p.northover, silviu.baranga, atrick, rengolin, MatzeB

Reviewed By: evandro

Subscribers: aemerson, rengolin, MatzeB, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33230

llvm-svn: 303618
2017-05-23 09:33:34 +00:00
Igor Breger 617be6e475 [GlobalISel][X86] G_LOAD/G_STORE vec256/512 support
Summary: mark G_LOAD/G_STORE vec256/512 legal for AVX/AVX512. Implement instruction selection.

Reviewers: zvi, guyblank

Reviewed By: zvi

Subscribers: rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D33268

llvm-svn: 303617
2017-05-23 08:23:51 +00:00
Craig Topper 7e0aeeb884 [KnownBits] Use !hasConflict() in asserts in place of Zero & One == 0 or similar. NFC
llvm-svn: 303614
2017-05-23 07:18:37 +00:00
Ayal Zaks 589e1d9610 [LV] Report multiple reasons for not vectorizing under allowExtraAnalysis
The default behavior of -Rpass-analysis=loop-vectorizer is to report only the
first reason encountered for not vectorizing, if one is found, at which time the
vectorizer aborts its handling of the loop. This patch allows multiple reasons
for not vectorizing to be identified and reported, at the potential expense of
additional compile-time, under allowExtraAnalysis which can currently be turned
on by Clang's -fsave-optimization-record and opt's -pass-remarks-missed.

Removed from LoopVectorizationLegality::canVectorize() the redundant checking
and reporting if we CantComputeNumberOfIterations, as LAI::canAnalyzeLoop() also
does that. This redundancy is caught by a lit test once multiple reasons are
reported.

Patch initially developed by Dror Barak.

Differential Revision: https://reviews.llvm.org/D33396

llvm-svn: 303613
2017-05-23 07:08:02 +00:00
David Blaikie 15d85fc537 libDebugInfo: Support symbolizing using DWP files
llvm-svn: 303609
2017-05-23 06:48:53 +00:00
Akira Hatanaka e8ae3346a3 [AArch64] Fix PRR33100.
This commit fixes a bug introduced in r301019 where optimizeLogicalImm
would replace a logical node's immediate operand that was CSE'd and
was also an operand of another node.

This commit fixes the bug by replacing the logical node instead of its
immediate operand.

rdar://problem/32295276

llvm-svn: 303607
2017-05-23 06:08:37 +00:00
Galina Kistanova 5e6c542ae3 Added LLVM_FALLTHROUGH to address gcc warning: this statement may fall through.
llvm-svn: 303597
2017-05-23 01:20:52 +00:00
Galina Kistanova fb9476ee6c Added LLVM_FALLTHROUGH to address gcc warning: this statement may fall through.
llvm-svn: 303595
2017-05-23 01:07:19 +00:00
David Blaikie 37d1cff491 FIX: Remove debugging assert left in previous commit
Sorry for the bot noise.

llvm-svn: 303592
2017-05-23 00:31:24 +00:00
David Blaikie f9803fb4bb libDebugInfo: Avoid independently parsing the same .dwo file for two separate CUs residing there
NFC, just an optimization. Will be building on this for DWP support
shortly.

llvm-svn: 303591
2017-05-23 00:30:42 +00:00
Teresa Johnson 2db1369c1f Support for taking the max of module flags when linking, use for PIE/PIC
Summary:
Add Max ModFlagBehavior, which can be used to take the max of two
module flag values when merging modules. Use it for the PIE and PIC
levels.

This avoids an error when we try to import from a module built -fpic
into a module built -fPIC, for example. For both PIE and PIC levels,
this will be legal, since the code generation gets more conservative
as the level is increased. Therefore we can take the max instead of
somehow trying to block importing between modules compiled with
different levels.

Reviewers: tmsriram, pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33418

llvm-svn: 303590
2017-05-23 00:08:00 +00:00
Davide Italiano 8e7d11ab2b [NewPM] Fix an innocent but silly typo. Reported by Craig Topper.
llvm-svn: 303587
2017-05-22 23:47:11 +00:00
Davide Italiano 8a09b8eba9 [NewPM] Add a temporary cl::opt() to test NewGVN.
llvm-svn: 303586
2017-05-22 23:41:40 +00:00
Galina Kistanova 6fa60f5e8b Added LLVM_FALLTHROUGH to address gcc warning: this statement may fall through.
llvm-svn: 303585
2017-05-22 22:46:31 +00:00
Vitaly Buka b238cb8fbc [CodeGen] Fix uninitialized variables exposed by r303084
All other calls of analyzeBranch reset PredTBB and PredFBB, so I assume it's
expected behavior.

llvm-svn: 303581
2017-05-22 21:33:54 +00:00
Tim Northover 997f5f10c6 InstructionSimplify: don't speculate about Constants changing.
When presented with an icmp/select pair, we can end up asking what would happen
if we replaced one constant with another in an instruction. This is a mistake,
while non-constant Values could become a constant, constants cannot change and
trying to do so can lead to completely invalid IR (a GEP referencing a
non-existant field in the original case).

llvm-svn: 303580
2017-05-22 21:28:08 +00:00
Evgeniy Stepanov b9f1b014e1 Infer relocation model from module flags in relocatable LTO link.
Fix for PR33096.

llvm-svn: 303578
2017-05-22 21:11:35 +00:00
Zachary Turner d4136e945e Implement various flavors of type merging.
Previous algotirhm assumed that types and ids are in a single
unified stream.  For inputs that come from object files, this
is the case.  But if the input is already a PDB, or is the result
of a previous merge, then the types and ids will already have
been split up, in which case we need an algorithm that can
accept operate on independent streams of types and ids that
refer across stream boundaries to each other.

Differential Revision: https://reviews.llvm.org/D33417

llvm-svn: 303577
2017-05-22 21:07:43 +00:00
Zachary Turner 12f8c31c04 Make TypeSerializer's StringMap use the same allocator.
llvm-svn: 303576
2017-05-22 21:07:14 +00:00
Adrian Prantl fb31da1306 Don't generate line&scope debug info for meta-instructions.
MachineInstructions that don't generate any code (such as
IMPLICIT_DEFs) should not generate any debug info either.

Fixes PR33107.

https://bugs.llvm.org/show_bug.cgi?id=33107

This reapplies r303566 without any modifications. The stage2 build
failures persisted even after reverting this patch, and looking back
through history, it looks like these tests are flaky.

llvm-svn: 303575
2017-05-22 20:47:09 +00:00
Teresa Johnson 525dcb617b Fix update VP metadata after inlining for instrumentation PGO
Summary:
With instrumentation profiling, when updating the VP metadata after
an inline, VP metadata on the inlined copy was inadvertantly having
all counts zeroed out. This was causing indirect calls from code inlined
during the call step to be marked as cold in the ThinLTO summaries and
not imported.

The CallerBFI needs to be passed down so that the CallSiteCount can be
computed from the profile summary info. With Sample PGO this was working
since the count is extracted from the branch weight metadata on the
call being inlined (even before we stopped looking at metadata for
non-sample PGO in r302844 this largely wasn't working for instrumentation
PGO since only promoted indirect calls would be getting inlined and have
the metadata).

Added an instrumentation PGO test and renamed the sample PGO test.

Reviewers: danielcdh, eraman

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D33389

llvm-svn: 303574
2017-05-22 20:28:18 +00:00
Krzysztof Parzyszek 9a23d40ee8 [Hexagon] Fix definitions of vector predicate loads and stores
This fixes http://llvm.org/PR33048.

llvm-svn: 303572
2017-05-22 20:02:53 +00:00
Craig Topper 64a65ec4fd [DataLayout] Add llvm_unreachable to the default of a nested switch statement that covers all values given to it by the outer switch. NFC
llvm-svn: 303571
2017-05-22 19:28:36 +00:00
Adrian Prantl 334a130a6f Revert "Don't generate line&scope debug info for meta-instructions."
This reverts commit r303566 while investigating a stage2 buildbot failure.

llvm-svn: 303570
2017-05-22 18:50:12 +00:00
Stanislav Mekhanoshin 5fa289f0d8 [AMDGPU] Narrow lshl from 64 to 32 bit if possible
Turn expensive 64 bit shift into 32 bit if shift does not overflow int:
shl (ext x) => zext (shl x)

Differential Revision: https://reviews.llvm.org/D33367

llvm-svn: 303569
2017-05-22 16:58:10 +00:00
Xinliang David Li 126157c3b4 [PartialInlining] Add internal options to enable partial inlining in pass pipeline (off by default)
1. Legacy: -mllvm -enable-partial-inlining
2. New:  -mllvm -enable-npm-partial-inlining -fexperimental-new-pass-manager

Differential Revision: http://reviews.llvm.org/D33382

llvm-svn: 303567
2017-05-22 16:41:57 +00:00
Adrian Prantl 4c047f8931 Don't generate line&scope debug info for meta-instructions.
MachineInstructions that don't generate any code (such as
IMPLICIT_DEFs) should not generate any debug info either.

Fixes PR33107.

https://bugs.llvm.org/show_bug.cgi?id=33107

llvm-svn: 303566
2017-05-22 16:21:02 +00:00
Nirav Dave e00da22ef3 [DAG] Rework store merge to loop on load candidates. NFCI.
Continue to consider remaining candidate merges until all possible
merges have been considered.

llvm-svn: 303560
2017-05-22 15:33:47 +00:00
Valery Pykhtin 74cb9c8831 [AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker
Differential revision: https://reviews.llvm.org/D33289

llvm-svn: 303548
2017-05-22 13:09:40 +00:00
Simon Atanasyan e0b726f2fa [mips] Support micromips attribute passed by front-end
This patch adds handling of the `micromips` and `nomicromips` attributes
passed by front-end. The patch depends on D33363.

Differential revision: https://reviews.llvm.org/D33364

llvm-svn: 303545
2017-05-22 12:47:41 +00:00
Artur Pilipenko edee25152b [LoopPredication] NFC. Add extra debug output in case we fail to parse the range check
llvm-svn: 303544
2017-05-22 12:06:57 +00:00
Artur Pilipenko c488dfabac [LoopPredication] NFC. Move a nested struct declaration before the fields, clang-format a bit
This will simplify the diff for an upcoming review.

llvm-svn: 303543
2017-05-22 12:01:32 +00:00
James Molloy 6110be9759 Re-apply r302416: [ARM] Clear the constant pool cache on explicit .ltorg directives
Re-applying now that PR32825 which was raised on the commit this fixed up is now known to have also been fixed by this commit.

Original commit message:
    Multiple ldr pseudoinstructions with the same constant value will
    reuse the same constant pool entry. However, if the constant pool
    is explicitly flushed with a .ltorg directive, we should not try
    to reference constants in the previous pool any longer, since they
    may be out of range.

    This fixes assembling hand-written assembler source which repeatedly
    loads the same constant value, across a binary size larger than the
    pc-relative fixup range for ldr instructions (4096 bytes). Such
    assembler source already uses explicit .ltorg instructions to emit
    constant pools with regular intervals. However if we try to reuse
    constants emitted in earlier pools, they end up out of range.

    This makes the output of the testcase match what binutils gas does
    (prior to this patch, it would fail to assemble).

    Differential Revision: https://reviews.llvm.org/D32847

llvm-svn: 303540
2017-05-22 09:42:07 +00:00
James Molloy 5193c80830 Re-apply r286006: Fix 24560: assembler does not share constant pool for same constants
Re-applying now that the open bug on this commit, PR32825, is known to be fixed.

Original commit message:
    Summary: This patch returns the same label if the CP entry with the same value has been created.

    Reviewers: eli.friedman, rengolin, jmolloy

    Subscribers: majnemer, jmolloy, llvm-commits

    Differential Revision: https://reviews.llvm.org/D25804

llvm-svn: 303539
2017-05-22 09:42:01 +00:00
Strahinja Petrovic ab9573f37c [MIPS] Add support to match more patterns for DINS instruction
This patch adds support for recognizing patterns to match
DINS instruction.

Differential Revision: https://reviews.llvm.org/D31465

llvm-svn: 303537
2017-05-22 09:06:44 +00:00
James Molloy 5cc75ae8f9 Revert "[ARM] Clear the constant pool cache on explicit .ltorg directives"
This reverts commit r302416. This was a fixup for r286006, which has now been reverted so this doesn't apply (either in concept or in code).

This commit itself has no problems, but the underlying issue it was fixing has now disappeared from the codebase.

llvm-svn: 303536
2017-05-22 08:49:28 +00:00
James Molloy 5a9cf2e22d Revert "Fix 24560: assembler does not share constant pool for same constants"
This reverts commit r286006. It caused PR32825 and wasn't fixed.

llvm-svn: 303535
2017-05-22 08:42:47 +00:00
David Blaikie d2f3a941e0 libDebugInfo/DWARF: Apply relocations for debug_addr addresses in object files
llvm-symbolizer would fail to symbolize addresses in unlinked object
files when handling .dwo file data because the addresses would not be
relocated in the same way as the ranges in the skeleton CU in the object
file.

Fix that so object files can be symbolized the same as executables.

llvm-svn: 303532
2017-05-22 07:02:47 +00:00
Sanjoy Das 036dda25a5 [SCEV] Clarify behavior around max backedge taken count
This is a re-application of a r303497 that was reverted in r303498.
I thought it had broken a bot when it had not (the breakage did not
go away with the revert).

This change makes the split between the "exact" backedge taken count
and the "maximum" backedge taken count a bit more obvious.  Both of
these are upper bounds on the number of times the loop header
executes (since SCEV does not account for most kinds of abnormal
control flow), but the latter is guaranteed to be a constant.

There were a few places where the max backedge taken count *was* a
non-constant; I've changed those to compute constants instead.

At this point, I'm not sure if the constant max backedge count can be
computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without
losing precision.  If it can, we can simplify even further by making
`getMaxBackedgeTakenCount` a thin wrapper around
`getBackedgeTakenCount` and `getUnsignedRange`.

llvm-svn: 303531
2017-05-22 06:46:04 +00:00
Craig Topper 2b1fc32f22 [InstCombine] Cleanup the interface for overflow checks
Summary:
Fix naming conventions and const correctness.
This completes the changes made in rL303029.

Patch by Yoav Ben-Shalom.

Reviewers: craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33377

llvm-svn: 303529
2017-05-22 06:25:31 +00:00
Craig Topper e777fed152 [SimplifyCFG] Prevent a few APInt copies on method calls that return const reference. NFCI
llvm-svn: 303523
2017-05-22 00:49:35 +00:00
Craig Topper aaef41f71b [KnownBits] Use isNegative/isNonNegative to shorten some code. NFC
llvm-svn: 303522
2017-05-22 00:49:33 +00:00
Daniel Berlin d130b6c27d NewGVN: Fix PR 33116, the memoryphi version of bug 32838.
llvm-svn: 303521
2017-05-21 23:41:58 +00:00
Daniel Berlin 0207cca8e0 NewGVN: Cleanup some repeated code using some templated helpers
llvm-svn: 303520
2017-05-21 23:41:56 +00:00
Daniel Berlin 0193997b7e NewGVN: Fix printing of simplified expression
llvm-svn: 303519
2017-05-21 23:41:53 +00:00
Davide Italiano 21a49dcdf1 [InstCombine] Take in account the size in sext->lshr->trunc patterns.
Otherwise we end up miscompiling, transforming:

define i8 @tinky() {
  %sext = sext i1 1 to i16
  %hibit = lshr i16 %sext, 15
  %tr = trunc i16 %hibit to i8
  ret i8 %tr
}

into:

  %sext = sext i1 1 to i8
  ret i8 %sext

and the first get folded to ret i8 1, while the second gets folded
to ret i8 -1.

Eventually we should get rid of this transform entirely, but for now,
this at least fixes a know correctness bug.

Differential Revision:  https://reviews.llvm.org/D33338

llvm-svn: 303513
2017-05-21 20:30:27 +00:00
Igor Breger 014fc566e7 [GlobalISel][X86] Fix G_TRUNC instruction selection.
Updated tests with -verify-machineinstrs flag.
It fixes 3 tests failed with machine verifier enabled and listed
in PR27481

llvm-svn: 303502
2017-05-21 11:13:56 +00:00
Hiroshi Inoue 37e63b1b21 Summary
PPC backend eliminates compare instructions by using record-form instructions in PPCInstrInfo::optimizeCompareInstr, which is called from peephole optimization pass.
This patch improves this optimization to eliminate more compare instructions in two types of common case.


- comparison against a constant 1 or -1

The record-form instructions set CR bit based on signed comparison against 0. So, the current implementation does not exploit the record-form instruction for comparison against a non-zero constant.
This patch enables record-form optimization for constant of 1 or -1 if possible; it changes the condition "greater than -1" into "greater than or equal to 0" and "less than 1" into "less than or equal to 0".
With this patch, compare can be eliminated in the following code sequence, as an example.

uint64_t a, b;
if ((a | b) & 0x8000000000000000ull) { ... }
else { ... }


- andi for 32-bit comparison on PPC64

Since record-form instructions execute 64-bit signed comparison and so we have limitation in eliminating 32-bit comparison, i.e. with cmplwi, using the record-form. The original implementation already has such checks but andi. is not recognized as an instruction which executes implicit zero extension and hence safe to convert into record-form if used for equality check.

%1 = and i32 %a, 10
%2 = icmp ne i32 %1, 0
br i1 %2, label %foo, label %bar

In this simple example, LLVM generates andi. + cmplwi + beq on PPC64.
This patch make it possible to eliminate the cmplwi for this case.
I added andi. for optimization targets if it is safe to do so.

Differential Revision: https://reviews.llvm.org/D30081

llvm-svn: 303500
2017-05-21 06:00:05 +00:00
Sanjoy Das 8963650cfa Revert "[SCEV] Clarify behavior around max backedge taken count"
This reverts commit r303497 since it breaks the msan bootstrap bot:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1379/

llvm-svn: 303498
2017-05-21 05:02:12 +00:00
Sanjoy Das 5207168383 [SCEV] Clarify behavior around max backedge taken count
This change makes the split between the "exact" backedge taken count
and the "maximum" backedge taken count a bit more obvious.  Both of
these are upper bounds on the number of times the loop header
executes (since SCEV does not account for most kinds of abnormal
control flow), but the latter is guaranteed to be a constant.

There were a few places where the max backedge taken count *was* a
non-constant; I've changed those to compute constants instead.

At this point, I'm not sure if the constant max backedge count can be
computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without
losing precision.  If it can, we can simplify even further by making
`getMaxBackedgeTakenCount` a thin wrapper around
`getBackedgeTakenCount` and `getUnsignedRange`.

llvm-svn: 303497
2017-05-21 01:47:50 +00:00
Xin Tong 9fbfeefadf Revert "Add pthread_self function prototype and make it speculatable."
This reverts commit 143d7445b5dfa2f6d6c45bdbe0433d9fc531be21.

Build breaking

llvm-svn: 303496
2017-05-21 00:37:55 +00:00
Xin Tong 75af3af957 Add pthread_self function prototype and make it speculatable.
Summary: This allows pthread_self to be pulled out of a loop by LICM.

Reviewers: hfinkel, arsenm, davide

Reviewed By: davide

Subscribers: davide, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D32782

llvm-svn: 303495
2017-05-20 22:40:25 +00:00
Martell Malone 36af8f4d42 COFF: Fix another StringRef return error
This should appease the lld build bot regression
Following up on rL303493

llvm-svn: 303494
2017-05-20 21:54:15 +00:00
Martell Malone d1a5d9eee5 COFF: Fix single StringRef return error
This should appease the lld build bot regression
Intrroduced by rL303490

llvm-svn: 303493
2017-05-20 21:00:36 +00:00
Martell Malone 375dc90ebf COFF: migrate def parser from LLD to LLVM [1/2]
This is split up into two commits.
The will create the DEF parser in LLVM.
Check the following commit to see the removal from LLD

Reviewers: ruiu

Differential Revision: https://reviews.llvm.org/D32689

llvm-svn: 303490
2017-05-20 19:56:29 +00:00
David Blaikie f1c3beecb2 Fix -Wunneeded-internal-declaration by removing constant arrays only used in sizeof expressions, in favor of constants containing the size directly
llvm-svn: 303483
2017-05-20 03:32:51 +00:00
David Blaikie 8d039d40c5 llvm-symbolizer: Support multiple CUs in a single DWO file
llvm-svn: 303482
2017-05-20 03:32:49 +00:00
Eric Beckmann a6bdf751a2 Add functionality to cvtres to parse all entries in res file.
Summary: Added the new modules in the Object/ folder.  Updated the
llvm-cvtres interface as well, and added additional tests.

Subscribers: llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D33180

llvm-svn: 303480
2017-05-20 01:49:19 +00:00
Matthias Braun 57fd12db0c Fix breakage after r303461
- Improve wchar_t size predicitions based on target triple.
- Be less strict in wchar_t size verifier.

llvm-svn: 303477
2017-05-20 01:28:52 +00:00
Davide Italiano 9a0f542db6 [NewGVN] Create a StoreExpression instead of a VariableExpression.
In the case where we have an operand defined by a lod of the
same memory location. Historically this was a VariableExpression
because we wanted to make sure they ended up in the same class,
but if we create the right expression, they end up in the same
class anyway.

Fixes PR32897. Thanks to Dan for the detailed discussion and the
fix suggestion.

llvm-svn: 303475
2017-05-20 00:46:54 +00:00
Davide Italiano 888965c8a2 [NewGVN] Get rid of an assertion.
This was here because we don't want to switch leaders too much,
in order to avoid fixpoint(ing) issue, but it's not sure if it
matters in practice.

A first step towards fixing PR32897.

llvm-svn: 303473
2017-05-20 00:24:04 +00:00
Adrian Prantl 981a799896 Revert "Revert "ThinLTO: Verify bitcode before lauching the ThinLTOCodeGenerator.""
This reapplies commit r303438 modified to not verify cross-imported
bitcode in FunctionImporter.

rdar://problem/31233625

Differential Revision: https://reviews.llvm.org/D33370

llvm-svn: 303470
2017-05-20 00:00:08 +00:00
Adrian Prantl 660437975b Revert "ThinLTO: Verify bitcode before lauching the ThinLTOCodeGenerator."
This reverts commit r303438 while deliberating buildbot breakage.

llvm-svn: 303467
2017-05-19 23:32:21 +00:00
Matthias Braun 50ec0b5dce SimplifyLibCalls: Optimize wcslen
Refactor the strlen optimization code to work for both strlen and wcslen.

This especially helps with programs in the wild where people pass
L"string"s to const std::wstring& function parameters and the wstring
constructor gets inlined.

This also fixes a lingerind API problem/bug in getConstantStringInfo()
where zeroinitializers would always give you an empty string (without a
length) back regardless of the actual length of the initializer which
did not work well in the TrimAtNul==false causing the PR mentioned
below.

Note that the fixed getConstantStringInfo() needed fixes to SelectionDAG
memcpy lowering and may lead to some cases for out-of-bounds
zeroinitializer accesses not getting optimized anymore. So some code
with UB may produce out of bound memory reads now instead of just
producing zeros.

The refactoring "accidentally" fixes http://llvm.org/PR32124

Differential Revision: https://reviews.llvm.org/D32839

llvm-svn: 303461
2017-05-19 22:37:09 +00:00
Matthias Braun 89f3bcf0b5 Verifier: Check wchar_size module flag.
Differential Revision: https://reviews.llvm.org/D32974

llvm-svn: 303460
2017-05-19 22:37:01 +00:00
Reid Kleckner bf6b3b1564 Fix off-by-one bug in AttributeList::addAttributes index handling
getParamAlignment expects an argument number, not an AttributeList
index.

Johan Englan, who works on LDC, found this bug and told me about it off
list.

llvm-svn: 303458
2017-05-19 22:23:47 +00:00
Galina Kistanova 78706a3dae Added LLVM_FALLTHROUGH to address gcc warning: this statement may fall through.
llvm-svn: 303457
2017-05-19 21:08:28 +00:00
Evgeniy Stepanov 2acea2786b [safestack] Disable stack coloring by default.
Workaround for apparent miscompilation of PR32143.

llvm-svn: 303456
2017-05-19 20:58:48 +00:00
Galina Kistanova f525c76ba1 Added missing break.
llvm-svn: 303454
2017-05-19 20:31:51 +00:00
Daniel Berlin e021d2d629 NewGVN: Fix PR32838.
This is a complicated bug involving two issues:
1. What do we do with phi nodes when we prove all arguments are not
live?
2. When is it safe to use value leaders to determine if we can ignore
an argumnet?

llvm-svn: 303453
2017-05-19 20:22:20 +00:00
Zachary Turner 526f4f2aa8 Resubmit "[CodeView] Provide a common interface for type collections."
This was originally reverted because it was a breaking a bunch
of bots and the breakage was not surfacing on Windows.  After much
head-scratching this was ultimately traced back to a bug in the
lit test runner related to its pipe handling.  Now that the bug
in lit is fixed, Windows correctly reports these test failures,
and as such I have finally (hopefully) fixed all of them in this
patch.

llvm-svn: 303446
2017-05-19 19:26:58 +00:00
Daniel Berlin b527b2cf13 Last of the major pieces to NewGVN - yay!
Summary:
NewGVN: Handle equivalence between phi of ops and op of phis.

This makes our GVN mostly-complete. It would be complete, modulo some
deliberate choices we make.  This means it detects roughly all herband
equivalences in polynomial time, including cases notoriously hard for
other GVN's to detect.  It also detects a very large swath of the
cases we currently rely on instcombine to detect that involve folding
upwards through phis.

Fixes PR 31125, 31463, PR 31868

Reviewers: davide

Subscribers: Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D32151

llvm-svn: 303444
2017-05-19 19:01:27 +00:00
Daniel Berlin ff15200b1d NewGVN: Get rid of most dominating leader check
llvm-svn: 303443
2017-05-19 19:01:24 +00:00
Daniel Berlin a5130bbd12 BasicAA: Uninserted instructions have no parent, and notDifferentParent explicitly allows for this case, but getParent crashes when handed one.
llvm-svn: 303442
2017-05-19 19:01:21 +00:00
Amaury Sechet 77cfb4a85f [DAGCombine] (addcarry 0, 0, X) -> (ext/trunc X)
Summary:
While this makes some case better and some case worse - so it's unclear if it is a worthy combine just by itself - this is a useful canonicalisation.

As per discussion in D32756 .

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32916

llvm-svn: 303441
2017-05-19 18:20:44 +00:00
Anna Thomas ae3f752f36 [NFC][loopIdiom] Clang format change rL303434
llvm-svn: 303439
2017-05-19 18:00:30 +00:00
Adrian Prantl f9ab9bfc39 ThinLTO: Verify bitcode before lauching the ThinLTOCodeGenerator.
rdar://problem/31233625

Differential Revision: https://reviews.llvm.org/D33151

llvm-svn: 303438
2017-05-19 17:55:02 +00:00
Anna Thomas 5ecb8f7593 [LoopIdiom] Refactor return value of isLegalStore [NFC]
Summary:

This NFC simply refactors the return value of LoopIdiomRecognize::isLegalStore() from bool to an enumeration, and
removes the return-through-parameter mechanism that the function was using. This function is constructed such that it will
only ever recognize a single store idiom (memset, memset_pattern, or memcpy), and never a combination of these. As such it
makes much more sense for the return value to be the single idiom that the store matches, rather than
having a separate argument-return for each idiom -- it's cleaner, and makes it clearer that
only a single idiom can be matched.

Patch by Daniel Neilson!

Reviewers: anna, sanjoy, davide, haicheng

Reviewed By: anna, haicheng

Subscribers: haicheng, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D33359

llvm-svn: 303434
2017-05-19 17:05:36 +00:00
Craig Topper 9c913bfd49 [InstSimplify] Fix 80 column violation. NFC
llvm-svn: 303433
2017-05-19 16:56:53 +00:00
Craig Topper 8885f933b2 [APInt] Add support for dividing or remainder by a uint64_t or int64_t.
Summary:
This patch adds udiv/sdiv/urem/srem/udivrem/sdivrem methods that can divide by a uint64_t. This makes division consistent with all the other arithmetic operations.

This modifies the interface of the divide helper method to work on raw arrays instead of APInts. This way we can pass the uint64_t in for the RHS without wrapping it in an APInt. This required moving all the Quotient and Remainder allocation handling up to the callers. For udiv/urem this was as simple as just creating the Quotient/Remainder with the right size when they were declared. For udivrem we have to rely on reallocate not changing the contents of the variable LHS or RHS is aliased with the Quotient or Remainder APInts. We also have to zero the upper bits of Remainder and Quotient that divide doesn't write to if lhsWords/rhsWords is smaller than the width.

I've update the toString method to use the new udivrem.

Reviewers: hans, dblaikie, RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33310

llvm-svn: 303431
2017-05-19 16:43:54 +00:00
Dmitry Preobrazhensky ce941c9c38 [AMDGPU][MC] Corrected disassembler to decode instructions with 2 literals
See bug 32922: https://bugs.llvm.org//show_bug.cgi?id=32922

Reviewers: artem.tamazov, vpykhtin

Differential Revision: https://reviews.llvm.org/D32912

llvm-svn: 303428
2017-05-19 14:27:52 +00:00
Artur Pilipenko a6c278049a [LoopPredication] NFC. Extract LoopICmp struct and parseLoopICmp helper
llvm-svn: 303427
2017-05-19 14:02:46 +00:00
Artur Pilipenko 6780ba65b9 [LoopPredication] NFC. Extract LoopPredication::expandCheck helper
llvm-svn: 303426
2017-05-19 14:00:58 +00:00
Artur Pilipenko aab28666bc [LoopPredication] NFC. Extract CanExpand helper lambda
llvm-svn: 303425
2017-05-19 14:00:04 +00:00
Artur Pilipenko 46c4e0a4bf [LoopPredication] NFC. Add an early exit if there is no guards in the loop
llvm-svn: 303424
2017-05-19 13:59:34 +00:00
Dmitry Preobrazhensky 9321e8fcec [AMDGPU][MC] Fixed bugs in export instruction
See Bugs 33019, 33056:
  https://bugs.llvm.org//show_bug.cgi?id=33019
  https://bugs.llvm.org//show_bug.cgi?id=33056

Reviewers: artem.tamazov, vpykhtin

Differential Revision: https://reviews.llvm.org/D33288

llvm-svn: 303423
2017-05-19 13:36:09 +00:00
Guy Blank 548e22a1a7 [X86][AVX512] Make i1 illegal in the CodeGen
This patch defines the i1 type as illegal in the X86 backend for AVX512.
For DAG operations on <N x i1> types (build vector, extract vector element, ...) i8 is used, and should be truncated/extended.
This should produce better scalar code for i1 types since GPRs will be used instead of mask registers.

Differential Revision: https://reviews.llvm.org/D32273

llvm-svn: 303421
2017-05-19 12:35:15 +00:00
Daniel Sanders a1b2db7919 [globalisel][tablegen] Demote OptForSize/OptForMinSize/ForCodeSize to per-function predicates.
Summary:
This causes them to be re-computed more often than necessary but resolves
objections that were raised post-commit on r301750.

Reviewers: qcolombet, ab, t.p.northover, rovka, kristof.beyls

Reviewed By: qcolombet

Subscribers: igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D32861

llvm-svn: 303418
2017-05-19 11:08:33 +00:00
Amara Emerson 4d33c86359 Fix vector pass-through value being unused in IRBuilder::CreateMaskedGather
Also s/0/nullptr in the call site in LV.

llvm-svn: 303416
2017-05-19 10:40:18 +00:00
Volkan Keles 6a36c64720 [GlobalISel] IRTranslator: Translate ConstantStruct
Reviewers: qcolombet, ab, t.p.northover, aditya_nandakumar, dsanders

Reviewed By: qcolombet

Subscribers: rovka, kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D33317

llvm-svn: 303412
2017-05-19 09:47:02 +00:00
Zachary Turner 1dfcf8d92c Revert "[CodeView] Provide a common interface for type collections."
This is a squash of ~5 reverts of, well, pretty much everything
I did today.  Something is seriously broken with lit on Windows
right now, and as a result assertions that fire in tests are
triggering failures.  I've been breaking non-Windows bots all
day which has seriously confused me because all my tests have
been passing, and after running lit with -a to view the output
even on successful runs, I find out that the tool is crashing
and yet lit is still reporting it as a success!

At this point I don't even know where to start, so rather than
leave the tree broken for who knows how long, I will get this
back to green, and then once lit is fixed on Windows, hopefully
hopefully fix the remaining set of problems for real.

llvm-svn: 303409
2017-05-19 05:57:45 +00:00
Zachary Turner 47fdc73771 Don't crash if someone tries to visit an empty type stream.
llvm-svn: 303408
2017-05-19 05:18:09 +00:00
Zachary Turner 59ab6a3816 [CodeView] Reduce memory usage in TypeSerializer.
We were using a BumpPtrAllocator to allocate stable storage for
a record, then trying to insert that into a hash table.  If a
collision occurred, the bytes were never inserted and the
allocation was unnecessary.  At the cost of an extra hash
computation, check first if it exists, and only if it does do
we allocate and insert.

llvm-svn: 303407
2017-05-19 04:56:48 +00:00
Davide Italiano ee49f4943c [NewGVN] Delete the old store when we find congruent to a load.
(or non-store, more in general). Fixes PR33086. Caught by the
store verifier.

llvm-svn: 303406
2017-05-19 04:06:10 +00:00
Zachary Turner 8f1d87a79a Fix crasher in CodeView test.
Apparently this was always broken, but previously we were more
graceful about it and we would print "unknown udt" if we couldn't
find the type index, whereas now we just segfault because we
assume it's valid.  But this exposed a real bug, which is that
we weren't looking in the right place.  So fix that, and also
fix this crash at the same time.

llvm-svn: 303397
2017-05-19 00:56:39 +00:00
Matthias Braun d6e75ed93e LiveIntervalAnalysis: Fix missing case in pruneSubRegValues()
pruneSubRegValues() needs to remove subregister ranges starting at
instructions that later get removed by eraseInstrs(). It missed to check
one case in which eraseInstrs() would remove an instruction.

Fixes http://llvm.org/PR32688

llvm-svn: 303396
2017-05-19 00:18:03 +00:00
Zachary Turner 613c29e45f Fix another warning.
llvm-svn: 303394
2017-05-18 23:30:51 +00:00
Davide Italiano eab0de2b82 [NewGVN] Break infinite recursion in singleReachablePHIPath().
We can have cycles between PHIs and this causes singleReachablePhi()
to call itself indefintely (until we run out of stack). The proper
solution would be that of computing SCCs, but it's not worth for
now, so just keep a visited set and give up when we find a cycle.
Thanks to Dan for the discussion/help with this.

Fixes PR33014.

llvm-svn: 303393
2017-05-18 23:22:44 +00:00
Zachary Turner 7b62d7ccc0 Fix some build errors and warnings.
llvm-svn: 303391
2017-05-18 23:12:42 +00:00
Zachary Turner b32ec02b80 [CodeView] Raise the source to ID map out of the TypeStreamMerger.
This map will be needed to rewrite symbol streams after re-writing
the corresponding type streams.

llvm-svn: 303390
2017-05-18 23:04:08 +00:00
Zachary Turner 8fb441ab9c [llvm-pdbdump] Add the ability to merge PDBs.
Merging PDBs is a feature that will be used heavily by
the linker.  The functionality already exists but does not
have deep test coverage because it's not easily exposed through
any tools.  This patch aims to address that by adding the
ability to merge PDBs via llvm-pdbdump.  It takes arbitrarily
many PDBs and outputs a single PDB.

Using this new functionality, a test is added for merging
type records.  Future patches will add the ability to merge
symbol records, module information, etc.

llvm-svn: 303389
2017-05-18 23:03:41 +00:00
Zachary Turner 0c60f269fc [CodeView] Provide a common interface for type collections.
Right now we have multiple notions of things that represent collections of
types. Most commonly used are TypeDatabase, which is supposed to keep
mappings from TypeIndex to type name when reading a type stream, which
happens when reading PDBs. And also TypeTableBuilder, which is used to
build up a collection of types dynamically which we will later serialize
(i.e. when writing PDBs).

But often you just want to do some operation on a collection of types, and
you may want to do the same operation on any kind of collection. For
example, you might want to merge two TypeTableBuilders or you might want
to merge two type streams that you loaded from various files.

This dichotomy between reading and writing is responsible for a lot of the
existing code duplication and overlapping responsibilities in the existing
CodeView library classes. For example, after building up a
TypeTableBuilder with a bunch of type records, if we want to dump it we
have to re-invent a bunch of extra glue because our dumper takes a
TypeDatabase or a CVTypeArray, which are both incompatible with
TypeTableBuilder.

This patch introduces an abstract base class called TypeCollection which
is shared between the various type collection like things. Wherever we
previously stored a TypeDatabase& in some common class, we now store a
TypeCollection&.

The advantage of this is that all the details of how the collection are
implemented, such as lazy deserialization of partial type streams, is
completely transparent and you can just treat any collection of types the
same regardless of where it came from.

Differential Revision: https://reviews.llvm.org/D33293

llvm-svn: 303388
2017-05-18 23:03:06 +00:00
Davide Italiano a76e5fa111 [NewGVN] Replace predicate info leftovers.
This time with an additional fix, i.e. we remove the dead
@llvm.ssa.copy instruction.

llvm-svn: 303385
2017-05-18 21:43:23 +00:00
Sanjay Patel 5e456b943a [InstCombine] add helper to foldXorOfICmps(); NFCI
Also, fix the old-style capitalization of the related functions
and move them to the 'private' section of the class since they
are just helpers of the visit* functions.

As shown in the post-commit comments for D32143, we are missing
folds for xor-of-icmps. 

llvm-svn: 303381
2017-05-18 20:53:16 +00:00
Rui Ueyama d66d976efd Revert r303375 "LLVM_FALLTHROUGH instead of fall-through comment."
This reverts commit r303375 since it didn't compile.

llvm-svn: 303377
2017-05-18 20:18:24 +00:00
Galina Kistanova 871417446b LLVM_FALLTHROUGH instead of fall-through comment.
llvm-svn: 303375
2017-05-18 20:01:52 +00:00
Hans Wennborg b00ffd8cb7 Revert r302938 "Add LiveRangeShrink pass to shrink live range within BB."
This also reverts follow-ups r303292 and r303298.

It broke some Chromium tests under MSan, and apparently also internal
tests at Google.

llvm-svn: 303369
2017-05-18 18:50:05 +00:00
Galina Kistanova f8355ccb77 Reduce gcc-7 warnings by fall-through comments.
llvm-svn: 303365
2017-05-18 17:53:47 +00:00
Reid Kleckner 96ab8726a3 [IR] De-virtualize ~Value to save a vptr
Summary:
Implements PR889

Removing the virtual table pointer from Value saves 1% of RSS when doing
LTO of llc on Linux. The impact on time was positive, but too noisy to
conclusively say that performance improved. Here is a link to the
spreadsheet with the original data:

https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing

This change makes it invalid to directly delete a Value, User, or
Instruction pointer. Instead, such code can be rewritten to a null check
and a call Value::deleteValue(). Value objects tend to have their
lifetimes managed through iplist, so for the most part, this isn't a big
deal.  However, there are some places where LLVM deletes values, and
those places had to be migrated to deleteValue.  I have also created
llvm::unique_value, which has a custom deleter, so it can be used in
place of std::unique_ptr<Value>.

I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which
derives from User outside of lib/IR. Code in IR cannot include MemorySSA
headers or call the MemoryAccess object destructors without introducing
a circular dependency, so we need some level of indirection.
Unfortunately, no class derived from User may have any virtual methods,
because adding a virtual method would break User::getHungOffOperands(),
which assumes that it can find the use list immediately prior to the
User object. I've added a static_assert to the appropriate OperandTraits
templates to help people avoid this trap.

Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv

Reviewed By: chandlerc

Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D31261

llvm-svn: 303362
2017-05-18 17:24:10 +00:00
Wei Mi 8848c1e3c7 [LSR] Call canonicalize after we generate a new Formula in GenerateTruncates. Fix PR33077.
The testcase in PR33077 generates a LSR Use Formula with two SCEVAddRecExprs for the same
loop. Such uncommon formula will become non-canonical after GenerateTruncates adds sign
extension to the ScaledReg of the Formula, and it will break the assertion that every
Formula to be inserted is canonical.

The fix is to call canonicalize for the raw Formula generated by GenerateTruncates
before inserting it.

llvm-svn: 303361
2017-05-18 17:21:22 +00:00
Francis Visoiu Mistrih 8b61764cbb [LegacyPassManager] Remove TargetMachine constructors
This provides a new way to access the TargetMachine through
TargetPassConfig, as a dependency.

The patterns replaced here are:

* Passes handling a null TargetMachine call
  `getAnalysisIfAvailable<TargetPassConfig>`.

* Passes not handling a null TargetMachine
  `addRequired<TargetPassConfig>` and call
  `getAnalysis<TargetPassConfig>`.

* MachineFunctionPasses now use MF.getTarget().

* Remove all the TargetMachine constructors.
* Remove INITIALIZE_TM_PASS.

This fixes a crash when running `llc -start-before prologepilog`.

PEI needs StackProtector, which gets constructed without a TargetMachine
by the pass manager. The StackProtector pass doesn't handle the case
where there is no TargetMachine, so it segfaults.

Related to PR30324.

Differential Revision: https://reviews.llvm.org/D33222

llvm-svn: 303360
2017-05-18 17:21:13 +00:00
Zachary Turner 5a83fb153f Fix some minor issues in PDB parsing library.
1) Until now I'd never seen a valid PDB where the DBI stream and
   the PDB Stream disagreed on the "Age" field.  Because of that,
   we had code to assert that they matched.  Recently though I was
   given a PDB where they disagreed, so this assumption has proven
   to be incorrect.  Remove this check.

2) We were walking the entire list of hash values for types up front
   and then throwing away the values.  For large PDBs this was a
   significant slow down.  Remove this.

With this patch, I can dump the list of all compilands from a
1.5GB PDB file in just a few seconds.

llvm-svn: 303351
2017-05-18 15:14:44 +00:00
Anna Thomas 7bca59152a [JumpThreading] Dont RAUW condition incorrectly
Summary:
We have a bug when RAUWing the condition if experimental.guard or assumes is a use of that
condition. This is because LazyValueInfo may have used the guards/assumes to identify the
value of the condition at the end of the block. RAUW replaces the uses
at the guard/assume as well as uses before the guard/assume. Both of
these are incorrect.
For now, disable RAUW for conditions and fix the logic as a next
step: https://reviews.llvm.org/D33257

Reviewers: sanjoy, reames, trentxintong

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33279

llvm-svn: 303349
2017-05-18 13:12:18 +00:00
Sam Kolton ebfdaf7394 [AMDGPU] SDWA operands should not intersect with potential MIs
Summary:
There should be no intesection between SDWA operands and potential MIs. E.g.:
```
v_and_b32 v0, 0xff, v1 -> src:v1 sel:BYTE_0
v_and_b32 v2, 0xff, v0 -> src:v0 sel:BYTE_0
v_add_u32 v3, v4, v2
```
In that example it is possible that we would fold 2nd instruction into 3rd (v_add_u32_sdwa) and then try to fold 1st instruction into 2nd (that was already destroyed). So if SDWAOperand is also a potential MI then do not apply it.

Reviewers: vpykhtin, arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D32804

llvm-svn: 303347
2017-05-18 12:12:03 +00:00
Guy Blank d19632fa16 [MVT] add v1i1 MVT
Adds the v1i1 MVT as a preparation for another commit (https://reviews.llvm.org/D32273)

Differential Revision: https://reviews.llvm.org/D32540

llvm-svn: 303346
2017-05-18 11:29:41 +00:00
Igor Breger 842b5b36ba [GlobalISel][X86] G_ADD/G_SUB vector legalizer/selector support.
Summary: G_ADD/G_SUB vector legalizer/selector support.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D33232

llvm-svn: 303345
2017-05-18 11:10:56 +00:00
Simon Pilgrim 6bba6068be [X86][AVX512] Add 512-bit vector ctpop costs + tests
llvm-svn: 303342
2017-05-18 10:42:34 +00:00
Daniel Sanders 89e9308623 Re-commit: [globalisel][tablegen] Import rules containing intrinsic_wo_chain.
Summary:
As of this patch, 1018 out of 3938 rules are currently imported.

Depends on D32275

Reviewers: qcolombet, kristof.beyls, rovka, t.p.northover, ab, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: dberris, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D32278

The previous commit failed on test-suite/Bitcode/simd_ops/AArch64_halide_runtime.bc
because isImmOperandEqual() assumed MO was a register operand and that's not
always true.

llvm-svn: 303341
2017-05-18 10:33:36 +00:00
Max Kazantsev 627ad0fec3 [SCEV][NFC] Remove duplication of isLoopInvariant code
Replace two places that duplicate the code of isLoopInvariant method with
the invocation of this method.

Differential Revision: https://reviews.llvm.org/D33313

llvm-svn: 303336
2017-05-18 08:26:41 +00:00
George Rimar 47f84b1a3c [DWARF] - Simplify RelocVisitor implementation.
We do not need to store relocation width field.
Patch removes relative code, that simplifies implementation.

Differential revision: https://reviews.llvm.org/D33274

llvm-svn: 303335
2017-05-18 08:25:11 +00:00
Lama Saba 2ea271b54a [X86] Replace slow LEA instructions in X86
According to Intel's Optimization Reference Manual for SNB+:
  " For LEA instructions with three source operands and some specific situations, instruction latency has increased to 3 cycles, and must
    dispatch via port 1:
  - LEA that has all three source operands: base, index, and offset
  - LEA that uses base and index registers where the base is EBP, RBP,or R13
  - LEA that uses RIP relative addressing mode
  - LEA that uses 16-bit addressing mode "
  This patch currently handles the first 2 cases only.
 
Differential Revision: https://reviews.llvm.org/D32277

llvm-svn: 303333
2017-05-18 08:11:50 +00:00
George Rimar f98b9ac5da [lib/Object] - Minor API update for llvm::Decompressor.
I revisited Decompressor API (issue with it was triggered during D32865 review)
and found it is probably provides more then we really need.

Issue was about next method's signature:

Error decompress(SmallString<32> &Out);
It is too strict. At first I wanted to change it to decompress(SmallVectorImpl<char> &Out),
but then found it is still not flexible because sticks to SmallVector.

During reviews was suggested to use templating to simplify code. Patch do that.

Differential revision: https://reviews.llvm.org/D33200

llvm-svn: 303331
2017-05-18 08:00:01 +00:00
Serguei Katkov ba831f78fd [BPI] Reduce the probability of unreachable edge to minimal value greater than 0
The probability of edge coming to unreachable block should be as low as possible.
The change reduces the probability to minimal value greater than zero.

The bug https://bugs.llvm.org/show_bug.cgi?id=32214 show the example when
the probability of edge coming to unreachable block is greater than for edge
coming to out of the loop and it causes incorrect loop rotation.

Please note that with this change the behavior of unreachable heuristic is a bit different
than others. Specifically, before this change the sum of probabilities
coming to unreachable blocks have the same weight for all branches
(it was just split over all edges of this block coming to unreachable blocks).
With this change it might be slightly different but not to much due to probability of
taken branch to unreachable block is really small.

Reviewers: chandlerc, sanjoy, vsk, congh, junbuml, davidxl, dexonsmith
Reviewed By: chandlerc, dexonsmith
Subscribers: reames, llvm-commits
Differential Revision: https://reviews.llvm.org/D30633

llvm-svn: 303327
2017-05-18 06:11:56 +00:00
Akira Hatanaka b10bff1183 [ThinLTO] Do not assert when adding a module with a different but
compatible target triple

Currently, an assertion fails in ThinLTOCodeGenerator::addModule when
the target triple of the module being added doesn't match that of the
one stored in TMBuilder. This patch relaxes the constraint and makes
changes to allow target triples that only differ in their version
numbers on Apple platforms, similarly to what r228999 did.

rdar://problem/30133904

Differential Revision: https://reviews.llvm.org/D33291

llvm-svn: 303326
2017-05-18 03:52:29 +00:00
Davide Italiano 9ae69a75ec [Target/X86] Remove unneeded return. NFCI.
llvm-svn: 303323
2017-05-18 02:36:42 +00:00
Craig Topper 8a950275f7 [Statistics] Add a method to atomically update a statistic that contains a maximum
Summary:
There are several places in the codebase that try to calculate a maximum value in a Statistic object. We currently do this in one of two ways:

  MaxNumFoo = std::max(MaxNumFoo, NumFoo);

or

  MaxNumFoo = (MaxNumFoo > NumFoo) ? MaxNumFoo : NumFoo;

The first version reads from MaxNumFoo one time and uncontionally rwrites to it. The second version possibly reads it twice depending on the result of the first compare.  But we have no way of knowing if the value was changed by another thread between the reads and the writes.

This patch adds a method to the Statistic object that can ensure that we only store if our value is the max and the previous max didn't change after we read it. If it changed we'll recheck if our value should still be the max or not and try again.

This spawned from an audit I'm trying to do of all places we uses the implicit conversion to unsigned on the Statistics objects. See my previous thread on llvm-dev https://groups.google.com/forum/#!topic/llvm-dev/yfvxiorKrDQ

Reviewers: dberlin, chandlerc, hfinkel, dblaikie

Reviewed By: chandlerc

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D33301

llvm-svn: 303318
2017-05-18 00:51:39 +00:00
Kyle Butt 0cf5b2f88a CodeGen: BlockPlacement: Add Message strings to asserts. NFC
Add message strings to all the unlabeled asserts in the file.

Differential Revision: https://reviews.llvm.org/D33078

llvm-svn: 303316
2017-05-17 23:44:41 +00:00
Craig Topper 48187cffe2 [Statistics] Use Statistic::operator+= instead of adding and assigning separately.
I believe this technically fixes a multithreaded race condition in this code. But my primary concern was as part of looking at removing the ability to treat Statistics like a plain unsigned. There are many weird operations on Statistics in the codebase.

llvm-svn: 303314
2017-05-17 23:22:10 +00:00
Sanjay Patel ba212c241a [InstCombine] handle icmp i1 X, C early to avoid creating an unknown pattern
The missing optimization for xor-of-icmps still needs to be added, but by
being more efficient (not generating unnecessary logic ops with constants)
we avoid the bug.

See discussion in post-commit comments:
https://reviews.llvm.org/D32143

llvm-svn: 303312
2017-05-17 22:29:40 +00:00
Sanjay Patel e5747e3cbd [InstCombine] move icmp bool canonicalizations to helper; NFC
As noted in the post-commit comments in D32143, we should be
catching the constant operand cases sooner to be more efficient
and less likely to expose a missing fold.

llvm-svn: 303309
2017-05-17 22:15:07 +00:00
Matt Arsenault 2b1f9aa577 AMDGPU: Start defining a calling convention
Partially implement callee-side for arguments and return values.
byval doesn't work properly, and most likely sret or other on-stack
return values most as well.

llvm-svn: 303308
2017-05-17 21:56:25 +00:00
Kyle Butt f6c61ef64d CodeGen: Power: Add lowering for shifts of v1i128.
When legalizing vector operations on vNi128, they will be split to v1i128
because that is a legal type on ppc64, but then the compiler will crash in
selection dag because it fails to select for these operations. This patch fixes
shift operations. Logical shift right and left shift can be performed in the
vector unit, but algebraic shift right requires being split.

Differential Revision: https://reviews.llvm.org/D32774

llvm-svn: 303307
2017-05-17 21:54:41 +00:00
Michael Liao ab12984634 Fix PR33028
- '-verify-mahcineinstrs' starts to complain allocatable live-in physical
  registers on non-entry or non-landing-pad basic blocks.
- Refactor the XBEGIN translation to define EAX on a dedicated fallback code
  path due to XABORT. Add a pseudo instruction to define EAX explicitly to
  avoid add physical register live-in.

Differential Revision: https://reviews.llvm.org/D33168

llvm-svn: 303306
2017-05-17 21:48:00 +00:00
Matt Arsenault 2525e4e4c2 AMDGPU: Expand frame indexes to be relative to scratch wave offset
In order for an arbitrary callee to access an object
in a caller's stack frame, the 32-bit offset used as
the private pointer needs to be relative to the kernel's
scratch wave offset register.

Convert to this by finding the difference from the current
stack frame and scaling by the wavefront size.

llvm-svn: 303303
2017-05-17 21:23:14 +00:00
Matt Arsenault 156d3ae0b6 AMDGPU: Change mubuf soffset register when SP relative
Check the MachinePointerInfo for whether the access is
supposed to be relative to the stack pointer.

No tests because this is used in later commits implementing
calls.

llvm-svn: 303301
2017-05-17 21:02:58 +00:00
Simon Pilgrim 23ef26728a [X86][AVX512] Add 512-bit vector ctlz costs + tests
llvm-svn: 303300
2017-05-17 21:02:18 +00:00
Bob Haarman de33a63784 [llvm-pdbdump] in yaml2pdb, generate default output filename if none given
Summary:
llvm-pdbdump yaml2pdb used to fail with a misleading error
message ("An I/O error occurred on the file system") if no output file
was specified. This change adds an assert to PDBFileBuilder to check
that an output file name is specified, and makes llvm-pdbdump generate
an output file name based on the input file name if no output file
name is explicitly specified.

Reviewers: amccarth, zturner

Reviewed By: zturner

Subscribers: fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D33296

llvm-svn: 303299
2017-05-17 20:46:48 +00:00
Zachary Turner d2b418bfe2 Add some helpers for manipulating BinaryStreamRefs.
llvm-svn: 303297
2017-05-17 20:42:52 +00:00
Matt Arsenault 98f2946ab3 AMDGPU: Make better use of op_sel with high components
Handle more general swizzles.

llvm-svn: 303296
2017-05-17 20:30:58 +00:00
Sanjay Patel e2787b9a35 [InstSimplify] handle all icmp i1 X, C in one place; NFCI
We already handled all of the new tests identically, but several
of those went through a lot of unnecessary processing before
getting folded.

Another motivation for grouping these cases together is that
InstCombine needs a similar fold. Currently, it handles the
'not' cases inefficiently which can lead to bugs as described
in the post-commit comments of:
https://reviews.llvm.org/D32143 

llvm-svn: 303295
2017-05-17 20:27:55 +00:00
Zachary Turner d9a626332e [BinaryStream] Reduce the amount of boiler plate needed to use.
Often you have an array and you just want to use it.  With the current
design, you have to first construct a `BinaryByteStream`, and then create
a `BinaryStreamRef` from it.  Worse, the `BinaryStreamRef` holds a pointer
to the `BinaryByteStream`, so you can't just create a temporary one to
appease the compiler, you have to actually hold onto both the `ArrayRef`
as well as the `BinaryByteStream` *AND* the `BinaryStreamReader` on top of
that.  This makes for very cumbersome code, often requiring one to store a
`BinaryByteStream` in a class just to circumvent this.

At the cost of some added complexity (not exposed to users, but internal
to the library), we can do better than this.  This patch allows us to
construct `BinaryStreamReaders` and `BinaryStreamWriters` directly from
source data (e.g. `StringRef`, `MutableArrayRef<uint8_t>`, etc).  Not only
does this reduce the amount of code you have to type and make it more
obvious how to use it, but it solves real lifetime issues when it's
inconvenient to hold onto a `BinaryByteStream` for a long time.

The additional complexity is in the form of an added layer of indirection.
Whereas before we simply stored a `BinaryStream*` in the ref, we now store
both a `BinaryStream*` **and** a `std::shared_ptr<BinaryStream>`.  When
the user wants to construct a `BinaryStreamRef` directly from an
`ArrayRef` etc, we allocate an internal object that holds ownership over a
`BinaryByteStream` and forwards all calls, and store this in the
`shared_ptr<>`.  This also maintains the ref semantics, as you can copy it
by value and references refer to the same underlying stream -- the one
being held in the object stored in the `shared_ptr`.

Differential Revision: https://reviews.llvm.org/D33293

llvm-svn: 303294
2017-05-17 20:23:31 +00:00
Simon Pilgrim d0365967c4 [X86][AVX512] Add 512-bit vector cttz costs + tests
llvm-svn: 303293
2017-05-17 20:22:54 +00:00
Dehao Chen 02828a93e8 Only enable LiveRangeShrink for x86.
Summary: Moving LiveRangeShrink to x86 as this pass is mostly useful for archtectures with great register pressure.

Reviewers: MatzeB, qcolombet

Reviewed By: qcolombet

Subscribers: jholewinski, jyknight, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33294

llvm-svn: 303292
2017-05-17 20:18:13 +00:00
Matt Arsenault 786eeea23e AMDGPU: Try to use op_sel when selecting packed instructions
Avoids instructions to pack a vector when the source is really
a scalar being broadcast.

Also be smarter and look for per-component fneg.

Doesn't yet handle scalar from upper half of register
or other swizzles.

llvm-svn: 303291
2017-05-17 20:00:00 +00:00
Jacob Gravelle c63fb00f13 [WebAssembly][NFC] Update expected testsuite failures for newly passing tests
Summary: r303050 fixes crashes when calling scalarizeMaskedMemIntrin pass from WebAssembly backend. This updates expected test failures for that.

Reviewers: sbc100

Subscribers: jfb, llvm-commits, dschuff

Differential Revision: https://reviews.llvm.org/D33295

llvm-svn: 303288
2017-05-17 19:45:22 +00:00
Matt Arsenault ea8a4ed588 AMDGPU: Use appropriate soffset for spilling
This needs to be the frame offset register, and not the global
scratch wave offset register. For kernels, these are the same.

llvm-svn: 303287
2017-05-17 19:37:57 +00:00
Dimitry Andric ebc8779301 Revert r303015, because it has the unintended side effect of breaking
driver-mode recognition in clang (this is because the sysctl method
always returns one and only one executable path, even for an executable
with multiple links):

Fix DynamicLibraryTest.cpp on FreeBSD and NetBSD

Summary:

After rL301562, on FreeBSD the DynamicLibrary unittests fail, because
the test uses getMainExecutable("DynamicLibraryTests", Ptr), and since
the path does not contain any slashes, retrieving the main executable
will not work.

Reimplement getMainExecutable() for FreeBSD and NetBSD using sysctl(3),
which is more reliable than fiddling with relative or absolute paths.

Also add retrieval of the original argv[] from the GoogleTest framework,
to use as a fallback for other OSes.

Reviewers: emaste, marsupial, hans, krytarowski

Reviewed By: krytarowski

Subscribers: krytarowski, llvm-commits

Differential Revision: https://reviews.llvm.org/D33171

llvm-svn: 303285
2017-05-17 19:33:10 +00:00
Matt Arsenault ee324ffc1f AMDGPU: Fix min3/max3 combines for f16/i16
Fix missing instruction definitions for min3/max3.

llvm-svn: 303284
2017-05-17 19:25:06 +00:00
Simon Pilgrim a9a92a1a6a [X86][AVX512] Add 512-bit vector bitreverse costs + tests
llvm-svn: 303283
2017-05-17 19:20:20 +00:00
Reid Kleckner 710c1cebb4 Re-land r303274: "[CrashRecovery] Use SEH __try instead of VEH when available"
We have to check gCrashRecoveryEnabled before using __try.

In other words, SEH works too well and we ended up recovering from
crashes in implicit module builds that we weren't supposed to. Only
libclang is supposed to enable CrashRecoveryContext to allow implicit
module builds to crash.

llvm-svn: 303279
2017-05-17 18:16:17 +00:00
Aditya Nandakumar be92993710 [GISel]: Fix undefined behavior in IRTranslator
Make sure IRTranslator->MachineIRBuilder->DebugLoc doesn't
outlive the DILocation. Clear it at the end of
IRTranslator::runOnMachineFunction

llvm-svn: 303277
2017-05-17 17:41:55 +00:00
Reid Kleckner 6f6f7d19f0 Revert "[CrashRecovery] Use SEH __try instead of VEH when available"
This reverts commit r303274, it appears to break some clang tests.

llvm-svn: 303275
2017-05-17 17:15:00 +00:00
Reid Kleckner 91fea018ee [CrashRecovery] Use SEH __try instead of VEH when available
Summary:
It avoids problems when other libraries raise exceptions. In particular,
OutputDebugString raises an exception that the debugger is supposed to
catch and suppress. VEH kicks in first right now, and that is entirely
incorrect.

Unfortunately, GCC does not support SEH, so I've kept the old buggy VEH
codepath around. We could fix it with SetUnhandledExceptionFilter, but
that is not per-thread, so a well-behaved library shouldn't set it.

Reviewers: zturner

Subscribers: llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D33261

llvm-svn: 303274
2017-05-17 17:02:16 +00:00
Zachary Turner 0daa7074bf Workaround for incorrect Win32 header on GCC.
llvm-svn: 303272
2017-05-17 16:39:33 +00:00
Zachary Turner 1d795c451e [CodeView] Simplify the use of visiting type records & streams.
There is often a lot of boilerplate code required to visit a type
record or type stream.  The #1 use case is that you have a sequence
of bytes that represent one or more records, and you want to
deserialize each one, switch on it, and call a callback with the
deserialized record that the user can examine.  Currently this
requires at least 6 lines of code:

  codeview::TypeVisitorCallbackPipeline Pipeline;
  Pipeline.addCallbackToPipeline(Deserializer);
  Pipeline.addCallbackToPipeline(MyCallbacks);

  codeview::CVTypeVisitor Visitor(Pipeline);
  consumeError(Visitor.visitTypeRecord(Record));

With this patch, it becomes one line of code:

  consumeError(codeview::visitTypeRecord(Record, MyCallbacks));

This is done by having the deserialization happen internally inside
of the visitTypeRecord function.  Since this is occasionally not
desirable, the function provides a 3rd parameter that can be used
to change this behavior.

Hopefully this can significantly reduce the barrier to entry
to using the visitation infrastructure.

Differential Revision: https://reviews.llvm.org/D33245

llvm-svn: 303271
2017-05-17 16:39:06 +00:00
Sanjay Patel b2e7003103 [InstCombine] add isCanonicalPredicate() helper function and use it; NFCI
There should be a slight efficiency improvement from handling icmp/fcmp with one matcher and reducing duplicated code.

The larger motivation is that there are questions about how predicate canonicalization is handled, and the refactoring
should make it easier if we want to change any of that behavior.

1. As noted in the code comment, we've chosen 3 of the 16 FCMP preds as not canonical. Why those 3? It goes back to 
   rL32751 from what I can tell, but I'm not sure if there's a justification for that rule.
2. We currently do not canonicalize integer select conditions. Should we use the same rule that applies to branches 
   for selects?
3. We currently do canonicalize some FP select conditions, and those rules would conflict with the rule shown here. 
   Should one or both be changed? 

No-functional-change-intended, but adding tests anyway because there's no coverage for most of the predicates.

Differential Revision: https://reviews.llvm.org/D33247

llvm-svn: 303261
2017-05-17 14:21:19 +00:00
Krzysztof Parzyszek 2b0533126e [PPC] Properly update register save area offsets
The variables MinGPR/MinG8R were not updated properly when resetting the
offsets, which in the included testcase lead to saving the CR register
in the same location as R30.

This fixes another issue reported in PR26519.

Differential Revision: https://reviews.llvm.org/D33017

llvm-svn: 303257
2017-05-17 13:25:09 +00:00
Igor Breger 28f290fab8 [GlobalISel][X86] Support add i64 in IA32.
Summary: support G_UADDE instruction selection.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D33096

llvm-svn: 303255
2017-05-17 12:48:08 +00:00
Jonas Paulsson 8722ade770 [SystemZ] Modelling of costs of divisions with a constant power of 2.
Such divisions will eventually be implemented with shifts which should
be reflected in the cost function.

Review: Ulrich Weigand
llvm-svn: 303254
2017-05-17 12:46:26 +00:00
Diana Picus eafa4aa910 Reland r303247: [ARM] GlobalISel: Remove dead instruction selection code
It only failed on llvm-clang-x86_64-expensive-checks-win, probably
because the TableGen stuff hasn't been regenerated.
Requires a clean build.

llvm-svn: 303252
2017-05-17 12:42:52 +00:00
George Rimar fed9f09f48 [DWARF] - Cleanup relocations proccessing.
RelocAddrMap was a pair of <width, address>, where width is relocation size (4/8/x, x < 8), 
and width field was never used in code.

Relocations proccessing loop had checks for width field. Does not look like DWARF parser
should do that. There is probably no much sense to validate relocations during proccessing 
them in parser.

Patch removes relocation's width relative code from DWARFContext.

Differential revision: https://reviews.llvm.org/D33194

llvm-svn: 303251
2017-05-17 12:10:51 +00:00
Diana Picus 36e4ba0f6e Revert "[ARM] GlobalISel: Remove dead instruction selection code"
This reverts commit r303247 because the tests are failing on some bots.
Sorry!

llvm-svn: 303249
2017-05-17 11:56:07 +00:00
Diana Picus 68d21c864e [ARM] GlobalISel: Remove dead instruction selection code
We can now generate code for selecting G_ADD, G_SUB and G_MUL. Remove
the hand-written versions.

llvm-svn: 303247
2017-05-17 11:39:26 +00:00
Daniel Cederman 4af795b499 [Sparc] Remove execute permissions from non-executable text files
Reviewers: jyknight, lero_chris, venkatra

Reviewed By: jyknight

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27127

llvm-svn: 303245
2017-05-17 11:05:20 +00:00
Pavel Labath 859d302349 [RuntimeDyld] Fix debug section relocation (pr20457)
Summary:
Debug info sections, (or non-SHF_ALLOC sections in general) should be
linked as if their load address was zero to emulate the behavior of the
static linker.

This bug was discovered because it was breaking lldb expression evaluation on
linux.

Reviewers: lhames

Subscribers: aprantl, eugene, clayborg, lldb-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D32899

llvm-svn: 303239
2017-05-17 08:47:28 +00:00
Jonas Paulsson 0f8678016f Make sure -optimize-regalloc=false is used correctly by user.
Don't allow -optimize-regalloc=false with -regalloc given for anything other
than 'fast'. The other register allocators depend on the supporting passes
added by addOptimizedRegAlloc().

Reviewers: Quentin Colombet, Matthias Braun
https://reviews.llvm.org/D33181

llvm-svn: 303238
2017-05-17 07:36:03 +00:00
Max Kazantsev 4c7f293d24 [SCEV] Always sort AddRecExprs from different loops by dominance
Sorting of AddRecExprs by loop nesting does not make sense since we only invoke
the CompareSCEVComplexity for AddRecExprs that are used by one SCEV. This
guarantees that there is always a dominance relationship between them. This
patch removes the sorting by nesting which is a dead code in current usage of
this function.

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D33228

llvm-svn: 303235
2017-05-17 04:09:14 +00:00
Max Kazantsev b67d344850 [SCEV][NFC] Replace redundant dyn_cast with cast in getAddExpr
Replace dyn_cast which is ensured by isa just one line above with cast.

Differential Revision: https://reviews.llvm.org/D33231

llvm-svn: 303234
2017-05-17 03:58:42 +00:00
Gor Nishanov db38485588 [coroutines] Handle spills before catchswitch
If we need to spill the result of the PHI instruction, we insert the spill after
all of the PHIs and EHPads, however, in a catchswitch block there is no
room to insert the spill. Make room by splitting away catchswitch into a separate
block.

Before the fix:

    catch.dispatch:
       %val = phi i32 [ 1, %if.then ], [ 2, %if.else ]
       %switch = catchswitch within none [label %catch] unwind label %cleanuppad

After:

    catch.dispatch:
       %val = phi i32 [ 1, %if.then ], [ 2, %if.else ]
       %tok = cleanuppad within none []
       ; spill goes here
       cleanupret from %tok unwind label %catch.dispatch.switch
    catch.dispatch.switch:
       %switch = catchswitch within none [label %catch] unwind label %cleanuppad

https://reviews.llvm.org/D31846

llvm-svn: 303232
2017-05-17 03:09:22 +00:00
Francis Visoiu Mistrih b52e036600 BitVector: add iterators for set bits
Differential revision: https://reviews.llvm.org/D32060

llvm-svn: 303227
2017-05-17 01:07:53 +00:00
Eugene Zelenko a369a45746 [ADT] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC).
llvm-svn: 303221
2017-05-16 23:10:25 +00:00
Zachary Turner c9c39291c7 Fix for compilers with older CRT header libraries.
llvm-svn: 303220
2017-05-16 22:59:34 +00:00
Zachary Turner 13e87f43d9 [Support] Ignore OutputDebugString exceptions in our crash recovery.
Since we use AddVectoredExceptionHandler, we get notified of
every exception that gets raised by a program.  Sometimes these
are not necessarily errors though, and this can be especially
true when linking against a library that we have no control
over, and may raise an exception internally which it intends
to catch.

In particular, the Windows API OutputDebugString does exactly
this.  It raises an exception inside of a __try / __except,
giving the debugger a chance to handle the exception to print
the message to the debug console.

But this doesn't interoperate nicely with our vectored exception
handler, which just sees another exception and decides that we
need to terminate the program.

Add a special case for this so that we ignore ODS exceptions
and continue normally.

Note that a better fix is to simply not use vectored exception
handlers and use SEH instead, but given that MinGW doesn't support
SEH, this is the only solution for MinGW.

Differential Revision: https://reviews.llvm.org/D33260

llvm-svn: 303219
2017-05-16 22:50:32 +00:00
Davide Italiano 79eb3b0366 [IR] Prefer use_empty() to !hasNUsesOrMore(1) for clarity.
llvm-svn: 303218
2017-05-16 22:38:40 +00:00
Sanjay Patel 877364ff99 [InstSimplify] add folds for constant mask of value shifted by constant
We would eventually catch these via demanded bits and computing known bits in InstCombine,
but I think it's better to handle the simple cases as soon as possible as a matter of efficiency.

This fold allows further simplifications based on distributed ops transforms. eg:
  %a = lshr i8 %x, 7
  %b = or i8 %a, 2
  %c = and i8 %b, 1

InstSimplify can directly fold this now:
  %a = lshr i8 %x, 7

Differential Revision: https://reviews.llvm.org/D33221

llvm-svn: 303213
2017-05-16 21:51:04 +00:00
Evgeny Stupachenko cc19560253 The patch exclude a case from zero check skip in
CTLZ idiom recognition (r303102).

Summary:

The following case:
i = 1;
if(n)
  while (n >>= 1)
    i++;
use(i);

Was converted to:

i = 1;
if(n)
  i += builtin_ctlz(n >> 1, false);
use(i);

Which is not correct. The patch make it:

i = 1;
if(n)
  i += builtin_ctlz(n >> 1, true);
use(i);

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 303212
2017-05-16 21:44:59 +00:00
Amara Emerson c9916d7e97 Re-commit r302678, fixing PR33053.
The issue was that the AArch64 TTI hook allowed unpacked integer cmp reductions
which didn't have a lowering.

llvm-svn: 303211
2017-05-16 21:29:22 +00:00
Easwaran Raman 3cd1479c3f [Inliner] Do not mix callsite and callee hotness based updates.
Update threshold based on callee's hotness only when BFI is not available.
Otherwise use only callsite's hotness. This makes it easier to reason about
hotness related threshold updates.

Differential revision: https://reviews.llvm.org/D33157

llvm-svn: 303210
2017-05-16 21:18:09 +00:00
Tim Shen 3bef27cc6f [PPC] Lower load acquire/seq_cst trailing fence to cmp + bne + isync.
Summary:
This fixes pr32392.

The lowering pipeline is:
llvm.ppc.cfence in IR -> PPC::CFENCE8 in isel -> Actual instructions in
expandPostRAPseudo.

The reason why expandPostRAPseudo is chosen is because previous passes
are likely eliminating instructions like cmpw 3, 3 (early CSE) and bne-
7, .+4 (some branch pass(s)).

Differential Revision: https://reviews.llvm.org/D32763

llvm-svn: 303205
2017-05-16 20:18:06 +00:00
Easwaran Raman dadc0f11ad Add hasProfileSummary and has{Sample|Instrumentation}Profile methods
ProfileSummaryInfo already checks whether the module has sample profile
in determining profile counts. This will also be useful in inliner to
clean up threshold updates.

llvm-svn: 303204
2017-05-16 20:14:39 +00:00
Dmitry Mikulin fce148c568 In debug builds non-trivial amount of time is spent in InstCombine processing
@llvm.dbg.* calls in visitCallInst(). They can be safely ignored.

llvm-svn: 303202
2017-05-16 20:08:49 +00:00
Daniel Berlin 6c66e9a22a NewGVN: Only do something in verifyStoreExpressions if assertions are enabled, to avoid unused code warnings.
llvm-svn: 303201
2017-05-16 20:02:45 +00:00
Daniel Berlin 4540357240 NewGVN: Fix PR 33051 by making sure we remove old store expressions
from the ExpressionToClass mapping.

llvm-svn: 303200
2017-05-16 19:58:47 +00:00
Reid Kleckner 0ad69fc89f Revert "[X86] Replace slow LEA instructions in X86"
This reverts commit r303183, it broke various buildbots and introduced
sanitizer errors.

llvm-svn: 303199
2017-05-16 19:55:03 +00:00
Nirav Dave da8f221273 Elide stores which are overwritten without being observed.
Summary:
In SelectionDAG, when a store is immediately chained to another store
to the same address, elide the first store as it has no observable
effects. This is causes small improvements dealing with intrinsics
lowered to stores.

Test notes:

* Many testcases overwrite store addresses multiple times and needed
  minor changes, mainly making stores volatile to prevent the
  optimization from optimizing the test away.

* Many X86 test cases optimized out instructions associated with
  associated with va_start.

* Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has
  dependencies to check and can probably be removed and potentially
  replaced with another test.

Reviewers: rnk, john.brawn

Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33206

llvm-svn: 303198
2017-05-16 19:43:56 +00:00
Matthias Braun d625bedb40 ShrinkWrap: Add skipFunction() call
ShrinkWrapping is a performance optimization that can safely be skipped,
so we can add `if (!skipFunction()) return;`

llvm-svn: 303197
2017-05-16 18:43:30 +00:00
Davide Italiano 56a08b40d2 [MetadataLoader] Remove unused Vector. NFCI.
llvm-svn: 303196
2017-05-16 18:41:46 +00:00
Renato Golin d69570e017 Revert "[ARM] Mark LEApcrel instructions as isAsCheapAsAMove"
Revert "[ARM] Mark LEApcrel as not having side effects"

This reverts commit r303054 and r303053, as they broke the ARM
self-hosting buildbots:

http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/1550

http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/1349

http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost/builds/1845

Offline investigation on course.

llvm-svn: 303193
2017-05-16 17:59:07 +00:00
Stanislav Mekhanoshin acca0f5c02 [AMDGPU] Use GCNRPTracker dumper methods in scheduler
Differential Revision: https://reviews.llvm.org/D33244

llvm-svn: 303186
2017-05-16 16:31:45 +00:00
Stanislav Mekhanoshin b10860788f [AMDGPU] Cache live-ins and register pressure in scheduler
Using LIS can be quite expensive, so caching of calculated region
live-ins and pressure is implemented. It does two things:

1. Caches the info for the second stage when we schedule with
   decreased target occupancy.
2. Tracks the basic block from top to bottom thus eliminating the
   need to scan whole register file liveness at every region split
   in the middle of the block.

The scheduling is now done in 3 stages instead of two, with the first
one being really a no-op and only used to collect scheduling regions
as sent by the scheduler driver.

There is no functional change to the current behavior, only compilation
speed is affected. In general computeBlockPressure() could be simplified
if we switch to backward RP tracker, because scheduler sends regions
within a block starting from the last upward. We could use a natural
order of upward tracker to seamlessly change between regions of the same
block, since live reg set of a previous tracked region would become a
live-out of the next region. That however requires fixing upward tracker
to properly account defs and uses of the same instruction as both are
contributing to the current pressure. When we converge on the produced
pressure we should be able to switch between them back and forth. In
addition, backward tracker is less expensive as it uses LIS in recede
less often than forward uses it in advance.

At the moment the worst known case compilation time has improved from 26
minutes to 8.5.

Differential Revision: https://reviews.llvm.org/D33117

llvm-svn: 303184
2017-05-16 16:11:26 +00:00
Lama Saba 52e892577d [X86] Replace slow LEA instructions in X86
According to Intel's Optimization Reference Manual for SNB+:
  " For LEA instructions with three source operands and some specific situations, instruction latency has increased to 3 cycles, and must
    dispatch via port 1:
  - LEA that has all three source operands: base, index, and offset
  - LEA that uses base and index registers where the base is EBP, RBP,or R13
  - LEA that uses RIP relative addressing mode
  - LEA that uses 16-bit addressing mode "
  This patch currently handles the first 2 cases only.
 
Differential Revision: https://reviews.llvm.org/D32277

llvm-svn: 303183
2017-05-16 16:01:36 +00:00
Matthew Simpson af60af1ed5 Revert 303174, 303176, and 303178
These commits are breaking the bots. Reverting to investigate.

llvm-svn: 303182
2017-05-16 15:50:30 +00:00
Nirav Dave cfd357a61a [DAG] Prune deleted nodes in TokenFactor
Fix visitTokenFactor to correctly remove deleted nodes. NFC.

llvm-svn: 303181
2017-05-16 15:49:02 +00:00
Stanislav Mekhanoshin 464cecf81e [AMDGPU] Turn register pressure estimation into forward tracker
This factors register pressure estimation mechanism from the
GCNSchedStrategy into the forward tracker to unify interface
with other strategies and expose it to other interested phases.

Differential Revision: https://reviews.llvm.org/D33105

llvm-svn: 303179
2017-05-16 15:43:52 +00:00
Matthew Simpson b7b5d55c38 [LV] Avoid potentential division by zero when selecting IC
llvm-svn: 303174
2017-05-16 14:43:55 +00:00
Gor Nishanov 23453c11ff [coroutines] Handle unwind edge splitting
Summary:
RewritePHIs algorithm used in building of CoroFrame inserts a placeholder
```
%placeholder = phi [%val]
```
on every edge leading to a block starting with PHI node with multiple incoming edges,
so that if one of the incoming values was spilled and need to be reloaded, we have a
place to insert a reload. We use SplitEdge helper function to split the incoming edge.

SplitEdge function does not deal with unwind edges comping into a block with an EHPad.

This patch adds an ehAwareSplitEdge function that can correctly split the unwind edge.

For landing pads, we clone the landing pad into every edge block and replace the original
landing pad with a PHI collection the values from all incoming landing pads.

For WinEH pads, we keep the original EHPad in place and insert cleanuppad/cleapret in the
edge blocks.

Reviewers: majnemer, rnk

Reviewed By: majnemer

Subscribers: EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D31845

llvm-svn: 303172
2017-05-16 14:11:39 +00:00
George Rimar 41e656768d [DWARF] - Add RelocAddrEntry for cleanup. NFCi.
Was mentioned as possible cleanup during review of D33184.

llvm-svn: 303171
2017-05-16 14:05:45 +00:00
Chad Rosier 8b12a03215 Fix an improperly placed curly bracket. NFC.
llvm-svn: 303165
2017-05-16 12:43:23 +00:00
George Rimar 4671f2e08c [DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector.
Recommit of r303159 "[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector"
All places were shitched to use DWARFAddressRange now.

Suggested during review of D33184.

llvm-svn: 303163
2017-05-16 12:30:59 +00:00
George Rimar 3824cca7b3 Revert r303159 "[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector."
Something went wrong, it broke BB.
http://green.lab.llvm.org/green//job/clang-stage1-cmake-RA-incremental_build/38477/consoleFull#-200034420049ba4694-19c4-4d7e-bec5-911270d8a58c

llvm-svn: 303162
2017-05-16 12:05:03 +00:00
George Rimar 8680b6ee9c [DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector.
Suggested during review of D33184.

llvm-svn: 303159
2017-05-16 11:54:19 +00:00
James Henderson 852f6fde01 [LTO] Print time-passes information at conclusion of LTO codegen
The information collected when requested by -time-passes is only printed when
llvm_shutdown is called at the moment. This means that when linking against the LTO
library dynamically and using the C interface, it is not possible to see the timing
information, because llvm_shutdown cannot be called. This change modifies the LTO
code generation functions for both regular LTO and thin LTO to explicitly print and
reset the timing information.

I have tested that this works with our proprietary linker. However, as this relies
on a specific method of building and linking against the LTO library, I'm not sure
how or if this can be tested in the LLVM testsuite.

Reviewed by: mehdi_amini

Differential Revision: https://reviews.llvm.org/D32803

llvm-svn: 303152
2017-05-16 09:43:21 +00:00
Max Kazantsev b09b5db793 [SCEV] Fix sorting order for AddRecExprs
The existing sorting order in defined CompareSCEVComplexity sorts AddRecExprs
by loop depth, but does not pay attention to dominance of loops. This can
lead us to the following buggy situation:

for (...) { // loop1
  op1 = {A,+,B}
}
for (...) { // loop2
  op2 = {A,+,B}
  S = add op1, op2
}

In this case there is no guarantee that in operand list of S the op2 comes
before op1 (loop depth is the same, so they will be sorted just
lexicographically), so we can incorrectly treat S as a recurrence of loop1,
which is wrong.

This patch changes the sorting logic so that it places the dominated recs
before the dominating recs. This ensures that when we pick the first recurrency
in the operands order, it will be the bottom-most in terms of domination tree.
The attached test set includes some tests that produce incorrect SCEV
estimations and crashes with oldlogic.

Reviewers: sanjoy, reames, apilipenko, anna

Reviewed By: sanjoy

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33121

llvm-svn: 303148
2017-05-16 07:27:06 +00:00
Craig Topper 064adc6bfa [CorrelatedValuePropagation] Don't use -> to call a static method of ConstantRange. NFC
llvm-svn: 303147
2017-05-16 07:05:38 +00:00
Daniel Berlin 629e1ff6e6 NewGVN: Use StoreExpression StoredValue instead of looking it up again, since it was already looked up when it was created
llvm-svn: 303144
2017-05-16 06:06:15 +00:00
Daniel Berlin abd632dfeb NewGVN: Formatting fixes
llvm-svn: 303143
2017-05-16 06:06:12 +00:00
Davide Italiano a641842845 Revert "[NewGVN] Replace predicate info leftovers."
It's breaking the bots.

llvm-svn: 303142
2017-05-16 05:51:21 +00:00
Davide Italiano 331058fcc4 [NewGVN] Replace predicate info leftovers.
Fixes PR32945.

Differential Revision:  https://reviews.llvm.org/D33226

llvm-svn: 303141
2017-05-16 05:23:23 +00:00
NAKAMURA Takumi 994a43d27a AMDGPUCodeGen: Fix warnings in r303111. [-Wunused-variable]
llvm-svn: 303137
2017-05-16 04:01:23 +00:00
Peter Collingbourne 6f0ecca3b5 IR: Give function GlobalValue::getRealLinkageName() a less misleading name: dropLLVMManglingEscape().
This function gives the wrong answer on some non-ELF platforms in some
cases. The function that does the right thing lives in Mangler.h. To try to
discourage people from using this function, give it a different name.

Differential Revision: https://reviews.llvm.org/D33162

llvm-svn: 303134
2017-05-16 00:39:01 +00:00
Francis Visoiu Mistrih ebbc7159e9 [ShrinkWrapping] Handle restores on no-return paths
Shrink-wrapping uses post-dominators to find a restore point that
post-dominates all the uses of CSR / stack.

The way dominator trees are modeled in LLVM today is that unreachable
blocks are not present in a generic dominator tree, so, an unreachable node is
dominated by anything: include/llvm/Support/GenericDomTree.h:467.

Since for post-dominators, a no-return block is considered
"unreachable", calling findNearestCommonDominator on an unreachable node
A and a non-unreachable node B, will return B, which can be false. If we
find such node, we bail out since there is no good restore point
available.

rdar://problem/30186931

llvm-svn: 303130
2017-05-15 23:13:35 +00:00
Kostya Serebryany cf50d43be9 [libFuzzer] fix tests on Windows
llvm-svn: 303128
2017-05-15 22:55:00 +00:00
Xinliang David Li 8726d91d29 Fix memory leak
llvm-svn: 303126
2017-05-15 22:43:52 +00:00
Kostya Serebryany 87813b1bf8 [libFuzzer] improve the afl driver and it's tests. Make it possible to run individual inputs with afl driver
llvm-svn: 303125
2017-05-15 22:38:29 +00:00
Davide Italiano 60d36c7506 [AMDGPU] Kill now unused phiInfoElementGetDebugLoc(). NFCI.
llvm-svn: 303122
2017-05-15 22:10:15 +00:00
Craig Topper 6a1d02024e [APInt] Simplify a for loop initialization based on the fact that 'n' is known to be 1 by an earlier 'if'.
llvm-svn: 303120
2017-05-15 22:01:03 +00:00
Eugene Zelenko d761e2c264 [IR] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC).
llvm-svn: 303119
2017-05-15 21:57:41 +00:00
Tim Northover 203c6f055d AArch64: use linker-private symbols for globals in MachO.
We don't use section-relative relocations on AArch64, so all symbols must be at
least visible to the linker (i.e. properly global or l_whatever, but not
L_whatever).

llvm-svn: 303118
2017-05-15 21:51:38 +00:00
David Blaikie 441cfee780 PR32288: Describe a bool parameter's DWARF location with a simple register
There's no need (& a bit incorrect) to mask off the high bits of the
register reference when describing a simple bool value.

Reviewers: aprantl

Differential Revision: https://reviews.llvm.org/D31062

llvm-svn: 303117
2017-05-15 21:34:01 +00:00
Adam Nemet e29686e5c1 [SLP] Enable 64-bit wide vectorization on AArch64
ARM Neon has native support for half-sized vector registers (64 bits).  This
is beneficial for example for 2D and 3D graphics.  This patch adds the option
to lower MinVecRegSize from 128 via a TTI in the SLP Vectorizer.

*** Performance Analysis

This change was motivated by some internal benchmarks but it is also
beneficial on SPEC and the LLVM testsuite.

The results are with -O3 and PGO.  A negative percentage is an improvement.
The testsuite was run with a sample size of 4.

** SPEC

* CFP2006/482.sphinx3  -3.34%

A pretty hot loop is SLP vectorized resulting in nice instruction reduction.
This used to be a +22% regression before rL299482.

* CFP2000/177.mesa     -3.34%
* CINT2000/256.bzip2   +6.97%

My current plan is to extend the fix in rL299482 to i16 which brings the
regression down to +2.5%.  There are also other problems with the codegen in
this loop so there is further room for improvement.

** LLVM testsuite

* SingleSource/Benchmarks/Misc/ReedSolomon               -10.75%

There are multiple small SLP vectorizations outside the hot code.  It's a bit
surprising that it adds up to 10%.  Some of this may be code-layout noise.

* MultiSource/Benchmarks/VersaBench/beamformer/beamformer -8.40%

The opt-viewer screenshot can be seen at F3218284.  We start at a colder store
but the tree leads us into the hottest loop.

* MultiSource/Applications/lambda-0.1.3/lambda            -2.68%
* MultiSource/Benchmarks/Bullet/bullet                    -2.18%

This is using 3D vectors.

* SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists +6.67%

Noise, binary is unchanged.

* MultiSource/Benchmarks/Ptrdist/anagram/anagram          +4.90%

There is an additional SLP in the cold code.  The test runs for ~1sec and
prints out over 2000 lines. This is most likely noise.

* MultiSource/Applications/aha/aha                        +1.63%
* MultiSource/Applications/JM/lencod/lencod               +1.41%
* SingleSource/Benchmarks/Misc/richards_benchmark         +1.15%

Differential Revision: https://reviews.llvm.org/D31965

llvm-svn: 303116
2017-05-15 21:15:01 +00:00
Hans Wennborg bd6e9e77a7 Revert r302678 "[AArch64] Enable use of reduction intrinsics."
This caused PR33053.

Original commit message:

> The new experimental reduction intrinsics can now be used, so I'm enabling this
> for AArch64. We will need this for SVE anyway, so it makes sense to do this for
> NEON reductions as well.
>
> The existing code to match shufflevector patterns are replaced with a direct
> lowering of the reductions to AArch64-specific nodes. Tests updated with the
> new, simpler, representation.
>
> Differential Revision: https://reviews.llvm.org/D32247

llvm-svn: 303115
2017-05-15 20:59:32 +00:00
Evgeniy Stepanov b56012b548 [asan] Better workaround for gold PR19002.
See the comment for more details. Test in a follow-up CFE commit.

llvm-svn: 303113
2017-05-15 20:43:42 +00:00
Jan Sjodin a06bfe054e Re-submit AMDGPUMachineCFGStructurizer.
Differential Revision: https://reviews.llvm.org/D23209

llvm-svn: 303111
2017-05-15 20:18:37 +00:00
Tim Northover 8b96c7e9b5 AArch64: diagnose unrecognized features in .cpu directive.
We were silently ignoring any features we couldn't match up, which led to
errors in an inline asm block missing the conventional "\n\t".

llvm-svn: 303108
2017-05-15 19:42:15 +00:00
Davide Italiano cff8a34716 [NewGVN] Remove unused setDefiningExpr(). NFCI.
llvm-svn: 303107
2017-05-15 19:35:40 +00:00
Sanjay Patel 878715f978 [InstCombine] restrict icmp fold with 2 sdiv exact operands (PR32949)
This is the InstCombine counterpart to D32954. 
I added some comments about the code duplication in:
rL302436

Alive-based verification:
http://rise4fun.com/Alive/dPw

This is a 2nd fix for the problem reported in:
https://bugs.llvm.org/show_bug.cgi?id=32949

Differential Revision: https://reviews.llvm.org/D32970

llvm-svn: 303105
2017-05-15 19:27:53 +00:00
Sanjay Patel a23b141cd2 [InstSimplify] restrict icmp fold with 2 sdiv exact operands (PR32949)
These folds were introduced with https://reviews.llvm.org/rL127064 as part of solving:
https://bugs.llvm.org/show_bug.cgi?id=9343

As shown here:
http://rise4fun.com/Alive/C8
...however, the sdiv exact case needs a stronger predicate.

I opted for duplicated code instead of adding another fallthrough because I think that's 
easier to read (and edit in case we need/want to restrict/loosen the predicates any more).

This should fix:
https://bugs.llvm.org/show_bug.cgi?id=32949
https://bugs.llvm.org/show_bug.cgi?id=32948

Differential Revision: https://reviews.llvm.org/D32954

llvm-svn: 303104
2017-05-15 19:16:49 +00:00
Evgeny Stupachenko 2fecd38ab8 The patch adds CTLZ idiom recognition.
Summary:

The following loops should be recognized:
i = 0;
while (n) {
  n = n >> 1;
  i++;
  body();
}
use(i);

And replaced with builtin_ctlz(n) if body() is empty or
for CPUs that have CTLZ instruction converted to countable:

for (j = 0; j < builtin_ctlz(n); j++) {
  n = n >> 1;
  i++;
  body();
}
use(builtin_ctlz(n));

Reviewers: rengolin, joerg

Differential Revision: http://reviews.llvm.org/D32605

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 303102
2017-05-15 19:08:56 +00:00
Davide Italiano 6e7a212748 [NewGVN] Fix verification of MemoryPhis in verifyMemoryCongruency().
verifyMemoryCongruency() filters out trivially dead MemoryDef(s),
as we find them immediately dead, before moving from TOP to a new
congruence class.
This fixes the same problem for PHI(s) skipping MemoryPhis if all
the operands are dead.

Differential Revision:  https://reviews.llvm.org/D33044

llvm-svn: 303100
2017-05-15 18:50:53 +00:00
Geoff Berry e369653bf3 [AArch64][Falkor] Fix sched details for FMOV
llvm-svn: 303099
2017-05-15 18:50:22 +00:00
Jan Sjodin 0e289822fa Revert 303091.
llvm-svn: 303098
2017-05-15 18:39:47 +00:00
Teresa Johnson 41db92f9ae Add support for handling ifuncs to GlobalValue::getBaseObject
Summary:
All GlobalIndirectSymbol types (not just GlobalAlias) should return
their base object.

Without this patch LTO would warn "Unable to determine comdat of
alias!" for an ifunc.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D33202

llvm-svn: 303096
2017-05-15 18:28:29 +00:00
Craig Topper 716cad8bb7 [SCEV] Use copy initialization of APInts instead of direct initialization.
This is based on post commit feed back from r302769.

llvm-svn: 303092
2017-05-15 18:14:16 +00:00
Jan Sjodin e9d2ddc9dd Add AMDGPUMachineCFGStructurizer.
Differential Revision: https://reviews.llvm.org/D23209

llvm-svn: 303091
2017-05-15 18:13:56 +00:00
Sanjay Patel 941e8dfcbf [InstCombine] use m_OneUse to reduce code; NFCI
llvm-svn: 303090
2017-05-15 18:08:17 +00:00
Kostya Serebryany e8a49b3850 [libFuzzer] fix a warning from Wunreachable-code-loop-increment reported by Christian Holler. This also fixes a logical bug, which however does not affect the libFuzzer's ability too much (I wasn't able to create a differentiating test)
llvm-svn: 303087
2017-05-15 17:39:42 +00:00
Kyle Butt 7d531daece CodeGen: BlockPlacement: Increase tail duplication size for O3.
At O3 we are more willing to increase size if we believe it will improve
performance. The current threshold for tail-duplication of 2 instructions is
conservative, and can be relaxed at O3.

Benchmark results:
llvm test-suite:
6% improvement in aha, due to duplication of loop latch
3% improvement in hexxagon

2% slowdown in lpbench. Seems related, but couldn't completely diagnose.

Internal google benchmark:
Produces 4% improvement on internal google protocol buffer serialization
benchmarks.

Differential-Revision: https://reviews.llvm.org/D32324
llvm-svn: 303084
2017-05-15 17:30:47 +00:00
Simon Pilgrim 55ff57861a [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146)
Follow up to D33147

NVPTXTargetLowering::LowerCall was trusting the default argument values.

Fixes another 17 of the NVPTX '-verify-machineinstrs with EXPENSIVE_CHECKS' errors in PR32146.

Differential Revision: https://reviews.llvm.org/D33189

llvm-svn: 303082
2017-05-15 17:17:44 +00:00
Florian Hahn af91e7e6d2 [AArch64] Enable FeatureFuseAES on Cortex-A72.
This patch enables fusing dependent AESE/AESMC and AESD/AESIMC
instruction pairs on Cortex-A72, as recommended in the Software
Optimization Guide, section 4.10.

llvm-svn: 303073
2017-05-15 15:15:22 +00:00
Dmitry Preobrazhensky 167f8b69e3 [AMDGPU][MC] Corrected several VI opcodes to avoid printing _e64
See bug 32936: https://bugs.llvm.org//show_bug.cgi?id=32936

Reviewers: artem.tamazov, vpykhtin

Differential Revision: https://reviews.llvm.org/D33123

llvm-svn: 303070
2017-05-15 14:28:23 +00:00
Dmitry Preobrazhensky 03852a9dca [AMDGPU][MC] Removed V_MQSAD_U16_U8
This instruction does not really exist

See Bug 33018: https://bugs.llvm.org//show_bug.cgi?id=33018

Reviewers: vpykhtin, artem.tamazov

Differential Revision: https://reviews.llvm.org/D33126

llvm-svn: 303055
2017-05-15 12:37:03 +00:00
John Brawn 9486becf09 [ARM] Mark LEApcrel instructions as isAsCheapAsAMove
Doing this means that if an LEApcrel is used in two places we will rematerialize
instead of generating two MOVs. This is particularly useful for printfs using
the same format string, where we want to generate an address into a register
that's going to get corrupted by the call.

Differential Revision: https://reviews.llvm.org/D32858

llvm-svn: 303054
2017-05-15 11:57:54 +00:00
John Brawn 43132c46a6 [ARM] Mark LEApcrel as not having side effects
Doing this lets us hoist it out of loops, and I've also marked it as
rematerializable the same as the thumb1 and thumb2 counterparts.

It looks like it being marked as such was just a mistake, as the commit that
made that change only mentions LEApcrelJT and in thumb1 and thumb2 only the
LEApcrelJT instructions were marked as having side-effects, so it looks like
the intent was to only mark LEApcrelJT as having side-effects but LEApcrel was
accidentally marked as such also.

Differential Revision: https://reviews.llvm.org/D32857

llvm-svn: 303053
2017-05-15 11:50:21 +00:00
George Rimar 958b01aa69 [DWARF] - Speedup handling of relocations in DWARFContextInMemory.
I am working on a speedup of building .gdb_index in LLD and 
noticed that relocations that are proccessed in DWARFContextInMemory often uses
the same symbol in a row. This patch introduces caching to reduce the relocations
proccessing time.

For benchmark,
I took debug LLC binary objects configured with -ggnu-pubnames and linked it using LLD.

Link time without --gdb-index is about 4,45s.
Link time with --gdb-index: a) Without patch: 19,16s b) With patch: 15,52s
That means time spent on --gdb-index in this configuration is 
19,16s - 4,45s = 14,71s (without patch) vs 15,52s - 4,45s = 11,07s (with patch).

Differential revision: https://reviews.llvm.org/D31136

llvm-svn: 303051
2017-05-15 11:45:28 +00:00
Ayman Musa c5490e5a29 [X86] Relocate code of replacement of subtarget unsupported masked memory intrinsics to run also on -O0 option.
Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional).

CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0).

Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation.

Differential Revision: https://reviews.llvm.org/D32487

llvm-svn: 303050
2017-05-15 11:30:54 +00:00
Simon Pilgrim f8389656e3 [NVPTX] Don't rely on default arguments to SelectionDAG::getMemIntrinsicNode. NFC.
NFC followup to D33147, this explicitly sets all the arguments (instead of relying on the defaults) to SelectionDAG::getMemIntrinsicNode to help identify -verify-machineinstrs issues.

llvm-svn: 303047
2017-05-15 10:47:48 +00:00
Tom Stellard 049e7e0791 [RegisterBankInfo] Remove overly-agressive asserts
Summary:
We were asserting in RegisterBankInfo if RBI.copyCost() returns
UINT_MAX.  This is OK for RegBankSelect::Mode::Fast since we only
try one instruction mapping and can't recover from this, but for
RegBankSelect::Mode::Greedy we will be considering multiple
instruction mappings, so we can recover if we see a UNIT_MAX copy
cost.

The copy cost for one pair of register banks in the AMDGPU backend
will be UNIT_MAX, so this patch will prevent AMDGPU tests from
breaking.

Reviewers: ab, qcolombet, t.p.northover, dsanders

Reviewed By: qcolombet

Subscribers: tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D33144

llvm-svn: 303043
2017-05-15 09:52:33 +00:00
Arnaud A. de Grandmaison 6d2417924c MCObjectStreamer : fail with a diagnostic when emitting an out of range value.
We were previously silently emitting bogus data in release mode,
making it very hard to diagnose the error, or crashing with an
assert in debug mode. A proper diagnostic is now always emitted
when the value to be emitted is out of range.

llvm-svn: 303041
2017-05-15 08:43:27 +00:00
Craig Topper 1a36b7d836 [ValueTracking] Replace all uses of ComputeSignBit with computeKnownBits.
This patch finishes off the conversion of ComputeSignBit to computeKnownBits.

Differential Revision: https://reviews.llvm.org/D33166

llvm-svn: 303035
2017-05-15 06:39:41 +00:00
Sanjoy Das f6f6fb903e Move some code into ScalarEvolution.cpp; NFC
I need to add some asserts to these constructors that are easier to
add once they're in the .cpp file.

llvm-svn: 303032
2017-05-15 04:22:09 +00:00
Craig Topper bb9737247a [InstCombine] Merge duplicate functionality between InstCombine and ValueTracking
Summary:
Merge overflow computation for signed add,
appearing both in InstCombine and ValueTracking.

As part of the merge,
cleanup the interface for overflow checks in InstCombine.

Patch by Yoav Ben-Shalom.

Reviewers: craig.topper, majnemer

Reviewed By: craig.topper

Subscribers: takuto.ikuta, llvm-commits

Differential Revision: https://reviews.llvm.org/D32946

llvm-svn: 303029
2017-05-15 02:44:08 +00:00
Craig Topper 26c4159956 [InstCombine] Remove 'return' of a called function that also returned void. NFC
llvm-svn: 303028
2017-05-15 02:30:27 +00:00
Zvi Rackover e6b278bc65 [X86] Utilize SelectionDAG::getSelect(). NFC.
Replace SelectionDAG::getNode(ISD::SELECT, ...)
and SelectionDAG::getNode(ISD::VSELECT, ...)
with SelectionDAG::getSelect(...)
Saves a few lines of code and in some cases saves the need to explicitly
check the type of the desired node.

llvm-svn: 303024
2017-05-14 21:30:38 +00:00
Simon Pilgrim d0ef9d8e93 [X86][AVX1] Account for cost of extract/insert of 256-bit shifts
llvm-svn: 303023
2017-05-14 20:52:11 +00:00
Simon Pilgrim f96b4ab92d [X86][AVX2] Fix costs for v4i64 ashr by splat
llvm-svn: 303022
2017-05-14 20:25:42 +00:00
Simon Pilgrim de4467b182 [X86][AVX1] Account for cost of extract/insert of 256-bit shifts by splat
llvm-svn: 303021
2017-05-14 20:02:34 +00:00
Craig Topper ceea1a76a1 [X86] Remove unused value from IntrinsicType enum. NFC
llvm-svn: 303018
2017-05-14 19:38:06 +00:00
Simon Pilgrim d3f0d03cc5 [X86][AVX1] Account for cost of extract/insert of 256-bit SDIV/UDIV by mul sequences
llvm-svn: 303017
2017-05-14 18:52:15 +00:00
Dimitry Andric 4043373e84 Fix DynamicLibraryTest.cpp on FreeBSD and NetBSD
Summary:

After rL301562, on FreeBSD the DynamicLibrary unittests fail, because
the test uses getMainExecutable("DynamicLibraryTests", Ptr), and since
the path does not contain any slashes, retrieving the main executable
will not work.

Reimplement getMainExecutable() for FreeBSD and NetBSD using sysctl(3),
which is more reliable than fiddling with relative or absolute paths.

Also add retrieval of the original argv[] from the GoogleTest framework,
to use as a fallback for other OSes.

Reviewers: emaste, marsupial, hans, krytarowski

Reviewed By: krytarowski

Subscribers: krytarowski, llvm-commits

Differential Revision: https://reviews.llvm.org/D33171

llvm-svn: 303015
2017-05-14 18:35:38 +00:00
Shoaib Meenai ee97c5f012 [COFF] Gracefully handle empty .drectve sections
Running `llvm-readobj -coff-directives msvcrt.lib` resulted in this error:

    Invalid data was encountered while parsing the file

This happened because some of the object files in the archive have empty
`.drectve` sections. These empty sections result in a `parse_failed` error being
returned from `COFFObjectFile::getSectionContents()`, which in turn caused
`llvm-readobj` to stop. With this change, `getSectionContents` now returns
success, and like before the resulting array is empty.

Patch by Dave Lee.

Differential Revision: https://reviews.llvm.org/D32652

llvm-svn: 303014
2017-05-14 18:34:56 +00:00
Simon Pilgrim 5bef9c627e [X86][XOP] XOP's general v16i8 shifts will be used instead of v8i16 shift + mask.
Tweak cost model to match what lowering actually does.

llvm-svn: 303013
2017-05-14 17:59:46 +00:00
Simon Pilgrim aa8dffb69b [X86][SSE] Account for cost of extract/insert of v32i8 vector shifts
llvm-svn: 303012
2017-05-14 17:36:07 +00:00
Simon Pilgrim 4599eaa09a [X86][XOP] Account for cost of extract/insert of 256-bit vector shifts
llvm-svn: 303010
2017-05-14 13:38:53 +00:00
Simon Pilgrim f3ee9c6997 [X86][AVX] Allow 32-bit targets to peek through subvectors to extract constant splats for vXi64 shifts.
llvm-svn: 303009
2017-05-14 11:46:26 +00:00
Craig Topper 479daaf74c [InstSimplify] Add patterns for folding (A & B) | (~A ^ B) -> (~A ^ B) and its commuted variants.
We already had (A & ~B) | (A ^ B), but we missed the cases where the not was part of the xor.

llvm-svn: 303004
2017-05-14 07:54:43 +00:00
Craig Topper dfc8955ee6 [BasicAA] Alphabetize includes. NFC
llvm-svn: 303002
2017-05-14 06:18:34 +00:00
Xinliang David Li 392e975693 Fix test failure on windows -- do not return deleted func
llvm-svn: 302999
2017-05-14 02:54:02 +00:00
Simon Pilgrim 754c1618ec [SelectionDAG] Added support for EXTRACT_SUBVECTOR/CONCAT_VECTORS demandedelts in ComputeNumSignBits
llvm-svn: 302997
2017-05-13 22:10:58 +00:00
Peter Collingbourne d891d89ce8 Add missing files
llvm-svn: 302996
2017-05-13 22:10:13 +00:00
Peter Collingbourne c6f07c423d Move lib/LibDriver -> lib/ToolDrivers/llvm-lib. NFCI.
This reorganisation prevents us from cluttering up the top-level lib directory
with more driver libraries such as llvm-dlltool (see D29892).

llvm-svn: 302995
2017-05-13 22:06:46 +00:00
Simon Pilgrim 7666afd042 [SelectionDAG] Add VECTOR_SHUFFLE support to ComputeNumSignBits
llvm-svn: 302993
2017-05-13 19:57:10 +00:00
Craig Topper 9fe357971c [ValueTracking] Remove const_casts on several calls to computeKnownBits and ComputeSignBit. NFC
llvm-svn: 302991
2017-05-13 17:22:16 +00:00
Simon Pilgrim ef46c2762a [x86, SSE] AVX1 PR28129 (256-bit all-ones rematerialization)
Further perf tests on Jaguar indicate that:

vxorps  %ymm0, %ymm0, %ymm0
vcmpps  $15, %ymm0, %ymm0, %ymm0

is consistently faster (by about 9%) than:

vpcmpeqd  %xmm0, %xmm0, %xmm0
vinsertf128  $1, %xmm0, %ymm0, %ymm0

Testing equivalent code on a SandyBridge (E5-2640) puts it slightly (~3%) faster as well.

Committed on behalf of @dtemirbulatov

Differential Revision: https://reviews.llvm.org/D32416

llvm-svn: 302989
2017-05-13 13:42:35 +00:00
Simon Pilgrim 7d62e4b455 [LoopOptimizer][Fix]PR32859, PR24738
The Loop vectorizer pass introduced undef value while it is fixing output of LCSSA form.
Here it is:

before: %e.0.ph = phi i32 [ 0, %for.inc.2.i ]
after: %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ undef, %middle.block ]

and after this change we have:

%e.0.ph = phi i32 [ 0, %for.inc.2.i ]
%e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ 0, %middle.block ]

Committed on behalf of @dtemirbulatov

Differential Revision: https://reviews.llvm.org/D33055

llvm-svn: 302988
2017-05-13 13:25:57 +00:00
Vivek Pandya 1d12790f37 This reverts r302984
llvm-svn: 302985
2017-05-13 10:59:05 +00:00
Vivek Pandya d20de87fd5 Simplify MIR Output used for Codegen Testing
- MIRYamlMapping: Default value provided for fields which have optional
mappings. Implemented == operators for required classes. When a field's value is
same as default value specified YAML IO class will not print it.

- MIRPrinter: Above mentioned behaviour is not on by default. If -simplify-mir
option not specified, then make yaml::Output to print fields with default values
too.

Differential Revision: https://reviews.llvm.org/D32304

llvm-svn: 302984
2017-05-13 08:55:43 +00:00
Craig Topper 2c9a70661c [APInt] Use Lo_32/Hi_32/Make_64 in a few more places in the divide code. NFCI
llvm-svn: 302983
2017-05-13 07:14:17 +00:00
Craig Topper 935f7b050f [InstCombine] Prevent InstCombine from triggering an extra iteration if something changed in the initial Worklist creation
Summary:
If the Worklist build causes an IR change this change flag currently factors into the flag for running another iteration of the iteration loop. But only changes during processing should trigger another loop.

This patch captures the worklist creation change flag into the outside the loop flag currently used for DbgDeclares and only sends that flag up to the caller. Rerunning the loop only depends on IC.run() now.

This uses the debug output of InstCombine to determine if one or two iterations run. I couldn't think of a better way to detect it since the second spurious iteration shoudn't make any visible changes. Just wasted computation.

I can do a pre-commit of the test case with the CHECK-NOT as a CHECK if this is an ok way to check this.

This is a subset of D31678 as I'm still not sure how to verify the analysis behavior for that.

Reviewers: davide, majnemer, spatel, chandlerc

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32453

llvm-svn: 302982
2017-05-13 06:56:04 +00:00
Craig Topper 4b83b4d560 [APInt] Fix typo in comment. NFC
llvm-svn: 302974
2017-05-13 00:35:30 +00:00
Dylan McKay 0c4debc123 [AVR] When lowering Select8/Select16, put newly generated MBBs in the same spot
Contributed by Dr. Gergő Érdi.

Fixes a bug.

Raised from (https://github.com/avr-rust/rust/issues/49).

llvm-svn: 302973
2017-05-13 00:22:34 +00:00
Dylan McKay 0c707da6ac [AVR] Remove an unused variable
llvm-svn: 302970
2017-05-13 00:00:26 +00:00
Xinliang David Li 66bdfca77a [PartialInlining] Profile based cost analysis
Implemented frequency based cost/saving analysis
and related options.

The pass is now in a state ready to be turne on
in the pipeline (in follow up).

Differential Revision: http://reviews.llvm.org/D32783

llvm-svn: 302967
2017-05-12 23:41:43 +00:00
Aditya Nandakumar 2a735421d1 [GISel]: Add a getConstantFPVRegVal utility
This might be useful across various GISel Passes

https://reviews.llvm.org/D33051

llvm-svn: 302964
2017-05-12 22:54:52 +00:00
Aditya Nandakumar 479ddd20fc [GISel]: Fix undefined behavior while accessing DefaultAction map
We end up dereferencing the end iterator here when the Aspect doesn't exist in the DefaultAction map.
Change the API to return Optional<LLT> and return None when not found.
Also update the callers to handle the None case

llvm-svn: 302963
2017-05-12 22:43:58 +00:00
Eugene Zelenko 0cd7948876 [IR] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC).
llvm-svn: 302961
2017-05-12 22:25:07 +00:00
Andrew Kaylor b01e94ee8d [TLI] Add mapping for various '__<func>_finite' forms of the math routines to SVML routines
Patch by Chris Chrulski

Differential Revision: https://reviews.llvm.org/D31789

llvm-svn: 302957
2017-05-12 22:11:26 +00:00
Andrew Kaylor f7c864f89c [ConstantFolding] Add folding for various math '__<func>_finite' routines generated from -ffast-math
Patch by Chris Chrulski

Differential Revision: https://reviews.llvm.org/D31788

llvm-svn: 302956
2017-05-12 22:11:20 +00:00
Andrew Kaylor 3cd8c16d7f [TLI] Add declarations for various math header file routines from math-finite.h that create '__<func>_finite as functions
Patch by Chris Chrulski

Differential Revision: https://reviews.llvm.org/D31787

llvm-svn: 302955
2017-05-12 22:11:12 +00:00
Craig Topper b1a71cac4b [APInt] Add early outs for a division by 1 to udiv/urem/udivrem
We already counted the number of bits in the RHS so its pretty cheap to just check if the RHS is 1.

Differential Revision: https://reviews.llvm.org/D33154

llvm-svn: 302953
2017-05-12 21:45:50 +00:00
Craig Topper 2579c7c69f [APInt] In udivrem, remember the bit width in a local variable so we don't reread it from the LHS which might be aliased with Quotient or Remainder.
This helped the compiler generate better code for the single word case. It was able to remember that the bit width was still a single word when it created the Remainder APInt and not create code for it possibly being multiword.

llvm-svn: 302952
2017-05-12 21:45:44 +00:00
Adrian Prantl 1fa362f811 LTO: Don't verify modules twice in verifyMergedModuleOnce
Differential Revision: https://reviews.llvm.org/D33140

llvm-svn: 302951
2017-05-12 21:38:32 +00:00
Changpeng Fang 161e8c39af AMDGPU/SI: Don't promote to vector if the load/store is volatile.
Summary:
  We should not change volatile loads/stores in promoting alloca to vector.

Reviewers:
  arsenm

Differential Revision:
  http://reviews.llvm.org/D33107

llvm-svn: 302943
2017-05-12 20:31:12 +00:00
Simon Pilgrim a1978aaefd [NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146)
This fixes 47 of the 75 NVPTX '-verify-machineinstrs with EXPENSIVE_CHECKS' errors in PR32146.

Differential Revision: https://reviews.llvm.org/D33147

llvm-svn: 302942
2017-05-12 19:56:43 +00:00
Teresa Johnson 4cd12ce9a0 Remove ignore-empty-index-file option
Summary:
As discussed in the D32195 review thread and on IRC, remove this option
and replace with parameter, which will be set to true when invoked
from clang in the context of a ThinLTO distributed backend.

Reviewers: pcc

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D33133

llvm-svn: 302939
2017-05-12 19:32:11 +00:00
Dehao Chen 65dd23e273 Add LiveRangeShrink pass to shrink live range within BB.
Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB.

Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb

Reviewed By: MatzeB, andreadb

Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D32563

llvm-svn: 302938
2017-05-12 19:29:27 +00:00
Tim Shen 10c64e6aea [PPC] Move the combine "a << (b % (sizeof(a) * 8)) -> (PPCshl a, b)" to the backend. NFC.
Summary:
Eli pointed out that it's unsafe to combine the shifts to ISD::SHL etc.,
because those are not defined for b > sizeof(a) * 8, even after some of
the combiners run.

However, PPCISD::SHL defines that behavior (as the instructions themselves).
Move the combination to the backend.

The tests in shift_mask.ll still pass.

Reviewers: echristo, hfinkel, efriedma, iteratee

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D33076

llvm-svn: 302937
2017-05-12 19:25:37 +00:00
Zachary Turner dd3a739d52 [CodeView] Add a random access type visitor.
This adds a visitor that is capable of accessing type
records randomly and caching intermediate results that it
learns about during partial linear scans.  This yields
amortized O(1) access to a type stream even though type
streams cannot normally be indexed.

Differential Revision: https://reviews.llvm.org/D33009

llvm-svn: 302936
2017-05-12 19:18:12 +00:00
Geoff Berry ddbbf6416c [AArch64][Falkor] Refine modeling of multiply accumulate forwarding.
llvm-svn: 302933
2017-05-12 18:57:10 +00:00
Craig Topper 4bdd621e93 [APInt] Add an assert to check for divide by zero in udivrem. NFC
udiv and urem already had the same assert.

llvm-svn: 302931
2017-05-12 18:19:01 +00:00
Craig Topper 06da0816fd [APInt] Remove unnecessary checks of rhsWords==1 with lhsWords==1 from udiv and udivrem. NFC
At this point in the code rhsWords is guaranteed to be non-zero and less than or equal to lhsWords. So if lhsWords is 1, rhsWords must also be 1. urem alread had the check removed so this makes all 3 consistent.

llvm-svn: 302930
2017-05-12 18:18:57 +00:00
Simon Pilgrim b146e61828 Strip trailing whitespace. NFCI.
llvm-svn: 302927
2017-05-12 17:42:36 +00:00