Commit Graph

28836 Commits

Author SHA1 Message Date
Zachary Turner cef001aaaa Make LLVM_PRETTY_FUNCTION support __func__.
In case there are compilers that support neither __FUNCSIG__ or
__PRETTY_FUNCTION__, we fall back to __func__ as a last resort,
which should be guaranteed by C++11 and C99.

llvm-svn: 278176
2016-08-09 23:03:55 +00:00
Zachary Turner e7c2875dc3 Add a platform independent version of __PRETTY_FUNCTION__.
MSVC doesn't have this, it only has __FUNCSIG__.  So this adds
a new macro called LLVM_PRETTY_FUNCTION which evaluates to the
right thing on any platform.

llvm-svn: 278170
2016-08-09 22:03:45 +00:00
Tim Northover 5ed648e509 GlobalISel: first translation support for Constants.
For now put them all in the entry block. This should be correct but may give
poor runtime performance. Hopefully MachineSinking combined with
isReMaterializable can solve those issues, but if not the interface is sound
enough to support alternatives.

llvm-svn: 278168
2016-08-09 21:28:04 +00:00
Wei Mi 575435012c Fix the runtime error caused by "Use ValueOffsetPair to enhance value reuse during SCEV expansion".
The patch is to fix the bug in PR28705. It was caused by setting wrong return
value for SCEVExpander::findExistingExpansion. The return values of findExistingExpansion
have different meanings when the function is used in different ways so it is easy to make
mistake. The fix creates two new interfaces to replace SCEVExpander::findExistingExpansion,
and specifies where each interface is expected to be used.

Differential Revision: https://reviews.llvm.org/D22942

llvm-svn: 278161
2016-08-09 20:40:03 +00:00
Wei Mi 785858cf6c Recommit "Use ValueOffsetPair to enhance value reuse during SCEV expansion".
The fix for PR28705 will be committed consecutively.

In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion.
However, const folding and sext/zext distribution can make the reuse still difficult.

A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and
  S1 = S2 + C_a
  S3 = S2 + C_b
where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as
V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a
complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused
by the fact that S3 is generated from S1 after const folding.

In order to do that, we represent ExprValueMap as a mapping from SCEV to
ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the
ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first
expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to
V1 - C_a + C_b.

Differential Revision: https://reviews.llvm.org/D21313

llvm-svn: 278160
2016-08-09 20:37:50 +00:00
Chris Dewhurst 822d4a09e3 Without explicitly including <string>, I'm getting an error on the new code in this file. Won't present an issue for anyone that isn't having the same trouble as me.
llvm-svn: 278159
2016-08-09 20:32:59 +00:00
Tim Shen 75c1656afb [ADT] Change iterator_adaptor_base's default template arguments to forward more underlying typedefs
Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23217

llvm-svn: 278157
2016-08-09 20:23:13 +00:00
Kuba Brecka 65966c8bde Add `#ifdef __cplusplus` around `extern "C"` in Compiler.h. NFC.
llvm-svn: 278119
2016-08-09 12:12:15 +00:00
Vassil Vassilev e01ffee570 [modules]Add missing include.
llvm-svn: 278108
2016-08-09 09:46:11 +00:00
Sean Silva 0746f3bfa4 Consistently use LoopAnalysisManager
One exception here is LoopInfo which must forward-declare it (because
the typedef is in LoopPassManager.h which depends on LoopInfo).

Also, some includes for LoopPassManager.h were needed since that file
provides the typedef.

Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

llvm-svn: 278079
2016-08-09 00:28:52 +00:00
Sean Silva fd03ac6a0c Consistently use ModuleAnalysisManager
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

llvm-svn: 278078
2016-08-09 00:28:38 +00:00
Sean Silva 36e0d01e13 Consistently use FunctionAnalysisManager
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

llvm-svn: 278077
2016-08-09 00:28:15 +00:00
Saleem Abdulrasool 015280211b CodeView: extract the OMF Directory Header
The DebugDirectory contains a pointer to the CodeView info structure which is a
derivative of the OMF debug directory.  The structure has evolved a bit over
time, and PDB 2.0 used a slightly different definition from PDB 7.0.  Both of
these are specific to CodeView and not COFF.  Reflect this by moving the
structure definitions into the DebugInfo/CodeView headers.  Define a generic
DebugInfo union type that can be used to pass around a reference to the
DebugInfo irrespective of the versioning.  NFC.

llvm-svn: 278075
2016-08-09 00:25:12 +00:00
Charles Davis e9c32c7ed3 Revert "[X86] Support the "ms-hotpatch" attribute."
This reverts commit r278048. Something changed between the last time I
built this--it takes awhile on my ridiculously slow and ancient
computer--and now that broke this.

llvm-svn: 278053
2016-08-08 21:20:15 +00:00
Charles Davis 0822aa118e [X86] Support the "ms-hotpatch" attribute.
Summary:
Based on two patches by Michael Mueller.

This is a target attribute that causes a function marked with it to be
emitted as "hotpatchable". This particular mechanism was originally
devised by Microsoft for patching their binaries (which they are
constantly updating to stay ahead of crackers, script kiddies, and other
ne'er-do-wells on the Internet), but is now commonly abused by Windows
programs to hook API functions.

This mechanism is target-specific. For x86, a two-byte no-op instruction
is emitted at the function's entry point; the entry point must be
immediately preceded by 64 (32-bit) or 128 (64-bit) bytes of padding.
This padding is where the patch code is written. The two byte no-op is
then overwritten with a short jump into this code. The no-op is usually
a `movl %edi, %edi` instruction; this is used as a magic value
indicating that this is a hotpatchable function.

Reviewers: majnemer, sanjoy, rnk

Subscribers: dberris, llvm-commits

Differential Revision: https://reviews.llvm.org/D19908

llvm-svn: 278048
2016-08-08 21:01:39 +00:00
Geoff Berry 75331f7f2e [MemorySSA] Fix windows build breakage caused by r278028 (take 2)
r278028: [MemorySSA] Ensure address stability of MemorySSA object.
llvm-svn: 278041
2016-08-08 19:33:27 +00:00
Geoff Berry 290a13e7c7 [MemorySSA] Fix windows build breakage caused by r278028
r278028: [MemorySSA] Ensure address stability of MemorySSA object.
llvm-svn: 278035
2016-08-08 18:27:22 +00:00
Geoff Berry cdf5333f6f [MemorySSA] Ensure address stability of MemorySSA object.
Summary:
Ensure that the MemorySSA object never changes address when using the
new pass manager since the walkers contained by MemorySSA cache pointers
to it at construction time.  This is achieved by wrapping the
MemorySSAAnalysis result in a unique_ptr.  Also add some asserts that
check for this bug.

Reviewers: george.burgess.iv, dberlin

Subscribers: mcrosier, hfinkel, chandlerc, silvas, llvm-commits

Differential Revision: https://reviews.llvm.org/D23171

llvm-svn: 278028
2016-08-08 17:52:01 +00:00
Oliver Stannard 8331aaee8f [ARM] Add support for embedded position-independent code
This patch adds support for some new relocation models to the ARM
backend:

* Read-only position independence (ROPI): Code and read-only data is accessed
  PC-relative. The offsets between all code and RO data sections are known at
  static link time. This does not affect read-write data.
* Read-write position independence (RWPI): Read-write data is accessed relative
  to the static base register (r9). The offsets between all writeable data
  sections are known at static link time. This does not affect read-only data.

These two modes are independent (they specify how different objects
should be addressed), so they can be used individually or together. They
are otherwise the same as the "static" relocation model, and are not
compatible with SysV-style PIC using a global offset table.

These modes are normally used by bare-metal systems or systems with
small real-time operating systems. They are designed to avoid the need
for a dynamic linker, the only initialisation required is setting r9 to
an appropriate value for RWPI code.

I have only added support to SelectionDAG, not FastISel, because
FastISel is currently disabled for bare-metal targets where these modes
would be used.

Differential Revision: https://reviews.llvm.org/D23195

llvm-svn: 278015
2016-08-08 15:28:31 +00:00
Nico Weber eb912b9dd3 Revert r2277979.
For some reason, MSVC2013's cl.exe crashes with
  fatal error C1001: An internal error has occurred in the compiler
with this when compiling e.g. LoopDistribute.cpp.

llvm-svn: 278011
2016-08-08 14:51:53 +00:00
Daniel Sanders 3feeb9c851 Re-commit r277988: [mips][ias] Fix all the hacks related to MIPS-specific unary operators (%hi/%lo/%gp_rel/etc.).
Hopefully with the MSVC builds fixed. I've added a missing '#include <tuple>'
that gcc and clang don't seem to need.

llvm-svn: 277995
2016-08-08 11:50:25 +00:00
Simon Pilgrim 4981ec9a56 Fix Wdocumentation unknown parameter warning
llvm-svn: 277994
2016-08-08 11:49:24 +00:00
Daniel Sanders cae9aeed39 Revert r277988: [mips][ias] Fix all the hacks related to MIPS-specific unary operators (%hi/%lo/%gp_rel/etc.).
It seems that MSVC doesn't like std::tie().

llvm-svn: 277990
2016-08-08 09:33:14 +00:00
Daniel Sanders 2ab623b5a3 [mips][ias] Fix all the hacks related to MIPS-specific unary operators (%hi/%lo/%gp_rel/etc.).
Summary:
They are now lexed as a single token on targets where
MCAsmInfo::HasMipsExpressions is true and then parsed in a similar way to
the '~' operator as part of MCExpr::parseExpression.

As a result:
* expressions and immediates no longer have different parsing rules. The
  difference is now solely down to whether evaluateAsAbsolute() succeeds.
* %hi(%neg(%gp_rel(x))) are no longer parsed as a single operator and
  decomposed into the three MipsMCExpr nodes. They are parsed directly as
  three MipsMCExpr nodes.
  * parseMemOperand no longer needs to eat all the surrounding parenthesis
    to get at the outermost operator to make this work
* %hi(%neg(%gp_rel(x))) and %lo(%neg(%gp_rel(x))) are no longer the only
  3-in-1 relocs that parse for N64. They're still the only combinations that
  are permitted in relocatable expressions though. Fixing that should be a
  later patch.
* We no longer need to list all the tokens that can occur as the first token of
  an expression or immediate.

test/MC/Mips/expr1.s:
    This change also prevents the incorrect lowering of %lo(2*4)+foo to
    %lo(8+foo) which is not an equivalent expression (the difference is
    whether foo is truncated to 16-bit or not) and the test has been
    updated to account for the macro expansion the correct expression requires.

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: https://reviews.llvm.org/D23110

llvm-svn: 277988
2016-08-08 09:20:52 +00:00
Sean Silva 6e1fed0ae5 [PM] BasicAA needs to be invalidated since it holds pointers to other stuff.
llvm-svn: 277981
2016-08-08 05:38:03 +00:00
Sean Silva 571906247e [PM] Function-level TLI is also immutable.
llvm-svn: 277979
2016-08-08 05:37:58 +00:00
Eli Friedman 02419a9849 [JumpThreading] Fix handling of aliasing metadata.
Summary:
The correctness fix here is that when we CSE a load with another load,
we need to combine the metadata on the two loads. This matches the
behavior of other passes, like instcombine and GVN.

There's also a minor optimization improvement here: for load PRE, the
aliasing metadata on the inserted load should be the same as the
metadata on the original load. Not sure why the old code was throwing
it away.

Issue found by inspection.

Differential Revision: http://reviews.llvm.org/D21460

llvm-svn: 277977
2016-08-08 04:10:22 +00:00
Davide Italiano 151e5be5ea [MC] Delete use of *structors_used.
Jim Grosbach and Kevin Enderby think those are not used anymore.
Originally submitted by: Rafael Espindola

llvm-svn: 277973
2016-08-08 03:30:01 +00:00
Lang Hames 73976f622d [ORC] Re-apply r277896, removing bogus triples and datalayouts that broke tests
on linux last time.

llvm-svn: 277942
2016-08-06 22:36:26 +00:00
Gor Nishanov 2ed6e788a8 [Coroutines] Part 5: Add CGSCC restart trigger
Summary:
CoroSplit pass processes the coroutine twice. First, it lets it go through
complete IPO optimization pipeline as a single function. It forces restart
of the pipeline by inserting an indirect call to an empty function "coro.devirt.trigger"
which is devirtualized by CoroElide pass that triggers a restart of the pipeline by CGPassManager.
(In later patches, when CoroSplit pass sees the same coroutine the second time, it splits it up,
adds coroutine subfunctions to the SCC to be processed by IPO pipeline.)

Documentation and overview is here: http://llvm.org/docs/Coroutines.html.

Upstreaming sequence (rough plan)
1.Add documentation. (https://reviews.llvm.org/D22603)
2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659)
3.Add empty coroutine passes. (https://reviews.llvm.org/D22847)
4.Add coroutine devirtualization + tests.
ab) Lower coro.resume and coro.destroy (https://reviews.llvm.org/D22998)
c) Do devirtualization (https://reviews.llvm.org/D23229)
5.Add CGSCC restart trigger + tests. <= we are here
6.Add coroutine heap elision + tests.
7.Add the rest of the logic (split into more patches)

Reviewers: mehdi_amini, majnemer

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23234

llvm-svn: 277936
2016-08-06 20:44:39 +00:00
David Majnemer 1665d8635e [CallGraphSCCPass] Use an ArrayRef instead of a pair of iterators
No functional change is intended.

llvm-svn: 277913
2016-08-06 06:21:02 +00:00
Sanjoy Das ba04d3a620 [InstCombine] Don't coerce non-integral pointers to integers
Reviewers: majnemer

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D23231

llvm-svn: 277910
2016-08-06 02:58:48 +00:00
Gor Nishanov 31d8c9af89 Part 4c: Coroutine Devirtualization: Devirtualize coro.resume and coro.destroy.
Summary:
This is the 4c patch of the coroutine series. CoroElide pass now checks if PostSplit coro.begin
is referenced by coro.subfn.addr intrinsics. If so replace coro.subfn.addrs with an appropriate coroutine
subfunction associated with that coro.begin.

Documentation and overview is here: http://llvm.org/docs/Coroutines.html.

Upstreaming sequence (rough plan)
1.Add documentation. (https://reviews.llvm.org/D22603)
2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659)
3.Add empty coroutine passes. (https://reviews.llvm.org/D22847)
4.Add coroutine devirtualization + tests.
ab) Lower coro.resume and coro.destroy (https://reviews.llvm.org/D22998)
c) Do devirtualization <= we are here
5.Add CGSCC restart trigger + tests.
6.Add coroutine heap elision + tests.
7.Add the rest of the logic (split into more patches)

Reviewers: majnemer

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D23229

llvm-svn: 277908
2016-08-06 02:16:35 +00:00
Nico Weber c893e603ab Revert r277896.
It breaks ExecutionEngine/OrcLazy/weak-function.ll on most bots.

Script:
--
...
--
Exit Code: 1

Command Output (stderr):
--
Could not find main function.

llvm-svn: 277907
2016-08-06 02:00:45 +00:00
Lang Hames 62a459603c [ORC] Add (partial) weak symbol support to the CompileOnDemand layer.
This adds partial support for weak functions to the CompileOnDemandLayer by
modifying the addLogicalModule method to check for existing stub definitions
before building a new stub for a weak function. This scheme is sufficient to
support ODR definitions, but fails for general weak definitions if strong
definition is encountered after the first weak definition. (A more extensive
refactor will be required to fully support weak symbols).

This patch does *not* add weak symbol support to RuntimeDyld: I hope to add
that in the near future.

llvm-svn: 277896
2016-08-06 00:54:43 +00:00
Zachary Turner 83816cea35 Fix a -Wunused-const-variable due to a bug in clang.
llvm-svn: 277893
2016-08-06 00:13:32 +00:00
Zachary Turner 9e91c28b71 Resubmit "Make YAML support SmallVector"
This resubmits a3770391c5fb64108d565e12f61dd77ce71b5b4f,
which was reverted due to breakages on non-Windows machines.

Due to differences in template instantiation rules on Microsoft
and non-Microsoft platforms, a member access restriction was
triggering on non-Microsoft compilers.  Previously, a friend
declaration for std::vector<> had been introduced into the
DebugMap class to make the member access restriction pass,
but the introduction of support for SmallVector<> meant that
an additional friend declaration would need to be added.

This didn't really make a lot of sense since the user of the
macro is probably only using one type (SmallVector<>, vector<>,
etc) and we could in theory add support for even more types
to this macro in the future (e.g. std::deque), so rather than
add another friend declaration, I just made the type being
referenced a public nested typedef instead of a private nested
typedef.

llvm-svn: 277888
2016-08-05 23:12:31 +00:00
Justin Bogner 1219a60e26 Revert "Make YAML support SmallVector"
This breaks building dsymutil, causing my local build and many bots to
fail.

This reverts r277870.

llvm-svn: 277881
2016-08-05 22:32:33 +00:00
Daniel Berlin 2919b1c41b Rewrite domination verifier to handle local domination as well.
Summary:
Rewrite domination verifier to handle local domination as well.
This catches a bug Geoff Berry noticed.

Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23184

llvm-svn: 277872
2016-08-05 21:46:52 +00:00
Zachary Turner 5e3e4bb26b [CodeView] Decouple record deserialization from visitor dispatch.
Until now, our use case for the visitor has been to take a stream of bytes
representing a type stream, deserialize the records in sequence, and do
something with them, where "something" is determined by how the user
implements a particular set of callbacks on an abstract class.

For actually writing PDBs, however, we want to do the reverse. We have
some kind of description of the list of records in their in-memory format,
and we want to process each one. Perhaps by serializing them to a byte
stream, or perhaps by converting them from one description format (Yaml)
to another (in-memory representation).

This was difficult in the current model because deserialization and
invoking the callbacks were tightly coupled.

With this patch we change this so that TypeDeserializer is itself an
implementation of the particular set of callbacks. This decouples
deserialization from the iteration over a list of records and invocation
of the callbacks.  TypeDeserializer is initialized with another
implementation of the callback interface, so that upon deserialization it
can pass the deserialized record through to the next set of callbacks. In
a sense this is like an implementation of the Decorator design pattern,
where the Deserializer is a decorator.

This will be useful for writing Pdbs from yaml, where we have a
description of the type records in Yaml format. In this case, the visitor
implementation would have each visitation callback method implemented in
such a way as to extract the proper set of fields from the Yaml, and it
could maintain state that builds up a list of these records. Finally at
the end we can pass this information through to another set of callbacks
which serializes them into a byte stream.

Reviewed By: majnemer, ruiu, rnk
Differential Revision: https://reviews.llvm.org/D23177

llvm-svn: 277871
2016-08-05 21:45:34 +00:00
Zachary Turner 9c3dac8efd Make YAML support SmallVector
Currently YAML sequences require std::vectors. All of the methods that the
YAML parser accesses though are present in SmallVector, so there's no
reason we can't support SmallVector inherently. This patch does that.

Reviewed By: majnemer
Differential Revision: https://reviews.llvm.org/D23213

llvm-svn: 277870
2016-08-05 21:45:19 +00:00
Mehdi Amini 0dd5b79e18 Update outdated comments in the new PM internals (NFC)
The analysis manager was made not optional and turned into a
reference instead of a pointer in r272978. Some comments were
still refering to the previous behavior.

llvm-svn: 277857
2016-08-05 19:51:00 +00:00
Sanjay Patel 344e25f13b fix documentation comments; NFC
llvm-svn: 277853
2016-08-05 19:09:25 +00:00
Lang Hames 2a04a99ce6 [ORC] Change LogicalDylib::LogicalModuleHandle from an iterator to an index.
This prevents handles from being invalidated (through iterator invalidation)
when new modules are added.

No test-case yet: This bug was uncovered during work on an upcoming patch for
weak symbol support and the testcase for that feature will implicitly test for
correct behavior here.

llvm-svn: 277847
2016-08-05 18:26:56 +00:00
Tim Northover 97d0cb3165 GlobalISel: IRTranslate PHI instructions
llvm-svn: 277835
2016-08-05 17:16:40 +00:00
John Brawn 75127944b6 Add a missing backslash to my previous commit
llvm-svn: 277809
2016-08-05 11:17:43 +00:00
John Brawn 4d79ec7fe8 Reapply r276973 "Adjust Registry interface to not require plugins to export a registry"
This differs from the previous version by being more careful about template
instantiation/specialization in order to prevent errors when building with
clang -Werror. Specifically:
 * begin is not defined in the template and is instead instantiated when Head
   is. I think the warning when we don't do that is wrong (PR28815) but for now
   at least do it this way to avoid the warning.
 * Instead of performing template specializations in LLVM_INSTANTIATE_REGISTRY
   instead provide a template definition then do explicit instantiation. No
   compiler I've tried has problems with doing it the other way, but strictly
   speaking it's not permitted by the C++ standard so better safe than sorry.

Original commit message:

Currently the Registry class contains the vestiges of a previous attempt to
allow plugins to be used on Windows without using BUILD_SHARED_LIBS, where a
plugin would have its own copy of a registry and export it to be imported by
the tool that's loading the plugin. This only works if the plugin is entirely
self-contained with the only interface between the plugin and tool being the
registry, and in particular this conflicts with how IR pass plugins work.

This patch changes things so that instead the add_node function of the registry
is exported by the tool and then imported by the plugin, which solves this
problem and also means that instead of every plugin having to export every
registry they use instead LLVM only has to export the add_node functions. This
allows plugins that use a registry to work on Windows if
LLVM_EXPORT_SYMBOLS_FOR_PLUGINS is used.

llvm-svn: 277806
2016-08-05 11:01:08 +00:00
Justin Bogner 19dd0da153 IR: Provide an IRBuilder Inserter that calls a callback after insertion
Add a generalized IRBuilderCallbackInserter, which is just given a
callback to execute after insertion. This can be used to get rid of
the custom inserter in InstCombine, which will in turn allow me to add
target specific InstCombineCalls API for intrinsics without horrible
layering violations.

llvm-svn: 277784
2016-08-04 23:41:01 +00:00
Tim Northover 1cfa919b3d GlobalISel: add support for G_MUL
llvm-svn: 277774
2016-08-04 21:39:44 +00:00
Chris Bieneman 17e42a0980 [Mach0YAML] Change n_type from uint8_t to llvm::yaml::Hex8
Since this field is generally masked, it is way easier to understand it as a Hex value than decimal.

llvm-svn: 277770
2016-08-04 21:07:39 +00:00
Tim Northover 9656f1476c GlobalISel: implement narrowing for G_ADD.
llvm-svn: 277769
2016-08-04 20:54:13 +00:00
Tim Northover 404f1b7db5 GlobalISel: refuse to halve size of 1-byte & odd-sized LLTs.
llvm-svn: 277768
2016-08-04 20:54:05 +00:00
Lang Hames aac59a26a5 [ExecutionEngine] Refactor - Roll JITSymbolFlags functionality into JITSymbol.h
and remove the JITSymbolFlags header.

llvm-svn: 277766
2016-08-04 20:32:37 +00:00
David Majnemer f93082e71a [coroutines] Part 4[ab]: Coroutine Devirtualization: Lower coro.resume and coro.destroy.
This is the forth patch in the coroutine series. CoroEaly pass now lowers coro.resume
and coro.destroy intrinsics by replacing them with an indirect call to an address
returned by coro.subfn.addr intrinsic. This is done so that CGPassManager recognizes
devirtualization when CoroElide replaces a call to coro.subfn.addr with an appropriate
function address.

Patch by Gor Nishanov!

Differential Revision: https://reviews.llvm.org/D22998

llvm-svn: 277765
2016-08-04 20:30:07 +00:00
Zachary Turner 660230eba4 [CodeView] Use llvm::Error instead of std::error_code.
This eliminates the remnants of std::error_code from the
DebugInfo libraries.

llvm-svn: 277758
2016-08-04 19:39:55 +00:00
Tim Northover 323358184e GlobalISel: add code to widen scalar G_ADD
llvm-svn: 277747
2016-08-04 18:35:11 +00:00
Alina Sbirlea 6f937b1144 LoadStoreVectorizer: Remove TargetBaseAlign. Keep alignment for stack adjustments.
Summary:
TargetBaseAlign is no longer required since LSV checks if target allows misaligned accesses.
A constant defining a base alignment is still needed for stack accesses where alignment can be adjusted.

Previous patch (D22936) was reverted because tests were failing. This patch also fixes the cause of those failures:
- x86 failing tests either did not have the right target, or the right alignment.
- NVPTX failing tests did not have the right alignment.
- AMDGPU failing test (merge-stores) should allow vectorization with the given alignment but the target info
  considers <3xi32> a non-standard type and gives up early. This patch removes the condition and only checks
  for a maximum size allowed and relies on the next condition checking for %4 for correctness.
  This should be revisited to include 3xi32 as a MVT type (on arsenm's non-immediate todo list).

Note that checking the sizeInBits for a MVT is undefined (leads to an assertion failure),
so we need to create an EVT, hence the interface change in allowsMisaligned to include the Context.

Reviewers: arsenm, jlebar, tstellarAMD

Subscribers: jholewinski, arsenm, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D23068

llvm-svn: 277735
2016-08-04 16:38:44 +00:00
Nikolai Bozhenov f679530ba1 [X86] Heuristic to selectively build Newton-Raphson SQRT estimation
On modern Intel processors hardware SQRT in many cases is faster than RSQRT
followed by Newton-Raphson refinement. The patch introduces a simple heuristic
to choose between hardware SQRT instruction and Newton-Raphson software
estimation.

The patch treats scalars and vectors differently. The heuristic is that for
scalars the compiler should optimize for latency while for vectors it should
optimize for throughput. It is based on the assumption that throughput bound
code is likely to be vectorized.

Basically, the patch disables scalar NR for big cores and disables NR completely
for Skylake. Firstly, scalar SQRT has shorter latency than NR code in big cores.
Secondly, vector SQRT has been greatly improved in Skylake and has better
throughput compared to NR.

Differential Revision: https://reviews.llvm.org/D21379

llvm-svn: 277725
2016-08-04 12:47:28 +00:00
Chandler Carruth a053a88df5 [PM] Change the name of the repeating utility to something less
overloaded (and simpler).

Sean rightly pointed out in code review that we've started using
"wrapper pass" as a specific part of the old pass manager, and in fact
it is more applicable there. Here, we really have a pass *template* to
build a repeated pass, so call it that.

llvm-svn: 277689
2016-08-04 03:52:53 +00:00
Rui Ueyama d1d8c8312a pdbdump: Fix crash bug.
pdbdump calls DbiStreamBuilder::commit through PDBFileBuilder::commit
without calling DbiStreamBuilder::finalize. Because `finalize` initializes
`Header` member, `Header` remained nullptr which caused a crash bug.

Differential Revision: https://reviews.llvm.org/D23143

llvm-svn: 277681
2016-08-03 23:43:23 +00:00
Kevin Enderby 27e85bd0a6 Clean up of libObject/Archive interfaces and change the last three uses of ErrorOr<>
changing them to Expected<> to allow them to pass through llvm Errors.
No functional change.

This commit by itself will break the next lld builds.  I’ll be committing the
matching change for lld immediately next.

llvm-svn: 277656
2016-08-03 21:57:47 +00:00
Sebastian Pop 031b1bc06f Pass EphValues by const-ref as it is not modified in the callee
Patch by Aditya Kumar.

Differential Revision: https://reviews.llvm.org/D22967

llvm-svn: 277634
2016-08-03 19:13:50 +00:00
Vedant Kumar 4031d9f80e Reapply "More fixes to get good error messages for bad archives."
This reverts commit the revert commit r277627. The build errors
mentioned in r277627 were likely caused by an unclean build directory.
Sorry for the noise.

llvm-svn: 277630
2016-08-03 19:02:50 +00:00
Vedant Kumar bfb6072d84 Revert "More fixes to get good error messages for bad archives."
This reverts commit r277540. It breaks the build with:

../lib/Object/Archive.cpp:264:41: error: return type of out-of-line definition of 'llvm::object::ArchiveMemberHeader::getUID' differs from that in the declaration
Expected<unsigned> ArchiveMemberHeader::getUID() const {
~~~~~~~~~~~~~~~~~~                      ^
include/llvm/Object/Archive.h:53:12: note: previous declaration is here
  unsigned getUID() const;
  ~~~~~~~~ ^

llvm-svn: 277627
2016-08-03 18:44:32 +00:00
Zachary Turner 8cf51c340d [msf] Make FPM reader use MappedBlockStream.
MappedBlockSTream can work with any sequence of block data where
the ordering is specified by a list of block numbers.  So rather
than manually stitch them together in the case of the FPM, reuse
this functionality so that we can treat the FPM as if it were
contiguous.

Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D23066

llvm-svn: 277609
2016-08-03 16:53:21 +00:00
Chandler Carruth 1af98245f4 [PM] Add the explicit copy, move, swap, and assignment boilerplate
required by MSVC 2013.

This also makes the repeating pass wrapper assignable. Mildly
unfortunate as it means we can't use a const member for the int, but
that is a really minor invariant to try to preserve at the cost of loss
of regularity of the type. Yet another annoyance of the particular C++
object / move semantic model.

llvm-svn: 277582
2016-08-03 08:16:08 +00:00
Chandler Carruth 241bf2456f [PM] Add a generic 'repeat N times' pass wrapper to the new pass
manager.

While this has some utility for debugging and testing on its own, it is
primarily intended to demonstrate the technique for adding custom
wrappers that can provide more interesting interation behavior in
a nice, orthogonal, and composable layer.

Being able to write these kinds of very dynamic and customized controls
for running passes was one of the motivating use cases of the new pass
manager design, and this gives a hint at how they might look. The actual
logic is tiny here, and most of this is just wiring in the pipeline
parsing so that this can be widely used.

I'm adding this now to show the wiring without a lot of business logic.
This is a precursor patch for showing how a "iterate up to N times as
long as we devirtualize a call" utility can be added as a separable and
composable component along side the CGSCC pass management.

Differential Revision: https://reviews.llvm.org/D22405

llvm-svn: 277581
2016-08-03 07:44:48 +00:00
Chandler Carruth 6cb2ab2c60 [PM] Significantly refactor the pass pipeline parsing to be easier to
reason about and less error prone.

The core idea is to fully parse the text without trying to identify
passes or structure. This is done with a single state machine. There
were various bugs in the logic around this previously that were repeated
and scattered across the code. Having a single routine makes it much
easier to fix and get correct. For example, this routine doesn't suffer
from PR28577.

Then the actual pass construction is handled using *much* easier to read
code and simple loops, with particular pass manager construction sunk to
live with other pass construction. This is especially nice as the pass
managers *are* in fact passes.

Finally, the "implicit" pass manager synthesis is done much more simply
by forming "pre-parsed" structures rather than having to duplicate tons
of logic.

One of the bugs fixed by this was evident in the tests where we accepted
a pipeline that wasn't really well formed. Another bug is PR28577 for
which I have added a test case.

The code is less efficient than the previous code but I'm really hoping
that's not a priority. ;]

Thanks to Sean for the review!

Differential Revision: https://reviews.llvm.org/D22724

llvm-svn: 277561
2016-08-03 03:21:41 +00:00
Matthias Braun 9dbdc18523 CommandFlags.h/llc: Move StopAfter/StartBefore options to llc.
Move those two options to llc:

The options in CommandFlags.h are shared by dsymutil, gold, llc,
llvm-dwp, llvm-lto, llvm-mc, lto, opt.

-stop-after/-start-after only affect codegen passes however only gold and llc
actually create codegen passes and I believe these flags to be only
useful for users of llc. For the other tools they are just highly
confusing: -stop-after claims to "Stop compilation after a specific
pass" which is not true in the context of the "opt" tool.

Differential Revision: https://reviews.llvm.org/D23050

llvm-svn: 277551
2016-08-02 23:36:06 +00:00
Kevin Enderby 395cc09444 More fixes to get good error messages for bad archives.
Fixed the last incorrect uses of llvm_unreachable() in the code
which were actually just cases of errors in the input Archives.

llvm-svn: 277540
2016-08-02 22:58:55 +00:00
Piotr Padlewski 47509f6185 Imported statistics types changes
Reviewers: tejohnson, eraman

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22980

llvm-svn: 277534
2016-08-02 22:18:47 +00:00
Ahmed Bougacha bfaddd999a [GlobalISel] Set the Selected MF property.
None of GlobalISel requires the property, but this lets us use the
verifier instead of rolling our own "all instructions selected" check.

llvm-svn: 277484
2016-08-02 16:49:25 +00:00
Ahmed Bougacha b109d51865 [GlobalISel] Add Selected MachineFunction property.
Selected: the InstructionSelect pass ran and all pre-isel generic
instructions have been eliminated; i.e., all instructions are now
target-specific or non-pre-isel generic instructions (e.g., COPY).

Since only pre-isel generic instructions can have generic virtual register
operands, this also means that all generic virtual registers have been
constrained to virtual registers (assigned to register classes) and that
all sizes attached to them have been eliminated.

This lets us enforce certain invariants across passes.
This property is GlobalISel-specific, but is always available.

llvm-svn: 277482
2016-08-02 16:49:19 +00:00
Daniel Berlin c43aa5a5b6 Rewrite the use optimizer to be less memory intensive and 50% faster.
Fixes PR28670

Summary:
Rewrite the use optimizer to be less memory intensive and 50% faster.
Fixes PR28670

The new use optimizer works like a standard SSA renaming pass, storing
all possible versions a MemorySSA use could get in a stack, and just
tracking indexes into the stack.
This uses much less memory than caching N^2 alias query results.
It's also a lot faster.

The current version defers phi node walking to the normal walker.

Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23032

llvm-svn: 277480
2016-08-02 16:24:03 +00:00
Ahmed Bougacha 4628e37e7f [GlobalISel] Set and require RegBankSelected MF property.
The InstructionSelect pass assumes that RegBankSelect ran; set the
property on all tests (thereby verifying the test inputs) and require
it in the pass.

llvm-svn: 277477
2016-08-02 16:17:18 +00:00
Ahmed Bougacha 2471265508 [GlobalISel] Add RegBankSelected MachineFunction property.
RegBankSelected: the RegBankSelect pass ran and all generic virtual
registers have been assigned to a register bank.

This lets us enforce certain invariants across passes.
This property is GlobalISel-specific, but is always available.

llvm-svn: 277475
2016-08-02 16:17:10 +00:00
Ahmed Bougacha 24d0d4d2ec [GlobalISel] Set, require, and verify Legalized MF property.
RegBankSelect and InstructionSelect run after the legalizer and
require a Legalized function: check that all instructions are legal.

Note that this should be in the MachineVerifier, but it can't use the
MachineLegalizer as it's currently in the separate GlobalISel library.
Note that the RegBankSelect verifier checks have the same layering
problem, but we only use inline methods so end up not needing to link
against the GlobalISel library.

llvm-svn: 277472
2016-08-02 15:10:32 +00:00
Ahmed Bougacha 0d7b0cb865 [GlobalISel] Add Legalized MachineFunction property.
Legalized: The MachineLegalizer ran; all pre-isel generic instructions
have been legalized, i.e., all instructions are now one of:
  - generic and always legal (e.g., COPY)
  - target-specific
  - legal pre-isel generic instructions.

This lets us enforce certain invariants across passes.
This property is GlobalISel-specific, but is always available.

llvm-svn: 277470
2016-08-02 15:10:25 +00:00
Ahmed Bougacha ed581eac6a [GlobalISel] Require isSSA in GISel passes.
The GISel passes don't make sense on non-SSA functions.
All GISel tests already set isSSA. Enforce that.

llvm-svn: 277464
2016-08-02 14:42:55 +00:00
Ahmed Bougacha 45eb3b94d4 [GlobalISel] Don't RegBankSelect target-specific instructions.
They don't have types and should be using register classes.

llvm-svn: 277447
2016-08-02 11:41:16 +00:00
Ahmed Bougacha f49ab9af2c [GlobalISel] Const-ify MachineInstrs passed to MachineLegalizer.
llvm-svn: 277445
2016-08-02 11:41:03 +00:00
Chandler Carruth feb095598c [Inliner] Clean up doxygen comments to match modern style.
llvm-svn: 277417
2016-08-02 05:49:32 +00:00
Sean Silva f801575fd0 CodeExtractor : Add ability to preserve profile data.
Added ability to estimate the entry count of the extracted function and
the branch probabilities of the exit branches.

Patch by River Riddle!

Differential Revision: https://reviews.llvm.org/D22744

llvm-svn: 277411
2016-08-02 02:15:45 +00:00
Tim Shen b44909eccb [ADT] NFC: Generalize GraphTraits requirement of "NodeType *" in interfaces to "NodeRef", and migrate SCCIterator.h to use NodeRef
Summary: By generalize the interface, users are able to inject more flexible Node token into the algorithm, for example, a pair of vector<Node>* and index integer. Currently I only migrated SCCIterator to use NodeRef, but more is coming. It's a NFC.

Reviewers: dblaikie, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22937

llvm-svn: 277399
2016-08-01 22:32:20 +00:00
Lang Hames 7643d98d86 [Orc] Fix common symbol support in ORC.
Common symbol support in ORC was broken in r270716 when the symbol resolution
rules in RuntimeDyld were changed. With the switch to lazily materialized
symbols in r277386, common symbols can be supported by having
RuntimeDyld::emitCommonSymbols search for (but not materialize!) definitions
elsewhere in the logical dylib.

This patch adds the 'Common' flag to JITSymbolFlags, and the necessary check
to RuntimeDyld::emitCommonSymbols.

llvm-svn: 277397
2016-08-01 22:23:24 +00:00
Michael Kuperstein c40618610f [PM] Port SpeculativeExecution to the new PM
Differential Revision: https://reviews.llvm.org/D23033

llvm-svn: 277393
2016-08-01 21:48:33 +00:00
Zachary Turner d3c7b8e303 [msf] Teach LLVM to parse a split Fpm.
The FPM is split at regular intervals across the MSF file, as the MS code
suggests. It turns out that the value of the interval is precisely the
block size. If the block size is 4096, then there are two Fpm pages every
4096 blocks.

So here we teach the PDBFile class to parse a split FPM, and also add more
options when dumping the FPM to display some additional information such
as orphaned pages (pages which the FPM says are allocated, but which
nothing appears to use), use after free pages (pages which the FPM says
are not allocated, but which are referenced by a stream), and multiple use
pages (pages which the FPM says are allocated but are used more than
once).

Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D23022

llvm-svn: 277388
2016-08-01 21:19:45 +00:00
Lang Hames ad4a911fea [ExecutionEngine][MCJIT][Orc] Replace RuntimeDyld::SymbolInfo with JITSymbol.
This patch replaces RuntimeDyld::SymbolInfo with JITSymbol: A symbol class
that is capable of lazy materialization (i.e. the symbol definition needn't be
emitted until the address is requested). This can be used to support common
and weak symbols in the JIT (though this is not implemented in this patch).

For consistency, RuntimeDyld::SymbolResolver is renamed to JITSymbolResolver.

For space efficiency a new class, JITEvaluatedSymbol, is introduced that
behaves like the old RuntimeDyld::SymbolInfo - i.e. it is just a pair of an
address and symbol flags. Instances of JITEvaluatedSymbol can be used in
symbol-tables to avoid paying the space cost of the materializer.

llvm-svn: 277386
2016-08-01 20:49:11 +00:00
Michael Kuperstein c97da7f3a4 [DAGCombine] Make sext(setcc) combine respect getBooleanContents
We used to combine "sext(setcc x, y, cc) -> (select (setcc x, y, cc), -1, 0)"
Instead, we should combine to (select (setcc x, y, cc), T, 0) where the value
of T is 1 or -1, depending on the type of the setcc, and getBooleanContents()
for the type if it is not i1.

This fixes PR28504.

llvm-svn: 277371
2016-08-01 19:39:49 +00:00
George Burgess IV 5f0e76dca6 [CFLAA] Remove modref queries from CFLAA.
As it turns out, modref queries are broken with CFLAA. Specifically,
the data source we were using for determining modref behaviors
explicitly ignores operations on non-pointer values. So, it wouldn't
note e.g. storing an i32 to an i32* (or loading an i64 from an i64*).
It also ignores external function calls, rather than acting
conservatively for them.

(N.B. These operations, where necessary, *are* tracked by CFLAA; we just
use a different mechanism to do so. Said mechanism is relatively
imprecise, so it's unlikely that we can provide reasonably good modref
answers with it as implemented.)

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22978

llvm-svn: 277366
2016-08-01 18:47:28 +00:00
Evandro Menezes 82e245a202 [AArch64] Add support for Samsung Exynos M2 (NFC).
llvm-svn: 277364
2016-08-01 18:39:45 +00:00
Krzysztof Parzyszek 8fb181ca5b Replace MachineInstr* with MachineInstr& in TargetInstrInfo, NFC
There were a few cases introduced with the modulo scheduler.

llvm-svn: 277358
2016-08-01 17:55:48 +00:00
Sean Silva 423c7149dc Revert r277313 and r277314.
They seem to trigger an LSan failure:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/15140/steps/check-llvm%20asan/logs/stdio

Revert "Add the tests for r277313"

This reverts commit r277314.

Revert "CodeExtractor : Add ability to preserve profile data."

This reverts commit r277313.

llvm-svn: 277317
2016-08-01 04:16:09 +00:00
Sean Silva 6208924323 CodeExtractor : Add ability to preserve profile data.
Added ability to estimate the entry count of the extracted function and
the branch probabilities of the exit branches.

Patch by River Riddle!

Differential Revision: https://reviews.llvm.org/D22744

llvm-svn: 277313
2016-08-01 02:59:26 +00:00
Daniel Berlin 5130cc831a Fix the MemorySSA updating API to enable people to create memory accesses before removing old ones
llvm-svn: 277309
2016-07-31 21:08:20 +00:00
Daniel Berlin cdda3ce478 Comment fixes to MemorySSA.h
llvm-svn: 277308
2016-07-31 21:08:10 +00:00
David Majnemer 6004952661 [COFF] Expose iterators for ImportAddressTableRVA
Patch by Bandzi Michal!

llvm-svn: 277298
2016-07-31 19:40:02 +00:00
David Majnemer 1c0aa04e7e [COFF] Remove a duplicate import_directory_table_entry definition
We had import_directory_table_entry and
coff_import_directory_table_entry, remove one.  Also, factor out the
logic which determins if a descriptor is a terminator.

llvm-svn: 277296
2016-07-31 19:25:21 +00:00
Chandler Carruth 974c67e7c6 [ADT] Add 'consume_front' and 'consume_back' methods to StringRef which
are very handy when parsing text.

They are essentially a combination of startswith and a self-modifying
drop_front, or endswith and drop_back respectively.

Differential Revision: https://reviews.llvm.org/D22723

llvm-svn: 277288
2016-07-31 02:19:13 +00:00
Lang Hames 6c64c2c86a [Support] Add doxygen @code tags to example code in Error comments.
llvm-svn: 277282
2016-07-30 21:34:04 +00:00
Hubert Tong 01a2cb55f1 TrailingObjects::FixedSizeStorage constexpr fixes + tests
Summary:
This change fixes issues with `LLVM_CONSTEXPR` functions and
`TrailingObjects::FixedSizeStorage`. In particular, some of the
functions marked `LLVM_CONSTEXPR` used by `FixedSizeStorage` were not
implemented such that they evaluate successfully as part of a constant
expression despite constant arguments.

This change also implements a more traditional template-meta path to
accommodate MSVC, and adds unit tests for `FixedSizeStorage`.

Drive-by fix: the access control for members of `TrailingObjectsImpl` is
tightened.

Reviewers: faisalv, rsmith, aaron.ballman

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D22668

llvm-svn: 277270
2016-07-30 14:01:00 +00:00
Hubert Tong a216643cd3 MathExtras.h: add LLVM_CONSTEXPR where simple
Summary:
This change adds `LLVM_CONSTEXPR` to functions selected as follows:
- the body is already valid under C++11 for a `constexpr` function,
- the evaluation of the function, given constant arguments, will not
  fail during the evaluation of a constant expression, and
- the above properties are easily verifiable at a glance.

Note: the evaluation of the function cannot fail if the instantiation
triggers a static assertion failure.

Reviewers: faisalv, rsmith, aaron.ballman

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D22824

llvm-svn: 277269
2016-07-30 13:38:51 +00:00
Benjamin Kramer 96cb6bfa27 Update modulemap for Msf -> MSF rename.
llvm-svn: 277267
2016-07-30 12:05:17 +00:00
Lang Hames c31d594228 [Orc] Add support for updating stub targets to CompileOnDemandLayer.
This makes it possible to implement re-optimization on top of the
CompileOnDemandLayer.

Test case to come in a future patch: This will need an execution test, and
execution tests require a full working stack. The best option is to plumb this
API up to the C Bindings stack and add a C bindings test for this.

Patch by Sean Ogden. Thanks Sean!

llvm-svn: 277257
2016-07-30 00:57:54 +00:00
Lang Hames 6ecbd16f00 [Support] Add storage specifier for MachO::NListType.
This should fix UB warnings from the sanitizer bots: LLD performs bit
manipulations on enums of this type, and these are UB if the underlying
storage type isn't specified.

llvm-svn: 277251
2016-07-29 23:17:53 +00:00
Tim Northover 5fb414d870 GlobalISel: support translation of intrinsic calls.
These come in two variants for now: G_INTRINSIC and G_INTRINSIC_W_SIDE_EFFECTS.
We may decide to split the latter up with finer-grained restrictions later, if
necessary.

llvm-svn: 277224
2016-07-29 22:32:36 +00:00
Rui Ueyama 7a5cdc6225 pdbdump: Dump Free Page Map contents.
Differential Revision: https://reviews.llvm.org/D22974

llvm-svn: 277216
2016-07-29 21:38:00 +00:00
Zachary Turner a3225b0451 [msf] Resubmit "Rename Msf -> MSF".
Previously this change was submitted from a Windows machine, so
changes made to the case of filenames and directory names did
not survive the commit, and as a result the CMake source file
names and the on-disk file names did not match on case-sensitive
file systems.

I'm resubmitting this patch from a Linux system, which hopefully
allows the case changes to make it through unfettered.

llvm-svn: 277213
2016-07-29 20:56:36 +00:00
Tim Northover 6b3bd61283 CodeGen: add new "intrinsic" MachineOperand kind.
This will be used during GlobalISel, where we need a more robust and readable
way to write tests than a simple immediate ID.

llvm-svn: 277209
2016-07-29 20:32:59 +00:00
Adam Nemet 12937c361f [LoopUnroll] Include hotness of region in opt remark
LoopUnroll is a loop pass, so the analysis of OptimizationRemarkEmitter
is added to the common function analysis passes that loop passes
depend on.

The BFI and indirectly BPI used in this pass is computed lazily so no
overhead should be observed unless -pass-remarks-with-hotness is used.

This is how the patch affects the O3 pipeline:

         Dominator Tree Construction
         Natural Loop Information
         Canonicalize natural loops
         Loop-Closed SSA Form Pass
         Basic Alias Analysis (stateless AA impl)
         Function Alias Analysis Results
         Scalar Evolution Analysis
+        Lazy Branch Probability Analysis
+        Lazy Block Frequency Analysis
+        Optimization Remark Emitter
         Loop Pass Manager
           Rotate Loops
           Loop Invariant Code Motion
           Unswitch loops
         Simplify the CFG
         Dominator Tree Construction
         Basic Alias Analysis (stateless AA impl)
         Function Alias Analysis Results
         Combine redundant instructions
         Natural Loop Information
         Canonicalize natural loops
         Loop-Closed SSA Form Pass
         Scalar Evolution Analysis
+        Lazy Branch Probability Analysis
+        Lazy Block Frequency Analysis
+        Optimization Remark Emitter
         Loop Pass Manager
           Induction Variable Simplification
           Recognize loop idioms
           Delete dead loops
           Unroll loops
...

llvm-svn: 277203
2016-07-29 19:29:47 +00:00
Zachary Turner 334aec4dd2 Revert "[msf] Rename Msf to MSF."
This reverts commit 4d1557ffac41e079bcb1abbcf04f512474dcd6fe.

llvm-svn: 277194
2016-07-29 18:38:47 +00:00
Piotr Padlewski 5312b6f10c Fixing broken MSVS builds
llvm-svn: 277191
2016-07-29 18:28:07 +00:00
Zachary Turner a010f5cef0 [msf] Rename Msf to MSF.
In a previous patch, it was suggested to use all caps instead of
rolling caps for initialisms, so this patch changes everything
to do this.

llvm-svn: 277190
2016-07-29 18:24:26 +00:00
Andrew Kaylor b99d1cc7ed Recommitting r275284: add support to inline __builtin_mempcpy
Patch by Sunita Marathe

Third try, now following fixes to MSan to handle mempcy in such a way that this commit won't break the MSan buildbots. (Thanks, Evegenii!)

llvm-svn: 277189
2016-07-29 18:23:18 +00:00
Tim Northover 0d56e05a12 GlobalISel: make translate* functions take the most specialized class possible.
NFC.

llvm-svn: 277188
2016-07-29 18:11:21 +00:00
Tim Northover 69c2ba546f GlobalISel: add generic conditional branch.
Just the basic equivalent to DAG's condbr for now, we'll get to things like
br_cc when we start doing more legalization.

llvm-svn: 277184
2016-07-29 17:58:00 +00:00
Kevin Enderby f4586039f6 The next step along the way to getting good error messages for bad archives.
As mentioned in commit log for r276686 this next step is adding a new
method in the ArchiveMemberHeader class to get the full name that
does proper error checking, and can be use for error messages.

To do this the name of ArchiveMemberHeader::getName() is changed to
ArchiveMemberHeader::getRawName() to be consistent with
Archive::Child::getRawName().  Then the “new” method is the addition
of a new implementation of ArchiveMemberHeader::getName() which gets
the full name and provides proper error checking.  Which is mostly a rewrite
of what was Archive::Child::getName() and cleaning up incorrect uses of
llvm_unreachable() in the code which were actually just cases of errors
in the input Archives.

Then Archive::Child::getName() is changed to return Expected<> and use
the new implementation of ArchiveMemberHeader::getName() .

Also needed to change Archive::getMemoryBufferRef() with these
changes to return Expected<> as well to propagate Errors up.
As well as changing Archive::isThinMember() to return Expected<> .

llvm-svn: 277177
2016-07-29 17:44:13 +00:00
Tim Northover a51575ffa2 CodeGen: improve MachineInstrBuilder & MachineIRBuilder interface
For MachineInstrBuilder, having to manually use RegState::Define is ugly and
makes register definitions clunkier than they need to be, so this adds two
convenience functions: addDef and addUse.

For MachineIRBuilder, we want to avoid BuildMI's first-reg-is-def rule because
it's hidden away and causes bugs. So this patch switches buildInstr to
returning a MachineInstrBuilder and adding *all* operands via addDef/addUse.

NFC.

llvm-svn: 277176
2016-07-29 17:43:52 +00:00
Ahmed Bougacha 784e3423e6 [GlobalISel] Add G_XOR.
llvm-svn: 277172
2016-07-29 16:56:20 +00:00
Ahmed Bougacha 5c98b60ecc [GlobalISel] Add LLT raw_ostream operator<< overload.
Helpful when debugging; will be used in the following commit.

llvm-svn: 277170
2016-07-29 16:56:12 +00:00
Brendon Cahoon 254f889dc5 MachinePipeliner pass that implements Swing Modulo Scheduling
Software pipelining is an optimization for improving ILP by
overlapping loop iterations. Swing Modulo Scheduling (SMS) is
an implementation of software pipelining that attempts to
reduce register pressure and generate efficient pipelines with
a low compile-time cost.

This implementaion of SMS is a target-independent back-end pass.
When enabled, the pass should run just prior to the register
allocation pass, while the machine IR is in SSA form. If the pass
is successful, then the original loop is replaced by the optimized
loop. The optimized loop contains one or more prolog blocks, the
pipelined kernel, and one or more epilog blocks.

This pass is enabled for Hexagon only. To enable for other targets,
a couple of target specific hooks must be implemented, and the
pass needs to be called from the target's TargetMachine
implementation.

Differential Review: http://reviews.llvm.org/D16829

llvm-svn: 277169
2016-07-29 16:44:44 +00:00
Matt Masten a6669a1e05 Initial support for vectorization using svml (short vector math library).
Differential Revision: https://reviews.llvm.org/D19544

llvm-svn: 277166
2016-07-29 16:42:44 +00:00
Ahmed Bougacha 5831cc28b5 [GlobalISel] Auto-brief LowLevelType. NFC.
llvm-svn: 277163
2016-07-29 16:11:06 +00:00
Ahmed Bougacha 9d95557128 [GlobalISel] Add LLT::operator!=().
llvm-svn: 277162
2016-07-29 16:11:04 +00:00
Ahmed Bougacha 8292bdf735 [GlobalISel] Fix LLT::unsized to match LLT(LabelTy).
When coming from an IR label type, we set a 0 NumElements, but not
when constructing an LLT using unsized(), causing comparisons to fail.

Pick one variant and fix the other.

llvm-svn: 277161
2016-07-29 16:11:02 +00:00
Ahmed Bougacha 9f986bf3a9 [GlobalISel] Add unittests for LowLevelType.
llvm-svn: 277160
2016-07-29 16:10:57 +00:00
Sjoerd Meijer 0eb96ed0de TargetInstrInfo: add virtual function getInstSizeInBytes
This adds a target hook getInstSizeInBytes to TargetInstrInfo that a lot of
subclasses already implement.

Differential Revision: https://reviews.llvm.org/D22885

llvm-svn: 277126
2016-07-29 08:16:16 +00:00
David Majnemer a926b3e71b [ConstantFolding] Remove an unused ConstantFoldInstOperands overload
No functional change is intended.

llvm-svn: 277101
2016-07-29 03:27:33 +00:00
David Majnemer d536f2328e [ConstnatFolding] Teach the folder how to fold ConstantVector
A ConstantVector can have ConstantExpr operands and vice versa.
However, the folder had no ability to fold ConstantVectors which, in
some cases, was an optimization barrier.

Instead, rephrase the folder in terms of Constants instead of
ConstantExprs and teach callers how to deal with failure.

llvm-svn: 277099
2016-07-29 03:27:26 +00:00
Piotr Padlewski c7a151464f Fixed comment
llvm-svn: 277091
2016-07-29 00:30:07 +00:00
Piotr Padlewski 84abc74f2c Added ThinLTO inlining statistics
Summary:
copypasta doc of ImportedFunctionsInliningStatistics class
 \brief Calculate and dump ThinLTO specific inliner stats.
 The main statistics are:
 (1) Number of inlined imported functions,
 (2) Number of imported functions inlined into importing module (indirect),
 (3) Number of non imported functions inlined into importing module
 (indirect).
 The difference between first and the second is that first stat counts
 all performed inlines on imported functions, but the second one only the
 functions that have been eventually inlined to a function in the importing
 module (by a chain of inlines). Because llvm uses bottom-up inliner, it is
 possible to e.g. import function `A`, `B` and then inline `B` to `A`,
 and after this `A` might be too big to be inlined into some other function
 that calls it. It calculates this statistic by building graph, where
 the nodes are functions, and edges are performed inlines and then by marking
 the edges starting from not imported function.

 If `Verbose` is set to true, then it also dumps statistics
 per each inlined function, sorted by the greatest inlines count like
 - number of performed inlines
 - number of performed inlines to importing module

Reviewers: eraman, tejohnson, mehdi_amini

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D22491

llvm-svn: 277089
2016-07-29 00:27:16 +00:00
Justin Lebar 9cbc301035 Revert "Don't invoke getName() from Function::isIntrinsic().", rL276942.
This broke some out-of-tree AMDGPU tests that relied on the old behavior
wherein isIntrinsic() would return true for any function that starts
with "llvm.".  And in general that change will not play nicely with
out-of-tree backends.

llvm-svn: 277087
2016-07-28 23:58:15 +00:00
Sanjoy Das c6af5ead86 [IR] Introduce a non-integral pointer type
Summary:
This change adds a `ni` specifier in the `datalayout` string to denote
pointers in some given address spaces as "non-integral", and adds some
typing rules around these special pointers.

Reviewers: majnemer, chandlerc, atrick, dberlin, eli.friedman, tstellarAMD, arsenm

Subscribers: arsenm, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D22488

llvm-svn: 277085
2016-07-28 23:43:38 +00:00
Adam Nemet aa3506c5f0 [BPI] Add new LazyBPI analysis
Summary:
The motivation is the same as in D22141: In order to add the hotness
attribute to optimization remarks we need BFI to be available in all
passes that emit optimization remarks.  BFI depends on BPI so unless we
make this lazy as well we would still compute BPI unconditionally.

The solution is to use the new LazyBPI pass in LazyBFI and only compute
BPI when computation of BFI is requested by the client.

I extended the laziness test using a LoopDistribute test to also cover
BPI.

Reviewers: hfinkel, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22835

llvm-svn: 277083
2016-07-28 23:31:12 +00:00
Michael Kuperstein e45d4d9b35 [PM] Port LowerGuardIntrinsic to the new PM.
llvm-svn: 277057
2016-07-28 22:08:41 +00:00
David Majnemer 3d32b7ed0d [coroutines] Part 3 of N: Adding Boilerplate for Coroutine Passes
This adds boilerplate code for all coroutine passes,
the passes are no-ops for now.
Also, a small test has been added to verify that passes execute in
the expected order or not at all if coroutine support is disabled.

Patch by Gor Nishanov!

Differential Revision: https://reviews.llvm.org/D22847

llvm-svn: 277033
2016-07-28 21:04:31 +00:00
Zachary Turner e98137c47f [pdb] Fix some warnings that break -Werror builds.
llvm-svn: 277021
2016-07-28 19:18:02 +00:00
Zachary Turner d66889cbae [pdb] Refactor library to more clearly separate reading/writing
Reviewed By: amccarth, ruiu
Differential Revision: https://reviews.llvm.org/D22693

llvm-svn: 277019
2016-07-28 19:12:28 +00:00
Zachary Turner 199f48a5f0 Get rid of IMsfStreamData class.
This was a pure virtual base class whose purpose was to abstract
away the notion of how you retrieve the layout of a discontiguous
stream of blocks in an Msf file.  This led to too many layers of
abstraction making it difficult to figure out what was going on
and extend things.  Ultimately, a stream's layout is decided by
its length and the array of block numbers that it lives on.  So
rather than have an abstract base class which can return this in
any number of ways, it's more straightforward to simply store them
as fields of a trivial struct, and also to give a more appropriate
name.

This patch does that.  It renames IMsfStreamData to MsfStreamLayout,
and deletes the 2 concrete implementations, DirectoryStreamData
and IndexedStreamData.  MsfStreamLayout is a trivial struct
with the necessary data.

llvm-svn: 277018
2016-07-28 19:11:09 +00:00
Matthias Braun 941a705b7b MachineFunction: Return reference for getFrameInfo(); NFC
getFrameInfo() never returns nullptr so we should use a reference
instead of a pointer.

llvm-svn: 277017
2016-07-28 18:40:00 +00:00
John Brawn 2853269224 Revert r276973 "Adjust Registry interface to not require plugins to export a registry"
Buildbot failures when building with clang -Werror. Reverting while I try to
figure this out.

llvm-svn: 277008
2016-07-28 17:17:22 +00:00
Ahmed Bougacha 46c05fc861 [GlobalISel] Remove types on selected insts instead of using LLT().
LLT() has a particular meaning: it's one invalid type. But we really
want selected instructions to have no type whatsoever.

Also verify that types don't linger after ISel, and enable the verifier
on the AArch64 select test.

llvm-svn: 277001
2016-07-28 16:58:27 +00:00
Wei Ding 07e03712d3 AMDGPU : Add intrinsics for compare with the full wavefront result
Differential Revision: http://reviews.llvm.org/D22482

llvm-svn: 276998
2016-07-28 16:42:13 +00:00
John Brawn 778c3c6c61 Reapply r276856 "Adjust Registry interface to not require plugins to export a registry"
This version has two fixes compared to the original:
 * In Registry.h the template static members are instantiated before they are
   used, as clang gives an error if you do it the other way around.
 * The use of the Registry template in clang-tidy is updated in the same way as
   has been done everywhere else.

Original commit message:

Currently the Registry class contains the vestiges of a previous attempt to
allow plugins to be used on Windows without using BUILD_SHARED_LIBS, where a
plugin would have its own copy of a registry and export it to be imported by
the tool that's loading the plugin. This only works if the plugin is entirely
self-contained with the only interface between the plugin and tool being the
registry, and in particular this conflicts with how IR pass plugins work.

This patch changes things so that instead the add_node function of the registry
is exported by the tool and then imported by the plugin, which solves this
problem and also means that instead of every plugin having to export every
registry they use instead LLVM only has to export the add_node functions. This
allows plugins that use a registry to work on Windows if
LLVM_EXPORT_SYMBOLS_FOR_PLUGINS is used.

llvm-svn: 276973
2016-07-28 12:48:17 +00:00
Zijiao Ma e56a53a9b3 Add unittests to {ARM | AArch64}TargetParser.
Add unittest to {ARM | AArch64}TargetParser,and by the way correct problems as below:
1.Correct a incorrect indexing problem in AArch64TargetParser. The architecture enumeration
 is shared across ARM and AArch64 in original implementation.But In the code,I just used the
 index which was offset by the ARM, and this would index into the array incorrectly. To make
 AArch64 has its own arch enum,or we will do a lot of slowly iterating.
2.Correct a spelling error. The parameter of llvm::AArch64::getArchExtName.
3.Correct a writing mistake, in llvm::ARM::parseArchISA.

Differential Revision: https://reviews.llvm.org/D21785

llvm-svn: 276957
2016-07-28 06:11:18 +00:00
David Majnemer 6e9b47bc8a Add EP_CGSCCOptimizerLate extension point to PassManagerBuilder
The EP_CGSCCOptimizerLate extension point allows adding CallGraphSCC
passes at the end of the main CallGraphSCC passes and before any
function simplification passes run by CGPassManager.

Patch by Gor Nishanov!

Differential Revision: https://reviews.llvm.org/D22897

llvm-svn: 276953
2016-07-28 03:28:43 +00:00
Justin Lebar 45bcdcbefb Don't invoke getName() from Function::isIntrinsic().
Summary:
getName() involves a hashtable lookup, so is expensive given how
frequently isIntrinsic() is called.  (In particular, many users cast to
IntrinsicInstr or one of its subclasses before calling
getIntrinsicID().)

This has an incidental functional change: Before, isIntrinsic() would
return true for any function whose name started with "llvm.", even if it
wasn't properly an intrinsic.  The new behavior seems more correct to
me, because it's strange to say that isIntrinsic() is true, but
getIntrinsicId() returns "not an intrinsic".

Some callers want the old behavior -- they want to know whether the
caller is a recognized intrinsic, or might be one in some other version
of LLVM.  For them, we added Function::hasLLVMReservedName(), which
checks whether the name starts with "llvm.".

This change is good for a 1.5% e2e speedup compiling a large Eigen
benchmark.

Reviewers: bogner

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D22065

llvm-svn: 276942
2016-07-27 23:46:57 +00:00
George Burgess IV dbd35c44d4 [CFLAA] Add getModRefBehavior to CFLAnders.
This patch lets CFLAnders respond to mod-ref queries. It also includes
a small bugfix to CFLSteens.

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22823

llvm-svn: 276939
2016-07-27 23:07:07 +00:00
Duncan P. N. Exon Smith 1723821d17 CodeGen: Make iterator-to-pointer conversion explicit, NFC
Remove the implicit conversion from MachineInstrBundleIterator to
MachineInstr*, leaving behind an explicit conversion.

I *think* this is the last ilist_iterator-related implicit conversion to
ilist_node subclass.  If I'm right, I can finally dig in and fix the UB
in ilist that these conversions were relying on.

Note that the implicit users of this conversion have already been
removed.  If you have out-of-tree code that doesn't update, you might be
able to buy some time by temporarily reverting this commit.

llvm-svn: 276902
2016-07-27 18:45:18 +00:00
David Majnemer 22039e6858 Fix the build for libstdc++ 4.7
libstdc++ 4.7 doesn't have emplace.  Use std::map::insert instead.

llvm-svn: 276901
2016-07-27 18:25:12 +00:00
Reid Kleckner 46cb48c74a Remove MCAsmInfo.h include from TargetOptions.h
TargetOptions wants the ExceptionHandling enum. Move that to
MCTargetOptions.h to avoid transitively including Dwarf.h everywhere in
clang. Now you can add a DWARF tag without a full rebuild of clang
semantic analysis.

llvm-svn: 276883
2016-07-27 16:03:57 +00:00
Ahmed Bougacha 6756a2c953 [GlobalISel] Introduce an instruction selector.
And implement it for AArch64, supporting x/w ADD/OR.

Differential Revision: https://reviews.llvm.org/D22373

llvm-svn: 276875
2016-07-27 14:31:55 +00:00
Tim Northover 8658b3c393 GlobalISel: remove variable_ops from output list.
The instance in the input operand list allows both inputs and outputs,
but the one in (outs) is not treated specially which leads to the
MachineVerifier invoking UB (looking at an invalid MCInstrDesc field).

No functional change except in UBSan builds (maybe, who knows!), where
it fixes the legalize-add.mir test.

llvm-svn: 276872
2016-07-27 14:30:49 +00:00
Daniel Sanders c5537427c2 [mips][ias] Check '$rs = $rd' constraints when both registers are in AsmText.
Summary:
This is one possible solution to the problem of ignoring constraints that Simon
raised in D21473 but it's a bit of a hack.

The integrated assembler currently ignores violations of the tied register
constraints when the operands involved in a tie are both present in the AsmText.
For example, 'dati $rs, $rt, $imm' with the '$rs = $rt' will silently replace
$rt with $rs. So 'dati $2, $3, 1' is processed as if the user provided
'dati $2, $2, 1' without any diagnostic being emitted.

This is difficult to solve properly because there are multiple parts of the
matcher that are silently forcing these constraints to be met. Tied operands are
rendered to instructions by cloning previously rendered operands but this is
unnecessary because the matcher was already instructed to render the operand it
would have cloned. This is also unnecessary because earlier code has already
replaced the MCParsedOperand with the one it was tied to (so the parsed input
is matched as if it were 'dati <RegIdx 2>, <RegIdx 2>, <Imm 1>'). As a result,
it looks like fixing this properly amounts to a rewrite of the tied operand
handling which affects all targets.

This patch however, merely inserts a checking hook just before the
substitution of MCParsedOperands and the Mips target overrides it. It's not
possible to accurately check the registers are the same this early (because
numeric registers haven't been bound to a register class yet) so it cheats a
bit and checks that the tokens that produced the operand are lexically
identical. This works because tied registers need to have the same register
class but it does have a flaw. It will reject 'dati $4, $a0, 1' for violating
the constraint even though $a0 ends up as the same register as $4.

Reviewers: sdardis

Subscribers: dsanders, llvm-commits, sdardis

Differential Revision: https://reviews.llvm.org/D21994

llvm-svn: 276867
2016-07-27 13:49:44 +00:00
John Brawn 3839263204 Revert r276856 "Adjust Registry interface to not require plugins to export a registry"
This is causing a huge pile of buildbot failures.

llvm-svn: 276857
2016-07-27 11:41:18 +00:00
John Brawn 63aff61019 Adjust Registry interface to not require plugins to export a registry
Currently the Registry class contains the vestiges of a previous attempt to
allow plugins to be used on Windows without using BUILD_SHARED_LIBS, where a
plugin would have its own copy of a registry and export it to be imported by
the tool that's loading the plugin. This only works if the plugin is entirely
self-contained with the only interface between the plugin and tool being the
registry, and in particular this conflicts with how IR pass plugins work.

This patch changes things so that instead the add_node function of the registry
is exported by the tool and then imported by the plugin, which solves this
problem and also means that instead of every plugin having to export every
registry they use instead LLVM only has to export the add_node functions. This
allows plugins that use a registry to work on Windows if
LLVM_EXPORT_SYMBOLS_FOR_PLUGINS is used.

Differential Revision: http://reviews.llvm.org/D21385

llvm-svn: 276856
2016-07-27 11:18:38 +00:00
Simon Pilgrim 470b81ca69 Removed unusued template function declaration that has no definition - fixes MSVC warning.
llvm-svn: 276852
2016-07-27 10:11:05 +00:00
Sean Silva 285e0974f0 Refactor - CodeExtractor : Move check for valid block to static utility
This lets you actually check to see if a block is valid before trying to
extract.

Patch by River Riddle!

Differential Revision: https://reviews.llvm.org/D22699

llvm-svn: 276846
2016-07-27 08:02:46 +00:00
David Majnemer 7855719c10 [coroutines] Part 2 of N: Adding Coroutine Intrinsics
This is the second patch in the coroutine series. It adds coroutine
intrinsics and updates intrinsic cost in TargetTransformInfoImpl.h.

Patch by Gor Nishanov!

Differential Revision: https://reviews.llvm.org/D22659

llvm-svn: 276839
2016-07-27 05:12:35 +00:00
Sebastian Pop 18c964d7a4 add a verbose mode to Loop->print() to print all the basic blocks of a loop
Differential Revision: https://reviews.llvm.org/D22817

llvm-svn: 276838
2016-07-27 05:02:17 +00:00
Sebastian Pop 9570bfd349 add function isLoopLatch
Differential Revision: https://reviews.llvm.org/D22817

llvm-svn: 276837
2016-07-27 05:02:15 +00:00
Sebastian Pop 57a127a7d1 refactor code in verifyLoop: NFC.
Use std::any_of as requested in https://reviews.llvm.org/D22816

llvm-svn: 276835
2016-07-27 04:36:06 +00:00
Sebastian Pop e5730a27f2 Move assert as early as possible
Patch written by Aditya Kumar.

Differential Revision: https://reviews.llvm.org/D22816

llvm-svn: 276830
2016-07-27 03:30:11 +00:00
Andrew Kaylor f990fa5f7b Reverting r276771 due to MSan failures.
llvm-svn: 276824
2016-07-27 01:19:24 +00:00
Hans Wennborg 685e8ff953 Revert r276136 "Use ValueOffsetPair to enhance value reuse during SCEV expansion."
It causes Clang tests to fail after Windows self-host (PR28705).

(Also reverts follow-up r276139.)

llvm-svn: 276822
2016-07-26 23:25:13 +00:00
Tim Northover ad2b717f2c GlobalISel: add generic load and store instructions.
Pretty straightforward, the only oddity is the MachineMemOperand (which it's
surprisingly difficult to share code for).

llvm-svn: 276799
2016-07-26 20:23:26 +00:00
David Majnemer 6774d612d4 [InstSimplify] Cast folding can be made more generic
Use isEliminableCastPair to determine if a pair of casts are foldable.

llvm-svn: 276777
2016-07-26 17:58:05 +00:00
Andrew Kaylor 3104a6bad0 Re-committing r275284: add support to inline __builtin_mempcpy
Patch by Sunita Marathe

Differential Revision: http://reviews.llvm.org/D21920

llvm-svn: 276771
2016-07-26 17:23:13 +00:00
Matt Arsenault 32fc527c65 AMDGPU: Add fp legacy instruction intrinsics
This could use some additional optimization work
to use mad/mac legacy.

llvm-svn: 276764
2016-07-26 16:45:45 +00:00
Tim Northover 756eca35cf GlobalISel: add specialized buildCopy function to MachineInstrBuilder.
NFC.

llvm-svn: 276763
2016-07-26 16:45:30 +00:00
Tim Northover cc5f76226b GlobalISel: give MachineInstrBuilder a uniform interface. NFC.
Instead of an ad-hoc collection of "buildInstr" functions with varying numbers
of registers, this uses variadic templates to provide for as many regs as
needed!

Also make IRtranslator use new "buildBr" function instead of some weird generic
one that no-one else would really use.

llvm-svn: 276762
2016-07-26 16:45:26 +00:00
Oliver Stannard 2171828a49 [ARM] Implement -mimplicit-it assembler option
This option, compatible with gas's -mimplicit-it, controls the
generation/checking of implicit IT blocks in ARM/Thumb assembly.

This option allows two behaviours that were not possible before:
- When in ARM mode, emit a warning when assembling a conditional
  instruction that is not in an IT block. This is enabled with
  -mimplicit-it=never and -mimplicit-it=thumb.
- When in Thumb mode, automatically generate IT instructions when an
  instruction with a condition code appears outside of an IT block. This
  is enabled with -mimplicit-it=thumb and -mimplicit-it=always.

The default option is -mimplicit-it=arm, which matches the existing
behaviour (allow conditional ARM instructions outside IT blocks without
warning, and error if a conditional Thumb instruction is outside an IT
block).

The general strategy for generating IT blocks in Thumb mode is to keep a
small list of instructions which should be in the IT block, and only
emit them when we encounter something in the input which means we cannot
continue the block.  This could be caused by:
- A non-predicable instruction
- An instruction with a condition not compatible with the IT block
- The IT block already contains 4 instructions
- A branch-like instruction (including ALU instructions with the PC as
  the destination), which cannot appear in the middle of an IT block
- A label (branching into an IT block is not legal)
- A change of section, architecture, ISA, etc
- The end of the assembly file.

Some of these, such as change of section and end of file, are parsed
outside of the ARM asm parser, so I've added a new virtual function to
AsmParser to ensure any previously-parsed instructions have been
emitted. The ARM implementation of this flushes the currently pending IT
block.

We now have to try instruction matching up to 3 times, because we cannot
know if the current IT block is valid before matching, and instruction
matching changes depending on the IT block state (due to the 16-bit ALU
instructions, which set the flags iff not in an IT block). In the common
case of not having an open implicit IT block and the instruction being
matched not needing one, we still only have to run the matcher once.

I've removed the ITState.FirstCond variable, because it does not store
any information that isn't already represented by CurPosition. I've also
updated the comment on CurPosition to accurately describe it's meaning
(which this patch doesn't change).

Differential Revision: https://reviews.llvm.org/D22760

llvm-svn: 276747
2016-07-26 14:19:47 +00:00
David Majnemer a90a621d1e Reapply: [InstSimplify] Add support for bitcasts"
This reverts commit r276700 and reapplies r276698.
The relevant clang tests have been updated.

llvm-svn: 276727
2016-07-26 05:52:29 +00:00
Amaury Sechet 06ac2f4a7e Propery format doccomment in lto.h . NFC
llvm-svn: 276725
2016-07-26 04:20:30 +00:00
Michael Kuperstein d5617b5545 Attempt to pacify windows bots, again.
llvm-svn: 276703
2016-07-25 22:29:04 +00:00
David Majnemer 6e06b577cc Revert "[InstSimplify] Add support for bitcasts"
This reverts commit r276698.  Clang has tests which rely on the
optimizer :(

llvm-svn: 276700
2016-07-25 22:24:59 +00:00
David Majnemer 62611fd3f7 [InstSimplify] Add support for bitcasts
BitCasts of BitCasts can be folded away as can BitCasts which don't
change the type of the operand.

llvm-svn: 276698
2016-07-25 22:04:58 +00:00
Tim Northover 7c9eba90ff GlobalISel: add generic casts to IRTranslator
This adds LLVM's 3 main cast instructions (inttoptr, ptrtoint, bitcast) to the
IRTranslator. The first two are direct translations (with 2 MachineInstr types
each). Since LLT discards information, a bitcast might become trivial and we
emit a COPY in those cases instead.

llvm-svn: 276690
2016-07-25 21:01:29 +00:00
Michael Kuperstein 39feb6290c [PM] Port SymbolRewriter to the new PM
Differential Revision: https://reviews.llvm.org/D22703

llvm-svn: 276687
2016-07-25 20:52:00 +00:00
Kevin Enderby 95b0842e64 Next step along the way to getting good error messages for bad archives.
I consulted with Lang Hames on this work, and the goal was to add a bit
of "where" in the archive the error occurred along with what the error was.

So this step changes ArchiveMemberHeader into a class with a pointer
to the archive header and the parent archive.  Which allows the methods
in the ArchiveMemberHeader to determine which member the header is
for to include that information in the error message.

For this first step the "where" is just the offset to the member in the
archive.  The next step will be a new method on ArchiveMemberHeader
to get the full name, if possible, to be use in the error message.  Which
will now be possible as ArchiveMemberHeader contains a pointer to
the Archive with its string table and its size, etc. so the full name can
be determined from the header if it is valid.

Also this change adds the missing checks the archive header is actually
contained in the buffer and is not truncated, as well as if the terminating
characters are correct in the header.

And changes one error message in Archive::Child::getNext() where the
name or offset to member is now added.

llvm-svn: 276686
2016-07-25 20:36:36 +00:00
Jordan Rose 10697a7c34 Fix r276671 to not use a defaulted move constructor.
MSVC won't provide the body of this move constructor and assignment
operator, possibly because the copy constructor is banned. Just write
it manually.

llvm-svn: 276685
2016-07-25 20:34:25 +00:00
Jan Vesely b64c8925e9 AMDGPU: Remove read_workdim intrinsic
Differential revision: https://reviews.llvm.org/D22732

llvm-svn: 276682
2016-07-25 20:17:02 +00:00
Matt Arsenault df3e224632 LiveIntervals: Return index from replaceMachineInstrInMaps
Fixes weird asymmetry with insertion

llvm-svn: 276678
2016-07-25 19:39:04 +00:00
Jordan Rose f85a95fdcb StringSwitch cannot be copied (take 2).
This prevents StringSwitch from being used with 'auto', which is
important because the inferred type is StringSwitch rather than the
result type. This is a problem because StringSwitch stores addresses
of temporary values rather than copying or moving the value into its
own storage.

This is a compromise that still allows wrapping StringSwitch in other
temporary structures, which (unlike StringSwitch) may be non-trivial
to set up and therefore want to at least be movable. (For an example,
see QueryParser.cpp in clang-tools-extra.)

Changing this uncovered the bug in PassBuilder, also in this patch.
Clang doesn't seem to have any occurrences of the issue.

Re-commit of r276652.

llvm-svn: 276671
2016-07-25 18:34:51 +00:00
Zachary Turner 336e919b01 Add a modulemap for LLVMDebugInfoMsf.
Differential Revision: https://reviews.llvm.org/D22769

llvm-svn: 276669
2016-07-25 18:18:59 +00:00
Michael Kuperstein 8f8e1d1bf6 Don't use iplist in SymbolRewriter. NFC.
There didn't appear to be a good reason to use iplist in this case, a regular
list of unique_ptr works just as well.
Change made in preparation to a new PM port (since iplist is not moveable).

llvm-svn: 276668
2016-07-25 18:10:54 +00:00
Jordan Rose 9978dec4c2 Revert "StringSwitch cannot be copied or moved."
This reverts commit r276652. The clang-query tool is currently
relying on this behavior. I'll try again later.

llvm-svn: 276661
2016-07-25 17:28:33 +00:00
Joel Jones 373d7d30dd MC] Provide an MCTargetOptions to implementors of MCAsmBackendCtorTy, NFC
Some targets, notably AArch64 for ILP32, have different relocation encodings
based upon the ABI. This is an enabling change, so a future patch can use the
ABIName from MCTargetOptions to chose which relocations to use. Tested using
check-llvm.

The corresponding change to clang is in: http://reviews.llvm.org/D16538

Patch by: Joel Jones

Differential Revision: https://reviews.llvm.org/D16213

llvm-svn: 276654
2016-07-25 17:18:28 +00:00
Jordan Rose 0cdbe7a572 StringSwitch cannot be copied or moved.
...but most importantly, it cannot be used well with 'auto', because
the inferred type is StringSwitch rather than the result type. This
is a problem because StringSwitch stores addresses of temporary
values rather than copying or moving the value into its own storage.

Changing this uncovered the bug in PassBuilder, also in this patch.
Clang doesn't seem to have any occurrences of the issue.

llvm-svn: 276652
2016-07-25 17:08:24 +00:00
Sean Silva fe5abd5e0c Fix : Partial Inliner requires AssumptionCacheTracker
The public InlineFunction utility assumes that the passed in
InlineFunctionInfo has a valid AssumptionCacheTracker.

Patch by River Riddle!

Differential Revision: https://reviews.llvm.org/D22706

llvm-svn: 276609
2016-07-25 05:00:00 +00:00
Elena Demikhovsky 376a18bd92 [Loop Vectorizer] Handling loops FP induction variables.
Allowed loop vectorization with secondary FP IVs. Like this:
float *A;
float x = init;
for (int i=0; i < N; ++i) {
  A[i] = x;
  x -= fp_inc;
}

The auto-vectorization is possible when the induction binary operator is "fast" or the function has "unsafe" attribute.

Differential Revision: https://reviews.llvm.org/D21330

llvm-svn: 276554
2016-07-24 07:24:54 +00:00
Xinliang David Li 9239245401 [Profile] Use explicit flag to enable IR PGO
Patch by Jake VanAdrighem

Differential Revision: http://reviews.llvm.org/D22607

llvm-svn: 276516
2016-07-23 04:28:52 +00:00
Sean Silva ab6a683765 Avoid using a raw AssumptionCacheTracker in various inliner functions.
This unblocks the new PM part of River's patch in
https://reviews.llvm.org/D22706

Conveniently, this same change was needed for D21921 and so these
changes are just spun out from there.

llvm-svn: 276515
2016-07-23 04:22:50 +00:00
Sanjoy Das a7d9ec8751 [SCEV] Make isImpliedCondOperandsViaRanges smarter
This change lets us prove things like

  "{X,+,10} s< 5000" implies "{X+7,+,10} does not sign overflow"

It does this by replacing replacing getConstantDifference by
computeConstantDifference (which is smarter) in
isImpliedCondOperandsViaRanges.

llvm-svn: 276505
2016-07-23 00:54:36 +00:00
Sanjoy Das 0b1af85cc2 [SCEV] Change the interface of computeConstantDifference; NFC
This is in preparation of
s/getConstantDifference/computeConstantDifference/ in a later change.

llvm-svn: 276503
2016-07-23 00:28:56 +00:00
Adam Nemet 9e6e63fba2 [LoopDataPrefetch] Include hotness of region in opt remark
llvm-svn: 276488
2016-07-22 22:53:17 +00:00
Tim Northover 98a56eb7f4 GlobalISel: allow multiple types on MachineInstrs.
llvm-svn: 276481
2016-07-22 22:13:36 +00:00
Vedant Kumar c9d3a173fb [Coverage] Mark more methods const (NFC)
llvm-svn: 276474
2016-07-22 21:11:55 +00:00
Anna Thomas 58d1192a22 Add invariant start call creation in IRBuilder.NFC
Differential Revision: https://reviews.llvm.org/D22700

llvm-svn: 276471
2016-07-22 20:57:23 +00:00
Pete Cooper fea2139740 Use RValue refs in APInt add/sub methods.
This adds versions of operator + and - which are optimized for the LHS/RHS of the
operator being RValue's.  When an RValue is available, we can use its storage space
instead of allocating new space.

On code such as ConstantRange which makes heavy use of APInt's over 64-bits in size,
this results in significant numbers of saved allocations.

Thanks to David Blaikie for all the review and most of the code here.

llvm-svn: 276470
2016-07-22 20:55:46 +00:00
Sanjoy Das 095f5b204f [SCEV] Extract out a helper function; NFC
The helper will get smarter in a later change, but right now this is
just code reorganization.

llvm-svn: 276467
2016-07-22 20:47:55 +00:00
Tim Northover 33b07d6725 GlobalISel: implement legalization pass, with just one transformation.
This adds the actual MachineLegalizeHelper to do the work and a trivial pass
wrapper that legalizes all instructions in a MachineFunction. Currently the
only transformation supported is splitting up a vector G_ADD into one acting on
smaller vectors.

llvm-svn: 276461
2016-07-22 20:03:43 +00:00
Zachary Turner e4a4f33daf Make PDBFile store an msf::Layout.
Previously it was storing all the fields of an msf::Layout as
separate members.  This is a trivial cleanup to make it store
an msf::Layout directly.  This makes the code more readable
since it becomes clear which fields of PDBFile are actually the
msf specific layout information in a sea of other bookkeeping
fields.

llvm-svn: 276460
2016-07-22 19:56:33 +00:00
Zachary Turner e109dc63f9 [pdb] Have builders share a single BumpPtrAllocator.
This makes it easier to have the writable and readable PDB
interfaces share code since the read/write and write-only
interfaces now share a single allocator, you don't have to worry
about a builder building a read only interface and then having
the read-only interface's data become corrupt when the builder
goes out of scope.  Now the allocator is specified explicitly
to all constructors, so all interfaces can share a single allocator
that is scoped appropriately.

llvm-svn: 276459
2016-07-22 19:56:26 +00:00
Zachary Turner bac69d33d0 [msf] Create LLVMDebugInfoMsf
This provides a better layering of responsibilities among different
aspects of PDB writing code.  Some of the MSF related code was
contained in CodeView, and some was in PDB prior to this.  Further,
we were often saying PDB when we meant MSF, and the two are
actually independent of each other since in theory you can have
other types of data besides PDB data in an MSF.  So, this patch
separates the MSF specific code into its own library, with no
dependencies on anything else, and DebugInfoCodeView and
DebugInfoPDB take dependencies on DebugInfoMsf.

llvm-svn: 276458
2016-07-22 19:56:05 +00:00
Wei Mi e04d0eff29 [PM] Port BreakCriticalEdges to the new PM.
Differential Revision: https://reviews.llvm.org/D22688

llvm-svn: 276449
2016-07-22 18:04:25 +00:00
Anna Thomas 0be4a0e6a4 Invariant start/end intrinsics overloaded for address space
Summary:
The llvm.invariant.start and llvm.invariant.end intrinsics currently
support specifying invariant memory objects only in the default address
space.

With this change, these intrinsics are overloaded for any adddress space
for memory objects
and we can use these llvm invariant intrinsics in non-default address
spaces.

Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr)

This overloaded intrinsic is needed for representing final or invariant
memory in managed languages.

Reviewers: apilipenko, reames

Subscribers: llvm-commits
llvm-svn: 276447
2016-07-22 17:49:40 +00:00
Matt Arsenault 8d718dcfda AMDGPU: Add HSA dispatch id intrinsic
llvm-svn: 276437
2016-07-22 17:01:30 +00:00
Tim Northover bd5054602e GlobalISel: implement alloca instruction
llvm-svn: 276433
2016-07-22 16:59:52 +00:00
Xinliang David Li c27f1b7182 [Profile] Cleanup: remove unused interface
llvm-svn: 276431
2016-07-22 16:11:56 +00:00
Lang Hames 5e51a2e31a [Support] Make ErrorAsOutParameter take an Error* rather than an Error&.
This allows ErrorAsOutParameter to work better with "optional" errors. For
example, consider a function where for certain input values it is known that
the function can't fail. This can now be written as:

Result foo(Arg X, Error *Err) {
  ErrorAsOutParameter EAO(Err);

  if (<Error Condition>) {
    if (Err)
      *Err = <report error>;
    else
      llvm_unreachable("Unexpected failure!");
  }
}

Rather than having to construct an ErrorAsOutParameter under every conditional
where Err is known to be non-null.

llvm-svn: 276430
2016-07-22 16:11:25 +00:00
Zachary Turner b383d628df [pdb] Move file layout header structs to RawTypes.h
This facilitates code reuse between the builder classes and the
"frozen" read only versions of the classes used for parsing
existing PDB files.

llvm-svn: 276427
2016-07-22 15:46:46 +00:00
Zachary Turner d218c26124 [pdb] Round-trip module & file info to/from YAML.
This implements support for writing compiland and compiland source
file info to a binary PDB.  This is tested by adding support for
dumping these fields from an existing PDB to yaml, reading them
back in, and dumping them again and verifying the values are as
expected.

llvm-svn: 276426
2016-07-22 15:46:37 +00:00
Simon Pilgrim ea0d4f9962 [X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 (reapplied)
As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector.

This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match.

We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts).

Reapplied with fix for PR28657 - removed intrinsic definitions (clang companion patch to be be submitted shortly).

Differential Revision: https://reviews.llvm.org/D22460

llvm-svn: 276416
2016-07-22 13:58:44 +00:00
Xinliang David Li d382e9d82b Sync up InstrProfData.inc with compiler-rt with fixes to references
llvm-svn: 276388
2016-07-22 04:46:56 +00:00
Xinliang David Li 6e9dee999d Revert 276386
llvm-svn: 276387
2016-07-22 04:31:26 +00:00
Xinliang David Li a182e58265 Sync up InstrProfData.inc with compiler-rt
llvm-svn: 276386
2016-07-22 04:18:17 +00:00
Pete Cooper b2ba776aed Avoid dsymutil calls to getFileNameByIndex.
This change adds a hasFileAtIndex method. getChildDeclContext can first call this method, and if it returns true it knows it can then lookup the resolved path cache for the given file index. If we hit that cache then we don't even have to call getFileNameByIndex.

Running dsymutil against the swift executable built from github gives a 20% performance improvement without any change in the binary.

Differential Revision: https://reviews.llvm.org/D22655

Reviewed by friss.

llvm-svn: 276380
2016-07-22 01:41:32 +00:00
Xinliang David Li 6f8c504f10 [Profile] deprecate __llvm_profile_override_default_filename
This eliminates unncessary calls and init functions.

Differential Revision: http://reviews.llvm.org/D22613

llvm-svn: 276354
2016-07-21 23:19:10 +00:00
Wei Mi 1cf58f8996 [PM] Port NaryReassociate to the new PM
Differential Revision: https://reviews.llvm.org/D22648

llvm-svn: 276349
2016-07-21 22:28:52 +00:00
Rong Xu 97b68c5ebe [PGO] Make needsComdatForCounter() available (NFC)
Move needsComdatForCounter() to lib/ProfileData/InstrProf.cpp from
lib/Transforms/Instrumentation/InstrProfiling.cpp to make is available for
other files.

Differential Revision: https://reviews.llvm.org/D22643

llvm-svn: 276330
2016-07-21 20:50:02 +00:00
Anna Thomas c858faa244 Revert "Invariant start/end intrinsics overloaded for address space"
This reverts commit r276316.

llvm-svn: 276320
2016-07-21 19:06:28 +00:00
Anna Thomas 29b24dfe44 Invariant start/end intrinsics overloaded for address space
Summary:
The llvm.invariant.start and llvm.invariant.end intrinsics currently
support specifying invariant memory objects only in the default address space.

With this change, these intrinsics are overloaded for any adddress space for memory objects
and we can use these llvm invariant intrinsics in non-default address spaces.

Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr)

This overloaded intrinsic is needed for representing final or invariant memory in managed languages.

Reviewers: tstellarAMD, reames, apilipenko

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22519

llvm-svn: 276316
2016-07-21 18:41:44 +00:00
Quentin Colombet 2b59eab79f [IRTranslator] Add G_SUB opcode.
This commit adds a generic SUB opcode to global-isel.

llvm-svn: 276308
2016-07-21 17:26:50 +00:00
Quentin Colombet 7bcc921dd8 [IRTranslator] Add G_AND opcode.
This commit adds a generic AND opcode to global-isel.

llvm-svn: 276297
2016-07-21 15:50:42 +00:00
Konstantin Zhuravlyov 155626238b AMDGPU/SI: Add support for R_AMDGPU_ABS32
Differential Revision: https://reviews.llvm.org/D21646

llvm-svn: 276294
2016-07-21 15:29:19 +00:00
Benjamin Kramer 3f1edd7e0c Weaken ThreadSafeRefCountedBase atomics.
Doesn't make a difference on x86, but avoids memory barriers on
weakly-ordered archs like PowerPC and ARM.

llvm-svn: 276291
2016-07-21 15:06:50 +00:00
Benjamin Kramer 857754a1cb [DenseMap] Add a C++17-style try_emplace method.
This provides an elegant pattern to solve the "construct if not in map
already" problem we have many times in LLVM. Without try_emplace we
either have to rely on a sentinel value (nullptr) or do two lookups.

llvm-svn: 276277
2016-07-21 13:37:53 +00:00
Benjamin Kramer eab3d36753 Rename StringMap::emplace_second to try_emplace.
Coincidentally this function maps to the C++17 try_emplace. Rename it
for consistentcy with C++17 std::map. NFC.

llvm-svn: 276276
2016-07-21 13:37:48 +00:00
Amaury Sechet 17b67cd1ad Expose AttributeSetNode, use it to provide aggregate getter for attribute in the C API.
Summary: See D19181 for context.

Reviewers: whitequark, Wallbraker, jyknight, echristo, bkramer, void

Subscribers: mehdi_amini

Differential Revision: http://reviews.llvm.org/D21265

llvm-svn: 276236
2016-07-21 04:25:06 +00:00
Adam Nemet cbe2a9b213 [OptDiag] Missed these when making the IR Value a const pointer
llvm-svn: 276224
2016-07-21 01:11:12 +00:00
Adam Nemet 7cfd5971ab [OptDiag,LV] Add hotness attribute to applied-optimization remarks
Test coverage is provided by modifying the function in the FP-math
testcase that we are allowed to vectorize.

llvm-svn: 276223
2016-07-21 01:07:13 +00:00
Adam Nemet 0e0e2d5d26 [OptDiag,LV] Add hotness attribute to the derived analysis remarks
This includes FPCompute and Aliasing.

Testcase is based on no_fpmath.ll.

llvm-svn: 276211
2016-07-20 23:50:32 +00:00
Tim Northover cffc0d20fb GlobalISel: Remove explicit enumerator values from .def file.
They were all auto-incremented from 0 anyway, and I'm getting really annoying
conflicts and runtime failures when different people add more for GlobalISel
(and even when I'm refactoring my own patches).

NFC.

llvm-svn: 276204
2016-07-20 22:58:01 +00:00
Adam Nemet 5b3a5cf6b0 [OptDiag,LV] Add hotness attribute to analysis remarks
The earlier change added hotness attribute to missed-optimization
remarks.  This follows up with the analysis remarks (the ones explaining
the reason for the missed optimization).

llvm-svn: 276192
2016-07-20 21:44:26 +00:00
Adam Nemet 6100d16e7d [OptDiag] Take the IR Value as a const pointer
This helps because LoopAccessReport is passed around as a const
reference and we derive the basic block passed as the Value parameter
from the instruction in LoopAccessReport.

llvm-svn: 276191
2016-07-20 21:44:22 +00:00
Tim Northover 75ad077330 GlobalISel: implement Legalization querying framework.
This adds an (incomplete, inefficient) framework for deciding what to do with
some operation on a given type.

llvm-svn: 276184
2016-07-20 21:13:29 +00:00
George Burgess IV 400ae40348 [MSSA] Add an overload for getClobberingMemoryAccess.
A seemingly common use for the walker's getClobberingMemoryAccess
function is:

```
MemoryAccess *getClobber(MemorySSAWalker *W, MemoryUseOrDef *MUD) {
  const Instruction *I = MUD->getMemoryInst();
  return W->getClobberingMemoryAccess(I);
}
```

Which is kind of redundant, since walkers will ultimately query MSSA to
find out which MemoryAccess `I` maps to (...which is always `MUD`).

So, this patch adds an overload of getClobberingMemoryAccess that
accepts MemoryAccesses directly. As a result, the Instruction overload
of getClobberingMemoryAccess becomes a lightweight wrapper around our
new overload.

Additionally, this patch un`virtual`izes the Instruction overload of
getClobberingMemoryAccess, since there doesn't seem to be a walker that
benefits from that being virtual, and I can't think of how else one
would implement it. Happy to make it virtual again if we would benefit
from doing so.

llvm-svn: 276169
2016-07-20 19:51:34 +00:00
Tim Northover d3f047a38f GlobalISel: properly conditionalize LLT use.
We can't guard the include of LowLevelType.h because getType and setType are
(trivial) functions even when GlobalISel isn't built.

llvm-svn: 276160
2016-07-20 19:17:29 +00:00
Tim Northover 62ae568bbb GlobalISel: implement low-level type with just size & vector lanes.
This should be all the low-level instruction selection needs to determine how
to implement an operation, with the remaining context taken from the opcode
(e.g. G_ADD vs G_FADD) or other flags not based on type (e.g. fast-math).

llvm-svn: 276158
2016-07-20 19:09:30 +00:00
Adam Nemet 546675cc7f [OptDiag] Fix function comment
Function is not passed unlike in the original of this
(llvm::emitOptimizationRemarkMissed).

llvm-svn: 276150
2016-07-20 18:16:45 +00:00
Sanjay Patel 683170bf56 move decomposeBitTestICmp() to Transforms/Utils; NFC
As noted in https://reviews.llvm.org/D22537 , we can use this functionality in 
visitSelectInstWithICmp() and InstSimplify, but currently we have duplicated
code.

llvm-svn: 276140
2016-07-20 17:18:45 +00:00
Wei Mi db80c0c77f Use ValueOffsetPair to enhance value reuse during SCEV expansion.
In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion.
However, const folding and sext/zext distribution can make the reuse still difficult.

A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and
  S1 = S2 + C_a
  S3 = S2 + C_b
where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as
V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a
complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused
by the fact that S3 is generated from S1 after const folding.

In order to do that, we represent ExprValueMap as a mapping from SCEV to
ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the
ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first
expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to
V1 - C_a + C_b.

Differential Revision: https://reviews.llvm.org/D21313

llvm-svn: 276136
2016-07-20 16:40:33 +00:00
Sanjay Patel be53c65fab fix documentation comments; NFC
llvm-svn: 276135
2016-07-20 16:30:55 +00:00
Adam Nemet 67c8929a2c [LV] Add hotness attribute to missed-optimization remarks
The new OptimizationRemarkEmitter analysis pass is hooked up to both new
and old PM passes.

llvm-svn: 276080
2016-07-20 04:03:43 +00:00
Matthias Braun 5b9722d6c7 Revert "RegScavenging: Add scavengeRegisterBackwards()"
Reverting this commit for now as it seems to be causing failures on
test-suite tests on the clang-ppc64le-linux-lnt bot.

This reverts commit r276044.

llvm-svn: 276068
2016-07-20 00:21:32 +00:00
Sean Silva e3c18a5ae8 [PM] Port LoopUnroll.
We just set PreserveLCSSA to always true since we don't have an
analogous method `mustPreserveAnalysisID(LCSSA)`.

Also port LoopInfo verifier pass to test LoopUnrollPass.

llvm-svn: 276063
2016-07-19 23:54:23 +00:00
Kyle Butt 9e52c064c2 Codegen: Factor out canTailDuplicate
canTailDuplicate accepts two blocks and returns true if the first can be
duplicated into the second successfully. Use this function to
encapsulate the heuristic.

llvm-svn: 276062
2016-07-19 23:54:21 +00:00
Justin Lebar 7ab570ec3a [ADT] Warn on unused results from ArrayRef and StringRef functions that read like they might mutate.
Summary:
Functions like "slice" and "drop_front" sound like they might mutate the
underlying object, but they don't.  Warning on unused results would have
saved me an hour yesterday, and I'm sure I'm not the only one.

LLVM and Clang are clean wrt this warning after D22540.

Reviewers: majnemer

Subscribers: sanjoy, chandlerc, llvm-commits

Differential Revision: https://reviews.llvm.org/D22541

llvm-svn: 276058
2016-07-19 23:19:25 +00:00
Daniel Berlin 5c46b943db Make MemorySSA::dominates/locallydominates constant time
Summary: Make MemorySSA::dominates/locallydominates constant time

Reviewers: george.burgess.iv, gberry

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22527

llvm-svn: 276046
2016-07-19 22:49:43 +00:00
Chandler Carruth 2aff750cb8 Add AIX support to Path.inc, Host.h, and CMake.
Patch by Andrew Paprocki!

Differential Revision: https://reviews.llvm.org/D18359

llvm-svn: 276045
2016-07-19 22:46:39 +00:00
Matthias Braun 84fd4bee6c RegScavenging: Add scavengeRegisterBackwards()
This is a variant of scavengeRegister() that works for
enterBasicBlockEnd()/backward(). The benefit of the backward mode is
that it is not affected by incomplete kill flags.

This patch also changes
PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register
scavenger in backwards mode.

Differential Revision: http://reviews.llvm.org/D21885

llvm-svn: 276044
2016-07-19 22:37:09 +00:00
Matthias Braun 4cb68e1048 RegisterScavenger: Introduce backward() mode.
This adds two pieces:
- RegisterScavenger:::enterBasicBlockEnd() which behaves similar to
  enterBasicBlock() but starts tracking at the end of the basic block.
- A RegisterScavenger::backward() method. It is subtly different
  from the existing unprocess() method which only considers uses with
  the kill flag set: If a value is dead at the end of a basic block with
  a last use inside the basic block, unprocess() will fail to mark it as
  live. However we cannot change/fix this behaviour because unprocess()
  needs to perform the exact reverse operation of forward().

Differential Revision: http://reviews.llvm.org/D21873

llvm-svn: 276043
2016-07-19 22:37:02 +00:00
Kevin Enderby 6524bd8c00 Next step along the way to getting good error messages for bad archives.
This step builds on Lang Hames work to change Archive::child_iterator
for better interoperation with Error/Expected.  Building on that it is now
possible to return an error message when the size field of an archive
contains non-decimal characters.

llvm-svn: 276025
2016-07-19 20:47:07 +00:00
Rafael Espindola 3816c53f04 Use posix_fallocate instead of ftruncate.
This makes sure that space is actually available. With this change
running lld on a full file system causes it to exit with

failed to open foo: No space left on device

instead of crashing with a sigbus.

llvm-svn: 276017
2016-07-19 20:19:56 +00:00
David Majnemer 938a6c7ce0 [RegionInfo] Some cleanups
- Use unique_ptr instead of managing a container of new'd pointers.
- Use range based for loops.

No functional change is intended.

llvm-svn: 276001
2016-07-19 17:50:30 +00:00
Simon Pilgrim 0ea8d275cc [X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR
D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead.

It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match).

This patch changes both scalar and packed versions back to using x86-specific builtins.

It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding.

A companion clang patch is at D22105

Differential Revision: https://reviews.llvm.org/D22106

llvm-svn: 275981
2016-07-19 15:07:43 +00:00
Simon Pilgrim 766345e331 Get rid of VS2015 operator precedence warning. NFCI.
llvm-svn: 275971
2016-07-19 12:26:51 +00:00
Daniel Sanders 2cb55d7dfd [mips] Recognise the triple used by Debian stretch for mips64el.
Summary:
The triple used for this distribution is mips64el-linux-gnuabi64.

Reviewers: sdardis

Subscribers: sdardis, llvm-commits

Differential Revision: https://reviews.llvm.org/D22406

llvm-svn: 275966
2016-07-19 10:22:19 +00:00
Tobias Grosser 3a49a8e13c Style: drop some unnecessary ';' [NFC]
llvm-svn: 275963
2016-07-19 09:01:46 +00:00
George Burgess IV 5f30897b7b [MemorySSA] Update to the new shiny walker.
This patch updates MemorySSA's use-optimizing walker to be more
accurate and, in some cases, faster.

Essentially, this changed our core walking algorithm from a
cache-as-you-go DFS to an iteratively expanded DFS, with all of the
caching happening at the end. Said expansion happens when we hit a Phi,
P; we'll try to do the smallest amount of work possible to see if
optimizing above that Phi is legal in the first place. If so, we'll
expand the search to see if we can optimize to the next phi, etc.

An iteratively expanded DFS lets us potentially quit earlier (because we
don't assume that we can optimize above all phis) than our old walker.
Additionally, because we don't cache as we go, we can now optimize above
loops.

As an added bonus, this patch adds a ton of verification (if
EXPENSIVE_CHECKS are enabled), so finding bugs is easier.

Differential Revision: https://reviews.llvm.org/D21777

llvm-svn: 275940
2016-07-19 01:29:15 +00:00
Vedant Kumar e3a0bf5048 Retry: [llvm-profdata] Speed up merging by using a thread pool
Add a "-j" option to llvm-profdata to control the number of threads used.
Auto-detect NumThreads when it isn't specified, and avoid spawning threads when
they wouldn't be beneficial.

I tested this patch using a raw profile produced by clang (147MB). Here is the
time taken to merge 4 copies together on my laptop:

  No thread pool: 112.87s user 5.92s system 97% cpu 2:01.08 total
  With 2 threads: 134.99s user 26.54s system 164% cpu 1:33.31 total

Changes since the initial commit:

  - When handling odd-length inputs, call ThreadPool::wait() before merging the
    last profile. Should fix a race/off-by-one (see r275937).

Differential Revision: https://reviews.llvm.org/D22438

llvm-svn: 275938
2016-07-19 01:17:20 +00:00
Vedant Kumar 21ab20e005 Revert "[llvm-profdata] Speed up merging by using a thread pool"
This reverts commit r275921. It broke the ppc64be bot:

  http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/3537

I'm not sure why it broke, but based on the output, it looks like an
off-by-one (one profile left un-merged).

llvm-svn: 275937
2016-07-19 00:57:09 +00:00
Matt Arsenault 4cb438b93c TableGen: Allow custom register operand decoder method
This is for a situation where the encoding for a register may be
different depending on the specific operand. For some instructions,
we want to apply additional restrictions beyond the encoding's
constraints.

In AMDGPU some operands are VSrc_32, using the VS_32 pseudo register
class which accept VGPRs, SGPRs, or immediates in the encoding.
Some specific instructions with the same encoding operand do not want
to allow immediates or SGPRs, but the encoding format is different
in this case than a regular VGPR_32 operand.

This allows specifying the encoding should be treated the same
without introducing yet another dummy register class.

llvm-svn: 275929
2016-07-18 23:20:46 +00:00
Vedant Kumar 0bd9907581 [llvm-profdata] Speed up merging by using a thread pool
Add a "-j" option to llvm-profdata to control the number of threads
used. Auto-detect NumThreads when it isn't specified, and avoid spawning
threads when they wouldn't be beneficial.

I tested this patch using a raw profile produced by clang (147MB). Here is the
time taken to merge 4 copies together on my laptop:

  No thread pool: 112.87s user 5.92s system 97% cpu 2:01.08 total
  With 2 threads: 134.99s user 26.54s system 164% cpu 1:33.31 total

Differential Revision: https://reviews.llvm.org/D22438

llvm-svn: 275921
2016-07-18 22:02:39 +00:00
Dehao Chen 6132ee8502 [PM] Convert Loop Strength Reduce pass to new PM
Summary: Convert Loop String Reduce pass to new PM

Reviewers: davidxl, silvas

Subscribers: junbuml, sanjoy, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D22468

llvm-svn: 275919
2016-07-18 21:41:50 +00:00
Mehdi Amini 4d74631ea4 Update doxygen description for `WriteBitcodeToFile()` API (NFC)
llvm-svn: 275917
2016-07-18 21:29:24 +00:00
Teresa Johnson 2124157102 [PM] Port FunctionImport Pass to new PM
Summary: Port FunctionImport Pass to new PM.

Reviewers: mehdi_amini, davide

Subscribers: davidxl, llvm-commits

Differential Revision: https://reviews.llvm.org/D22475

llvm-svn: 275916
2016-07-18 21:22:24 +00:00
Justin Lebar 4133584504 Write isUInt using template specializations to work around an incorrect MSVC warning.
Summary:
Per D22441, MSVC warns on our old implementation of isUInt<64>.  It sees
uint64_t(1) << 64 and doesn't realize that it's not going to be
executed.  Writing as a template specialization is ugly, but prevents
the warning.

Reviewers: RKSimon

Subscribers: majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D22472

llvm-svn: 275909
2016-07-18 20:40:35 +00:00
Matt Arsenault c96e1deffa AMDGPU: Add intrinsic for s_flbit_i32/v_ffbh_i32
llvm-svn: 275871
2016-07-18 18:35:05 +00:00
Matt Arsenault 4c519d3518 AMDGPU/R600: Replace barrier intrinsics
llvm-svn: 275870
2016-07-18 18:34:59 +00:00
David Majnemer a2a218fbd4 [MathExtras] Fix UB in minIntN
We negated a value with a signed type which invited problems when that
value was the most negative signed number.  Use an unsigned type
for the value instead.  It will compute the same twos complement
result without the UB.

llvm-svn: 275815
2016-07-18 17:03:09 +00:00
Adam Nemet b2593f78ca [LoopDist] Port to new PM
Summary:
The direct motivation for the port is to ensure that the OptRemarkEmitter
tests work with the new PM.

This remains a function pass because we not only create multiple loops
but could also version the original loop.

In the test I need to invoke opt
with -passes='require<aa>,loop-distribute'.  LoopDistribute does not
directly depend on AA however LAA does.  LAA uses getCachedResult so
I *think* we need manually pull in 'aa'.

Reviewers: davidxl, silvas

Subscribers: sanjoy, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22437

llvm-svn: 275811
2016-07-18 16:29:27 +00:00
Adam Nemet 79ac42a5c9 [OptRemarkEmitter] Port to new PM
Summary:
The main goal is to able to start using the new OptRemarkEmitter
analysis from the LoopVectorizer.  Since the vectorizer was recently
converted to the new PM, it makes sense to convert this analysis as
well.

This pass is currently tested through the LoopDistribution pass, so I am
also porting LoopDistribution to get coverage for this analysis with the
new PM.

Reviewers: davidxl, silvas

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22436

llvm-svn: 275810
2016-07-18 16:29:21 +00:00
Simon Dardis d32a2d30cb [inlineasm] Propagate operand constraints to the backend
When SelectionDAGISel transforms a node representing an inline asm
block, memory constraint information is not preserved. This can cause
constraints to be broken when a memory offset is of the form:

offset + frame index

when the frame is resolved.

By propagating the constraints all the way to the backend, targets can
enforce memory operands of inline assembly to conform to their constraints.

For MIPSR6, some instructions had their offsets reduced to 9 bits from
16 bits such as ll/sc. This becomes problematic when using inline assembly
to perform atomic operations, as an offset can generated that is too big to
encode in the instruction.

Reviewers: dsanders, vkalintris

Differential Review: https://reviews.llvm.org/D21615

llvm-svn: 275786
2016-07-18 13:17:31 +00:00
Diana Picus 774d157a5d [ARM] Honour ABI for rem under -O0 for EABI, GNUEABI, Android and Musl
At higher optimization levels, we generate the libcall for DIVREM_Ix, which is
fine: aeabi_{u|i}divmod. At -O0 we generate the one for REM_Ix, which is the
default {u}mod{q|h|s|d}i3.

This commit makes sure that we don't generate REM_Ix calls for ABIs that
don't support them (i.e. where we need to use DIVREM_Ix instead). This is
achieved by bailing out of FastISel, which can't handle non-double multi-reg
returns, and letting the legalization infrastructure expand the REM_Ix calls.

It also updates the divmod-eabi.ll test to run under -O0 as well, and adds some
Windows checks to it to make sure we don't break things for it.

Fixes PR27068

Differential Revision: https://reviews.llvm.org/D21926

llvm-svn: 275773
2016-07-18 06:48:25 +00:00
Justin Lebar b59c1dd5cf Avoid UB in maxIntN(64).
Summary:
Previously we were relying on 2's complement underflow in an int64_t.
Now we cast to a uint64_t so we explicitly get the behavior we want.

Reviewers: rnk

Subscribers: dylanmckay, llvm-commits

Differential Revision: https://reviews.llvm.org/D22445

llvm-svn: 275722
2016-07-17 18:19:26 +00:00
Justin Lebar 6df6bde694 Clean up some comments in MathExtras.h.
Reviewers: rnk

Subscribers: llvm-commits, dylanmckay

Differential Revision: https://reviews.llvm.org/D22444

llvm-svn: 275721
2016-07-17 18:19:25 +00:00
Justin Lebar ab549c8187 Add assertions checking SignExtend{32,64}'s bit width.
Summary:
The bit width must be greater than zero, otherwise we shift by the
integer's width, which is UB.  Also (more obviously) the width must be
less than or equal to the integer's width, otherwise we shift by a
negative number, which is also UB.

Reviewers: rnk

Subscribers: llvm-commits, dylanmckay

Differential Revision: https://reviews.llvm.org/D22442

llvm-svn: 275720
2016-07-17 18:19:23 +00:00
Justin Lebar cbba3c4aef Fix isShiftedInt and isShiftedUint for widths > 32.
Summary:
Previously we were doing 1 << S.  "1" is an int, so this doesn't work
when S >= 32.

This patch also adds some static_asserts to these functions to ensure
that we don't hit UB by shifting left too much.

Reviewers: rnk

Subscribers: llvm-commits, dylanmckay

Differential Revision: https://reviews.llvm.org/D22441

llvm-svn: 275719
2016-07-17 18:19:21 +00:00
Justin Lebar f2d0066af7 Use a faster implementation of maxUIntN.
Summary:
On x86-64 with clang 3.8, before:

   mov     edx, 1
   mov     cl, dil
   shl     rdx, cl
   cmp     rdi, 64
   mov     rax, -1
   cmovne  rax, rdx
   ret

after:

  mov     ecx, 64
  sub     ecx, edi
  mov     rax, -1
  shr     rax, cl
  ret

Reviewers: rnk

Subscribers: dylanmckay, mkuper, llvm-commits

Differential Revision: https://reviews.llvm.org/D22440

llvm-svn: 275718
2016-07-17 18:19:19 +00:00
Teresa Johnson cd21a646f6 [ThinLTO] Perform profile-guided indirect call promotion
Summary:
To enable profile-guided indirect call promotion in ThinLTO mode, we
simply add call graph edges for each profitable target from the profile
to the summaries, then the summary-guided importing will consider the
callee for importing as usual.

Also we need to enable the indirect call promotion pass creation in the
PassManagerBuilder when PerformThinLTO=true (we are in the ThinLTO
backend), so that the newly imported functions are considered for
promotion in the backends.

The IC promotion profiles refer to callees by GUID, which required
adding GUIDs to the per-module VST in bitcode (and assigning them
valueIds similar to how they are assigned valueIds in the combined
index).

Reviewers: mehdi_amini, xur

Subscribers: mehdi_amini, davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21932

llvm-svn: 275707
2016-07-17 14:47:01 +00:00
Dehao Chen 1a44452b11 [PM] Convert IVUsers analysis to new pass manager.
Summary: Convert IVUsers analysis to new pass manager.

Reviewers: davidxl, silvas

Subscribers: junbuml, sanjoy, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22434

llvm-svn: 275698
2016-07-16 22:51:33 +00:00
Eric Christopher 7a0b7dfd05 Reword comment to be more clear.
llvm-svn: 275659
2016-07-16 01:55:45 +00:00
Richard Smith 3bfed96632 Fix modules buildbot after r275633.
llvm-svn: 275657
2016-07-16 01:05:39 +00:00
Justin Lebar 8d56f47cfe Don't do uint64_t(1) << 64 in maxUIntN.
Summary:
This shift is undefined behavior (and, as compiled by clang, gives the
wrong answer for maxUIntN(64)).

Reviewers: mkuper

Subscribers: llvm-commits, jroelofs, rsmith

Differential Revision: https://reviews.llvm.org/D22430

llvm-svn: 275656
2016-07-16 00:59:41 +00:00
Vedant Kumar 7a4bd83c6c [Support] Fix a doxygen comment (NFC)
There was a missing "<" on a line, so its contents wrapped around into
the description of the next argument.

llvm-svn: 275638
2016-07-15 22:44:52 +00:00
Alexei Starovoitov cfb51f54ba BPF: Use official ELF e_machine value
The same value for EM_BPF is being propagated to glibc,
elfutils, and binutils.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 275633
2016-07-15 22:27:55 +00:00
Zachary Turner b927e02e1b [pdb] Teach MsfBuilder and other classes about the Free Page Map.
Block 1 and 2 of an MSF file are bit vectors that represent the
list of blocks allocated and free in the file.  We had been using
these blocks to write stream data and other data, so we mark them
as the free page map now.  We don't yet serialize these pages to
the disk, but at least we make a note of what it is, and avoid
writing random data to them.

Doing this also necessitated cleaning up some of the tests to be
more general and hardcode fewer values, which is nice.

llvm-svn: 275629
2016-07-15 22:17:19 +00:00
Zachary Turner 5e534c7fb3 [pdb] Round trip the NameMap data structure to YAML.
llvm-svn: 275628
2016-07-15 22:17:08 +00:00
Zachary Turner faa554b2fd [pdb] Use MsfBuilder to handle the writing PDBs.
Previously we would read a PDB, then write some of it back out,
but write the directory, super block, and other pertinent metadata
back out unchanged.  This generates incorrect PDBs since the amount
of data written was not always the same as the amount of data read.

This patch changes things to use the newly introduced `MsfBuilder`
class to write out a correct and accurate set of Msf metadata for
the data *actually* written, which opens up the door for adding and
removing type records, symbol records, and other types of data to
an existing PDB.

llvm-svn: 275627
2016-07-15 22:16:56 +00:00
Matt Arsenault 11d3e21f2b AMDGPU: Remove AMDGPU.ldexp
llvm-svn: 275618
2016-07-15 21:26:56 +00:00
Matt Arsenault 09b2c4aee8 AMDGPU: Remove legacy rsq.clamped intrinsic
Mesa still has a use of llvm.AMDGPU.rsq.f64 remaining.

Also fix mismatch with non-IEEE rsq selecting to IEEE rsq.

llvm-svn: 275617
2016-07-15 21:26:52 +00:00
Michael Zolotukhin a78937afb2 Make processInstruction from LCSSA.cpp externally available.
Summary:
When a pass tries to keep LCSSA form it's often convenient to be able to update
LCSSA for a set of instructions rather than for the entire loop. This patch makes the
processInstruction from LCSSA externally available under a name
formLCSSAForInstruction.

Reviewers: chandlerc, sanjoy, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22378

llvm-svn: 275613
2016-07-15 21:08:41 +00:00
Zachary Turner f52a899f4a [pdb] Introduce MsfBuilder for laying out PDB files.
Reviewed by: ruiu
Differential Revision: https://reviews.llvm.org/D22308

llvm-svn: 275611
2016-07-15 20:43:38 +00:00
George Burgess IV 6d30aa03a0 [CFLAA] Add an initial CFLAnders implementation.
This adds an incomplete anders-style implementation for CFLAA. It's
incomplete in that it's missing interprocedural analysis, attrs
handling, etc. and that it needs more tests. More tests and features
will be added in future commits.

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22291

llvm-svn: 275602
2016-07-15 19:53:25 +00:00
Justin Lebar 9c375817ac [SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, getStore, and friends.
Summary:
Instead, we take a single flags arg (a bitset).

Also add a default 0 alignment, and change the order of arguments so the
alignment comes before the flags.

This greatly simplifies many callsites, and fixes a bug in
AMDGPUISelLowering, wherein the order of the args to getLoad was
inverted.  It also greatly simplifies the process of adding another flag
to getLoad.

Reviewers: chandlerc, tstellarAMD

Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits

Differential Revision: http://reviews.llvm.org/D22249

llvm-svn: 275592
2016-07-15 18:27:10 +00:00
Justin Lebar 0af80cd6f0 [CodeGen] Take a MachineMemOperand::Flags in MachineFunction::getMachineMemOperand.
Summary:
Previously we took an unsigned.

Hooray for type-safety.

Reviewers: chandlerc

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D22282

llvm-svn: 275591
2016-07-15 18:26:59 +00:00
Sanjay Patel 32f900730c fix documentation comments; NFC
llvm-svn: 275587
2016-07-15 18:03:59 +00:00
Adam Nemet aad816083e [OptRemark,LDist] RFC: Add hotness attribute
Summary:
This is the first set of changes implementing the RFC from
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334

This is a cross-sectional patch; rather than implementing the hotness
attribute for all optimization remarks and all passes in a patch set, it
implements it for the 'missed-optimization' remark for Loop
Distribution.  My goal is to shake out the design issues before scaling
it up to other types and passes.

Hotness is computed as an integer as the multiplication of the block
frequency with the function entry count.  It's only printed in opt
currently since clang prints the diagnostic fields directly.  E.g.:

  remark: /tmp/t.c:3:3: loop not distributed: use -Rpass-analysis=loop-distribute for more info (hotness: 300)

A new API added is similar to emitOptimizationRemarkMissed.  The
difference is that it additionally takes a code region that the
diagnostic corresponds to.  From this, hotness is computed using BFI.
The new API is exposed via an analysis pass so that it can be made
dependent on LazyBFI.  (Thanks to Hal for the analysis pass idea.)

This feature can all be enabled by setDiagnosticHotnessRequested in the
LLVM context.  If this is off, LazyBFI is not calculated (D22141) so
there should be no overhead.

A new command-line option is added to turn this on in opt.

My plan is to switch all user of emitOptimizationRemark* to use this
module instead.

Reviewers: hfinkel

Subscribers: rcox2, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D21771

llvm-svn: 275583
2016-07-15 17:23:20 +00:00
David Majnemer a940f360cb [AliasAnalysis] Give back AA results for fence instructions
Calling getModRefInfo with a fence resulted in crashes because fences
don't have a memory location.  Add a new predicate to Instruction
called isFenceLike which indicates that the instruction mutates memory
but not any single memory location in particular. In practice, it is a
proxy for the set of instructions which "mayWriteToMemory" but cannot be
used with MemoryLocation::get.

This fixes PR28570.

llvm-svn: 275581
2016-07-15 17:19:24 +00:00
Dehao Chen dcafd5ebfd [PM] Convert LoopInstSimplify Pass to new PM
Summary: Convert LoopInstSimplify to new PM. Unfortunately there is no exisiting unittest for this pass.

Reviewers: davidxl, silvas

Subscribers: silvas, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22280

llvm-svn: 275576
2016-07-15 16:42:11 +00:00
Jacques Pienaar 71c30a14b7 Rename AnalyzeBranch* to analyzeBranch*.
Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect.

Reviewers: tstellarAMD, mcrosier

Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai

Differential Revision: https://reviews.llvm.org/D22409

llvm-svn: 275564
2016-07-15 14:41:04 +00:00
Igor Laevsky ee40d1e8da Re-submit r272891 "Prevent dangling pointer problems in BranchProbabilityInfo"
Most possibly problem was caused by the same reason as PR28400. This change
bypasses it by using CallbackVH instead of AssertingVH.

Differential Revision: https://reviews.llvm.org/D20957

llvm-svn: 275563
2016-07-15 14:31:16 +00:00
Sebastian Pop 4177480aad code hoisting pass based on GVN
This pass hoists duplicated computations in the program. The primary goal of
gvn-hoist is to reduce the size of functions before inline heuristics to reduce
the total cost of function inlining.

Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki.
Important algorithmic contributions by Daniel Berlin under the form of reviews.

Differential Revision: http://reviews.llvm.org/D19338

llvm-svn: 275561
2016-07-15 13:45:20 +00:00
Vedant Kumar f681e2e506 [Coverage] Mark a few more methods const (NFC)
llvm-svn: 275514
2016-07-15 01:19:33 +00:00
Peter Collingbourne 5c73220fa5 Move legacy LTO interface headers to legacy/ directory.
Differential Revision: https://reviews.llvm.org/D22173

llvm-svn: 275476
2016-07-14 21:21:16 +00:00
Lang Hames 69f4902ba6 [Object] Change Archive::findSym to return an Expected<Optional<Child>>.
As suggested by Rafael in review of D22079 - this was accidentally left out of
the final commit (r275316).

llvm-svn: 275469
2016-07-14 20:44:27 +00:00
Mehdi Amini 40f993df25 Add recently added TargetOptions::EnableIPRA member to operator==
llvm-svn: 275467
2016-07-14 20:22:13 +00:00
Jun Bum Lim c837af306e [PM] Port Dead Loop Deletion Pass to the new PM
Summary: Port Dead Loop Deletion Pass to the new pass manager.

Reviewers: silvas, davide

Subscribers: llvm-commits, sanjoy, mcrosier

Differential Revision: https://reviews.llvm.org/D21483

llvm-svn: 275453
2016-07-14 18:28:29 +00:00
Justin Lebar 288b3376ae [CodeGen] Refactor MachineMemOperand::Flags's target-specific flags.
Summary:
Make the target-specific flags in MachineMemOperand::Flags real, bona
fide enum values.  This simplifies users, prevents various constants
from going out of sync, and avoids the false sense of security provided
by declaring static members in classes and then forgetting to define
them inside of cpp files.

Reviewers: MatzeB

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22372

llvm-svn: 275451
2016-07-14 18:15:20 +00:00
Ahmed Bougacha 83dc5cb6a9 [GlobalISel] Fix G_OR opcode after the addition of a TargetOpcode.
r275367 fixed G_ADD and G_BR, but not G_OR.

llvm-svn: 275444
2016-07-14 17:29:49 +00:00
Ahmed Bougacha 9e511525a0 [CodeGen] Simplify reg bank/class union is+get into dyn_cast. NFC.
llvm-svn: 275443
2016-07-14 17:29:46 +00:00
Justin Lebar efbf9c8d5a [CodeGen] s/constexpr/LLVM_CONSTEXPR/ in MachineMemOperand.h.
llvm-svn: 275441
2016-07-14 17:16:40 +00:00
Justin Lebar a3b786a8c1 [CodeGen] Refactor MachineMemOperand's Flags enum.
Summary:
- Give it a shorter name (because we're going to refer to it often from
  SelectionDAG and friends).

- Split the flags and alignment into separate variables.

- Specialize FlagsEnumTraits for it, so we can do bitwise ops on it
  without losing type information.

- Make some enum values constants in MachineMemOperand instead.
  MOMaxBits should not be a valid Flag.

- Simplify some of the bitwise ops for dealing with Flags.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D22281

llvm-svn: 275438
2016-07-14 17:07:44 +00:00
Ahmed Bougacha 6b943e33e6 [TableGen] Autobrief-ize Record. NFC.
llvm-svn: 275425
2016-07-14 14:53:14 +00:00
Ahmed Bougacha e8405ad5d0 [TableGen] Cleanup Record comments. NFC.
LLVM doesn't use exceptions anymore.
Also remove the implementation comments. Some of them diverged.

llvm-svn: 275424
2016-07-14 14:53:11 +00:00
Nico Weber 755cd760cd Revert r275401, it caused PR28551.
llvm-svn: 275420
2016-07-14 14:41:25 +00:00
Sebastian Pop 63847d04e7 code hoisting pass based on GVN
This pass hoists duplicated computations in the program. The primary goal of
gvn-hoist is to reduce the size of functions before inline heuristics to reduce
the total cost of function inlining.

Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki.
Important algorithmic contributions by Daniel Berlin under the form of reviews.

Differential Revision: http://reviews.llvm.org/D19338

llvm-svn: 275401
2016-07-14 12:18:53 +00:00
Sjoerd Meijer 38c2cd0c14 This implements a more optimal algorithm for selecting a base constant in
constant hoisting. It not only takes into account the number of uses and the
cost of expressions in which constants appear, but now also the resulting
integer range of the offsets. Thus, the algorithm maximizes the number of uses
within an integer range that will enable more efficient code generation. On
ARM, for example, this will enable code size optimisations because less
negative offsets will be created. Negative offsets/immediates are not supported
by Thumb1 thus preventing more compact instruction encoding.

Differential Revision: http://reviews.llvm.org/D21183

llvm-svn: 275382
2016-07-14 07:44:20 +00:00
Dean Michael Berris 52735fc435 XRay: Add entry and exit sleds
Summary:
In this patch we implement the following parts of XRay:

- Supporting a function attribute named 'function-instrument' which currently only supports 'xray-always'. We should be able to use this attribute for other instrumentation approaches.
- Supporting a function attribute named 'xray-instruction-threshold' used to determine whether a function is instrumented with a minimum number of instructions (IR instruction counts).
- X86-specific nop sleds as described in the white paper.
- A machine function pass that adds the different instrumentation marker instructions at a very late stage.
- A way of identifying which return opcode is considered "normal" for each architecture.

There are some caveats here:

1) We don't handle PATCHABLE_RET in platforms other than x86_64 yet -- this means if IR used PATCHABLE_RET directly instead of a normal ret, instruction lowering for that platform might do the wrong thing. We think this should be handled at instruction selection time to by default be unpacked for platforms where XRay is not availble yet.

2) The generated section for X86 is different from what is described from the white paper for the sole reason that LLVM allows us to do this neatly. We're taking the opportunity to deviate from the white paper from this perspective to allow us to get richer information from the runtime library.

Reviewers: sanjoy, eugenis, kcc, pcc, echristo, rnk

Subscribers: niravd, majnemer, atrick, rnk, emaste, bmakam, mcrosier, mehdi_amini, llvm-commits

Differential Revision: http://reviews.llvm.org/D19904

llvm-svn: 275367
2016-07-14 04:06:33 +00:00
Lang Hames fc209623e9 [Object] Re-apply r275316 now that I have the corresponding LLD patch ready.
llvm-svn: 275361
2016-07-14 02:24:01 +00:00
Adrian Prantl 0418ef2691 Synchronize LLVM and clang's ObjCDeclSpec::ObjCPropertyAttributeKind.
This adds Clang-specific DWARF constants for nullability and ObjC
class properties that are already generated by clang. This patch adds
dwarfdump support and a more comprehensive testcase.

<rdar://problem/27335745>

llvm-svn: 275354
2016-07-14 00:41:18 +00:00
Lang Hames ae610ab528 [Object] Revert r275316, Archive::child_iterator changes, while I update lld.
Should fix the bots broken by r275316.

llvm-svn: 275353
2016-07-14 00:37:04 +00:00
Justin Lebar d5bbd856e2 Force a semicolon at the end of the LLVM_ENABLE_BITMASK_ENUMS_IN_NAMESPACE() macro.
This silences a warning about an extra semicolon on gcc.

llvm-svn: 275349
2016-07-13 23:52:19 +00:00
Mehdi Amini cfed2564f7 Add EnableIPRA to TargetOptions, and move the cl::opt -enable-ipra to TargetMachine.cpp
Avoid exposing a cl::opt in a public header and instead promote this
option in the API.
Alternatively, we could land the cl::opt in CommandFlags.h so that
it is available to every tool, but we would still have to find an
option for clang.

llvm-svn: 275348
2016-07-13 23:39:46 +00:00
Mehdi Amini 4beea66232 [IPRA] Set callee saved registers to none for local function when IPRA is enabled.
IPRA try to optimize caller saved register by propagating register
usage information from callee to caller so it is beneficial to have
caller saved registers compare to callee saved registers when IPRA
is enabled. Please find more detailed explanation here
https://groups.google.com/d/msg/llvm-dev/XRzGhJ9wtZg/tjAJqb0eEgAJ.

This change makes local function do not have any callee preserved
register when IPRA is enabled. A simple test case is also added to
verify this change.

Patch by Vivek Pandya <vivekvpandya@gmail.com>

Differential Revision: http://reviews.llvm.org/D21561

llvm-svn: 275347
2016-07-13 23:39:34 +00:00
Vedant Kumar ef345e1d3f [Coverage] Return an ArrayRef to avoid copies (NFC)
llvm-svn: 275338
2016-07-13 23:12:26 +00:00
Vedant Kumar 7fcc5472e2 [Coverage] Mark a few methods const (NFC)
llvm-svn: 275337
2016-07-13 23:12:23 +00:00
Adam Nemet 7da74abf3d [LAA] Don't hold on to DominatorTree in the analysis result
llvm-svn: 275335
2016-07-13 22:36:35 +00:00
Adam Nemet b49d9a56eb [LAA] Don't hold on to TargetLibraryInfo in the analysis result
llvm-svn: 275334
2016-07-13 22:36:27 +00:00
Matthias Braun c4ab36abeb MIRYamlMapping: Update stale comment
llvm-svn: 275328
2016-07-13 22:23:19 +00:00
Adam Nemet 1824e411c6 [LAA] Don't hold on to DataLayout in the analysis result
In fact, don't even pass this to the ctor since we can get it from the
module.

llvm-svn: 275326
2016-07-13 22:18:51 +00:00
Adam Nemet 6616ad08f6 [LAA] Don't hold on to LoopInfo in the analysis result
llvm-svn: 275325
2016-07-13 22:18:48 +00:00
Adam Nemet 1556357677 [LAA] Don't hold on to AliasAnalysis in the analysis result
llvm-svn: 275322
2016-07-13 21:39:09 +00:00
Teresa Johnson 6df48b34bf Mark the textual headers in the module map for ProfileData
Follow on to r275312.

llvm-svn: 275319
2016-07-13 21:27:51 +00:00
Lang Hames c2773e97d2 [Object] Change Archive::child_iterator for better interop with Error/Expected.
See http://reviews.llvm.org/D22079

Changes the Archive::child_begin and Archive::children to require a reference
to an Error. If iterator increment fails (because the archive header is
damaged) the iterator will be set to 'end()', and the error stored in the
given Error&. The Error value should be checked by the user immediately after
the loop. E.g.:

Error Err;
for (auto &C : A->children(Err)) {
  // Do something with archive child C.
}
// Check the error immediately after the loop.
if (Err)
  return Err;

Failure to check the Error will result in an abort() when the Error goes out of
scope (as guaranteed by the Error class).

llvm-svn: 275316
2016-07-13 21:13:05 +00:00
Teresa Johnson 56a76961aa Define a module map entry for ProfileData.
As per Richard Smith, this should help avoid a modules bug exposed
by my r275216 commit:
http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17560

llvm-svn: 275312
2016-07-13 20:19:09 +00:00
Andrew Kaylor 346dd7f1bd Reverting r275284 due to platform-specific test failures
llvm-svn: 275304
2016-07-13 19:09:16 +00:00
Justin Lebar 81edbbe259 [ADT] Add LLVM_MARK_AS_BITMASK_ENUM, used to enable bitwise operations on enums without static_cast.
Summary: Normally when you do a bitwise operation on an enum value, you
get back an instance of the underlying type (e.g. int).  But using this
macro, bitwise ops on your enum will return you back instances of the
enum.  This is particularly useful for enums which represent a
combination of flags.

Suppose you have a function which takes an int and a set of flags.  One
way to do this would be to take two numeric params:

  enum SomeFlags { F1 = 1, F2 = 2, F3 = 4, ... };
  void Fn(int Num, int Flags);

  void foo() {
    Fn(42, F2 | F3);
  }

But now if you get the order of arguments wrong, you won't get an error.

You might try to fix this by changing the signature of Fn so it accepts
a SomeFlags arg:

  enum SomeFlags { F1 = 1, F2 = 2, F3 = 4, ... };
  void Fn(int Num, SomeFlags Flags);

  void foo() {
    Fn(42, static_cast<SomeFlags>(F2 | F3));
  }

But now we need a static cast after doing "F2 | F3" because the result
of that computation is the enum's underlying type.

This patch adds a mechanism which gives us the safety of the second
approach with the brevity of the first.

  enum SomeFlags {
    F1 = 1, F2 = 2, F3 = 4, ..., F_MAX = 128,
    LLVM_MARK_AS_BITMASK_ENUM(F_MAX)
  };

  void Fn(int Num, SomeFlags Flags);

  void foo() {
    Fn(42, F2 | F3);  // No static_cast.
  }

The LLVM_MARK_AS_BITMASK_ENUM macro enables overloads for bitwise
operators on SomeFlags.  Critically, these operators return the enum
type, not its underlying type, so you don't need any static_casts.

An advantage of this solution over the previously-proposed BitMask class
[0, 1] is that we don't need any wrapper classes -- we can operate
directly on the enum itself.

The approach here is somewhat similar to OpenOffice's typed_flags_set
[2].  But we skirt the need for a wrapper class (and a good deal of
complexity) by judicious use of enable_if.  We SFINAE on the presence of
a particular enumerator (added by the LLVM_MARK_AS_BITMASK_ENUM macro)
instead of using a traits class so that it's impossible to use the enum
before the overloads are present.  The solution here also seamlessly
works across multiple namespaces.

[0] http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150622/283369.html
[1] http://lists.llvm.org/pipermail/llvm-commits/attachments/20150623/073434b6/attachment.obj
[2] https://cgit.freedesktop.org/libreoffice/core/tree/include/o3tl/typed_flags_set.hxx

Reviewers: chandlerc, rsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D22279

llvm-svn: 275292
2016-07-13 18:23:16 +00:00
Andrew Kaylor 12cccdd731 Fix for Bug 26903, adds support to inline __builtin_mempcpy
Patch by Sunita Marathe

Differential Revision: http://reviews.llvm.org/D21920

llvm-svn: 275284
2016-07-13 17:25:11 +00:00
Adam Nemet c2f791d8a7 [BFI] Add new LazyBFI analysis pass
Summary:
This is necessary for D21771.  In order to add the hotness attribute to
optimization remarks we need BFI to be available in all passes that emit
optimization remarks.

However we don't want to pay for computing BFI unless the hotness
attribute is requested.

This is achieved by making BFI lazy at the very high-level through a new
analysis pass -- BFI is not calculated unless requested.

I am adding a test to check the laziness under D21771 where the first
user of the analysis is added.

Reviewers: hfinkel, dexonsmith, davidxl

Subscribers: davidxl, dexonsmith, llvm-commits

Differential Revision: http://reviews.llvm.org/D22141

llvm-svn: 275250
2016-07-13 05:01:48 +00:00
David Majnemer 17bdf445e4 [IR] Make getIndexedOffsetInType return a signed result
A GEPed offset can go negative, the result of getIndexedOffsetInType
should according be a signed type.

llvm-svn: 275246
2016-07-13 03:42:38 +00:00
Dehao Chen 9cba1f4e7e New pass manager for LICM.
Summary: Port LICM to the new pass manager.

Reviewers: davidxl, silvas

Subscribers: krasin, vitalybuka, silvas, davide, sanjoy, llvm-commits, mehdi_amini

Differential Revision: http://reviews.llvm.org/D21772

llvm-svn: 275222
2016-07-12 22:37:48 +00:00
Teresa Johnson 1e44b5d3ab Refactor indirect call promotion profitability analysis (NFC)
Summary:
Refactored the profitability analysis out of the IC promotion pass and
into lib/Analysis so that it can be accessed by the summary index
builder in a follow-on patch to enable IC promotion in ThinLTO (D21932).

Reviewers: davidxl, xur

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D22182

llvm-svn: 275216
2016-07-12 21:13:44 +00:00
Dehao Chen b9f8e29290 [PM] Port LoopIdiomRecognize Pass to new PM
Summary: Port LoopIdiomRecognize Pass to new PM

Reviewers: davidxl

Subscribers: davide, sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D22250

llvm-svn: 275202
2016-07-12 18:45:51 +00:00
Wei Ding 5b2636a152 AMDGPU: Add LLVM IR Intrinsic for v_lerp_u8
Differential Revision: http://reviews.llvm.org/D22239

llvm-svn: 275197
2016-07-12 18:02:14 +00:00
Krzysztof Parzyszek f5b9bb61f7 Add print/dump routines to LiveInterval::SubRange
llvm-svn: 275194
2016-07-12 17:37:44 +00:00
Vitaly Buka 204dc533c5 Revert "New pass manager for LICM."
Summary: This reverts commit r275118.

Subscribers: sanjoy, mehdi_amini

Differential Revision: http://reviews.llvm.org/D22259

llvm-svn: 275156
2016-07-12 06:25:32 +00:00
Craig Topper a6e6febe2c [AVX512] Remove masked logic op intrinsics and autoupgrade them to native IR.
llvm-svn: 275155
2016-07-12 05:27:53 +00:00
Rui Ueyama ef5ec2da4a Re-enable TPI hash verification for enum records.
We didn't read unique names correctly. As a result, we computed
hashes on (non-)unique names instead of unique names.

llvm-svn: 275150
2016-07-12 03:25:03 +00:00
Mehdi Amini 0da268d13a Do not use bool in C header lto.h, use lto_bool_t instead
llvm-svn: 275130
2016-07-11 23:55:01 +00:00
Mehdi Amini e75aa6f674 Add a libLTO API to query a memory buffer and check if it contains ObjC categories
The linker supports a feature to force load an object from a static
archive if it defines an Objective-C category.
This API supports this feature by looking at every section in the
module to find if a category is defined in the module.

llvm-svn: 275125
2016-07-11 23:10:18 +00:00
Dehao Chen 7ef5820fa3 New pass manager for LICM.
Summary: Port LICM to the new pass manager.

Reviewers: davidxl, silvas

Subscribers: silvas, davide, sanjoy, llvm-commits, mehdi_amini

Differential Revision: http://reviews.llvm.org/D21772

llvm-svn: 275118
2016-07-11 22:45:24 +00:00
Zachary Turner dbeaea7b35 Refactor the PDB writing to use a builder approach
llvm-svn: 275110
2016-07-11 21:45:26 +00:00
Sanjay Patel bb7d87ee25 fix documentation comments; NFC
llvm-svn: 275101
2016-07-11 20:50:39 +00:00
Alina Sbirlea 327955e057 Add TLI.allowsMisalignedMemoryAccesses to LoadStoreVectorizer
Summary: Extend TTI to access TLI.allowsMisalignedMemoryAccesses(). Check condition when vectorizing load and store chains.
Add additional parameters: AddressSpace, Alignment, Fast.

Reviewers: llvm-commits, jlebar

Subscribers: arsenm, mzolotukhin

Differential Revision: http://reviews.llvm.org/D21935

llvm-svn: 275100
2016-07-11 20:46:17 +00:00
Chad Rosier 4f0dad1674 [IPRA] Properly compute register usage at call sites.
Differential Revision: http://reviews.llvm.org/D21395
Patch by Vivek Pandya.
PR28144

llvm-svn: 275087
2016-07-11 18:45:49 +00:00
Davide Italiano e8ae0b5eb4 [PM/IPO] Port LowerTypeTests to the new PassManager.
There's a little bit of churn in this patch because the initialization
mechanism is now shared between the old and the new PM. Other than
that, it's just a pretty mechanical translation.

llvm-svn: 275082
2016-07-11 18:10:06 +00:00
Dehao Chen 9232f98279 Implement callsite-hotness based inline cost for Sample-based PGO
Summary:
For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch.

E.g.

if (A1 && A2 && A3 && ..... && A10) {
  for (i=0; i < 100000000; i++) {
    callsite();
  }
}

Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value.

In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness.

Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR.

Reviewers: davidxl, eraman, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D22118

llvm-svn: 275073
2016-07-11 16:48:54 +00:00
Nirav Dave 8603062ee4 Fix branch relaxation in 16-bit mode.
Thread through MCSubtargetInfo to relaxInstruction function allowing relaxation
to generate jumps with 16-bit sized immediates in 16-bit mode.

This fixes PR22097.

Reviewers: dwmw2, tstellarAMD, craig.topper, jyknight

Subscribers: jfb, arsenm, jyknight, llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D20830

llvm-svn: 275068
2016-07-11 14:23:53 +00:00
Nirav Dave 53a72f4d3c Provide support for preserving assembly comments
Preserve assembly comments from input in output assembly and flags to
toggle property. This is on by default for inline assembly and off in
llvm-mc.

Parsed comments are emitted immediately before an EOL which generally
places them on the expected line.

Reviewers: rtrieu, dwmw2, rnk, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20020

llvm-svn: 275058
2016-07-11 12:42:14 +00:00
Daniel Berlin e64985fc94 Allow BasicBlockEdge to be used in DenseMap
Summary: Add a DenseMapInfo specialization for BasicBlockEdge

Reviewers: hfinkel, chandlerc, majnemer

Differential Revision: http://reviews.llvm.org/D22207

llvm-svn: 275041
2016-07-11 04:37:53 +00:00
Hal Finkel 47646c0981 Add a 'Returned' intrinsic property corresponding to the 'returned' argument attribute
This will be used by the upcoming llvm.noalias intrinsic.

Differential Revision: http://reviews.llvm.org/D22201

llvm-svn: 275034
2016-07-11 01:28:42 +00:00
Hal Finkel e87ad547ef Add getReturnedArgOperand to Call/InvokeInst, CallSite
In order to make the optimizer smarter about using the 'returned' argument
attribute (generally, but motivated by my llvm.noalias intrinsic work), add a
utility function to Call/InvokeInst, and CallSite, to make it easy to get the
returned call argument (when one exists).

P.S. There is already an unfortunate amount of code duplication between
CallInst and InvokeInst, and this adds to it. We should probably clean that up
separately.

Differential Revision: http://reviews.llvm.org/D22204

llvm-svn: 275031
2016-07-10 23:01:32 +00:00
Sanjay Patel fedc01ad76 [DAG] make isConstantSplatVector() available to the rest of lowering
llvm-svn: 275025
2016-07-10 21:27:06 +00:00
Jan Vesely 2fa28c330c AMDGPU/R600: Add implicitarg.ptr intrinsic
Differential Revision: http://reviews.llvm.org/D21622

llvm-svn: 275024
2016-07-10 21:20:29 +00:00
Sanjay Patel 9bedcdb5f5 fix documentation comments; NFC
llvm-svn: 275021
2016-07-10 21:02:16 +00:00
Marcin Koscielnicki cf7cc724a7 [SystemZ] Utilize Test Data Class instructions.
This adds a new SystemZ-specific intrinsic, llvm.s390.tdc.f(32|64|128),
which maps straight to the test data class instructions.  A new IR pass
is added to recognize instructions that can be converted to TDC and
perform the necessary replacements.

Differential Revision: http://reviews.llvm.org/D21949

llvm-svn: 275016
2016-07-10 14:41:22 +00:00
Benjamin Kramer da5b4cc339 [codeview] Drop unused private inheritance.
There is no polymorphism here, and StreamRef already contains a
StreamInterface pointer. Dropping the base class makes StreamRef more
transparent to the compiler, for example it can find unused variables.

llvm-svn: 275013
2016-07-10 10:17:36 +00:00
David Majnemer 1b79e9a5b9 [pdb] Sanity check the stream map
Some abstractions in LLVM "know" that they are reading in-bounds,
FixedStreamArray, and provide a simple result.  This breaks down if the
stream map is bogus.

llvm-svn: 275010
2016-07-10 05:32:05 +00:00
David Majnemer 6211b1f1f9 [llvm-pdbdump] Propagate errors a little more consistently
PDBFile::getBlockData didn't really return any indication that it
failed.  It merely returned an empty buffer.

llvm-svn: 275009
2016-07-10 03:34:47 +00:00
Sean Silva db90d4d9c1 [PM] Port LoopVectorize to the new PM.
llvm-svn: 275000
2016-07-09 22:56:50 +00:00
Sean Silva bf71035438 Fix up an include guard.
This should have been done as part of the move in r274960.

llvm-svn: 274999
2016-07-09 22:56:39 +00:00
Sanjay Patel 6170b4bebd fix documentation comments; NFC
llvm-svn: 274981
2016-07-09 18:52:07 +00:00
Craig Topper 45a59a08bc [X86] Remove sse41 extract intrinsics. They are not used by clang and are not implemented by the x86 backend.
llvm-svn: 274967
2016-07-09 04:38:30 +00:00
Craig Topper 70610cf7b6 [X86] Remove and autoupgrade 512-bit non-temporal store intrinsics.
llvm-svn: 274966
2016-07-09 04:38:27 +00:00
Davide Italiano 92b933a55c [PM] Port CrossDSOCFI to the new pass manager.
llvm-svn: 274962
2016-07-09 03:25:35 +00:00
Sean Silva 0dacbd8f31 [PM] Fix a think-o. mv {Scalar,Vectorize}/SLPVectorize.h
llvm-svn: 274960
2016-07-09 03:11:29 +00:00
Davide Italiano cd96cfd8df [PM] Port LoopSimplify to the new pass manager.
While here move simplifyLoop() function to the new header, as
suggested by Chandler in the review.

Differential Revision:  http://reviews.llvm.org/D21404

llvm-svn: 274959
2016-07-09 03:03:01 +00:00
George Burgess IV c294d0dcc2 [CFLAA] Move the graph builder out from CFLSteens. NFC.
Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D22022

llvm-svn: 274958
2016-07-09 02:54:42 +00:00
Anna Thomas 9ad45adfd7 Revert "InstCombine rule to fold truncs whose value is available"
This reverts commit r274853.
Caused failure in ppcBE build

llvm-svn: 274943
2016-07-08 22:15:08 +00:00
Jingyue Wu 15f3e82d42 [TTI] Expose TTI::getGEPCost and use it in SLSR and NaryReassociate.
NFC.

llvm-svn: 274940
2016-07-08 21:48:05 +00:00
Adam Nemet 0a24048bb7 [BFI] Minor cleanup. NFC
Use typedef Result in BlockFrequencyAnalysis::run.  Fix typo in comment.

llvm-svn: 274936
2016-07-08 21:24:13 +00:00
Xinliang David Li 07e08fa36b [PM] name the new PM LAA class LoopAccessAnalysis (LAA) /NFC
llvm-svn: 274934
2016-07-08 21:21:44 +00:00
Wei Mi c022370767 Allow dead insts to be kept in DeadRemat only when they are rematerializable.
Because isReallyTriviallyReMaterializableGeneric puts many limits on
rematerializable instructions, this fix can prevent instructions with
tied virtual operands and instructions with virtual register uses from
being kept in DeadRemat, so as to workaround the live interval consistency
problem for the dummy instructions kept in DeadRemat.

But we still need to fix the live interval consistency problem. This patch
is just a short time relieve. PR28464 has been filed as a reminder.

Differential Revision: http://reviews.llvm.org/D19486

llvm-svn: 274928
2016-07-08 21:08:09 +00:00
Xinliang David Li 7853c1dd73 Rename LoopAccessAnalysis to LoopAccessLegacyAnalysis /NFC
llvm-svn: 274927
2016-07-08 20:55:26 +00:00
Justin Bogner 068a8054ae IR: Set a TargetPrefix for nvvm intrinsics
Since these are named nvvm_* rather than nvptx_*, we also need to
update getArchTypePrefix. It's a bit unusual for getArchTypePrefix not
to match the backend name, but I think this fits the intent of the
function in this case.

llvm-svn: 274890
2016-07-08 17:25:18 +00:00
Anna Thomas 3124f6273a InstCombine rule to fold truncs whose value is available
We can fold truncs whose operand feeds from a load, if the trunc value
is available through a prior load/store.

This change is from: http://reviews.llvm.org/D21246, which folded the
trunc but missed the bitcast or ptrtoint/inttoptr required in the RAUW
call, when the load type didnt match the prior load/store type.

Differential Revision: http://reviews.llvm.org/D21791

llvm-svn: 274853
2016-07-08 15:18:56 +00:00
Vassil Vassilev bf70516503 [modules] Add missing includes.
Patch by Cristina Cristescu!

Reviewed by Adrian Prantl (D21985)

llvm-svn: 274838
2016-07-08 12:00:08 +00:00
Craig Topper f7bf6de0af [AVX512] Remove and autoupgrade a duplicate set of 512-bit masked shift intrinsics.
I'm not sure if clang ever used these builtin names or not.

llvm-svn: 274827
2016-07-08 06:14:47 +00:00
Craig Topper 4f826b7ae5 [X86] Remove intrinsics that already have autoupgrade support.
llvm-svn: 274826
2016-07-08 06:14:41 +00:00
Wei Mi 90d195a5fd [PM] Port UnreachableBlockElim to the new Pass Manager
Differential Revision: http://reviews.llvm.org/D22124

llvm-svn: 274824
2016-07-08 03:32:49 +00:00
Davide Italiano 16284df8ec [PM] Port InstSimplify to the new pass manager.
llvm-svn: 274796
2016-07-07 21:14:36 +00:00
Rui Ueyama 52a1dd76cb Add a reference for Elf_Chdr type.
llvm-svn: 274793
2016-07-07 20:19:19 +00:00
Tim Northover 917e744ea6 GlobalISel: remove redundant property setting. NFC.
AsmString is empty by default.

llvm-svn: 274789
2016-07-07 19:45:45 +00:00
Peter Collingbourne 73589f321b ThinLTO: Do not take into account whether a definition has multiple copies when promoting.
We currently do not touch a symbol's linkage in the case where a definition
has a single copy. However, this code is effectively unnecessary: either
the definition is not exported, in which case the internalize phase sets
its linkage to internal, or it is exported, in which case we need to promote
linkage to weak. Those two cases are already handled by existing code.

I believe that the only real functional change here is in the case where we
have a single definition which does not prevail (e.g. because the definition
in a native object file prevails). In that case we now lower linkage to
available_externally following the existing code path for that case.

As a result we can remove the isExported function parameter from the
thinLTOResolveWeakForLinkerInIndex function.

Differential Revision: http://reviews.llvm.org/D21883

llvm-svn: 274784
2016-07-07 18:31:51 +00:00
Justin Lebar e5c910f8ae [NVVM] Rename __nvvm_bar0 builtin back to __syncthreads.
__syncthreads was renamed to __nvvm_bar0 in r274664.  But __syncthreads
is part of our user-facing API, so we need to keep the name.

This will momentarily break clang; we need a matching patch there.

Patch by Justin Bogner.

llvm-svn: 274779
2016-07-07 18:14:55 +00:00
Justin Bogner a466cc33fa NVPTX: Remove the legacy ptx intrinsics
- Rename the ptx.read.* intrinsics to nvvm.read.ptx.sreg.* - some but
  not all of these registers were already accessible via the nvvm
  name.
- Rename ptx.bar.sync nvvm.bar.sync, to match nvvm.bar0.

There's a fair amount of code motion here, but it's all very
mechanical.

llvm-svn: 274769
2016-07-07 16:40:17 +00:00
Chandler Carruth 168800c97d [LCG] Hoist the definitions of the stream operator friends to be inline
friend definitions.

Based on the experiments Sean Silva and Reid did, this seems the safest
course of action and also will work around a questionable warning
provided by GCC6 on the old form of the code. Thanks for Davide pointing
out the issue and other suggesting ways to fix.

llvm-svn: 274740
2016-07-07 07:52:07 +00:00
David Majnemer 7afb46d3c8 [LoopAccessAnalysis] Fix an integer overflow
We were inappropriately using 32-bit types to account for quantities
that can be far larger.

Fixed in PR28443.

llvm-svn: 274737
2016-07-07 06:24:36 +00:00
Rui Ueyama 830c078d8b Define endianness-aware type for Elf_Chdr.
llvm-svn: 274728
2016-07-07 03:53:00 +00:00
Sean Silva 59fe82f4ce [PM] Port TailCallElim
llvm-svn: 274708
2016-07-06 23:48:41 +00:00
Matt Arsenault 3e96a709f1 Fix missing member initializers
This fixes the -Werror build with some combination of
warning flags.

llvm-svn: 274707
2016-07-06 23:30:54 +00:00
Sean Silva b025d375a1 [PM] Port CorrelatedValuePropagation
llvm-svn: 274705
2016-07-06 23:26:29 +00:00
Matthias Braun 332bb5c236 AArch64: Replace a RegScavenger instance with LivePhysRegs
findScratchNonCalleeSaveRegister() just needs a simple liveness
analysis, use LivePhysRegs for that as it is simpler and does not depend
on the kill flags.

This commit adds a convenience function available() to LivePhysRegs:
This function returns true if the given register is not reserved and
neither the register nor any of its aliases are alive.

Differential Revision: http://reviews.llvm.org/D21865

llvm-svn: 274685
2016-07-06 21:31:27 +00:00
Chad Rosier 232e29ebea [MemorySSA] Reinstate the legacy printer and verifier.
Differential Revision: http://reviews.llvm.org/D22058

llvm-svn: 274679
2016-07-06 21:20:47 +00:00
Justin Bogner a463537a36 NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0
Everywhere where cuda.syncthreads or __syncthreads is used, use the
properly namespaced nvvm.barrier0 instead.

llvm-svn: 274664
2016-07-06 20:02:45 +00:00
Justin Bogner b3745b6d24 NVPTX: Make the llvm.nvvm.shfl intrinsics and builtin names consistent
The intrinsics here use nvvm, but the builtins and tablegen variable
names were using ptx. Stick to the modern names here.

llvm-svn: 274662
2016-07-06 19:52:27 +00:00
Matthias Braun f16acbd2f9 TailDuplicator: Remove live-in updating logic
This logic was introduced in r157663 and does not make any sense to me.
The motivating example in rdar://11538365 looks like this:

This is the tail:
BB#16: derived from LLVM BB %if.end68
    Live Ins: %R0 %R4 %R5
    Predecessors according to CFG: BB#15 BB#5
        tBLXi pred:14, pred:%noreg, <ga:@CFRelease>, %R0<kill>, <regmask>, %LR<imp-def,dead>, %SP<imp-use>, %SP<imp-def>
        t2B <BB#20>, pred:14, pred:%noreg
    Successors according to CFG: BB#20

This is the predBB:
BB#5:
    Live Ins: %R5
    Predecessors according to CFG: BB#4
        %R4<def> = t2MOVi 0, pred:14, pred:%noreg, opt:%noreg
        t2B <BB#16>, pred:14, pred:%noreg
    Successors according to CFG: BB#16

However this is invalid machine code to begin with, if %R0 is live-in to
BB#16 then it must be live-in to BB#5 as well if BB#5 does not define
it.  We should not need logic to retroactively fix broken machine code
and in fact the example from r157663 passes cleanly with the code
removed and I do not see any (newly) failing tests with the machine
verifier enabled.

Differential Revision: http://reviews.llvm.org/D22031

llvm-svn: 274655
2016-07-06 18:55:10 +00:00
Zachary Turner 8848a7a6b2 [pdb] Round trip the PDB stream between YAML and binary PDB.
This gets writing of the PDB stream working.

llvm-svn: 274647
2016-07-06 18:05:57 +00:00
Michael Kuperstein aa71bdd3af [TTI] The cost model should not assume vector casts get completely scalarized
The cost model should not assume vector casts get completely scalarized, since
on targets that have vector support, the common case is a partial split up to
the legal vector size. So, when a vector cast  gets split, the resulting casts
end up legal and cheap.

Instead of pessimistically assuming scalarization, base TTI can use the costs
the concrete TTI provides for the split vector, plus a fudge factor to account
for the cost of the split itself. This fudge factor is currently 1 by default,
except on AMDGPU where inserts and extracts are considered free.

Differential Revision: http://reviews.llvm.org/D21251

llvm-svn: 274642
2016-07-06 17:30:56 +00:00
Zachary Turner ca4f02ce53 Add a default parameter for getRegisteredOptions.
llvm-svn: 274640
2016-07-06 17:25:16 +00:00
Reid Kleckner dafc5d75ea Prune RelocVisitor.h include to avoid including COFF.h from MCJIT.h
This helps to mitigate the conflict between COFF.h and winnt.h, which is
PR28399.

llvm-svn: 274637
2016-07-06 16:56:42 +00:00
Craig Topper 226274ab60 [X86] Remove GCC builtin names from sse/avx packed fp cmp intrinsics so clang can special handle some of the immediate values.
llvm-svn: 274607
2016-07-06 06:27:25 +00:00
Craig Topper 2839045e28 [AVX512] Remove GCC builtins from the vplzcntd/q intrinsics so we can emit native IR using the generic ctlz intrinsic in clang.
llvm-svn: 274602
2016-07-06 04:24:24 +00:00
George Burgess IV bfa401e5ad [CFLAA] Split into Anders+Steens analysis.
StratifiedSets (as implemented) is very fast, but its accuracy is also
limited. If we take a more aggressive andersens-like approach, we can be
way more accurate, but we'll also end up being slower.

So, we've decided to split CFLAA into CFLSteensAA and CFLAndersAA.

Long-term, we want to end up in a place where CFLSteens is queried
first; if it can provide an answer, great (since queries are basically
map lookups). Otherwise, we'll fall back to CFLAnders, BasicAA, etc.

This patch splits everything out so we can try to do something like
that when we get a reasonable CFLAnders implementation.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21910

llvm-svn: 274589
2016-07-06 00:26:41 +00:00
Tim Northover e6ae6767d9 AArch64: TableGenerate system instruction operands.
The way the named arguments for various system instructions are handled at the
moment has a few problems:

  - Large-scale duplication between AArch64BaseInfo.h and AArch64BaseInfo.cpp
  - That weird Mapping class that I have no idea what I was on when I thought
    it was a good idea.
  - Searches are performed linearly through the entire list.
  - We print absolutely all registers in upper-case, even though some are
    canonically mixed case (SPSel for example).
  - The ARM ARM specifies sysregs in terms of 5 fields, but those are relegated
    to comments in our implementation, with a slightly opaque hex value
    indicating the canonical encoding LLVM will use.

This adds a new TableGen backend to produce efficiently searchable tables, and
switches AArch64 over to using that infrastructure.

llvm-svn: 274576
2016-07-05 21:23:04 +00:00
Tim Northover 88403d7a84 TableGen: promote "code" type from syntactic sugar.
It's being immediately converted to a "string", but being able to tell what
type the field was originally can be useful in backends.

llvm-svn: 274575
2016-07-05 21:22:55 +00:00
Simon Pilgrim 9769428e08 [X86][AVX512] Remove vector BROADCAST builtins.
llvm-svn: 274555
2016-07-05 14:49:58 +00:00
Michael Zuckerman bdc5f40dca [LLVM][INTRINSICS] adding intrinsics of CLFLUSHOPT
Differential Revision: http://reviews.llvm.org/D21789

llvm-svn: 274553
2016-07-05 14:42:12 +00:00
Lang Hames 2b1c093c43 [Support][Error] Make logAllUnhandledErrors take a Twine for the banner, rather
than a const string&.

llvm-svn: 274526
2016-07-04 22:47:53 +00:00
Craig Topper 5d16cd9d63 [AVX512] Remove masked VPERMD/VPERMQ/VPERMILPS/VPERMILPD intrinsics. They were autoupgraded to native IR in r274506 and r274506.
llvm-svn: 274519
2016-07-04 19:58:38 +00:00
Nicolai Haehnle 84c9f9919a Add writeonly IR attribute
Summary:
This complements the earlier addition of IntrWriteMem and IntrWriteArgMem
LLVM intrinsic properties, see D18291.

Also start using the attribute for memset, memcpy, and memmove intrinsics,
and remove their special-casing in BasicAliasAnalysis.

Reviewers: reames, joker.eph

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D18714

llvm-svn: 274485
2016-07-04 08:01:29 +00:00
Sean Silva 45835e731d Remove dead TLI arg of isKnownNonNull and propagate deadness. NFC.
This actually uncovered a surprisingly large chain of ultimately unused
TLI args.
From what I can gather, this argument is a remnant of when
isKnownNonNull would look at the TLI directly.
The current approach seems to be that InferFunctionAttrs runs early in
the pipeline and uses TLI to annotate the TLI-dependent non-null
information as return attributes.

This also removes the dependence of functionattrs on TLI altogether.

llvm-svn: 274455
2016-07-02 23:47:27 +00:00
Xinliang David Li 2ecff7dd5a Fix wrong comment
llvm-svn: 274453
2016-07-02 21:25:12 +00:00
Xinliang David Li 8a021317a2 [PM] Port LoopAccessInfo analysis to new PM
It is implemented as a LoopAnalysis pass as 
discussed and agreed upon.

llvm-svn: 274452
2016-07-02 21:18:40 +00:00
Simon Pilgrim 77dda7c2e0 [X86][AVX512] Converted the MOVDDUP/MOVSLDUP/MOVSHDUP masked intrinsics to generic IR
llvm-svn: 274443
2016-07-02 17:16:41 +00:00
Pirama Arumuga Nainar 9c3aec2035 Add RenderScript ArchType
Summary:
Add renderscript32 and renderscript64 ArchTypes.  This is to configure
the ABI requirement on 32-bit RenderScript that 'long' types have 64-bit
size and alignment.  64-bit RenderScript is the same as AArch64, but is
added here for completeness.

Reviewers: echristo, rsmith

Subscribers: aemerson, jfb, rampitec, dschuff, mehdi_amini, llvm-commits, srhines

Differential Revision: http://reviews.llvm.org/D21333

llvm-svn: 274412
2016-07-02 00:23:09 +00:00
Michael Kuperstein 071d8306b0 [PM] Port ConstantHoisting to the new Pass Manager
Differential Revision: http://reviews.llvm.org/D21945

llvm-svn: 274411
2016-07-02 00:16:47 +00:00
Justin Bogner 0efaa349e4 IR: Set TargetPrefix for some X86 and AArch64 intrinsics where it was missing
llvm-svn: 274390
2016-07-01 22:07:11 +00:00
Adrian Prantl b340676439 Reapply "Define a module map entry for DebugInfo/CodeView."
This reapplies r274313 with two additional #include directives needed
when submodule visibility is enabled.

Fixes PR28384.

llvm-svn: 274358
2016-07-01 15:54:46 +00:00
Benjamin Kramer b0b52fc4c6 function_refify. NFC.
While there use emplace_back to create an expensive pair.

llvm-svn: 274344
2016-07-01 11:05:15 +00:00
Craig Topper 2bd8b4b180 [CodeGen,Target] Remove the version of DAG.getVectorShuffle that takes a pointer to a mask array. Convert all callers to use the ArrayRef version. No functional change intended.
For the most part this simplifies all callers. There were two places in X86 that needed an explicit makeArrayRef to shorten a statically sized array.

llvm-svn: 274337
2016-07-01 06:54:47 +00:00
Eric Christopher 36e601c6dc Add support for allowing us to create uniquely identified "COMDAT" or "ELF
Group" sections while lowering. In particular, for ELF sections this is
useful for creating function-specific groups that get merged into the
same named section.

Also use const Twine& instead of StringRef for the getELF functions
while we're here.

Differential Revision: http://reviews.llvm.org/D21743

llvm-svn: 274336
2016-07-01 06:07:38 +00:00
Xinliang David Li 94734eef33 [PM] refactor LoopAccessInfo code part-2
Differential Revision: http://reviews.llvm.org/D21636

llvm-svn: 274334
2016-07-01 05:59:55 +00:00
Adrian Prantl 257a676b8d Revert "Define a module map entry for DebugInfo/CodeView."
This reverts commit r274313.
While this fixed the build on Darwin, it broke Linux with local submodule
visibility.

llvm-svn: 274328
2016-07-01 03:17:02 +00:00
Reid Kleckner b5af11dfa3 [codeview] Add DISubprogram::ThisAdjustment
Summary:
This represents the adjustment applied to the implicit 'this' parameter
in the prologue of a virtual method in the MS C++ ABI. The adjustment is
always zero unless multiple inheritance is involved.

This increases the size of DISubprogram by 8 bytes, unfortunately. The
adjustment really is a signed 32-bit integer. If this size increase is
too much, we could probably win it back by splitting out a subclass with
info specific to virtual methods (virtuality, vindex, thisadjustment,
containingType).

Reviewers: aprantl, dexonsmith

Subscribers: aaboud, amccarth, llvm-commits

Differential Revision: http://reviews.llvm.org/D21614

llvm-svn: 274325
2016-07-01 02:41:21 +00:00
Matt Arsenault 370e8226c7 LoadStoreVectorizer: Check TTI for vec reg bit width
llvm-svn: 274322
2016-07-01 02:07:22 +00:00
Duncan P. N. Exon Smith 9d1f156418 Revert "code hoisting pass based on GVN"
This reverts commit r274305, since it breaks self-hosting:
  http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/22349/
  http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17232

Note that the blamelist on lab.llvm.org:8011 is incorrect.  The previous
build was r274299, but somehow r274305 wasn't included in the blamelist:
  http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules

llvm-svn: 274320
2016-07-01 01:51:40 +00:00
Duncan P. N. Exon Smith d26fdc83c9 CodeGen: Use MachineInstr& in LiveVariables API, NFC
Change all the methods in LiveVariables that expect non-null
MachineInstr* to take MachineInstr& and update the call sites.  This
clarifies the API, and designs away a class of iterator to pointer
implicit conversions.

llvm-svn: 274319
2016-07-01 01:51:32 +00:00
Adrian Prantl 8d62f7fc10 Define a module map entry for DebugInfo/CodeView.
This fixes the -fmodules build.

llvm-svn: 274313
2016-07-01 01:16:17 +00:00
Sebastian Pop 5c5798c57c code hoisting pass based on GVN
This pass hoists duplicated computations in the program. The primary goal of
gvn-hoist is to reduce the size of functions before inline heuristics to reduce
the total cost of function inlining.

Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki.
Important algorithmic contributions by Daniel Berlin under the form of reviews.

Differential Revision: http://reviews.llvm.org/D19338

llvm-svn: 274305
2016-07-01 00:24:31 +00:00
Duncan P. N. Exon Smith 632987296f Target: Remove unused arguments from overrideSchedPolicy, NFC
TargetSubtargetInfo::overrideSchedPolicy takes two MachineInstr*
arguments (begin and end) that invite implicit conversions from
MachineInstrBundleIterator.  One option would be to change their type to
an iterator, but since they don't seem to have been used since the API
was added in 2010, I'm deleting the dead code.

llvm-svn: 274304
2016-07-01 00:23:27 +00:00
Matt Arsenault 08debb0244 Add LoadStoreVectorizer pass
This was contributed by Apple, and I've been working on
minimal cleanups and generalizing it.

llvm-svn: 274293
2016-06-30 23:11:38 +00:00
Reid Kleckner 69ae6848e9 [TableGen] Use a SmallVector for Record::Values to avoid debug iterators
Debug iterators are valuable so we don't want to turn them off
completely. However, llvm-tblgen is critical to build speed, so we can
skip them here.

Regenerating X86GenSubtargetInfo.inc in a clang-cl self-host debug build
now takes 39s instead of 1m29s.

Helps PR28222

llvm-svn: 274288
2016-06-30 23:04:07 +00:00
Duncan P. N. Exon Smith e4f5e4f4d1 CodeGen: Use MachineInstr& in TargetLowering, NFC
This is a mechanical change to make TargetLowering API take MachineInstr&
(instead of MachineInstr*), since the argument is expected to be a valid
MachineInstr.  In one case, changed a parameter from MachineInstr* to
MachineBasicBlock::iterator, since it was used as an insertion point.

As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.

llvm-svn: 274287
2016-06-30 22:52:52 +00:00
Matt Arsenault 727e279ac4 SLPVectorizer: Move propagateMetadata to VectorUtils
This will be re-used by the LoadStoreVectorizer.

Fix handling of range metadata and testcase by Justin Lebar.

llvm-svn: 274281
2016-06-30 21:17:59 +00:00
Justin Bogner fbac64d678 CodeGen: Add the other BuildMI overload for MachineInstr&
The change in r274193 missed this variant.

llvm-svn: 274259
2016-06-30 18:32:12 +00:00
Rafael Espindola d86e8bb0ed Delete MCCodeGenInfo.
MC doesn't really care about CodeGen stuff, so this was just
complicating target initialization.

llvm-svn: 274258
2016-06-30 18:25:11 +00:00