Commit Graph

33205 Commits

Author SHA1 Message Date
Daniel Sanders 766646517f [globalisel][tablegen] Add support for importing G_ATOMIC_CMPXCHG, G_ATOMICRMW_* rules from SelectionDAG.
GIM_CheckNonAtomic has been replaced by GIM_CheckAtomicOrdering to allow it to support a wider
range of orderings. This has then been used to import patterns using nodes such
as atomic_cmp_swap, atomic_swap, and atomic_load_*.

llvm-svn: 319232
2017-11-28 22:07:05 +00:00
Daniel Sanders 7fe7acc6b1 [aarch64][globalisel] Define G_ATOMIC_CMPXCHG and G_ATOMICRMW_* and make them legal
The IRTranslator cannot generate these instructions at the moment so there's no
issue with not having implemented ISel for them yet. D40092 will add
G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMICRMW_* to the IRTranslator and a
further patch will add support for lowering G_ATOMIC_CMPXCHG_WITH_SUCCESS into
G_ATOMIC_CMPXCHG with an external success check via the `Lower` action.

The separation of G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMIC_CMPXCHG is
to import SelectionDAG rules while still supporting targets that prefer to
custom lower the original LLVM-IR-like operation.

llvm-svn: 319216
2017-11-28 20:21:15 +00:00
Zachary Turner 6900de1dfb [CodeView] Refactor / Rewrite TypeSerializer and TypeTableBuilder.
The motivation behind this patch is that future directions require us to
be able to compute the hash value of records independently of actually
using them for de-duplication.

The current structure of TypeSerializer / TypeTableBuilder being a
single entry point that takes an unserialized type record, and then
hashes and de-duplicates it is not flexible enough to allow this.

At the same time, the existing TypeSerializer is already extremely
complex for this very reason -- it tries to be too many things. In
addition to serializing, hashing, and de-duplicating, ti also supports
splitting up field list records and adding continuations. All of this
functionality crammed into this one class makes it very complicated to
work with and hard to maintain.

To solve all of these problems, I've re-written everything from scratch
and split the functionality into separate pieces that can easily be
reused. The end result is that one class TypeSerializer is turned into 3
new classes SimpleTypeSerializer, ContinuationRecordBuilder, and
TypeTableBuilder, each of which in isolation is simple and
straightforward.

A quick summary of these new classes and their responsibilities are:

- SimpleTypeSerializer : Turns a non-FieldList leaf type into a series of
  bytes. Does not do any hashing. Every time you call it, it will
  re-serialize and return bytes again. The same instance can be re-used
  over and over to avoid re-allocations, and in exchange for this
  optimization the bytes returned by the serializer only live until the
  caller attempts to serialize a new record.

- ContinuationRecordBuilder : Turns a FieldList-like record into a series
  of fragments. Does not do any hashing. Like SimpleTypeSerializer,
  returns references to privately owned bytes, so the storage is
  invalidated as soon as the caller tries to re-use the instance. Works
  equally well for LF_FIELDLIST as it does for LF_METHODLIST, solving a
  long-standing theoretical limitation of the previous implementation.

- TypeTableBuilder : Accepts sequences of bytes that the user has already
  serialized, and inserts them by de-duplicating with a hash table. For
  the sake of convenience and efficiency, this class internally stores a
  SimpleTypeSerializer so that it can accept unserialized records. The
  same is not true of ContinuationRecordBuilder. The user is required to
  create their own instance of ContinuationRecordBuilder.

Differential Revision: https://reviews.llvm.org/D40518

llvm-svn: 319198
2017-11-28 18:33:17 +00:00
Francis Visoiu Mistrih 946e394e33 [CodeGen] Cleanup MachineOperand
* clang-format
* move doxygen from the implementation to headers
* remove duplicate doxygen

llvm-svn: 319193
2017-11-28 17:58:38 +00:00
Konstantin Zhuravlyov 06ae4ec78e AMDGPU: Add num spilled s/vgprs to metadata
This was requested by tools.

Differential Revision: https://reviews.llvm.org/D40321

llvm-svn: 319192
2017-11-28 17:51:08 +00:00
Francis Visoiu Mistrih 9d7bb0cb40 [CodeGen] Print register names in lowercase in both MIR and debug output
As part of the unification of the debug format and the MIR format,
always print registers as lowercase.

* Only debug printing is affected. It now follows MIR.

Differential Revision: https://reviews.llvm.org/D40417

llvm-svn: 319187
2017-11-28 17:15:09 +00:00
Francis Visoiu Mistrih 26d6fc1f0e [Support] Merge toLower / toUpper implementations
Merge the ones from StringRef and StringExtras.

llvm-svn: 319171
2017-11-28 14:22:27 +00:00
Francis Visoiu Mistrih 9d419d3b0c [CodeGen] Rename functions PrintReg* to printReg*
LLVM Coding Standards:
  Function names should be verb phrases (as they represent actions), and
  command-like function should be imperative. The name should be camel
  case, and start with a lower case letter (e.g. openFile() or isFoo()).

Differential Revision: https://reviews.llvm.org/D40416

llvm-svn: 319168
2017-11-28 12:42:37 +00:00
Chandler Carruth c34f789e38 Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable.
The core idea is to (re-)introduce some redundancies where their cost is
hidden by the cost of materializing immediates for constant operands of
PHI nodes. When the cost of the redundancies is covered by this,
avoiding materializing the immediate has numerous benefits:
1) Less register pressure
2) Potential for further folding / combining
3) Potential for more efficient instructions due to immediate operand

As a motivating example, consider the remarkably different cost on x86
of a SHL instruction with an immediate operand versus a register
operand.

This pattern turns up surprisingly frequently, but is somewhat rarely
obvious as a significant performance problem.

The pass is entirely target independent, but it does rely on the target
cost model in TTI to decide when to speculate things around the PHI
node. I've included x86-focused tests, but any target that sets up its
immediate cost model should benefit from this pass.

There is probably more that can be done in this space, but the pass
as-is is enough to get some important performance on our internal
benchmarks, and should be generally performance neutral, but help with
more extensive benchmarking is always welcome.

One awkward part is that this pass has to be scheduled after
*everything* that can eliminate these kinds of redundancies. This
includes SimplifyCFG, GVN, etc. I'm open to suggestions about better
places to put this. We could in theory make it part of the codegen pass
pipeline, but there doesn't really seem to be a good reason for that --
it isn't "lowering" in any sense and only relies on pretty standard cost
model based TTI queries, so it seems to fit well with the "optimization"
pipeline model. Still, further thoughts on the pipeline position are
welcome.

I've also only implemented this in the new pass manager. If folks are
very interested, I can try to add it to the old PM as well, but I didn't
really see much point (my use case is already switched over to the new
PM).

I've tested this pretty heavily without issue. A wide range of
benchmarks internally show no change outside the noise, and I don't see
any significant changes in SPEC either. However, the size class
computation in tcmalloc is substantially improved by this, which turns
into a 2% to 4% win on the hottest path through tcmalloc for us, so
there are definitely important cases where this is going to make
a substantial difference.

Differential revision: https://reviews.llvm.org/D37467

llvm-svn: 319164
2017-11-28 11:32:31 +00:00
Rafael Espindola 3ecd20430c Use FILE_FLAG_DELETE_ON_CLOSE for TempFile on windows.
We won't see the temp file no more.

llvm-svn: 319137
2017-11-28 01:41:22 +00:00
Peter Collingbourne 1621c20ffc Reland r319090, "COFF: Do not create SectionChunks for discarded comdat sections." with a fix for debug sections.
If /debug was not specified, readSection will return a null
pointer for debug sections. If the debug section is associative with
another section, we need to make sure that the section returned from
readSection is not a null pointer before adding it as an associative
section.

Differential Revision: https://reviews.llvm.org/D40533

llvm-svn: 319133
2017-11-28 01:30:07 +00:00
Adrian Prantl 3e0e1d0934 Move getVariableSize from Verifier.cpp into DIVariable::getSize() (NFC)
llvm-svn: 319125
2017-11-28 00:57:51 +00:00
Rafael Espindola bce112c9e9 Add an F_Delete flag.
For now this only changes the handle Access.

llvm-svn: 319121
2017-11-28 00:12:44 +00:00
Rafael Espindola d19c2e8126 Add OpenFlags to the create(Unique|Temporary)File interfaces.
This will allow a future F_Delete flag to be specified when we want
the file to be automatically deleted on close.

llvm-svn: 319117
2017-11-27 23:44:11 +00:00
Peter Collingbourne c8477b8234 Revert r319090, "COFF: Do not create SectionChunks for discarded comdat sections."
Caused test failures in check-cfi on Windows.
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/20284

llvm-svn: 319100
2017-11-27 21:37:51 +00:00
Sanjay Patel 0de1a4bc2d [PartiallyInlineLibCalls][x86] add TTI hook to allow sqrt inlining to depend on arg rather than result
This should fix PR31455:
https://bugs.llvm.org/show_bug.cgi?id=31455

Differential Revision: https://reviews.llvm.org/D28314

llvm-svn: 319094
2017-11-27 21:15:43 +00:00
Peter Collingbourne 3f2921f5ec COFF: Do not create SectionChunks for discarded comdat sections.
With this change, instead of creating a SectionChunk for each section
in the object file, we only create them when we encounter a prevailing
comdat section.

Also change how symbol resolution occurs between comdat symbols. Now
only the comdat leader participates in comdat resolution, and not any
other external associated symbols. This is more in line with how COFF
semantics are defined, and should allow for a more straightforward
implementation of non-ANY comdat types.

On my machine, this change reduces our runtime linking a release
build of chrome_child.dll with /nopdb from 5.65s to 4.54s (median of
50 runs).

Differential Revision: https://reviews.llvm.org/D40238

llvm-svn: 319090
2017-11-27 20:42:34 +00:00
David Blaikie eef5c23305 Rename MCTargetOptionsCommandFlags.h to .def as it is not a normal/modular header as much as it is for stamping out some global/static variables
llvm-svn: 319086
2017-11-27 19:55:16 +00:00
David Blaikie c14bfec487 Rename CommandFlags.h -> CommandFlags.def
Since this isn't a real header - it includes static functions and had
external linkage variables (though this change makes them static, since
that's what they should be) so can't be included more than once in a
program.

llvm-svn: 319082
2017-11-27 19:43:58 +00:00
Zachary Turner 96c6985b53 [BinaryStream] Support growable streams.
The existing library assumed that a stream's length would never
change.  This makes some things simpler, but it's not flexible
enough for what we need, especially for writable streams where
what you really want is for each call to write to actually append.

llvm-svn: 319070
2017-11-27 18:48:37 +00:00
Jonas Hahnfeld a8efe6d9a5 Delete obsolete function mergeUseListsImpl
mergeUseLists is implemented iteratively since r243590.

Differential Revision: https://reviews.llvm.org/D40491

llvm-svn: 319061
2017-11-27 17:55:47 +00:00
Nirav Dave db77e57ea8 [DAG] Do MergeConsecutiveStores again before Instruction Selection
Summary:

Now that store-merge is only generates type-safe stores, do a second
pass just before instruction selection to allow lowered intrinsics to
be merged as well.

Reviewers: jyknight, hfinkel, RKSimon, efriedma, rnk, jmolloy

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33675

llvm-svn: 319036
2017-11-27 15:28:15 +00:00
Max Kazantsev f3c0aec971 [NFC] Add missing unit tests for EquivalenceClasses
llvm-svn: 319018
2017-11-27 11:20:58 +00:00
Oren Ben Simhon fa582b075c Control-Flow Enforcement Technology - Shadow Stack support (LLVM side)
Shadow stack solution introduces a new stack for return addresses only.
The HW has a Shadow Stack Pointer (SSP) that points to the next return address.
If we return to a different address, an exception is triggered.
The shadow stack is managed using a series of intrinsics that are introduced in this patch as well as the new register (SSP).
The intrinsics are mapped to new instruction set that implements CET mechanism.

The patch also includes initial infrastructure support for IBT.

For more information, please see the following:
https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf

Differential Revision: https://reviews.llvm.org/D40223

Change-Id: I4daa1f27e88176be79a4ac3b4cd26a459e88fed4
llvm-svn: 318996
2017-11-26 13:02:45 +00:00
Coby Tayree d8b17bedfa [x86][icelake]GFNI
galois field arithmetic (GF(2^8)) insns:
gf2p8affineinvqb
gf2p8affineqb
gf2p8mulb
Differential Revision: https://reviews.llvm.org/D40373

llvm-svn: 318993
2017-11-26 09:36:41 +00:00
David Blaikie f73efc6923 Remove dead code
(this header is not fully implemented (the out of line function
writeTypeRecordKind is called in an inline function but never
implemented - this fails to link under modular code generation) and not
included anywhere)

llvm-svn: 318987
2017-11-25 20:06:04 +00:00
Craig Topper 2c2b4c0191 [X86] Remove GCCBuiltin from intrinsics that are no longer used by clang.
llvm-svn: 318986
2017-11-25 20:00:37 +00:00
Craig Topper e485631cd1 [X86] Add separate intrinsics for scalar FMA4 instructions.
Summary:
These instructions zero the non-scalar part of the lower 128-bits which makes them different than the FMA3 instructions which pass through the non-scalar part of the lower 128-bits.

I've only added fmadd because we should be able to derive all other variants using operand negation in the intrinsic header like we do for AVX512.

I think there are still some missed negate folding opportunities with the FMA4 instructions in light of this behavior difference that I hadn't noticed before.

I've split the tests so that we can use different intrinsics for scalar testing between the two. I just copied the tests split the RUN lines and changed out the scalar intrinsics.

fma4-fneg-combine.ll is a new test to make sure we negate the fma4 intrinsics correctly though there are a couple TODOs in it.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39851

llvm-svn: 318984
2017-11-25 18:32:43 +00:00
Javed Absar 72bac8f337 [SCEV] : Simplify loop to range-loop.NFC.
llvm-svn: 318952
2017-11-24 14:35:38 +00:00
Benjamin Kramer cb100af21c [YAMLParser] Fix unused variable warning.
llvm-svn: 318936
2017-11-23 21:07:11 +00:00
Benjamin Kramer 0085d3c79a [YAMLParser] Don't crash on null keys in KeyValueNodes.
Found by clangd-fuzzer!

llvm-svn: 318935
2017-11-23 20:57:20 +00:00
Coby Tayree e8bdd383e9 [x86][icelake]BITALG
2/3
vpshufbitqmb encoding
3/3
vpshufbitqmb intrinsics
Differential Revision: https://reviews.llvm.org/D40222

llvm-svn: 318904
2017-11-23 11:15:50 +00:00
George Rimar 33894b619b Revert r318822 "[llvm-tblgen] - Stop using std::string in RecordKeeper."
It reported to have problems with memory sanitizers and DBUILD_SHARED_LIBS=ON.

llvm-svn: 318899
2017-11-23 06:52:44 +00:00
David Blaikie 9b55e99747 Instrumentation.h: Remove dead/untested code for DFSan JIT support
llvm-svn: 318887
2017-11-23 00:08:40 +00:00
Paul Robinson 920c60408b Add a missing include found by modules bot.
llvm-svn: 318873
2017-11-22 20:31:39 +00:00
Paul Robinson b02295641d Remove unnecessary include.
llvm-svn: 318861
2017-11-22 18:39:26 +00:00
Peter Collingbourne 048ac83973 CachePruning: Allow limiting the number of files in the cache directory.
The default limit is 1000000 but it can be configured with a cache
policy. The motivation is that some filesystems (notably ext4) have
a limit on the number of files that can be contained in a directory
(separate from the inode limit).

Differential Revision: https://reviews.llvm.org/D40327

llvm-svn: 318857
2017-11-22 18:27:31 +00:00
Paul Robinson 511b54cadc [DebugInfo] Dump a .debug_line section, including line-number program,
without any compile units.

Differential Revision: https://reviews.llvm.org/D40114

llvm-svn: 318842
2017-11-22 15:48:30 +00:00
George Rimar 860a7b7901 [llvm-tblgen] - Stop using std::string in RecordKeeper.
RecordKeeper::getDef() is a hot place, it shows up in profiling
and it creates std::string instance for each search in RecordMap
though RecordKeeper::RecordMap can use StringRef as a key
instead to avoid that. Patch do that change.

Differential revision: https://reviews.llvm.org/D40170

llvm-svn: 318822
2017-11-22 07:53:48 +00:00
Craig Topper fb0d4cd48c [SelectionDAG] Add a isel matcher op to check the type of node results other than result 0.
I plan to use this to check the type of the mask result of masked gathers in the X86 backend.

llvm-svn: 318820
2017-11-22 07:11:01 +00:00
Craig Topper 47c8739b08 [X86] Move the information about the feature bits used by compiler-rt and shared by Host.cpp to a .def file and TargetParser.h so clang can make use of it.
Since we keep Host.cpp and compiler-rt relatively in sync, clang can use this information as a proxy.

llvm-svn: 318814
2017-11-21 23:36:42 +00:00
Alina Sbirlea ff8b8aea2e Add MemorySSA as loop dependency, disabled by default [NFC].
Summary:
First step in adding MemorySSA as dependency for loop pass manager.
Adding the dependency under a flag.

New pass manager: MSSA pointer in LoopStandardAnalysisResults can be null.
Legacy and new pass manager: Use cl::opt EnableMSSALoopDependency. Disabled by default.

Reviewers: sanjoy, davide, gberry

Subscribers: mehdi_amini, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D40274

llvm-svn: 318772
2017-11-21 15:45:46 +00:00
Coby Tayree 3880f2a363 [x86][icelake]VNNI
Introducing Vector Neural Network Instructions, consisting of:
vpdpbusd{s}
vpdpwssd{s}
Differential Revision: https://reviews.llvm.org/D40208

llvm-svn: 318746
2017-11-21 10:04:28 +00:00
Coby Tayree 71e37cc9ff [x86][icelake]vbmi2
introducing vbmi2, consisting of
vpcompress{b,w}
vpexpand{b,w}
vpsh{l,r}d{w,d,q}
vpsh{l,r}dv{w,d,q}
Differential Revision: https://reviews.llvm.org/D40206

llvm-svn: 318745
2017-11-21 09:48:44 +00:00
Coby Tayree 7ca5e58736 [x86][icelake]vpclmulqdq introduction
an icelake promotion of pclmulqdq
Differential Revision: https://reviews.llvm.org/D40101

llvm-svn: 318741
2017-11-21 09:30:33 +00:00
Coby Tayree 2a1c02fcbc [x86][icelake]VAES introduction
an icelake promotion of AES
Differential Revision: https://reviews.llvm.org/D40078

llvm-svn: 318740
2017-11-21 09:11:41 +00:00
David Blaikie 62b076325a XRayRecord.h: Add missing #include
llvm-svn: 318713
2017-11-21 00:23:19 +00:00
David Blaikie 970ce642fe YAML/XRay/std::vector: Fix ODR violation by removing local specialization
There's a generic partial specialization for all std::vector<T> that
does what's desired, so no need for this full specialization that's
causing an ODR violation anyway.

llvm-svn: 318712
2017-11-21 00:23:17 +00:00
David Blaikie 2bc260aba2 Add ADL support to range based <algorithm> extensions
This adds support for ADL in the range based <algorithm> extensions
(llvm::for_each etc.).

Also adds the helper functions llvm::adl::begin and llvm::adl::end which wrap
std::begin and std::end with ADL support.

Saw this was missing from a recent llvm weekly post about adding llvm::for_each
and thought I might add it.

Patch by Stephen Dollberg!

Differential Revision: https://reviews.llvm.org/D40006

llvm-svn: 318703
2017-11-20 22:12:55 +00:00
Hiroshi Yamauchi 5c774b9235 Fix a lld-x86_64-darwin13 build error.
Summary:
Fix this build error

http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/15112/steps/build_Lld/logs/stdio

after https://reviews.llvm.org/rL318693

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40266

llvm-svn: 318696
2017-11-20 21:38:43 +00:00
Hiroshi Yamauchi c94d4d70d8 Add heuristics for irreducible loop metadata under PGO
Summary:
Add the following heuristics for irreducible loop metadata:

- When an irreducible loop header is missing the loop header weight metadata,
  give it the minimum weight seen among other headers.
- Annotate indirectbr targets with the loop header weight metadata (as they are
  likely to become irreducible loop headers after indirectbr tail duplication.)

These greatly improve the accuracy of the block frequency info of the Python
interpreter loop (eg. from ~3-16x off down to ~40-55% off) and the Python
performance (eg. unpack_sequence from ~50% slower to ~8% faster than GCC) due to
better register allocation under PGO.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39980

llvm-svn: 318693
2017-11-20 21:03:38 +00:00
Teresa Johnson 3309002a86 [SROA] Correctly invalidate analyses when dead instructions deleted
Summary:
SROA can fail in rewriting alloca but still rewrite a phi resulting
in dead instruction elimination. The Changed flag was not being set
correctly, resulting in downstream passes using stale analyses.
The included test case will assert during the second BDCE pass as a
result.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39921

llvm-svn: 318677
2017-11-20 18:33:38 +00:00
Tony Jiang f75f4d6573 [MachineCSE] Add new callback for is caller preserved or constant physregs
The instructions addis,addi, bl are used to calculate the address of TLS thread
local variables. These TLS access code sequences are generated repeatedly every
time the thread local variable is accessed. By communicating to Machine CSE that
X2 is guaranteed to have the same value within the same function call (so called
Caller Preserved Physical Register), the redundant TLS access code sequences are
cleaned up.

Differential Revision: https://reviews.llvm.org/D39173

llvm-svn: 318661
2017-11-20 16:55:07 +00:00
Sanjay Patel fbd3e66b9a [LibCallSimplifier] partly fix pow(x, 0.5) -> sqrt() transforms
As the first test shows, we could transform an llvm intrinsic which never sets errno 
into a libcall which could set errno (even though it's marked readnone?), so that's 
not ideal.

It's possible that we can also transform a libcall which could set errno to an
intrinsic given the fast-math-flags constraint, but that's deferred to determine
exactly which set of FMF are needed.

Differential Revision: https://reviews.llvm.org/D40150

llvm-svn: 318628
2017-11-19 16:13:14 +00:00
Eric Fiselier 1ed231de44 Fix use of config.h in public headers.
The CodeGenCoverage.h header is installed, but it references
the build-only header "llvm/Config/config.h". This breaks use
of the CodeGenCoverage.h header once it is installed, because config.h isn't
available.

This patch fixes the error by moving the config.h include from
the CodeGenCoverage.h header (where it's not needed), to the
CodeGenCoverage.cpp source file.

llvm-svn: 318602
2017-11-18 22:42:26 +00:00
Daniel Sanders c54aa9c844 [globalisel][tablegen] Generalize pointer-type inference by introducing ptypeN. NFC
ptypeN is functionally the same as typeN except that it informs the
SelectionDAG importer that an operand should be treated as a pointer even
if it was written as iN. This is important for patterns that use iN instead
of iPTR to represent pointers. E.g.:
  (set GPR64:$dst, (load GPR64:$addr))

Previously, this was handled as a hardcoded special case for the appropriate
operands to G_LOAD and G_STORE.

llvm-svn: 318574
2017-11-18 00:16:44 +00:00
Rafael Espindola 51c63bb7ef Use TempFile in the implementation of LockFileManager.
This move some of the complexity over to the lower level TempFile.

It also makes it a bit more explicit where errors are ignored since we
now have a call to consumeError.

llvm-svn: 318550
2017-11-17 20:06:41 +00:00
Chandler Carruth 693eedb138 [PM/Unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching,
making it no longer even remotely simple.

The pass will now be more of a "full loop unswitching" pass rather than
anything substantively simpler than any other approach. I plan to rename
it accordingly once the dust settles.

The key ideas of the new loop unswitcher are carried over for
non-trivial unswitching:
1) Fully unswitch a branch or switch instruction from inside of a loop to
   outside of it.
2) Update the CFG and IR. This avoids needing to "remember" the
   unswitched branches as well as avoiding excessively cloning and
   reliance on complex parts of simplify-cfg to cleanup the cfg.
3) Update the analyses (where we can) rather than just blowing them away
   or relying on something else updating them.

Sadly, #3 is somewhat compromised here as the dominator tree updates
were too complex for me to want to reason about. I will need to make
another attempt to do this now that we have a nice dynamic update API
for dominators. However, we do adhere to #3 w.r.t. LoopInfo.

This approach also adds an important principls specific to non-trivial
unswitching: not *all* of the loop will be duplicated when unswitching.
This fact allows us to compute the cost in terms of how much *duplicate*
code is inserted rather than just on raw size. Unswitching conditions
which essentialy partition loops will work regardless of the total loop
size.

Some remaining issues that I will be addressing in subsequent commits:
- Handling unstructured control flow.
- Unswitching 'switch' cases instead of just branches.
- Moving to the dynamic update API for dominators.

Some high-level, interesting limitationsV that folks might want to push
on as follow-ups but that I don't have any immediate plans around:
- We could be much more clever about not cloning things that will be
  deleted. In fact, we should be able to delete *nothing* and do
  a minimal number of clones.
- There are many more interesting selection criteria for which branch to
  unswitch that we might want to look at. One that I'm interested in
  particularly are a set of conditions which all exit the loop and which
  can be merged into a single unswitched test of them.

Differential revision: https://reviews.llvm.org/D34200

llvm-svn: 318549
2017-11-17 19:58:36 +00:00
Aditya Nandakumar 69855491ee [GISel]: DCE copy instructions during legalization
We might have instructions such as ext(copy(trunc)), and while cleaning
up legalization artifacts, we can also dce the copies that are in
between legalization artifacts.

llvm-svn: 318501
2017-11-17 02:44:55 +00:00
Vedant Kumar 4d7f2b02d6 [SelectionDAG] Consolidate (t|T)ransferDbgValues methods, NFC (reapply)
TransferDbgValues (capital 'T') is wired into ReplaceAllUsesWith, and
transferDbgValues (lowercase 't') is used elsewhere (e.g in Legalize).

Both functions should be doing the exact same thing. This patch
consolidates the logic into one place.

This was reverted in r318455 because some newly introduced asserts,
which I thought were NFC, were firing. I filed PR35338. For now I've
weakened the asserts.

Testing: check-llvm, check-clang, and a stage2 Rel+Deb build of clang

Differential Revision: https://reviews.llvm.org/D40104

llvm-svn: 318498
2017-11-17 01:48:33 +00:00
David Blaikie b3bde2ea50 Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490
2017-11-17 01:07:10 +00:00
Zachary Turner bd159d32c4 Don't #include MemoryBuffer.h from Host.h.
It turns out this #include isn't used from Host.h anyway,
but by having it it causes circular include dependencies.
This issues only surfaced while I was working on a separate
patch, so I'm submitting this first so that it's independent
of the other, unrelated patch.

llvm-svn: 318489
2017-11-17 01:00:35 +00:00
Lang Hames afcb70d031 [Support] Support NetBSD PaX MPROTECT in sys::Memory.
Removes AllocateRWX, setWritable and setExecutable from sys::Memory and
standardizes on allocateMappedMemory / protectMappedMemory. The
allocateMappedMemory method is updated to request full permissions for memory
blocks so that they can be marked executable later.

llvm-svn: 318464
2017-11-16 23:04:44 +00:00
Rafael Espindola b60bb6904b Convert another use of createUniqueFile to TempFile::create.
This one requires a new small feature in TempFile: the ability to keep
the temporary file with the temporary name.

llvm-svn: 318458
2017-11-16 21:40:10 +00:00
Vedant Kumar 53418797fd Revert "[SelectionDAG] Consolidate (t|T)ransferDbgValues methods, NFC."
This reverts commit r318448. It looks like some of the asserts need to
be weakened.

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/16296

llvm-svn: 318455
2017-11-16 21:08:51 +00:00
Vedant Kumar 494814d52a [SelectionDAG] Consolidate (t|T)ransferDbgValues methods, NFC.
TransferDbgValues (capital 'T') is wired into ReplaceAllUsesWith, and
transferDbgValues (lowercase 't') is used elsewhere (e.g in Legalize).

Both functions should be doing the exact same thing. This patch
consolidates the logic into one place.

Differential Revision: https://reviews.llvm.org/D40104

llvm-svn: 318448
2017-11-16 19:50:24 +00:00
Daniel Sanders 1eaf300fac [arc] Update TargetInfo to include the new backend name argument
Also update a comment about the usage of RegisterTarget() that didn't mention
the new argument.

llvm-svn: 318441
2017-11-16 19:10:26 +00:00
Guozhi Wei 433e8d3e04 [PPC] Change i32 constant in store instruction to i64
This patch changes all i32 constant in store instruction to i64 with truncation, to increase the chance that the referenced constant can be shared with other i64 constant.

Differential Revision: https://reviews.llvm.org/D39352

llvm-svn: 318436
2017-11-16 18:27:34 +00:00
Dave Lee 67b4966ccd Add ELF dynamic symbol support to yaml2obj/obj2yaml
Summary:
This change introduces a `DynamicSymbols` field to the ELF specific YAML
supported by `yaml2obj` and `obj2yaml`. This grouping of symbols provides a way
to represent ELF dynamic symbols. The `DynamicSymbols` structure is identical to
the existing `Symbols`.

Reviewers: compnerd, jakehehrlich, silvas

Reviewed By: silvas

Subscribers: silvas, jakehehrlich, llvm-commits

Differential Revision: https://reviews.llvm.org/D39582

llvm-svn: 318433
2017-11-16 18:10:15 +00:00
Aaron Smith 3ca416ce72 [DebugInfo/PDB] Exclude the PDB/DIA files added in my previous commit from modulemap
llvm-svn: 318425
2017-11-16 17:24:49 +00:00
Yaxun Liu 407ca36b27 Let llvm.invariant.group.barrier accepts pointer to any address space
llvm.invariant.group.barrier may accept pointers to arbitrary address space.

This patch let it accept pointers to i8 in any address space and returns
pointer to i8 in the same address space.

Differential Revision: https://reviews.llvm.org/D39973

llvm-svn: 318413
2017-11-16 16:32:16 +00:00
Igor Laevsky e714ef49af [FuzzMutate] NFC. Move parseModule and writeModule from llvm-isel-fuzzer into FuzzMutate.
This is to be able to reuse them in the llvm-opt-fuzzer.

llvm-svn: 318407
2017-11-16 15:23:08 +00:00
Aaron Smith 89bca9e566 [DebugInfo/PDB] Adding getUndecoratedNameEx and IPDB interfaces for IDiaEnumTables and IDiaTable.
Initial changes to support debugging PE/COFF files with LLDB on Windows through DIA SDK.
There is another set of changes required on the LLDB side before this does anything.

Differential Revision: https://reviews.llvm.org/D39517

llvm-svn: 318403
2017-11-16 14:33:09 +00:00
Aaron Smith c6ef575909 Test commit. Add a missing dash to the standard llvm file header; NFC.
llvm-svn: 318400
2017-11-16 13:42:28 +00:00
Max Kazantsev 87f4a3de45 [SCEV][NFC] Introduce isSafeToExpandAt function to SCEVExpander
This function checks that:
1) It is safe to expand a SCEV;
2) It is OK to materialize it at the specified location.
For example, attempt to expand a loop's AddRec to the same loop's preheader should fail.

Differential Revision: https://reviews.llvm.org/D39236

llvm-svn: 318377
2017-11-16 05:10:56 +00:00
Daniel Sanders f76f315436 [globalisel][tablegen] Generate rule coverage and use it to identify untested rules
Summary:
This patch adds a LLVM_ENABLE_GISEL_COV which, like LLVM_ENABLE_DAGISEL_COV,
causes TableGen to instrument the generated table to collect rule coverage
information. However, LLVM_ENABLE_GISEL_COV goes a bit further than
LLVM_ENABLE_DAGISEL_COV. The information is written to files
(${CMAKE_BINARY_DIR}/gisel-coverage-* by default). These files can then be
concatenated into ${LLVM_GISEL_COV_PREFIX}-all after which TableGen will
read this information and use it to emit warnings about untested rules.

This technique could also be used by SelectionDAG and can be further
extended to detect hot rules and give them priority over colder rules.

Usage:
* Enable LLVM_ENABLE_GISEL_COV in CMake
* Build the compiler and run some tests
* cat gisel-coverage-[0-9]* > gisel-coverage-all
* Delete lib/Target/*/*GenGlobalISel.inc*
* Build the compiler

Known issues:
* ${LLVM_GISEL_COV_PREFIX}-all must be generated as a manual
  step due to a lack of a portable 'cat' command. It should be the
  concatenation of all ${LLVM_GISEL_COV_PREFIX}-[0-9]* files.
* There's no mechanism to discard coverage information when the ruleset
  changes

Depends on D39742

Reviewers: ab, qcolombet, t.p.northover, aditya_nandakumar, rovka

Reviewed By: rovka

Subscribers: vsk, arsenm, nhaehnle, mgorny, kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39747

llvm-svn: 318356
2017-11-16 00:46:35 +00:00
Daniel Sanders 725584e26d Add backend name to Target to enable runtime info to be fed back into TableGen
Summary:
Make it possible to feed runtime information back to tablegen to enable
profile-guided tablegen-eration, detection of untested tablegen definitions, etc.

Being a cross-compiler by nature, LLVM will potentially collect data for multiple
architectures (e.g. when running 'ninja check'). We therefore need a way for
TableGen to figure out what data applies to the backend it is generating at the
time. This patch achieves that by including the name of the 'def X : Target ...'
for the backend in the TargetRegistry.

Reviewers: qcolombet

Reviewed By: qcolombet

Subscribers: jholewinski, arsenm, jyknight, aditya_nandakumar, sdardis, nemanjai, ab, nhaehnle, t.p.northover, javed.absar, qcolombet, llvm-commits, fedor.sergeev

Differential Revision: https://reviews.llvm.org/D39742

llvm-svn: 318352
2017-11-15 23:55:44 +00:00
Aditya Nandakumar 954eea074b [GISel][NFC]: Move getOpcodeDef from the LegalizationArtifactCombiner into GlobalISel/Utils for use elsewhere
llvm-svn: 318350
2017-11-15 23:45:04 +00:00
Sanjay Patel 3e29890a7f [(new) Pass Manager] instantiate SimplifyCFG with the same options as the old PM
This is a recommit of r316869 which was speculatively reverted with r317444 and 
subsequently shown to not be the cause of PR35210. That crash should be fixed
after r318237.

Original commit message:

The old PM sets the options of what used to be known as "latesimplifycfg" on the
instantiation after the vectorizers have run, so that's what we'redoing here.

FWIW, there's a later SimplifyCFGPass instantiation in both PMs where we do not
set the "late" options. I'm not sure if that's intentional or not.

Differential Revision: https://reviews.llvm.org/D39407

llvm-svn: 318299
2017-11-15 16:33:11 +00:00
NAKAMURA Takumi ad51924eb4 GISelWorkList.h: Fix -fmodules build in rL318210.
llvm-svn: 318275
2017-11-15 07:34:35 +00:00
Fangrui Song e73534464d NFC Remove default argument of DataLayout::getPointerABIAlignment
Differential Revision: https://reviews.llvm.org/D40005

llvm-svn: 318272
2017-11-15 06:17:32 +00:00
Craig Topper 0749186a70 [X86] Add getHostCPUName support for cannonlake.
This adds an explicit model number check and fallback path to the unknown family 6 detection.

llvm-svn: 318270
2017-11-15 06:02:42 +00:00
Craig Topper 659d5fbe99 [X86] Correct the spelling of pentiumpro in X86TargetParser.def
Thanks to Erich Keane for spotting this.

llvm-svn: 318243
2017-11-15 01:01:50 +00:00
Mitch Phillips 2e7be2a65a [cfi-verify] Validate there are no register clobbers between CFI-check and instruction execution.
Summary:
This patch adds another failure mode for `validateCFIProtection(..)`, wherein any register that affects the indirect control flow instruction is clobbered to between the CFI-check and the instruction's execution.

Also includes a modification to make MCInstrDesc::hasDefOfPhysReg public.

Reviewers: vlad.tsyrklevich

Reviewed By: vlad.tsyrklevich

Subscribers: llvm-commits, pcc, kcc

Differential Revision: https://reviews.llvm.org/D39820

llvm-svn: 318238
2017-11-15 00:35:26 +00:00
Vedant Kumar 865046fafe [PGO] Bump the indexed profile format version
Differential Revision: https://reviews.llvm.org/D39447

llvm-svn: 318228
2017-11-14 23:56:48 +00:00
Craig Topper bb5d7a5550 [X86] Fix the parameter order in the default implementation of X86_VENDOR macro in X86TargetParser.def
The default implementation doesn't do anything so the order doesn't matter, but good for cleanliness.

llvm-svn: 318226
2017-11-14 23:54:28 +00:00
Tim Renouf 39e7ce8f21 [AMDGPU] updated PAL metadata record keys
Summary: The ABI changed before specification was finalized.

Reviewers: kzhuravl, dstuttard

Subscribers: wdng, nhaehnle, yaxunl, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D39807

llvm-svn: 318213
2017-11-14 23:05:36 +00:00
Aditya Nandakumar e6201c8724 [GISel]: Rework legalization algorithm for better elimination of
artifacts along with DCE

Legalization Artifacts are all those insts that are there to make the
type system happy. Currently, the target needs to say all combinations
of extends and truncs are legal and there's no way of verifying that
post legalization, we only have *truly* legal instructions. This patch
changes roughly the legalization algorithm to process all illegal insts
at one go, and then process all truncs/extends that were added to
satisfy the type constraints separately trying to combine trivial cases
until they converge. This has the added benefit that, the target
legalizerinfo can only say which truncs and extends are okay and the
artifact combiner would combine away other exts and truncs.

Updated legalization algorithm to roughly the following pseudo code.

WorkList Insts, Artifacts;
collect_all_insts_and_artifacts(Insts, Artifacts);

do {
  for (Inst in Insts)
         legalizeInstrStep(Inst, Insts, Artifacts);
  for (Artifact in Artifacts)
         tryCombineArtifact(Artifact, Insts, Artifacts);
} while(!Insts.empty());

Also, wrote a simple wrapper equivalent to SetVector, except for
erasing, it avoids moving all elements over by one and instead just
nulls them out.

llvm-svn: 318210
2017-11-14 22:42:19 +00:00
Hans Wennborg e1ecd61b98 Rename CountingFunctionInserter and use for both mcount and cygprofile calls, before and after inlining
Clang implements the -finstrument-functions flag inherited from GCC, which
inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit.

This is useful for getting a trace of how the functions in a program are
executed. Normally, the calls remain even if a function is inlined into another
function, but it is useful to be able to turn this off for users who are
interested in a lower-level trace, i.e. one that reflects what functions are
called post-inlining. (We use this to generate link order files for Chromium.)

LLVM already has a pass for inserting similar instrumentation calls to
mcount(), which it does after inlining. This patch renames and extends that
pass to handle calls both to mcount and the cygprofile functions, before and/or
after inlining as controlled by function attributes.

Differential Revision: https://reviews.llvm.org/D39287

llvm-svn: 318195
2017-11-14 21:09:45 +00:00
Alex Bradbury 64e879745f Set hasSideEffects=0 for TargetOpcode::{CFI_INSTRUCTION,EH_LABEL,GC_LABEL,ANNOTATION_LABEL}
D37065 (committed as rL317674) explicitly set hasSideEffects for all 
TargetOpcode::* instructions where it was inferred previously. This is a 
follow-up to that patch, setting hasSideEffects=0 for CFI_INSTRUCTION, 
EH_LABEL, GC_LABEL and ANNOTATION_LABEL. All LLVM tests pass after this 
change.

This patch also modifies MachineInstr::isLabel returns true for a 
TargetOpcode::ANNOTATION_LABEL, which ensures that an annotation label won't 
be incorrectly considered safe to move.

Differential Revision: https://reviews.llvm.org/D39941

llvm-svn: 318174
2017-11-14 19:16:08 +00:00
Artem Belevich 55dcf5e586 Mark intrinsics operating on the whole warp as IntrInaccessibleMemOnly
It's needed to model the fact that they do access data from other threads in a
warp and thus can't be CSE'd.

llvm-svn: 318173
2017-11-14 19:14:00 +00:00
Serge Guelton a7be3aa785 Add missing const qualifier to AttributeSet::operator==
llvm-svn: 318162
2017-11-14 18:08:05 +00:00
Chandler Carruth 00a301d568 [PM] Port BoundsChecking to the new PM.
Registers it and everything, updates all the references, etc.

Next patch will add support to Clang's `-fexperimental-new-pass-manager`
path to actually enable BoundsChecking correctly.

Differential Revision: https://reviews.llvm.org/D39084

llvm-svn: 318128
2017-11-14 01:30:04 +00:00
Sam Clegg 999660761e [WebAssembly] Explicily disable comdat support for wasm output
For now at least.  We clearly need some kind of comdat or
linkonce_odr support for wasm but currently COMDAT is not
supported.

Disable COMDAT support in the same way we do the Mach-O.  This
also causes clang not to generated COMDATs.

Differential Revision: https://reviews.llvm.org/D39873

llvm-svn: 318123
2017-11-14 00:49:16 +00:00
Rafael Espindola e41151965f Add a move assignment operator to TempFile. NFC.
llvm-svn: 318122
2017-11-14 00:31:28 +00:00
Rafael Espindola 8c42d323c9 Simplify and rename variable.
std::error_code can represent success, so we don't need a
Optional<std::error_code>.

Rename the variable to avoid confusion with the type Error.

llvm-svn: 318111
2017-11-13 23:32:19 +00:00
Daniel Sanders 6d9d30a917 [tablegen] Handle atomic predicates for ordering inside tablegen. NFC.
Similar to r315841, GlobalISel and SelectionDAG require different code for the
common atomic predicates due to differences in the representation.
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.

This patch moves the implementation of the common atomic predicates related to
ordering into tablegen so that it can handle these differences.

It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.

llvm-svn: 318102
2017-11-13 23:03:47 +00:00
Daniel Sanders 87d196ca48 [tablegen] Handle atomic predicates for memory type inside tablegen. NFC.
Similar to r315841, GlobalISel and SelectionDAG require different code for the
common atomic predicates due to differences in the representation.
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.

This patch moves the implementation of the common atomic predicates related to
memory type into tablegen so that it can handle these differences.

It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.

llvm-svn: 318095
2017-11-13 22:26:13 +00:00
Serge Guelton 3347332ad3 Fix -Werror when compiling rL318083 (bis)
Statically assert the result and remove a runtime comparison, a direct consequence of the optimization introduced in rL318083.

llvm-svn: 318090
2017-11-13 21:40:57 +00:00
Serge Guelton 8dd0160dab Fix -Werror when compiling rL318083
Statically assert the result and remove a runtime comparison, a direct consequence of the optimization introduced in rL318083.

llvm-svn: 318087
2017-11-13 21:25:35 +00:00
Serge Guelton 25cbe525ef Reorder Value.def to optimize code size
If the first values in Value.def is the range of constant, then the code
generated by `isa<Constant>` is smaller by one operation (basically, an add is
removed). It turns out this small optimization reduces the size of the
statically linked clang binary by 400ko on my laptop. The theoritical
performance gain is non visible from my benchmarks, but the size dropdown is.

Differential Revision: https://reviews.llvm.org/D39373

llvm-svn: 318083
2017-11-13 20:57:40 +00:00
Rafael Espindola 58fe67a965 Create a TempFile class.
This just adds a TempFile class and replaces the use in
FileOutputBuffer with it.

The only difference for now is better error handling. Followup work includes:

- Convert other user of temporary files to it.
- Add support for automatically deleting on windows.
- Add a createUnnamed method that returns a potentially unnamed
  file. It would be actually unnamed on modern linux and have a
  unknown name on windows.

llvm-svn: 318069
2017-11-13 18:33:44 +00:00
Uriel Korach 2aa707bdaa [X86] test/testn intrinsics lowering to IR. llvm part.
Remove builtins from llvm and add AutoUpgrade support.
Also add fast-isel tests for the TEST and TESTN instructions.

Differential Revision: https://reviews.llvm.org/D38736

llvm-svn: 318036
2017-11-13 12:51:18 +00:00
Florian Hahn 0e9dec672d [PartialInliner] Inline vararg functions that forward varargs.
Summary:
This patch extends the partial inliner to support inlining parts of
vararg functions, if the vararg handling is done in the outlined part.

It adds a `ForwardVarArgsTo` argument to InlineFunction. If it is
non-null, all varargs passed to the inlined function will be added to
all calls to `ForwardVarArgsTo`.

The partial inliner takes care to only pass `ForwardVarArgsTo` if the
varargs handing is done in the outlined function. It checks that vastart
is not part of the function to be inlined.

`test/Transforms/CodeExtractor/PartialInlineNoInline.ll` (already part
of the repo) checks we do not do partial inlining if vastart is used in
a basic block that will be inlined.

Reviewers: davide, davidxl, grosser

Reviewed By: davide, davidxl, grosser

Subscribers: gyiu, grosser, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D39607

llvm-svn: 318028
2017-11-13 10:35:52 +00:00
Jina Nahias 9a7f9f123c [x86][AVX512] Lowering shuffle i/f intrinsics to LLVM IR
This patch, together with a matching clang patch (https://reviews.llvm.org/D38672), implements the lowering of X86 shuffle i/f intrinsics to IR.

Differential Revision: https://reviews.llvm.org/D38671

Change-Id: I1e7d359a74743e995ec356237a85214ce55d3661
llvm-svn: 318026
2017-11-13 09:16:39 +00:00
Daniel Sanders 7e52367398 [globalisel][tablegen] Import signextload and zeroextload.
Allow a pattern rewriter to be installed in CodeGenDAGPatterns and use it to
correct situations where SelectionDAG and GlobalISel disagree on
representation. For example, it would rewrite:
  (sextload:i32 $ptr)<<unindexedload>><<sextload>><<sextloadi16>
to:
  (sext:i32 (load:i16 $ptr)<<unindexedload>>)

I'd have preferred to replace the fragments and have the expansion happen
naturally as part of PatFrag expansion but the type inferencing system can't
cope with loads of types narrower than those mentioned in register classes.
This is because the SDTCisInt's on the sext constrain both the result and
operand to the 'legal' integer types (where legal is defined as 'a register
class can contain the type') which immediately rules the narrower types out.
Several targets (those with only one legal integer type) would then go on to
crash on the SDTCisOpSmallerThanOp<> when it removes all the possible types
for the result of the extend.

Also, improve isObviouslySafeToFold() slightly to automatically return true for
neighbouring instructions. There can't be any re-ordering problems if
re-ordering isn't happenning. We'll need to improve it further to handle
sign/zero-extending loads when the extend and load aren't immediate neighbours
though.

llvm-svn: 317971
2017-11-11 03:23:44 +00:00
NAKAMURA Takumi 9f65a1ffc8 llvm/Support/TargetParser.h: Fix -fmodules build in rL317900.
llvm-svn: 317966
2017-11-11 02:05:47 +00:00
Daniel Neilson 6e4aa1e481 Expand IRBuilder interface for atomic memcpy to require pointer alignments. (NFC)
Summary:
 The specification of the @llvm.memcpy.element.unordered.atomic intrinsic requires
that the pointer arguments have alignments of at least the element size. The existing
IRBuilder interface to create a call to this intrinsic does not allow for providing
the alignment of these pointer args. Having an interface that makes it easy to
construct invalid intrinsic calls doesn't seem sensible, so this patch simply
adds the requirement that one provide the argument alignments when using IRBuilder
to create atomic memcpy calls.

llvm-svn: 317918
2017-11-10 19:38:12 +00:00
Lang Hames 43e7b7a57f [ADT] Rewrite mapped_iterator in terms of iterator_adaptor_base.
Summary:
This eliminates the boilerplate implementation of the iterator interface in
mapped_iterator.

This patch also adds unit tests that verify that the mapped function is applied
by operator* and operator->, and that references returned by the map function
are returned via operator*.

Reviewers: dblaikie, chandlerc

Subscribers: llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D39855

llvm-svn: 317902
2017-11-10 17:41:28 +00:00
Craig Topper c77d00e327 [X86] Add a def file to CPU vendor, type, and subtype encodings used by Host.cpp
Summary:
I want to leverage this to clean up some of the code in clang. This will allow us to simplify D39521 which was trying to do some of the same.

If we accurately keep the code in Host.cpp synced with new CPUs added to compile-rt/libgcc we should be able to use this file as a proxy for what's implemented in the libraries.

The entries for the CPUs recognized by the libraries use separate macros that define additional parameters like the name for __builtin_cpu_is and an alias string for the couple cases where __builtin_cpu_is accepts two different names.

All of the macros contain an ARCHNAME that is usually the same as the __builtin_cpu_is string, but sometimes isn't. This represents the name recognized by X86.td and -march.

I'm following the precedent set by ARM and AArch64 and adding this information to lib/Support/TargetParser.cpp

Reviewers: erichkeane, echristo, asbirlea

Reviewed By: echristo

Subscribers: llvm-commits, aemerson, kristof.beyls

Differential Revision: https://reviews.llvm.org/D39782

llvm-svn: 317900
2017-11-10 17:10:57 +00:00
Igor Laevsky 13cc995c3d [llvm-opt-fuzzer] Introduce llvm-opt-fuzzer for fuzzing optimization passes
This change adds generic fuzzing tools capable of running libFuzzer tests on
any optimization pass or combination of them.

Differential Revision: https://reviews.llvm.org/D39555

llvm-svn: 317883
2017-11-10 12:19:08 +00:00
Jonas Paulsson 4b017e682d [RegAlloc, SystemZ] Increase number of LOCRs by passing "hard" regalloc hints.
* The method getRegAllocationHints() is now of bool type instead of void. If
true is returned, regalloc (AllocationOrder) will *only* try to allocate the
hints, as opposed to merely trying them before non-hinted registers.

* TargetRegisterInfo::getRegAllocationHints() is implemented for SystemZ with
an increase in number of LOCRs.

In this case, it is desired to force the hints even though there is a slight
increase in spilling, because if a non-hinted register would be allocated,
the LOCRMux pseudo would have to be expanded with a jump sequence. The LOCR
(Load On Condition) SystemZ instruction must have both operands in either the
low or high part of the 64 bit register.

Reviewers: Quentin Colombet and Ulrich Weigand
https://reviews.llvm.org/D36795

llvm-svn: 317879
2017-11-10 08:46:26 +00:00
Craig Topper 98a64388ab [X86] Remove GCCBuiltin from intrinsics that are no longer used by clang.
I've also added TODOs for intrinsic removal.

llvm-svn: 317876
2017-11-10 06:07:37 +00:00
Adrian Prantl 1c8c544946 Preserve debug info when DAG-combinging (zext (truncate x)) -> (and x, mask).
rdar://problem/27139077

llvm-svn: 317825
2017-11-09 19:50:20 +00:00
Zachary Turner 18f21a483b [Support] Make llvm::Error and Expected faster.
Whenever LLVM_ENABLE_ABI_BREAKING_CHECKS is enabled, which
is usually the case for example when asserts are enabled,
Error's destructor does some additional checking to make sure
that that it does not represent an error condition and that it
was checked.

However, this is -- by definition -- not the likely codepath.
Some profiling shows that at least with some compilers, simply
calling assertIsChecked -- in a release build with full
optimizations -- can account for up to 15% of the entire
runtime of the program, even though this function should almost
literally be a no-op.

The problem is that the assertIsChecked function can be considered
too big to inline depending on the compiler's inliner.  Since it's
unlikely to ever need to failure path though, we can move it out
of line and force it to not be inlined, so that the fast path
can be inlined.

In my test (using lld to link clang with CMAKE_BUILD_TYPE=Release
and LLVM_ENABLE_ASSERTIONS=ON), this reduces link time from 27
seconds to 23.5 seconds, which is a solid 15% gain.

llvm-svn: 317824
2017-11-09 19:31:52 +00:00
Andrew V. Tischenko 3543f0a712 Add -print-schedule scheduling comments to inline asm.
Differential Revision: https://reviews.llvm.org/D39728

llvm-svn: 317782
2017-11-09 12:45:40 +00:00
Sanjoy Das e3992c6328 [SectionMemoryManager] Abstract out mmap, munmap, mprotect even more ; NFC
Summary:
This will let ORC JIT clients plug in custom logic for the mmap, munmap and
mprotect paths.

Reviewers: loladiro, dblaikie

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D39300

llvm-svn: 317770
2017-11-09 06:31:33 +00:00
Craig Topper 258da8a249 [X86] Rename the VEX scalar fma builtins to end with a '3' to match gcc
I think we need to use different builtins for the FMA4 instructions since those instructions zero the upper bits and FMA3 instructions pass the bits through.

So this moves the existing builtins to be the FMA3 versions. New versions will be added for FMA4.

llvm-svn: 317765
2017-11-09 04:10:42 +00:00
Craig Topper cfd510678f [X86] X86MaskedGatherSDNode shouldn't inherit from MaskedGatherScatterSDNode
The classof implementation in MaskedGatherScatterSDNode doesn't consider X86MaskedGatherSDNode so its misleading.

llvm-svn: 317733
2017-11-08 22:26:41 +00:00
Adrian Prantl a8e56458e6 Let replaceVTableHolder accept any type.
In Rust, a trait can be implemented for any type, and if a trait
object pointer is used for the type, then a virtual table will be
emitted for that trait/type combination.

We would like debuggers to be able to inspect trait objects, which
requires finding the concrete type associated with a given vtable.

This patch changes LLVM so that any type can be passed to
replaceVTableHolder. This allows the Rust compiler to emit the needed
debug info -- associating a vtable with the concrete type for which it
was emitted.

This is a DWARF extension: DWARF only specifies the meaning of
DW_AT_containing_type in one specific situation. This style of DWARF
extension is routine, though, and LLVM already has one such case for
DW_AT_containing_type.

Patch by Tom Tromey!

Differential Revision: https://reviews.llvm.org/D39503

llvm-svn: 317730
2017-11-08 22:04:43 +00:00
Dan Gohman 2c74fe977d Add an @llvm.sideeffect intrinsic
This patch implements Chandler's idea [0] for supporting languages that
require support for infinite loops with side effects, such as Rust, providing
part of a solution to bug 965 [1].

Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual
effect, but which appears to optimization passes to have obscure side effects,
such that they don't optimize away loops containing it. It also teaches
several optimization passes to ignore this intrinsic, so that it doesn't
significantly impact optimization in most cases.

As discussed on llvm-dev [2], this patch is the first of two major parts.
The second part, to change LLVM's semantics to have defined behavior
on infinite loops by default, with a function attribute for opting into
potential-undefined-behavior, will be implemented and posted for review in
a separate patch.

[0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
[1] https://bugs.llvm.org/show_bug.cgi?id=965
[2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html

Differential Revision: https://reviews.llvm.org/D38336

llvm-svn: 317729
2017-11-08 21:59:51 +00:00
Reid Kleckner 7adb2fdbba Revert "Correct dwarf unwind information in function epilogue for X86"
This reverts r317579, originally committed as r317100.

There is a design issue with marking CFI instructions duplicatable. Not
all targets support the CFIInstrInserter pass, and targets like Darwin
can't cope with duplicated prologue setup CFI instructions. The compact
unwind info emission fails.

When the following code is compiled for arm64 on Mac at -O3, the CFI
instructions end up getting tail duplicated, which causes compact unwind
info emission to fail:
  int a, c, d, e, f, g, h, i, j, k, l, m;
  void n(int o, int *b) {
    if (g)
      f = 0;
    for (; f < o; f++) {
      m = a;
      if (l > j * k > i)
        j = i = k = d;
      h = b[c] - e;
    }
  }

We get assembly that looks like this:
; BB#1:                                 ; %if.then
Lloh3:
	adrp	x9, _f@GOTPAGE
Lloh4:
	ldr	x9, [x9, _f@GOTPAGEOFF]
	mov	 w8, wzr
Lloh5:
	str		wzr, [x9]
	stp	x20, x19, [sp, #-16]!   ; 8-byte Folded Spill
	.cfi_def_cfa_offset 16
	.cfi_offset w19, -8
	.cfi_offset w20, -16
	cmp		w8, w0
	b.lt	LBB0_3
	b	LBB0_7
LBB0_2:                                 ; %entry.if.end_crit_edge
Lloh6:
	adrp	x8, _f@GOTPAGE
Lloh7:
	ldr	x8, [x8, _f@GOTPAGEOFF]
Lloh8:
	ldr		w8, [x8]
	stp	x20, x19, [sp, #-16]!   ; 8-byte Folded Spill
	.cfi_def_cfa_offset 16
	.cfi_offset w19, -8
	.cfi_offset w20, -16
	cmp		w8, w0
	b.ge	LBB0_7
LBB0_3:                                 ; %for.body.lr.ph

Note the multiple .cfi_def* directives. Compact unwind info emission
can't handle that.

llvm-svn: 317726
2017-11-08 21:31:14 +00:00
Alex Bradbury fa18b9e73c Set hasSideEffects=0 for PHI and fix affected passes
Previously, hasSideEffects was ? for TargetOpcode::PHI and would be inferred 
as 1. D37065 sets the previously inferred properties explicitly. This patch sets 
hasSideEffects=0 for PHI, as it is for G_PHI. MachineInstr::isSafeToMove has 
been updated so it still returns false for PHI.

Additionally, HexagonBitSimplify relied on a PHI node having the 
hasUnmodeledSideEffects property. This patch fixes that assumption.

Differential Revision: https://reviews.llvm.org/D37097

llvm-svn: 317721
2017-11-08 20:19:16 +00:00
Adrian McCarthy 75248a7ade NFC: Rename MCSafeSEHFragment to MCSymbolIdFragment
Summary:
This fragment emits a symbol ID and will be useful for more than just Safe SEH
tables (e.g., I plan to re-use it for Control Flow Guard tables).  This is
simply a rename refactor.

Reviewers: rnk

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D39770

llvm-svn: 317703
2017-11-08 18:57:02 +00:00
Alex Bradbury cc988415fe [NFCI] Ensure TargetOpcode::* are compatible with guessInstructionProperties=0
rL162640 introduced CodeGenTarget::guessInstructionProperties. If a target 
sets guessInstructionProperties=0 in its FooInstrInfo, tablegen will error if 
it has to guess properties from patterns. Unfortunately, 
guessInstructionProperties=0 can't be used with current upstream LLVM as 
instructions in the TargetOpcode namespace are always included and sometimes 
have inferred properties for mayLoad, mayStore, and hasSideEffects. This patch 
provides the simplest possible fix to this problem, setting default values for 
these fields in the TargetOpcode scope. There is no intended functional 
change, as the explicitly set properties should match what was previously 
inferred. A number of the instructions had hasSideEffects=1 inferred 
unintentionally. This patch makes it explicit, while future patches (such as 
D37097) correct the property.

Differential Revision: https://reviews.llvm.org/D37065

llvm-svn: 317674
2017-11-08 09:26:06 +00:00
Matt Arsenault f6ee94c1c6 DAG: Add computeKnownBitsForFrameIndex
Some of the AMDGPU stack addressing modes require knowing the sign
bit is zero. We used to accomplish this by custom lowering
frame indexes, and then putting an AssertZext around a
TargetFrameIndex. This required specifically looking for
the AssextZext + frame index pattern which was moderately
disgusting. The same could probably be accomplished
with a target specific node, but would still
require special handling of frame indexes.

llvm-svn: 317671
2017-11-08 08:52:31 +00:00
Rafael Espindola 0d7a38a81d Convert FileOutputBuffer::commit to Error.
llvm-svn: 317656
2017-11-08 01:50:29 +00:00
Rafael Espindola e0df357dbd Convert FileOutputBuffer to Expected. NFC.
llvm-svn: 317649
2017-11-08 01:05:44 +00:00
David Blaikie 3f833edc7c Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering
This header includes CodeGen headers, and is not, itself, included by
any Target headers, so move it into CodeGen to match the layering of its
implementation.

llvm-svn: 317647
2017-11-08 01:01:31 +00:00
Justin Lebar da9e0bd3a2 [NVPTX] Implement __nvvm_atom_add_gen_d builtin.
Summary:
This just seems to have been an oversight.  We already supported the f64
atomic add with an explicit scope (e.g. "cta"), but not the scopeless
version.

Reviewers: tra

Subscribers: jholewinski, sanjoy, cfe-commits, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D39638

llvm-svn: 317623
2017-11-07 22:10:54 +00:00
Mitch Phillips 40d6663367 Extend SpecialCaseList to allow users to blame matches on entries in the file.
Summary:
Extends SCL functionality to allow users to find the line number in the file the SCL is built from through SpecialCaseList::inSectionBlame(...).

Also removes the need to compile the SCL before use. As the matcher now contains a list of regexes to test against instead of a single regex, the regexes can be individually built on each insertion rather than one large compilation at the end of construction.

This change also fixes a bug where blank lines would cause the parser to become out-of-sync with the line number. An error on line `k` was being reported as being on line `k - num_blank_lines_before_k`.

Note: This change has a cyclical dependency on D39486. Both these changes must be submitted at the same time to avoid a build breakage.

Reviewers: vlad.tsyrklevich

Reviewed By: vlad.tsyrklevich

Subscribers: kcc, pcc, llvm-commits

Differential Revision: https://reviews.llvm.org/D39485

llvm-svn: 317617
2017-11-07 21:16:46 +00:00
Paul Robinson e5400f8a6e [DWARFv5] Support DW_FORM_strp in the .debug_line header.
Supporting this form in .debug_line.dwo will be done as a follow-up.

Differential Revision: https://reviews.llvm.org/D33155

llvm-svn: 317607
2017-11-07 19:57:12 +00:00
Petar Jovanovic e2a585dddc Reland "Correct dwarf unwind information in function epilogue for X86"
Reland r317100 with minor fix regarding ComputeCommonTailLength function in
BranchFolding.cpp. Skipping top CFI instructions block needs to executed on
several more return points in ComputeCommonTailLength().

Original r317100 message:

"Correct dwarf unwind information in function epilogue for X86"

This patch aims to provide correct dwarf unwind information in function
epilogue for X86.

It consists of two parts. The first part inserts CFI instructions that set
appropriate cfa offset and cfa register in emitEpilogue() in
X86FrameLowering. This part is X86 specific.

The second part is platform independent and ensures that:

- CFI instructions do not affect code generation
- Unwind information remains correct when a function is modified by
  different passes. This is done in a late pass by analyzing information
  about cfa offset and cfa register in BBs and inserting additional CFI
  directives where necessary.

Changed CFI instructions so that they:

- are duplicable
- are not counted as instructions when tail duplicating or tail merging
- can be compared as equal

Added CFIInstrInserter pass:

- analyzes each basic block to determine cfa offset and register valid at
  its entry and exit
- verifies that outgoing cfa offset and register of predecessor blocks match
  incoming values of their successors
- inserts additional CFI directives at basic block beginning to correct the
  rule for calculating CFA

Having CFI instructions in function epilogue can cause incorrect CFA
calculation rule for some basic blocks. This can happen if, due to basic
block reordering, or the existence of multiple epilogue blocks, some of the
blocks have wrong cfa offset and register values set by the epilogue block
above them.

CFIInstrInserter is currently run only on X86, but can be used by any target
that implements support for adding CFI instructions in epilogue.

Patch by Violeta Vukobrat.

llvm-svn: 317579
2017-11-07 14:40:27 +00:00
Kristof Beyls af9814a1fc [GlobalISel] Enable legalizing non-power-of-2 sized types.
This changes the interface of how targets describe how to legalize, see
the below description.

1. Interface for targets to describe how to legalize.

In GlobalISel, the API in the LegalizerInfo class is the main interface
for targets to specify which types are legal for which operations, and
what to do to turn illegal type/operation combinations into legal ones.

For each operation the type sizes that can be legalized without having
to change the size of the type are specified with a call to setAction.
This isn't different to how GlobalISel worked before. For example, for a
target that supports 32 and 64 bit adds natively:

  for (auto Ty : {s32, s64})
    setAction({G_ADD, 0, s32}, Legal);

or for a target that needs a library call for a 32 bit division:

  setAction({G_SDIV, s32}, Libcall);

The main conceptual change to the LegalizerInfo API, is in specifying
how to legalize the type sizes for which a change of size is needed. For
example, in the above example, how to specify how all types from i1 to
i8388607 (apart from s32 and s64 which are legal) need to be legalized
and expressed in terms of operations on the available legal sizes
(again, i32 and i64 in this case). Before, the implementation only
allowed specifying power-of-2-sized types (e.g. setAction({G_ADD, 0,
s128}, NarrowScalar).  A worse limitation was that if you'd wanted to
specify how to legalize all the sized types as allowed by the LLVM-IR
LangRef, i1 to i8388607, you'd have to call setAction 8388607-3 times
and probably would need a lot of memory to store all of these
specifications.

Instead, the legalization actions that need to change the size of the
type are specified now using a "SizeChangeStrategy".  For example:

   setLegalizeScalarToDifferentSizeStrategy(
       G_ADD, 0, widenToLargerAndNarrowToLargest);

This example indicates that for type sizes for which there is a larger
size that can be legalized towards, do it by Widening the size.
For example, G_ADD on s17 will be legalized by first doing WidenScalar
to make it s32, after which it's legal.
The "NarrowToLargest" indicates what to do if there is no larger size
that can be legalized towards. E.g. G_ADD on s92 will be legalized by
doing NarrowScalar to s64.

Another example, taken from the ARM backend is:
   for (unsigned Op : {G_SDIV, G_UDIV}) {
     setLegalizeScalarToDifferentSizeStrategy(Op, 0,
         widenToLargerTypesUnsupportedOtherwise);
     if (ST.hasDivideInARMMode())
       setAction({Op, s32}, Legal);
     else
       setAction({Op, s32}, Libcall);
   }

For this example, G_SDIV on s8, on a target without a divide
instruction, would be legalized by first doing action (WidenScalar,
s32), followed by (Libcall, s32).

The same principle is also followed for when the number of vector lanes
on vector data types need to be changed, e.g.:

   setAction({G_ADD, LLT::vector(8, 8)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(16, 8)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(4, 16)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(8, 16)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(2, 32)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(4, 32)}, LegalizerInfo::Legal);
   setLegalizeVectorElementToDifferentSizeStrategy(
       G_ADD, 0, widenToLargerTypesUnsupportedOtherwise);

As currently implemented here, vector types are legalized by first
making the vector element size legal, followed by then making the number
of lanes legal. The strategy to follow in the first step is set by a
call to setLegalizeVectorElementToDifferentSizeStrategy, see example
above.  The strategy followed in the second step
"moreToWiderTypesAndLessToWidest" (see code for its definition),
indicating that vectors are widened to more elements so they map to
natively supported vector widths, or when there isn't a legal wider
vector, split the vector to map it to the widest vector supported.

Therefore, for the above specification, some example legalizations are:
  * getAction({G_ADD, LLT::vector(3, 3)})
    returns {WidenScalar, LLT::vector(3, 8)}
  * getAction({G_ADD, LLT::vector(3, 8)})
    then returns {MoreElements, LLT::vector(8, 8)}
  * getAction({G_ADD, LLT::vector(20, 8)})
    returns {FewerElements, LLT::vector(16, 8)}


2. Key implementation aspects.

How to legalize a specific (operation, type index, size) tuple is
represented by mapping intervals of integers representing a range of
size types to an action to take, e.g.:

       setScalarAction({G_ADD, LLT:scalar(1)},
                       {{1, WidenScalar},  // bit sizes [ 1, 31[
                        {32, Legal},       // bit sizes [32, 33[
                        {33, WidenScalar}, // bit sizes [33, 64[
                        {64, Legal},       // bit sizes [64, 65[
                        {65, NarrowScalar} // bit sizes [65, +inf[
                       });

Please note that most of the code to do the actual lowering of
non-power-of-2 sized types is currently missing, this is just trying to
make it possible for targets to specify what is legal, and how non-legal
types should be legalized.  Probably quite a bit of further work is
needed in the actual legalizing and the other passes in GlobalISel to
support non-power-of-2 sized types.

I hope the documentation in LegalizerInfo.h and the examples provided in the
various {Target}LegalizerInfo.cpp and LegalizerInfoTest.cpp explains well
enough how this is meant to be used.

This drops the need for LLT::{half,double}...Size().


Differential Revision: https://reviews.llvm.org/D30529

llvm-svn: 317560
2017-11-07 10:34:34 +00:00
Adrian Prantl 25a09dd408 Make DIExpression::createFragmentExpression() return an Optional.
We can't safely split arithmetic into multiple fragments because we
can't express carry-over between fragments.

llvm-svn: 317534
2017-11-07 00:45:34 +00:00
Davide Italiano 1a46affb45 [IPO/LowerTypesTest] Skip blockaddress(es) when replacing uses.
Blockaddresses refer to the function itself, therefore replacing them
would cause an assertion in doRAUW.

Fixes https://bugs.llvm.org/show_bug.cgi?id=35201

This was found when trying CFI on a proprietary kernel by Dmitry Mikulin.

Differential Revision:  https://reviews.llvm.org/D39695

llvm-svn: 317527
2017-11-07 00:09:25 +00:00
Vedant Kumar 2b881f567f [DebugInfo] Unify logic to merge DILocations. NFC.
This makes DILocation::getMergedLocation() do what its comment says it
does when merging locations for an Instruction: set the common inlineAt
scope. This simplifies Instruction::applyMergedLocation() a bit.

Testing: check-llvm, check-clang

Differential Revision: https://reviews.llvm.org/D39628

llvm-svn: 317524
2017-11-06 23:15:21 +00:00
Bjorn Pettersson a42ed3e361 [MIRPrinter] Use %subreg.xxx syntax for subregister index operands
Summary:
Print %subreg.<subregidxname> instead of just the subregister
index when printing immediate operands corresponding to subreg
indices in INSERT_SUBREG, EXTRACT_SUBREG, SUBREG_TO_REG and
REG_SEQUENCE.

Reviewers: qcolombet, MatzeB

Reviewed By: MatzeB

Subscribers: nhaehnle, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D39696

llvm-svn: 317513
2017-11-06 21:46:06 +00:00
Sanjay Patel 629c411538 [IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag
As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html
and again more recently:
http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html

...this is a step in cleaning up our fast-math-flags implementation in IR to better match
the capabilities of both clang's user-visible flags and the backend's flags for SDNode.

As proposed in the above threads, we're replacing the 'UnsafeAlgebra' bit (which had the 
'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic 
reassociation - 'AllowReassoc'.

We're also adding a bit to allow approximations for library functions called 'ApproxFunc' 
(this was initially proposed as 'libm' or similar).

...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did 
look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits), 
but that's apparently already used for other purposes. Also, I don't think we can just 
add a field to FPMathOperator because Operator is not intended to be instantiated. 
We'll defer movement of FMF to another day.

We keep the 'fast' keyword. I thought about removing that, but seeing IR like this:
%f.fast = fadd reassoc nnan ninf nsz arcp contract afn float %op1, %op2
...made me think we want to keep the shortcut synonym.

Finally, this change is binary incompatible with existing IR as seen in the 
compatibility tests. This statement:
"Newer releases can ignore features from older releases, but they cannot miscompile 
them. For example, if nsw is ever replaced with something else, dropping it would be 
a valid way to upgrade the IR." 
( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility )
...provides the flexibility we want to make this change without requiring a new IR 
version. Ie, we're not loosening the FP strictness of existing IR. At worst, we will 
fail to optimize some previously 'fast' code because it's no longer recognized as 
'fast'. This should get fixed as we audit/squash all of the uses of 'isFast()'.

Note: an inter-dependent clang commit to use the new API name should closely follow 
commit.

Differential Revision: https://reviews.llvm.org/D39304

llvm-svn: 317488
2017-11-06 16:27:15 +00:00
Martin Storsjo bed0c519c3 [ObjectYAML] Map relocation types for COFF ARMNT and ARM64
Differential Revision: https://reviews.llvm.org/D39668

llvm-svn: 317459
2017-11-06 07:20:58 +00:00
David L. Jones 82b22e0327 [PassManager, SimplifyCFG] Revert r316908 and r316869.
These cause Clang to crash with a segfault. See PR35210 for details.

llvm-svn: 317444
2017-11-06 00:32:01 +00:00
Harlan Haskins 2ad533c0f9 Use code voice for DIBuilder in LLVM C API
(This is a test commit)

llvm-svn: 317422
2017-11-04 20:31:20 +00:00
Aaron Ballman a5ee69a010 Move the srpm, ocaml_make_directory, llvm_vcsrevision_h, and llvm-headers projects into the Misc folder on IDEs like Visual Studio rather than leave them in the root directory. NFC.
llvm-svn: 317416
2017-11-04 19:59:14 +00:00
Sean Fertile 4595a915f6 [LTO][ThinLTO] Use the linker resolutions to mark global values as dso_local.
Now that we have a way to mark GlobalValues as local we can use the symbol
resolutions that the linker plugin provides as part of lto/thinlto link
step to refine the compilers view on what symbols will end up being local.

Originally commited as r317374, but reverted in r317395 to update some missed
tests.

Differential Revision: https://reviews.llvm.org/D35702

llvm-svn: 317408
2017-11-04 17:04:39 +00:00
Sean Fertile 39770ca0a1 Revert "[LTO][ThinLTO] Use the linker resolutions to mark global values ..."
Changes more tests then expected on one of the build bots.
reverting to investigate.

This reverts https://llvm.org/svn/llvm-project/llvm/trunk@317374

llvm-svn: 317395
2017-11-04 01:54:20 +00:00
David Blaikie 1be62f0327 Move TargetFrameLowering.h to CodeGen where it's implemented
This header already includes a CodeGen header and is implemented in
lib/CodeGen, so move the header there to match.

This fixes a link error with modular codegeneration builds - where a
header and its implementation are circularly dependent and so need to be
in the same library, not split between two like this.

llvm-svn: 317379
2017-11-03 22:32:11 +00:00
Sean Fertile 36528c2a9b [LTO][ThinLTO] Use the linker resolutions to mark global values as dso_local.
Now that we have a way to mark GlobalValues as local we can use the symbol
resolutions that the linker plugin provides as part of lto/thinlto link
step to refine the compilers view on what symbols will end up being local.

Differential Revision: https://reviews.llvm.org/D35702

llvm-svn: 317374
2017-11-03 21:45:55 +00:00
Peter Collingbourne c2935db629 Revert r317046, "Object: Move some code from ELF.h into ELF.cpp."
This change resulted in a measured 1.5-2% perf regression linking
chrome.

llvm-svn: 317371
2017-11-03 21:30:06 +00:00
David Blaikie 34eb96b03f GCOV: Move GCOV from IR & Support into ProfileData to fix layering
This class was split between libIR and libSupport, which breaks under
modular code generation. Move it into the one library that uses it,
ProfileData, to resolve this issue.

llvm-svn: 317366
2017-11-03 20:57:10 +00:00
Jun Bum Lim 0c99007db1 Recommit r317351 : Add CallSiteSplitting pass
This recommit r317351 after fixing a buildbot failure.

Original commit message:

    Summary:
    This change add a pass which tries to split a call-site to pass
    more constrained arguments if its argument is predicated in the control flow
    so that we can expose better context to the later passes (e.g, inliner, jump
    threading, or IPA-CP based function cloning, etc.).
    As of now we support two cases :

    1) If a call site is dominated by an OR condition and if any of its arguments
    are predicated on this OR condition, try to split the condition with more
    constrained arguments. For example, in the code below, we try to split the
    call site since we can predicate the argument (ptr) based on the OR condition.

    Split from :
          if (!ptr || c)
            callee(ptr);
    to :
          if (!ptr)
            callee(null ptr)  // set the known constant value
          else if (c)
            callee(nonnull ptr)  // set non-null attribute in the argument

    2) We can also split a call-site based on constant incoming values of a PHI
    For example,
    from :
          BB0:
           %c = icmp eq i32 %i1, %i2
           br i1 %c, label %BB2, label %BB1
          BB1:
           br label %BB2
          BB2:
           %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ]
           call void @bar(i32 %p)
    to
          BB0:
           %c = icmp eq i32 %i1, %i2
           br i1 %c, label %BB2-split0, label %BB1
          BB1:
           br label %BB2-split1
          BB2-split0:
           call void @bar(i32 0)
           br label %BB2
          BB2-split1:
           call void @bar(i32 1)
           br label %BB2
          BB2:
           %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ]

llvm-svn: 317362
2017-11-03 20:41:16 +00:00
David Blaikie 526f30b8aa Modularize: Include some required headers
DenseMaps require the definition of a type to be available when using a
pointer to that type as a key to know how many bits are available for
tombstone/etc.

llvm-svn: 317360
2017-11-03 20:24:19 +00:00
Aaron Ballman 639ea374d6 Correcting some CRLFs that snuck in with my previous commit; NFC.
llvm-svn: 317357
2017-11-03 20:05:51 +00:00
Aaron Ballman ecf0e95267 Add llvm::for_each as a range-based extensions to <algorithm> and make use of it in some cases where it is a more clear alternative to std::for_each.
llvm-svn: 317356
2017-11-03 20:01:25 +00:00
Jun Bum Lim 0eb1c2d63a Revert "Add CallSiteSplitting pass"
Revert due to Buildbot failure.

This reverts commit r317351.

llvm-svn: 317353
2017-11-03 19:17:11 +00:00
Jun Bum Lim 2a58933519 Add CallSiteSplitting pass
Summary:
This change add a pass which tries to split a call-site to pass
more constrained arguments if its argument is predicated in the control flow
so that we can expose better context to the later passes (e.g, inliner, jump
threading, or IPA-CP based function cloning, etc.).
As of now we support two cases :

1) If a call site is dominated by an OR condition and if any of its arguments
are predicated on this OR condition, try to split the condition with more
constrained arguments. For example, in the code below, we try to split the
call site since we can predicate the argument (ptr) based on the OR condition.

Split from :
      if (!ptr || c)
        callee(ptr);
to :
      if (!ptr)
        callee(null ptr)  // set the known constant value
      else if (c)
        callee(nonnull ptr)  // set non-null attribute in the argument

2) We can also split a call-site based on constant incoming values of a PHI
For example,
from :
      BB0:
       %c = icmp eq i32 %i1, %i2
       br i1 %c, label %BB2, label %BB1
      BB1:
       br label %BB2
      BB2:
       %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ]
       call void @bar(i32 %p)
to
      BB0:
       %c = icmp eq i32 %i1, %i2
       br i1 %c, label %BB2-split0, label %BB1
      BB1:
       br label %BB2-split1
      BB2-split0:
       call void @bar(i32 0)
       br label %BB2
      BB2-split1:
       call void @bar(i32 1)
       br label %BB2
      BB2:
       %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ]

Reviewers: davidxl, huntergr, chandlerc, mcrosier, eraman, davide

Reviewed By: davidxl

Subscribers: sdesmalen, ashutosh.nema, fhahn, mssimpso, aemerson, mgorny, mehdi_amini, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D39137

llvm-svn: 317351
2017-11-03 19:01:57 +00:00
Mikael Holmen 6018104d5e [ADCE] Use MapVector for BlockInfo to make iteration order deterministic
Summary:
Also added a reserve() method to MapVector since we want to use that from
ADCE.

DenseMap does not provide deterministic iteration order so with that
we will handle the members of BlockInfo in random order, eventually
leading to random order of the blocks in the predecessor lists.

Without this change, I get the same predecessor order in about 90% of the
time when I compile a certain reproducer and in 10% I get a different one.

No idea how to make a proper test case for this.

Reviewers: kuhar, david2050

Reviewed By: kuhar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39593

llvm-svn: 317323
2017-11-03 14:15:08 +00:00
Clement Courbet 063bed9baf re-land [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."
Fix undefined references: ExpandMemCmp belongs to CodeGen/, not Scalar/.

llvm-svn: 317318
2017-11-03 12:12:27 +00:00
Puyan Lotfi a521c4ac55 mir-canon: First commit.
mir-canon (MIRCanonicalizerPass) is a pass designed to reorder instructions and
rename operands so that two similar programs will diff more cleanly after being
run through mir-canon than they would otherwise. This project is still a work
in progress and there are ideas still being discussed for improving diff
quality.

M    include/llvm/InitializePasses.h
M    lib/CodeGen/CMakeLists.txt
M    lib/CodeGen/CodeGen.cpp
A    lib/CodeGen/MIRCanonicalizerPass.cpp

llvm-svn: 317285
2017-11-02 23:37:32 +00:00
Hiroshi Yamauchi dce9def3dd Irreducible loop metadata for more accurate block frequency under PGO.
Summary:
Currently the block frequency analysis is an approximation for irreducible
loops.

The new irreducible loop metadata is used to annotate the irreducible loop
headers with their header weights based on the PGO profile (currently this is
approximated to be evenly weighted) and to help improve the accuracy of the
block frequency analysis for irreducible loops.

This patch is a basic support for this.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: mehdi_amini, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D39028

llvm-svn: 317278
2017-11-02 22:26:51 +00:00
Adrian Prantl 07eaa9b46e Clean up comments in include/llvm-c/DebugInfo.h
Patch by Harlan Haskins!

Differential Revision: https://reviews.llvm.org/D39568

llvm-svn: 317271
2017-11-02 21:35:37 +00:00
Adrian Prantl f2593d028f Add missing header guards.
llvm-svn: 317267
2017-11-02 20:58:58 +00:00
Mitch Phillips 6d2590baec Fixed line length style issue.
Reviewers: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39395

llvm-svn: 317223
2017-11-02 18:04:44 +00:00
Chad Rosier cd1b93c4e4 [TargetParser][AArch64] Reorder enum to preserve 5.0.0 libLLVM ABI.
This is required for backporting r311659 to the 5.0.1 release.
PR35060

Differential Revision: https://reviews.llvm.org/D39558

llvm-svn: 317222
2017-11-02 17:52:27 +00:00
Clement Courbet 82bade615b Revert "[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."
undefined reference to `llvm::TargetPassConfig::ID' on
clang-ppc64le-linux-multistage

This reverts commit eea333c33fa73ad225ef28607795984829f65688.

llvm-svn: 317213
2017-11-02 15:53:10 +00:00
Clement Courbet 1dc37b9c3b [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass.
Summary:
This is mostly a noop (most of the test diffs are renamed blocks).
There are a few temporary register renames (eax<->ecx) and a few blocks are
shuffled around.

See the discussion in PR33325 for more details.

Reviewers: spatel

Subscribers: mgorny

Differential Revision: https://reviews.llvm.org/D39456

llvm-svn: 317211
2017-11-02 15:02:51 +00:00
Yichao Yu 6fefc0d65e Allow inaccessiblememonly and inaccessiblemem_or_argmemonly to be overwriten on call site with operand bundle
Summary:
Similar to argmemonly, readonly and readnone.

Fix PR35128

Reviewers: andrew.w.kaylor, chandlerc, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D39434

llvm-svn: 317201
2017-11-02 12:18:33 +00:00
NAKAMURA Takumi 965602ac82 llvm-c/DebugInfo.h: Fix warning. [-Wdocumentation]
llvm-svn: 317191
2017-11-02 08:03:12 +00:00
Jake Ehrlich 03aeeb09c5 [yaml2obj][ELF] Add support for setting alignment in program headers
Sometimes program headers have larger alignments than any of the
sections they contain. Currently yaml2obj can't produce such files. A
bug recently appeared in llvm-objcopy that failed in such a case. I'd
like to be able to add tests to llvm-objcopy for such cases.

This change adds an optional alignment parameter to program headers that
will be used instead of calculating the alignment.

Differential Revision: https://reviews.llvm.org/D39130

llvm-svn: 317139
2017-11-01 23:14:48 +00:00
Petar Jovanovic bb5c84fb57 Revert "Correct dwarf unwind information in function epilogue for X86"
This reverts r317100 as it introduced sanitizer-x86_64-linux-autoconf
buildbot failure (build #15606).

llvm-svn: 317136
2017-11-01 23:05:52 +00:00
whitequark 789164d426 [LLVM-C] Expose functions to create debug locations via DIBuilder.
These include:
  * Several functions for creating an LLVMDIBuilder,
  * LLVMDIBuilderCreateCompileUnit,
  * LLVMDIBuilderCreateFile,
  * LLVMDIBuilderCreateDebugLocation.

Patch by Harlan Haskins.

Differential Revision: https://reviews.llvm.org/D32368

llvm-svn: 317135
2017-11-01 22:18:52 +00:00
Daniel Sanders 466fe399b8 [globalisel][regbank] Warn about MIR ambiguities when register bank/class names clash.
llvm-svn: 317132
2017-11-01 22:13:05 +00:00
Rui Ueyama a16fe65b72 Rewrite FileOutputBuffer as two separate classes.
This patch is to rewrite FileOutputBuffer as two separate classes;
one for file-backed output buffer and the other for memory-backed
output buffer. I think the new code is easier to follow because two
different implementations are now actually separated as different
classes.

Unlike the previous implementation, the class that does not replace the
final output file using rename(2) does not create a temporary file at
all. Instead, it allocates memory using mmap(2) and use it. I think
this is an improvement because it is now guaranteed that the temporary
memory region doesn't trigger any I/O and there's now zero chance to
leave a temporary file behind. Also, it shouldn't impose new restrictions
because were using mmap IO too.

Differential Revision: https://reviews.llvm.org/D39449

llvm-svn: 317127
2017-11-01 21:38:14 +00:00
Dehao Chen c6c051f2ea Include GUIDs from the same module when computing GUIDs that needs to be imported.
Summary: In the compile phase of SamplePGO+ThinLTO, ICP is not invoked. Instead, indirect call targets will be included as function metadata for ThinIndex to buidl the call graph. This should not only include functions defined in other modules, but also functions defined in the same module, otherwise ThinIndex may find the callee dead and eliminate it, while ICP in backend will revive the symbol, which leads to undefined symbol.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: sanjoy, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D39480

llvm-svn: 317118
2017-11-01 20:26:47 +00:00
Daniel Sanders 9cbe7c7f93 [globalisel][tablegen] Add support for multi-insn emission
The importer will now accept nested instructions in the result pattern such as
(ADDWrr $a, (SUBWrr $b, $c)). This is only valid when the nested instruction
def's a single vreg and the parent instruction consumes a single vreg where a
nested instruction is specified. The importer will automatically create a vreg
to connect the two using the type information from the pattern. This vreg will
be constrained to the register classes given in the instruction definitions*.

* REG_SEQUENCE is explicitly rejected because of this. The definition doesn't
  constrain to a register class and it therefore needs special handling.

llvm-svn: 317117
2017-11-01 19:57:57 +00:00
Petar Jovanovic f2faee92aa Correct dwarf unwind information in function epilogue for X86
This patch aims to provide correct dwarf unwind information in function
epilogue for X86.

It consists of two parts. The first part inserts CFI instructions that set
appropriate cfa offset and cfa register in emitEpilogue() in
X86FrameLowering. This part is X86 specific.

The second part is platform independent and ensures that:

- CFI instructions do not affect code generation
- Unwind information remains correct when a function is modified by
  different passes. This is done in a late pass by analyzing information
  about cfa offset and cfa register in BBs and inserting additional CFI
  directives where necessary.

Changed CFI instructions so that they:

- are duplicable
- are not counted as instructions when tail duplicating or tail merging
- can be compared as equal

Added CFIInstrInserter pass:

- analyzes each basic block to determine cfa offset and register valid at
  its entry and exit
- verifies that outgoing cfa offset and register of predecessor blocks match
  incoming values of their successors
- inserts additional CFI directives at basic block beginning to correct the
  rule for calculating CFA

Having CFI instructions in function epilogue can cause incorrect CFA
calculation rule for some basic blocks. This can happen if, due to basic
block reordering, or the existence of multiple epilogue blocks, some of the
blocks have wrong cfa offset and register values set by the epilogue block
above them.

CFIInstrInserter is currently run only on X86, but can be used by any target
that implements support for adding CFI instructions in epilogue.


Patch by Violeta Vukobrat.

Differential Revision: https://reviews.llvm.org/D35844

llvm-svn: 317100
2017-11-01 16:04:11 +00:00
Geoff Berry eed6531ea2 [BranchProbabilityInfo] Handle irreducible loops.
Summary:
Compute the strongly connected components of the CFG and fall back to
use these for blocks that are in loops that are not detected by
LoopInfo when computing loop back-edge and exit branch probabilities.

Reviewers: dexonsmith, davidxl

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D39385

llvm-svn: 317094
2017-11-01 15:16:50 +00:00
NAKAMURA Takumi 1657f2ad99 Fix warnings discovered by rL317076. [-Wunused-private-field]
llvm-svn: 317091
2017-11-01 13:47:55 +00:00
NAKAMURA Takumi 53fc7e1763 Reformat.
llvm-svn: 317078
2017-11-01 05:14:35 +00:00
NAKAMURA Takumi 7c1ef4a88c Revert rL317019, "[ADT] Split optional to only include copy mechanics and dtor for non-trivial types."
Seems g++-4.8 (eg. Ubuntu 14.04) doesn't like this.

llvm-svn: 317077
2017-11-01 05:14:31 +00:00
Peter Collingbourne aedb4bf37f Object: Move some code from ELF.h into ELF.cpp.
Differential Revision: https://reviews.llvm.org/D39271

llvm-svn: 317046
2017-10-31 22:49:23 +00:00
Peter Collingbourne 4240c5861b Inline compareAddr function into its only caller. NFCI.
llvm-svn: 317045
2017-10-31 22:49:09 +00:00
Benjamin Kramer 0fad6dd3c4 Revert "[DWARF] Now that Optional is standard layout, put it into an union instead of splatting it."
GCC doesn't like it. This reverts commit r317028.

llvm-svn: 317030
2017-10-31 19:55:08 +00:00
Benjamin Kramer 8732bbec1e [DWARF] Now that Optional is standard layout, put it into an union instead of splatting it.
No functionality change intended.

llvm-svn: 317028
2017-10-31 19:40:03 +00:00
Benjamin Kramer 5168f257d5 [ADT] Split optional to only include copy mechanics and dtor for non-trivial types.
This makes uses of Optional more transparent to the compiler (and
clang-tidy) and generates slightly smaller code.

llvm-svn: 317019
2017-10-31 18:35:54 +00:00
Wolfgang Pieb 2d2798b3a7 [Metadata][NFC] Make MDNode::resolve() public in preparation for the fix to PR33930.
Reviewers: aprantl
llvm-svn: 317018
2017-10-31 18:25:28 +00:00
Teresa Johnson d1089e5cea [ThinLTO] Double bits of module hash used for renaming
Summary:
Use 64 instead of 32 bits of the module hash as the suffix when renaming
after promotion to reduce the likelihood of a collision (which we
observed in a binary when using 32 bits).

Reviewers: pcc

Subscribers: llvm-commits, inglorion

Differential Revision: https://reviews.llvm.org/D39443

llvm-svn: 316996
2017-10-31 12:56:09 +00:00
David Green 64f53b4214 [LoopUnroll] Clean up remarks for unroll remainder
The optimisation remarks for loop unrolling with an unrolled remainder looks something like:

test.c:7:18: remark: completely unrolled loop with 3 iterations [-Rpass=loop-unroll]
            C[i] += A[i*N+j];
                 ^
test.c:6:9: remark: unrolled loop by a factor of 4 with run-time trip count [-Rpass=loop-unroll]
        for(int j = 0; j < N; j++)
        ^
This removes the first of the two messages.

Differential revision: https://reviews.llvm.org/D38725

llvm-svn: 316986
2017-10-31 10:47:46 +00:00
Max Kazantsev 488ec975bb Reapply "[GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors"
This patch fixes the miscompile that happens when PRE hoists loads across guards and
other instructions that don't always pass control flow to their successors. PRE is now prohibited
to hoist across such instructions because there is no guarantee that the load standing after such
instruction is still valid before such instruction. For example, a load from under a guard may be
invalid before the guard in the following case:
  int array[LEN];
  ...
  guard(0 <= index && index < LEN);
  use(array[index]);

Differential Revision: https://reviews.llvm.org/D37460

llvm-svn: 316975
2017-10-31 05:07:56 +00:00
Daniel Neilson f9c7d29c77 Create instruction classes for identifying any atomicity of memory intrinsic. (NFC)
Summary:
For reference, see: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116589.html

This patch fleshes out the instruction class hierarchy with respect to atomic and
non-atomic memory intrinsics. With this change, the relevant part of the class
hierarchy becomes:

IntrinsicInst
  -> MemIntrinsicBase (methods-only class)
    -> MemIntrinsic (non-atomic intrinsics)
      -> MemSetInst
      -> MemTransferInst
        -> MemCpyInst
        -> MemMoveInst
    -> AtomicMemIntrinsic (atomic intrinsics)
      -> AtomicMemSetInst
      -> AtomicMemTransferInst
        -> AtomicMemCpyInst
        -> AtomicMemMoveInst
    -> AnyMemIntrinsic (both atomicities)
      -> AnyMemSetInst
      -> AnyMemTransferInst
        -> AnyMemCpyInst
        -> AnyMemMoveInst

This involves some class renaming:
    ElementUnorderedAtomicMemCpyInst -> AtomicMemCpyInst
    ElementUnorderedAtomicMemMoveInst -> AtomicMemMoveInst
    ElementUnorderedAtomicMemSetInst -> AtomicMemSetInst
A script for doing this renaming in downstream trees is included below.

An example of where the Any* classes should be used in LLVM is when reasoning
about the effects of an instruction (ex: aliasing).

---
Script for renaming AtomicMem* classes:
PREFIXES="[<,([:space:]]"
CLASSES="MemIntrinsic|MemTransferInst|MemSetInst|MemMoveInst|MemCpyInst"
SUFFIXES="[;)>,[:space:]]"

REGEX="(${PREFIXES})ElementUnorderedAtomic(${CLASSES})(${SUFFIXES})"
REGEX2="visitElementUnorderedAtomic(${CLASSES})"

FILES=$( grep -E "(${REGEX}|${REGEX2})" -r . | tr ':' ' ' | awk '{print $1}' | sort | uniq )

SED_SCRIPT="s~${REGEX}~\1Atomic\2\3~g"
SED_SCRIPT2="s~${REGEX2}~visitAtomic\1~g"

for f in $FILES; do
    echo "Processing: $f"
    sed  -i ".bak" -E "${SED_SCRIPT};${SED_SCRIPT2};${EA_SED_SCRIPT};${EA_SED_SCRIPT2}" $f
done

Reviewers: sanjoy, deadalnix, apilipenko, anna, skatkov, mkazantsev

Reviewed By: sanjoy

Subscribers: hfinkel, jholewinski, arsenm, sdardis, nhaehnle, JDevlieghere, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38419

llvm-svn: 316950
2017-10-30 19:51:48 +00:00
Clement Courbet b2c3eb8cf1 [CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).
- Targets that want to support memcmp expansions now return the list of
   supported load sizes.
 - Expansion codegen does not assume that all power-of-two load sizes
   smaller than the max load size are valid. For examples, this is not the
   case for x86(32bit)+sse2.

Fixes PR34887.

llvm-svn: 316905
2017-10-30 14:19:33 +00:00
Sanjay Patel adf38911d8 [(new) Pass Manager] instantiate SimplifyCFG with the same options as the old PM
The old PM sets the options of what used to be known as "latesimplifycfg" on the 
instantiation after the vectorizers have run, so that's what we'redoing here.

FWIW, there's a later SimplifyCFGPass instantiation in both PMs where we do not 
set the "late" options. I'm not sure if that's intentional or not.

Differential Revision: https://reviews.llvm.org/D39407

llvm-svn: 316869
2017-10-29 20:49:31 +00:00
Saleem Abdulrasool 0f759db2bd ADT: add a helper to check if the Triple is ARM64
Add a trivial helper for checking if the architecture is AArch64 Little
Endian or Big Endian.

llvm-svn: 316837
2017-10-28 19:15:05 +00:00
Sanjay Patel b049173157 [SimplifyCFG] use pass options and remove the latesimplifycfg pass
This is no-functional-change-intended.

This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and 
D35411 (defer folding unconditional branches) with pass parameters rather than a named
"latesimplifycfg" pass. Now that we have individual options to control the functionality,
we could decouple when these fire (but that's an independent patch if desired). 

The next planned step would be to add another option bit to disable the sinking transform
mentioned in D38566. This should also make it clear that the new pass manager needs to
be updated to limit simplifycfg in the same way as the old pass manager.

Differential Revision: https://reviews.llvm.org/D38631

llvm-svn: 316835
2017-10-28 18:43:07 +00:00
Eugene Zelenko 8e07bd4887 [ADT] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316818
2017-10-28 00:24:26 +00:00
David Blaikie 8699f71310 Add a few missing headers for modularization/IWYU/etc
Several cases where class definitions are required for DenseMap pointer
traits handling.

llvm-svn: 316803
2017-10-27 22:12:46 +00:00
whitequark 131f98f054 [LLVM-C] Publicly expose getters of MetadataType, TokenType
Patch by Robert Widmann.

Expose getters for MetadataType and TokenType publicly in the C API.
Discovered a need for these while trying to wrap the intrinsics API.

Differential Revision: https://reviews.llvm.org/D38809

llvm-svn: 316762
2017-10-27 11:51:40 +00:00
NAKAMURA Takumi 363afa3b25 llvm/CodeGen/GlobalISel/InstructionSelectorImpl.h: Fix -fmodules build introduced in rL316715.
llvm-svn: 316743
2017-10-27 05:45:11 +00:00
Sean Fertile 57d46b8436 Add subclass data to the FoldingSetNode for MemIntrinsicSDNodes.
Not having the subclass data on an MemIntrinsicSDNodes means it was possible
to try to fold 2 nodes with the same operands but differing MMO flags. This
would trip an assertion when trying to refine the alignment between the 2
MachineMemOperands.

Differential Revision: https://reviews.llvm.org/D38898

llvm-svn: 316737
2017-10-27 04:02:51 +00:00
Eugene Zelenko 57bd5a0274 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316724
2017-10-27 01:09:08 +00:00
David Blaikie 6265130054 InstructionSelectorImpl.h: Modularize/remove ODR violations by using a static member function to expose the debug name
llvm-svn: 316715
2017-10-26 23:39:54 +00:00
David Blaikie bd37bc3336 MCCodePadder.h: Include definition of type for use with DenseMap
Pointer traits require a full definition of a type to function
correctly, so the header must be included rather than only a forward
declaration.

llvm-svn: 316714
2017-10-26 23:39:52 +00:00
Aditya Nandakumar 14a1e474da [GISel]: Missed checking if it's okay to create a G_CONSTANT of DstTy in the legalizationCombiner
llvm-svn: 316694
2017-10-26 20:13:54 +00:00
Sean Fertile c70d28bff5 Represent runtime preemption in the IR.
Currently we do not represent runtime preemption in the IR, which has several
drawbacks:

  1) The semantics of GlobalValues differ depending on the object file format
     you are targeting (as well as the relocation-model and -fPIE value).
  2) We have no way of disabling inlining of run time interposable functions,
     since in the IR we only know if a function is link-time interposable.
     Because of this llvm cannot support elf-interposition semantics.
  3) In LTO builds of executables we will have extra knowledge that a symbol
     resolved to a local definition and can't be preemptable, but have no way to
     propagate that knowledge through the compiler.

This patch adds preemptability specifiers to the IR with the following meaning:

dso_local --> means the compiler may assume the symbol will resolve to a
 definition within the current linkage unit and the symbol may be accessed
 directly even if the definition is not within this compilation unit.

dso_preemptable --> means that the compiler must assume the GlobalValue may be
replaced with a definition from outside the current linkage unit at runtime.

To ease transitioning dso_preemptable is treated as a 'default' in that
low-level codegen will still do the same checks it did previously to see if a
symbol should be accessed indirectly. Eventually when IR producers emit the
specifiers on all Globalvalues we can change dso_preemptable to mean 'always
access indirectly', and remove the current logic.

Differential Revision: https://reviews.llvm.org/D20217

llvm-svn: 316668
2017-10-26 15:00:26 +00:00
Eugene Zelenko 5adb96cc92 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316630
2017-10-26 00:55:39 +00:00
Jonas Devlieghere f63ee64c4b Re-land "[dwarfdump] Add -lookup option"
Add the option to lookup an address in the debug information and print
out the file, function, block and line table details.

Differential revision: https://reviews.llvm.org/D38409

llvm-svn: 316619
2017-10-25 21:56:41 +00:00
Sanjoy Das f15a861601 Add a comment to clarify a future change
llvm-svn: 316614
2017-10-25 21:40:59 +00:00
Aditya Nandakumar d2a954d0ae Make the combiner check if shifts are legal before creating them
Summary: Make sure shifts are legal/specified by the legalizerinfo before creating it

Reviewers: qcolombet, dsanders, rovka, t.p.northover

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39264

llvm-svn: 316602
2017-10-25 18:49:18 +00:00
Peter Collingbourne 7cedc5607c Remove dead function declaration.
llvm-svn: 316597
2017-10-25 17:42:00 +00:00
Matthew Simpson cb58558c2f Add CalledValuePropagation pass
This patch adds a new pass for attaching !callees metadata to indirect call
sites. The pass propagates values to call sites by performing an IPSCCP-like
analysis using the generic sparse propagation solver. For indirect call sites
having a small set of possible callees, the attached metadata indicates what
those callees are. The metadata can be used to facilitate optimizations like
intersecting the function attributes of the possible callees, refining the call
graph, performing indirect call promotion, etc.

Differential Revision: https://reviews.llvm.org/D37355

llvm-svn: 316576
2017-10-25 13:40:08 +00:00
Daniil Fukalov 2bfbadcbc1 [inlineasm] Fix crash when number of matched input constraint operands overflows signed char
In a case when number of output constraint operands that has matched input operands
doesn't fit to signed char, TargetLowering::ParseConstraints() can try to access
ConstraintOperands (that is std::vector) with negative index.

Reviewers: rampitec, arsenm

Differential Review: https://reviews.llvm.org/D39125

llvm-svn: 316574
2017-10-25 12:51:32 +00:00
George Rimar 0be860f695 [llvm-dwarfdump] - Fix array out of bounds access crash.
This fixes possible out of bound access in
DWARFDie::getFirstChild()
which might happen when .debug_info section is corrupted,
like shown in testcase.

Differential revision: https://reviews.llvm.org/D39185

llvm-svn: 316566
2017-10-25 10:23:49 +00:00
Peter Collingbourne 689e6c052e llvm-readobj: Add support for reading relocations in the Android packed format.
This is in preparation for testing lld's upcoming relocation packing
feature (D39152). I have verified that this implementation correctly
unpacks the relocations from a Chromium DSO built with gold and the
Android relocation packer for ARM32 and ARM64.

Differential Revision: https://reviews.llvm.org/D39272

llvm-svn: 316543
2017-10-25 03:37:12 +00:00
Adrian Prantl 2eb7cbf987 Implement salavageDebugInfo functionality for SelectionDAG.
Similar to how llvm::salvagDebugInfo hooks into InstCombine, this adds
a hook that can be invoked before an SDNode that is associated with an
SDDbgValue is erased to capture the effect of the deleted node in a
DIExpression.

The motivating example is an SDDebugValue attached to an ADD operation
that gets folded into a LOAD+OFFSET operation.

rdar://problem/32121503

llvm-svn: 316525
2017-10-24 22:55:12 +00:00
Sam Clegg 4f00a71af5 Add Triple::isOSUnknown
Subscribers: aheejin

Differential Revision: https://reviews.llvm.org/D39256

llvm-svn: 316524
2017-10-24 22:48:19 +00:00
David Blaikie bfd094337c ValueMapper.h: Don't mark header functions as file local
llvm-svn: 316516
2017-10-24 21:29:21 +00:00
David Blaikie b685160d93 Transforms/Utils/Local.h: Don't mark header functions as file local
llvm-svn: 316515
2017-10-24 21:29:20 +00:00
David Blaikie 31c7b16f0c TargetOpcodes.h: Don't mark header functions as file local
llvm-svn: 316514
2017-10-24 21:29:19 +00:00
David Blaikie a4cceca999 Printable.h: Don't mark header functions as file local
llvm-svn: 316513
2017-10-24 21:29:19 +00:00
David Blaikie c6fbcd512a ConvertUTF.h: Don't mark header functions as file local
llvm-svn: 316512
2017-10-24 21:29:18 +00:00
David Blaikie cfa310ba2f AtomicOrdering.h: Don't mark header functions as file local
llvm-svn: 316511
2017-10-24 21:29:18 +00:00
David Blaikie e2bf1f14b3 LaneBitmask.h: Don't mark header functions as file local
llvm-svn: 316510
2017-10-24 21:29:17 +00:00
David Blaikie 994e907372 Type.h: Don't mark header functions as file local
llvm-svn: 316509
2017-10-24 21:29:16 +00:00
David Blaikie a92dbea14b RegisterUsageInfo.h: Add missing header for complete type needed for DenseMap traits
llvm-svn: 316504
2017-10-24 21:29:10 +00:00
Simon Pilgrim 513d8fbb3a Fix Wdocumentation warning. NFCI.
llvm-svn: 316498
2017-10-24 20:56:09 +00:00
Artem Belevich cb8f6328dc [NVPTX] allow address space inference for volatile loads/stores.
If particular target supports volatile memory access operations, we can
avoid AS casting to generic AS. Currently it's only enabled in NVPTX for
loads and stores that access global & shared AS.

Differential Revision: https://reviews.llvm.org/D39026

llvm-svn: 316495
2017-10-24 20:31:44 +00:00
David Blaikie b8522bd97d BinaryFormat/MachO.h Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline                                                                   function in a header and also creates binary bloat from duplicate definitions.

llvm-svn: 316477
2017-10-24 17:29:14 +00:00
David Blaikie 9c56a58aeb ValueTracking.h Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline                                                                   function in a header and also creates binary bloat from duplicate definitions.

llvm-svn: 316476
2017-10-24 17:29:14 +00:00
David Blaikie 21b7fc5016 MemoryBuiltins.h: Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline                                                                   function in a header and also creates binary bloat from duplicate definitions.

llvm-svn: 316475
2017-10-24 17:29:13 +00:00
David Blaikie 789647686c IndirectCallSiteVisitor.h:findIndirectCallSites Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline                                                                   function in a header and also creates binary bloat from duplicate definitions.

llvm-svn: 316474
2017-10-24 17:29:12 +00:00
David Blaikie bbef4d6c08 StringExtras.h Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline                                                                   function in a header and also creates binary bloat from duplicate definitions.

llvm-svn: 316473
2017-10-24 17:29:12 +00:00
David Blaikie aedf1537c6 SmallVector.h:capacity_in_bytes Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline
function in a header and also creates binary bloat from duplicate definitions.

llvm-svn: 316472
2017-10-24 17:29:11 +00:00
David Blaikie 797ccba543 DenseMap.h:capacity_in_bytes Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline
function in a header and also creates binary bloat from duplicate definitions.

llvm-svn: 316471
2017-10-24 17:29:11 +00:00
David Blaikie 1303ccccd4 BitVector.h:capacity_in_bytes Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another
inline function in a header and also creates binary bloat from duplicate
definitions.

llvm-svn: 316470
2017-10-24 17:29:08 +00:00
Marek Olsak ce76ea0394 AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1)
Summary:
Kill the thread if operand 0 == false.
llvm.amdgcn.wqm.vote can be applied to the operand.

Also allow kill in all shader stages.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D38544

llvm-svn: 316427
2017-10-24 10:27:13 +00:00
Marek Olsak 2114fc3bcb AMDGPU: Add llvm.amdgcn.wqm.vote intrinsic
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D38543

llvm-svn: 316426
2017-10-24 10:26:59 +00:00
Sam McCall fb4a9b7ede Support formatv of TimePoint with strftime-style formats.
Summary:
Support formatv of TimePoint with strftime-style formats.

Extensions for millis/micros/nanos are added.
Inital use case is HH:MM:SS.MMM timestamps in clangd logs.

Reviewers: bkramer, ilya-biryukov

Subscribers: labath, llvm-commits

Differential Revision: https://reviews.llvm.org/D38992

llvm-svn: 316419
2017-10-24 08:30:19 +00:00
Bruno Cardoso Lopes 2555e41b4e [Modules] Add module for Config/llvm-config.h
Besides all the goodness from modularizing a header, this is necessary
to compile ToT with modules with the clang host compiler from Xcode 9 in
macOS 10.13, which our bots don't use yet.

rdar://problem/35038151

llvm-svn: 316414
2017-10-24 06:18:52 +00:00
Omer Paparo Bivas 2251c79aba [MC] Adding code padding for performance stability - infrastructure. NFC.
Infrastructure designed for padding code with nop instructions in key places such that preformance improvement will be achieved.
The infrastructure is implemented such that the padding is done in the Assembler after the layout is done and all IPs and alignments are known.
This patch by itself in a NFC. Future patches will make use of this infrastructure to implement required policies for code padding.

Reviewers:
aaboud
zvi
craig.topper
gadi.haber

Differential revision: https://reviews.llvm.org/D34393

Change-Id: I92110d0c0a757080a8405636914a93ef6f8ad00e
llvm-svn: 316413
2017-10-24 06:16:03 +00:00
Bob Haarman 9ce2d03e54 [raw_fd_ostream] report actual error in error messages
Summary:
Previously, we would emit error messages like "IO failure on output
stream". This change causes use to include information about what
actually went wrong, e.g. "No space left on device".

Reviewers: sunfish, rnk

Reviewed By: rnk

Subscribers: mehdi_amini, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D39203

llvm-svn: 316404
2017-10-24 01:26:22 +00:00
Reid Kleckner 0e88118dd7 [codeview] Add support for inlinee lists
This adds type index discovery and dumper support for symbol record kind
0x1168, which is a list of inlined function ids. This symbol kind is
undocumented, but S_INLINEES is consistent with the existing
nomenclature.

Fixes PR34222

llvm-svn: 316398
2017-10-23 23:43:40 +00:00
Justin Lebar 7fb124131c [PM] Fix Typo
Patch by Nick Sarnie.

llvm-svn: 316397
2017-10-23 23:42:05 +00:00
Bob Wilson 7e55e68852 Add a new Simulator entry for the target triple environment.
Apple's iOS, tvOS and watchOS simulator platforms have never been clearly
distinguished in the target triples. Even though they are intended to
behave similarly to the corresponding device platforms, they have separate
SDKs and are really separate platforms from the compiler's perspective.
Clang now defines a macro when building for one of these simulator platforms
(r297866) but that relies on the very indirect mechanism of checking to see
which option was used to specify the minimum deployment target. That is not
so great. Swift would also like to distinguish these simulator platforms in
a similar way, but unlike Clang, Swift does not use a separate option to
specify the minimum deployment target -- it uses a -target option to
specify the target triple directly, including the OS version number.
Using a different target triple for the simulator platforms is a much
more direct and obvious way to specify this. Putting the "simulator" in
the environment component of the triple means the OS values can stay the
same and existing code the looks at the OS field will not be affected.

https://reviews.llvm.org/D39143
rdar://problem/34729432

llvm-svn: 316380
2017-10-23 21:51:50 +00:00
Daniel Sanders d66e0901ae [globalisel][tablegen] Import stores and allow GISel to automatically substitute zero regs like WZR/XZR/$zero.
This patch enables the import of stores. Unfortunately, doing so by itself,
loses an optimization where storing 0 to memory makes use of WZR/XZR.

To mitigate this, this patch also introduces a new feature that allows register
operands to nominate a zero register. When this is done, GlobalISel will
substitute (G_CONSTANT 0) with the nominated register automatically. This
is currently configured to only apply to the stores.

Applying it to GPR32/GPR64 register classes in general will be done after
review see (https://reviews.llvm.org/D39150).

llvm-svn: 316360
2017-10-23 18:19:24 +00:00
Sam McCall f9cb007355 Support formatting formatv_objects.
Summary:
Support formatting formatv_objects.

While here, fix documentation about member-formatters, and attempted
perfect-forwarding (I think).

Reviewers: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38997

llvm-svn: 316330
2017-10-23 15:40:44 +00:00
George Rimar 7fc298afe4 [llvm-dwarfdump] - Teach tool about few GNU call_sites constants.
This teaches tool about following consants: 
DW_TAG_GNU_call_site,
DW_TAG_GNU_call_site_parameter,
DW_AT_GNU_call_site_value,
DW_AT_GNU_all_call_sites.

Constants documented here: https://sourceware.org/elfutils/DwarfExtensions

Differential revision: https://reviews.llvm.org/D39119

llvm-svn: 316321
2017-10-23 11:24:14 +00:00
Benjamin Kramer 24952ce5b9 Create fewer copies of StringMaps. No functionality change intended.
llvm-svn: 316301
2017-10-22 20:16:28 +00:00
Sanjay Patel b80daf0b48 [SimplifyCFG] delay switch condition forwarding to -latesimplifycfg
As discussed in D39011:
https://reviews.llvm.org/D39011
...replacing constants with a variable is inverting the transform done
by other IR passes, so we definitely don't want to do this early. 
In fact, it's questionable whether this transform belongs in SimplifyCFG 
at all. I'll look at moving this to codegen as a follow-up step.

llvm-svn: 316298
2017-10-22 19:10:07 +00:00
Marina Yatsina f9371d821f Add logic to greedy reg alloc to avoid bad eviction chains
This fixes bugzilla 26810
https://bugs.llvm.org/show_bug.cgi?id=26810

This is intended to prevent sequences like:
movl %ebp, 8(%esp) # 4-byte Spill
movl %ecx, %ebp
movl %ebx, %ecx
movl %edi, %ebx
movl %edx, %edi
cltd
idivl %esi
movl %edi, %edx
movl %ebx, %edi
movl %ecx, %ebx
movl %ebp, %ecx
movl 16(%esp), %ebp # 4 - byte Reload

Such sequences are created in 2 scenarios:

Scenario #1:
vreg0 is evicted from physreg0 by vreg1
Evictee vreg0 is intended for region splitting with split candidate physreg0 (the reg vreg0 was evicted from)
Region splitting creates a local interval because of interference with the evictor vreg1 (normally region spliiting creates 2 interval, the "by reg" and "by stack" intervals. Local interval created when interference occurs.)
one of the split intervals ends up evicting vreg2 from physreg1
Evictee vreg2 is intended for region splitting with split candidate physreg1
one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills

Scenario #2
vreg0 is evicted from physreg0 by vreg1
vreg2 is evicted from physreg2 by vreg3 etc
Evictee vreg0 is intended for region splitting with split candidate physreg1
Region splitting creates a local interval because of interference with the evictor vreg1
one of the split intervals ends up evicting back original evictor vreg1 from physreg0 (the reg vreg0 was evicted from)
Another evictee vreg2 is intended for region splitting with split candidate physreg1
one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills

As compile time was a concern, I've added a flag to control weather we do cost calculations for local intervals we expect to be created (it's on by default for X86 target, off for the rest).

Differential Revision: https://reviews.llvm.org/D35816

Change-Id: Id9411ff7bbb845463d289ba2ae97737a1ee7cc39
llvm-svn: 316295
2017-10-22 17:59:38 +00:00
Eugene Zelenko fce435764e [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316253
2017-10-21 00:57:46 +00:00
Krzysztof Parzyszek 9d19c8cac9 [Packetizer] Add function to check for aliasing between instructions
llvm-svn: 316243
2017-10-20 22:08:40 +00:00