Commit Graph

112713 Commits

Author SHA1 Message Date
Craig Topper 22d25a08ae [X86] Merge itineraries for CLC, CMC, and STC.
These are very simple flag setting instructions that appear to only be a single uop. They're unlikely to need this separation.

llvm-svn: 329414
2018-04-06 16:16:43 +00:00
Mircea Trofin aa3fea6cb0 [GlobalOpt] Fix support for casts in ctors.
Summary:
Fixing an issue where initializations of globals where constructors use
casts were silently translated to 0-initialization.

Reviewers: davidxl, evgeny777

Reviewed By: evgeny777

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45198

llvm-svn: 329409
2018-04-06 15:54:47 +00:00
Dmitry Preobrazhensky 59399ae4cc [AMDGPU][MC][VI][GFX9] Added s_atc_probe* instructions
See bug 36839: https://bugs.llvm.org/show_bug.cgi?id=36839

Differential Revision: https://reviews.llvm.org/D45249

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 329408
2018-04-06 15:48:39 +00:00
Pete Couperus b7b6e1da6c [ARC] Add <.f> suffix for F32_GEN4_{DOP|SOP}.
Add disassembler support for instructions which writeback STATUS32.
https://reviews.llvm.org/D45148

Patch by Yan Luo! (Yan.Luo2@synopsys.com)

llvm-svn: 329404
2018-04-06 15:43:11 +00:00
Dmitry Preobrazhensky 4732d876ee [AMDGPU][MC][GFX9] Added s_dcache_discard* instructions
See bug 36838: https://bugs.llvm.org/show_bug.cgi?id=36838

Differential Revision: https://reviews.llvm.org/D45247

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 329397
2018-04-06 15:08:42 +00:00
Chad Rosier 45735b8e40 [LoopUnroll] Make LoopPeeling respect the AllowPeeling preference.
The SimpleLoopUnrollPass isn't suppose to perform loop peeling.

Differential Revision: https://reviews.llvm.org/D45334

llvm-svn: 329395
2018-04-06 13:57:21 +00:00
Pavel Labath c9f07b06a1 DWARFVerifier: validate information in name index entries
Summary:
This patch add checks to verify that the information in the name index
entries is consistent with the debug_info section. Specifically, we
check that entries point to valid DIEs, and their names, tags, and
compile units match the information in the debug_info sections.

These checks are only run if the previous checks did not find any errors
in the name index headers. Attempting to proceed with the checks anyway
would likely produce a lot of spurious errors and the verification code
would need to be very careful to avoid crashing.

I also add a couple of more checks to the abbreviation-validation code
to verify that some attributes are always present (an index without a
DW_IDX_die_offset attribute is fairly useless).

The entry verification works only on indexes without any type units - I
haven't attempted to extend it to type units, as we don't even have a
DWARF v5-compatible type unit generator at the moment.

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45323

llvm-svn: 329392
2018-04-06 13:34:12 +00:00
Simon Pilgrim 09eeb3a8b9 [X86][SandyBridge] Add (V)DPPS memory fold latencies
Noticed this during D44654

llvm-svn: 329389
2018-04-06 11:25:21 +00:00
Simon Pilgrim 8a83f16ccd [X86][SandyBridge] SBWriteResPair +5cy Memory Folds
As mentioned on D44647, this patch increases the default memory latency to +5cy , which more closely matches what most custom cases are doing for reg-mem instructions.

I've bumped LoadLatency, ReadAfterLd and WriteLoad values to 5cy to be consistent.

As Sandy Bridge is currently our default generic model, this affects a lot of scheduling tests...

Differential Revision: https://reviews.llvm.org/D44654

llvm-svn: 329388
2018-04-06 11:00:51 +00:00
Hans Wennborg da8b71f292 Tweak an assert message in the verifier
llvm-svn: 329387
2018-04-06 10:20:19 +00:00
Simon Pilgrim fd1f4fe54e [X86][SkylakeServer] Merge 2 InstRW entries to the same sched group. NFCI.
llvm-svn: 329386
2018-04-06 10:16:36 +00:00
Hans Wennborg b230c763a4 EntryExitInstrumenter: Handle musttail calls
Inserting instrumentation between a musttail call and ret instruction
would create invalid IR. Instead, treat musttail calls as function
exits.

llvm-svn: 329385
2018-04-06 10:14:09 +00:00
Max Kazantsev 832563a782 [NFC] Add missing end of line symbols
llvm-svn: 329383
2018-04-06 09:47:06 +00:00
Francis Visoiu Mistrih 537d7eee90 [MIR] Add support for MachineFrameInfo::LocalFrameSize
MFI.LocalFrameSize was not serialized.

It is usually set from LocalStackSlotAllocation, so if that pass doesn't
run it is impossible do deduce it from the stack objects. Until now, this
information was lost.

llvm-svn: 329382
2018-04-06 08:56:25 +00:00
Pavel Labath 54ca2d688a [debug_loc] Fix typo in DWARFExpression constructor
Summary:
The positions of the DwarfVersion and AddressSize arguments were
reversed, which caused parsing for dwarf opcodes which contained
address-size-dependent operands (such as DW_OP_addr). Amusingly enough,
none of the address-size asserts fired, as dwarf version was always 4,
which is a valid address size.

I ran into this when constructing weird inputs for the DWARF verifier. I
I add a test case as hand-written dwarf -- I am not sure how to trigger
this differently, as having a DW_OP_addr inside a location list is a
fairly non-standard thing to do.

Fixing this error exposed a bug in the debug_loc.dwo parser, which was
always being constructed with an address size of 0. I fix that as well
by following the pattern in the non-dwo parser of picking up the address
size from the first compile unit (which is technically not correct, but
probably good enough in practice).

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45324

llvm-svn: 329381
2018-04-06 08:49:57 +00:00
Max Kazantsev 2f2fbebdc8 [NFC] Loosen restriction on preheader to fix buildbot
llvm-svn: 329379
2018-04-06 07:23:45 +00:00
Hiroshi Inoue a2eefb6d9a [PowerPC] allow D-form VSX load/store when accessing FrameIndex without offset
VSX D-form load/store instructions of POWER9 require the offset be a multiple of 16 and a helper`isOffsetMultipleOf` is used to check this.
So far, the helper handles FrameIndex + offset case, but not handling FrameIndex without offset case. Due to this, we are missing opportunities to exploit D-form instructions when accessing an object or array allocated on stack.
For example, x-form store (stxvx) is used for int a[4] = {0}; instead of d-form store (stxv). For larger arrays, D-form instruction is not used when accessing the first 16-byte. Using D-form instructions reduces register pressure as well as instructions.

Differential Revision: https://reviews.llvm.org/D45079

llvm-svn: 329377
2018-04-06 05:41:16 +00:00
Robert Widmann f108d57f9b [LLVM-C] Audit Inline Assembly APIs for Consistency
Summary:
- Add a missing getter for module-level inline assembly
- Add a missing append function for module-level inline assembly
- Deprecate LLVMSetModuleInlineAsm and replace it with LLVMSetModuleInlineAsm2 which takes an explicit length parameter
- Deprecate LLVMConstInlineAsm and replace it with LLVMGetInlineAsm, a function that allows passing a dialect and is not mis-classified as a constant operation

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45346

llvm-svn: 329369
2018-04-06 02:31:29 +00:00
Manoj Gupta afb355bdc0 Fix lld-x86_64-darwin13 build fails.
Use double braces in std::array initialization
to keep Darwin builders happy.

llvm-svn: 329363
2018-04-05 23:23:29 +00:00
Sanjay Patel 04683de82f [InstCombine] FP: Z - (X - Y) --> Z + (Y - X)
This restores what was lost with rL73243 but without
re-introducing the bug that was present in the old code.

Note that we already have these transforms if the ops are 
marked 'fast' (and I assume that's happening somewhere in
the code added with rL170471), but we clearly don't need 
all of 'fast' for these transforms.

llvm-svn: 329362
2018-04-05 23:21:15 +00:00
Manoj Gupta 9d68b9eac5 Attempt to fix Mips breakages.
Summary:
Replace ArrayRefs by actual std::array objects so that there are
no dangling references.

Reviewers: rsmith, gkistanova

Subscribers: sdardis, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D45338

llvm-svn: 329359
2018-04-05 22:47:25 +00:00
Craig Topper fbe3132f67 [X86] Separate CDQ and CDQE in the scheduler model.
According to Agner's data, CDQE is closer to CWDE.

llvm-svn: 329354
2018-04-05 21:56:19 +00:00
Mandeep Singh Grang f3555650bd [IR] Change std::sort to llvm::sort in response to r327219
r327219 added wrappers to std::sort which randomly shuffle the container before
sorting.  This will help in uncovering non-determinism caused due to undefined
sorting order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of
std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to
llvm::sort.  Refer D44363 for a list of all the required patches.

llvm-svn: 329353
2018-04-05 21:52:24 +00:00
Craig Topper 4cc3827791 [X86] Add MOVZPQILo2PQIrr to the Sandy Bridge scheduler model
llvm-svn: 329351
2018-04-05 21:40:32 +00:00
Sanjay Patel 03e2526728 [InstCombine] nsz: -(X - Y) --> Y - X
This restores part of the fold that was removed with rL73243 (PR4374).

llvm-svn: 329350
2018-04-05 21:37:17 +00:00
Craig Topper 3b0b96c591 [X86] Add LEAVE instruction to the scheduler models using the same data as LEAVE64. Make LEAVE/LEAVE64 more correct on Sandy Bridge.
This is the 32-bit mode version of LEAVE64. It should be at least somewhat similar to LEAVE64.

The Sandy Bridge version was missing a load port use.

llvm-svn: 329347
2018-04-05 21:16:26 +00:00
Wolfgang Pieb 3fb9e3f398 [DWARF v5][NFC]: Refactor DebugRnglists to prepare for the support of the DW_AT_ranges
attribute in conjunction with .debug_rnglists.

Reviewers: JDevlieghere
 
Differential Revision: https://reviews.llvm.org/D45307

llvm-svn: 329345
2018-04-05 21:01:49 +00:00
Konstantin Zhuravlyov c233ae8004 AMDGPU/Metadata: Always report a fixed number of hidden arguments
Currently it is 6. If the "feature" was not used, report dummy
hidden argument. Otherwise it does not match the kernarg size
reported in the kernel header.

Differential Revision: https://reviews.llvm.org/D45129

llvm-svn: 329341
2018-04-05 20:46:04 +00:00
Craig Topper c6bb36a3d0 [X86] Remove some InstRWs for plain store instructions on Sandy Bridge.
We were forcing the latency of these instructions to 5 cycles, but every other scheduler model had them as 1 cycle. I'm sure I didn't get everything, but this gets a big portion.

llvm-svn: 329339
2018-04-05 20:04:06 +00:00
Lang Hames 7a598477aa [RuntimeDyld][PowerPC] Use global entry points for calls between sections.
Functions in different objects may use different TOCs, so calls between such
functions should use the global entry point of the callee which updates the
TOC pointer.

This should fix a bug that the Numba developers encountered (see
https://github.com/numba/numba/issues/2451).

Patch by Olexa Bilaniuk. Thanks Olexa!

No RuntimeDyld checker test case yet as I am not familiar enough with how
RuntimeDyldELF fixes up call-sites, but I do not want to hold up landing
this. I will continue to work on it and see if I can rope some powerpc
experts in.

llvm-svn: 329335
2018-04-05 19:37:05 +00:00
Mandeep Singh Grang 176c3efa0e [Bitcode] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: pcc, mehdi_amini, dexonsmith

Reviewed By: dexonsmith

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45132

llvm-svn: 329334
2018-04-05 19:27:04 +00:00
Daniel Neilson 367c2aea4e [InstCombine] Properly change GEP type when reassociating loop invariant GEP chains
Summary:
This is a fix to PR37005.

Essentially, rL328539 ([InstCombine] reassociate loop invariant GEP chains to enable LICM) contains a bug
whereby it will convert:
%src = getelementptr inbounds i8, i8* %base, <2 x i64> %val
%res = getelementptr inbounds i8, <2 x i8*> %src, i64 %val2
into:
%src = getelementptr inbounds i8, i8* %base, i64 %val2
%res = getelementptr inbounds i8, <2 x i8*> %src, <2 x i64> %val

By swapping the index operands if the GEPs are in a loop, and %val is loop variant while %val2
is loop invariant.

This fix recreates new GEP instructions if the index operand swap would result in the type
of %src changing from vector to scalar, or vice versa.

Reviewers: sebpop, spatel

Reviewed By: sebpop

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45287

llvm-svn: 329331
2018-04-05 18:51:45 +00:00
Craig Topper 9eec2025c5 [X86] Synchronize the SchedRW on some EVEX instructions with their VEX equivalents.
Mostly vector load, store, and move instructions.

llvm-svn: 329330
2018-04-05 18:38:45 +00:00
Mandeep Singh Grang 9893fe218c [ARM] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: t.p.northover, RKSimon, MatzeB, bkramer

Reviewed By: bkramer

Subscribers: javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D44855

llvm-svn: 329329
2018-04-05 18:31:50 +00:00
Craig Topper 665f74414d [X86] Disassembler support for having an ADSIZE prefix affect instructions with 0xf2 and 0xf3 prefixes.
Needed to support umonitor from D45253.

llvm-svn: 329327
2018-04-05 18:20:14 +00:00
Sanjay Patel deaf4f354e [InstCombine] use pattern matchers for fsub --> fadd folds
This allows folding for vectors with undef elements.

llvm-svn: 329316
2018-04-05 17:06:45 +00:00
Sam Clegg cfd44a2e69 [WebAssembly] Allow for the creation of user-defined custom sections
This patch adds a way for users to create their own custom sections to
be added to wasm files. At the LLVM IR layer, they are defined through
the "wasm.custom_sections" named metadata. The expected use case for
this is bindings generators such as wasm-bindgen.

Patch by Dan Gohman

Differential Revision: https://reviews.llvm.org/D45297

llvm-svn: 329315
2018-04-05 17:01:39 +00:00
Craig Topper 6ecdb03f16 [X86] Use WriteFShuffle256 for VEXTRACTF128 to be consistent with VEXTRACTI128 which uses WriteShuffle256.
llvm-svn: 329310
2018-04-05 16:32:48 +00:00
Andrea Di Biagio c74ad502ce [MC][Tablegen] Allow models to describe the retire control unit for llvm-mca.
This patch adds the ability to describe properties of the hardware retire
control unit.

Tablegen class RetireControlUnit has been added for this purpose (see
TargetSchedule.td).

A RetireControlUnit specifies the size of the reorder buffer, as well as the
maximum number of opcodes that can be retired every cycle.

A zero (or negative) value for the reorder buffer size means: "the size is
unknown". If the size is unknown, then llvm-mca defaults it to the value of
field SchedMachineModel::MicroOpBufferSize.  A zero or negative number of
opcodes retired per cycle means: "there is no restriction on the number of
instructions that can be retired every cycle".

Models can optionally specify an instance of RetireControlUnit. There can only
be up-to one RetireControlUnit definition per scheduling model.

Information related to the RCU (RetireControlUnit) is stored in (two new fields
of) MCExtraProcessorInfo.  llvm-mca loads that information when it initializes
the DispatchUnit / RetireControlUnit (see Dispatch.h/Dispatch.cpp).

This patch fixes PR36661.

Differential Revision: https://reviews.llvm.org/D45259

llvm-svn: 329304
2018-04-05 15:41:41 +00:00
Hiroshi Inoue bbf98aea83 [PowerPC] fix assertion failure due to missing instruction in P9InstrResources.td
This patch adds L(W|H|B)ZXTLS_32 instructions introduced by https://reviews.llvm.org/rL327635 in P9InstrResources.td.

llvm-svn: 329299
2018-04-05 15:27:06 +00:00
Philip Pfaffe 131fb978b0 Re-land r329273: [Plugins] Add a slim plugin API to work together with the new PM
Fix unittest: Do not link LLVM into the test plugin.
Additionally, remove an unrelated change that slipped in in r329273.

llvm-svn: 329293
2018-04-05 15:04:13 +00:00
Pavel Labath 510725c2d6 [Testing/Support]: Better matching of Error failure states
Summary:
The existing Failed() matcher only allowed asserting that the operation
failed, but it was not possible to verify any details of the returned
error.

This patch adds two new matchers, which make this possible:
- Failed<InfoT>() verifies that the operation failed with a single error
  of a given type.
- Failed<InfoT>(M) additionally check that the contained error info
  object is matched by the nested matcher M.

To make these work, I've changed the implementation of the ErrorHolder
class. Now, instead of just storing the string representation of the
Error, it fetches the ErrorInfo objects and stores then as a list of
shared pointers. This way, ErrorHolder remains copyable, while still
retaining the full information contained in the Error object.

In case the Error object contains two or more errors, the new matchers
will fail to match, instead of trying to match all (or any) of the
individual ErrorInfo objects. This seemed to be the most sensible
behavior for when one wants to match exact error details, but I could be
convinced otherwise...

Reviewers: zturner, lhames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44925

llvm-svn: 329288
2018-04-05 14:32:10 +00:00
Tim Northover b30388bf11 ARM: Do not spill CSR to stack on entry to noreturn functions
A noreturn nounwind function can be expected to never return in any way, and by
never returning it will also never have to restore any callee-saved registers
for its caller. This makes it possible to skip spills of those registers during
function entry, saving some stack space and time in the process. This is rather
useful for embedded targets with limited stack space.

Should fix PR9970.

Patch by myeisha (pmb).

llvm-svn: 329287
2018-04-05 14:26:06 +00:00
Krzysztof Parzyszek 62c4805c1f [Hexagon] Remove default values from lambda parameters
llvm-svn: 329286
2018-04-05 14:25:52 +00:00
Sam Parker 0e7deb8104 [DAGCombine] Revert r329160
Again, broke the big endian stage 2 builders.

llvm-svn: 329283
2018-04-05 13:46:17 +00:00
Sanjay Patel 236442e063 [InstCombine] cleanup; NFC
llvm-svn: 329282
2018-04-05 13:24:26 +00:00
Simon Pilgrim 1d793b8ac5 [SchedModel] Complete models shouldn't match against itineraries when they don't use them (PR35639)
For schedule models that don't use itineraries, checkCompleteness still checks that an instruction has a matching itinerary instead of skipping and going straight to matching the InstRWs. That doesn't seem to match what happens in TargetSchedule.cpp

This patch causes problems for a number of models that had been incorrectly flagged as complete.

Differential Revision: https://reviews.llvm.org/D43235

llvm-svn: 329280
2018-04-05 13:11:36 +00:00
Philip Pfaffe e6b49ef286 Revert "[Plugins] Add a slim plugin API to work together with the new PM"
This reverts commit ecf3ba1ab45edb1b0fadce716a7facf50dca4fbb/r329273.

llvm-svn: 329276
2018-04-05 12:42:12 +00:00
Philip Pfaffe e8f3ae9da0 [Plugins] Add a slim plugin API to work together with the new PM
Summary:
Add a new plugin API. This closes the gap between pass registration and out-of-tree passes for the new PassManager.

Unlike with the existing API, interaction with a plugin is always
initiated from the tools perspective. I.e., when a plugin is loaded, it
resolves and calls a well-known symbol `llvmGetPassPluginInfo` to obtain
details about the plugin. The fundamental motivation is to get rid of as
many global constructors as possible.  The API exposed by the plugin
info is kept intentionally minimal.

Reviewers: chandlerc

Reviewed By: chandlerc

Subscribers: bollu, grosser, lksbhm, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D35258

llvm-svn: 329273
2018-04-05 11:29:37 +00:00
Florian Hahn 6e0043365b [LoopInterchange] Add stats counter for number of interchanged loops.
Reviewers: samparker, karthikthecool, blitz.opensource

Reviewed By: samparker

Differential Revision: https://reviews.llvm.org/D45209

llvm-svn: 329269
2018-04-05 10:39:23 +00:00
Fedor Sergeev d29884c7e6 allow custom OptBisect classes set to LLVMContext
This patch introduces a way to set custom OptPassGate instances to LLVMContext.
A new instance field OptBisector and a new method setOptBisect() are added
to the LLVMContext classes. These changes allow to set a custom OptBisect class
that can make its own decisions on skipping optional passes.

Another important feature of this change is ability to set different instances
of OptPassGate to different LLVMContexts. So the different contexts can be used
independently in several compiling threads of one process.

One unit test is added.

Patch by Yevgeny Rouban.

Reviewers: andrew.w.kaylor, fedor.sergeev, vsk, dberlin, Eugene.Zelenko, reames, skatkov
Reviewed By: andrew.w.kaylor, fedor.sergeev
Differential Revision: https://reviews.llvm.org/D44464

llvm-svn: 329267
2018-04-05 10:29:37 +00:00
Florian Hahn 831a757728 [LoopInterchange] Preserve LoopInfo after interchanging.
LoopInterchange relies on LoopInfo being up-to-date, so we should
preserve it after interchanging. This patch updates restructureLoops to
move the BBs of the interchanged loops to the right place.

Reviewers: davide, efriedma, karthikthecool, mcrosier

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D45278

llvm-svn: 329264
2018-04-05 09:48:45 +00:00
Puyan Lotfi 8afc99363b [MIR-Canon] Fixing warnings in Non-assert builds.
llvm-svn: 329258
2018-04-05 06:56:44 +00:00
Craig Topper 15303dda0d [X86] Revert r329251-329254
It's failing on the bots and I'm not sure why.

This reverts:

[X86] Synchronize the SchedRW on some EVEX instructions with their VEX equivalents.
[X86] Use WriteFShuffle256 for VEXTRACTF128 to be consistent with VEXTRACTI128 which uses WriteShuffle256.
[X86] Remove some InstRWs for plain store instructions on Sandy Bridge.
[X86] Auto-generate complete checks. NFC

llvm-svn: 329256
2018-04-05 05:19:36 +00:00
Craig Topper 25c7110a37 [X86] Synchronize the SchedRW on some EVEX instructions with their VEX equivalents.
Mostly vector load, store, and move instructions.

llvm-svn: 329254
2018-04-05 04:42:03 +00:00
Craig Topper 4b1fdd4921 [X86] Use WriteFShuffle256 for VEXTRACTF128 to be consistent with VEXTRACTI128 which uses WriteShuffle256.
llvm-svn: 329253
2018-04-05 04:42:02 +00:00
Craig Topper 5c36557426 [X86] Auto-generate complete checks. NFC
llvm-svn: 329251
2018-04-05 04:41:59 +00:00
Taewook Oh e0db533feb [CallSiteSplitting] Do not perform callsite splitting inside landing pad
Summary:
If the callsite is inside landing pad, do not perform callsite splitting.

Callsite splitting uses utility function llvm::DuplicateInstructionsInSplitBetween, which eventually calls llvm::SplitEdge. llvm::SplitEdge calls llvm::SplitCriticalEdge with an assumption that the function returns nullptr only when the target edge is not a critical edge (and further assumes that if the return value was not nullptr, the predecessor of the original target edge always has a single successor because critical edge splitting was successful). However, this assumtion is not true because SplitCriticalEdge returns nullptr if the destination block is a landing pad. This invalid assumption results assertion failure.

Fundamental solution might be fixing llvm::SplitEdge to not to rely on the invalid assumption. However, it'll involve a lot of work because current API assumes that llvm::SplitEdge never fails. Instead, this patch makes callsite splitting to not to attempt splitting if the callsite is in a landing pad.

Attached test case will crash with assertion failure without the fix.

Reviewers: fhahn, junbuml, dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45130

llvm-svn: 329250
2018-04-05 04:16:23 +00:00
Gerolf Hoflehner f41aa4fd85 [IR] Upgrade comment token in objc retain release marker
Older compiler issued '#' instead of ';'

llvm-svn: 329248
2018-04-05 02:44:46 +00:00
Puyan Lotfi d6f7313c8f [MIR-Canon] Improving performance by switching to named vregs.
No more skipping thounsands of vregs. Much faster running time.

llvm-svn: 329246
2018-04-05 00:27:15 +00:00
Puyan Lotfi 26c504fe1e [MIR-Canon] Adding support for multi-def -> user distance reduction.
llvm-svn: 329243
2018-04-05 00:08:15 +00:00
Sam Clegg 685c5e838a [WebAssembly] Only write 32-bits for WebAssembly::OPERAND_OFFSET32
A bug was found where an offset of -1 would generate an encoding
of max int64 which is invalid in the binary format.

Differential Revision: https://reviews.llvm.org/D45280

llvm-svn: 329238
2018-04-04 22:27:58 +00:00
Peter Collingbourne f11eb3ebe7 AArch64: Implement support for the shadowcallstack attribute.
The implementation of shadow call stack on aarch64 is quite different to
the implementation on x86_64. Instead of reserving a segment register for
the shadow call stack, we reserve the platform register, x18. Any function
that spills lr to sp also spills it to the shadow call stack, a pointer to
which is stored in x18.

Differential Revision: https://reviews.llvm.org/D45239

llvm-svn: 329236
2018-04-04 21:55:44 +00:00
Vitaly Buka 4296ea72ff Don't inline @llvm.icall.branch.funnel
Summary: @llvm.icall.branch.funnel is musttail with variable number of
arguments. After inlining current backend can't separate call targets from call
arguments.

Reviewers: pcc

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D45116

llvm-svn: 329235
2018-04-04 21:46:27 +00:00
Zhaoshi Zheng a5531f287a [MemorySSA] Fix spelling errors in MemorySSA.cpp. NFC
llvm-svn: 329230
2018-04-04 21:08:11 +00:00
Evgeniy Stepanov 1f1a7a719d hwasan: add -hwasan-match-all-tag flag
Sometimes instead of storing addresses as is, the kernel stores the address of
a page and an offset within that page, and then computes the actual address
when it needs to make an access. Because of this the pointer tag gets lost
(gets set to 0xff). The solution is to ignore all accesses tagged with 0xff.

This patch adds a -hwasan-match-all-tag flag to hwasan, which allows to ignore
accesses through pointers with a particular pointer tag value for validity.

Patch by Andrey Konovalov.

Differential Revision: https://reviews.llvm.org/D44827

llvm-svn: 329228
2018-04-04 20:44:59 +00:00
Jessica Paquette bccd18b816 [MachineOutliner] Add `useMachineOutliner` target hook
The MachineOutliner has a bunch of target hooks that will call llvm_unreachable
if the target doesn't implement them. Therefore, if you enable the outliner on
such a target, it'll just crash. It'd be much better if it'd just *not* run
the outliner at all in this case.

This commit adds a hook to TargetInstrInfo that returns false by default.
Targets that implement the hook make it return true. The outliner checks the
return value of this hook to decide whether or not to continue.

llvm-svn: 329220
2018-04-04 19:13:31 +00:00
Eric Fiselier 96bbec79b4 [Analysis] Support aligned new/delete functions.
Summary:
Clang's __builtin_operator_new/delete was recently taught about the aligned allocation overloads (r328134). This patch makes LLVM aware of them as well.
This allows the compiler to perform certain optimizations including eliding new/delete calls.

Reviewers: rsmith, majnemer, dblaikie, vsk, bkramer

Reviewed By: bkramer

Subscribers: ckennelly, llvm-commits

Differential Revision: https://reviews.llvm.org/D44769

llvm-svn: 329218
2018-04-04 19:01:51 +00:00
Eric Fiselier e03d45fa8e Revert "[Analysis] Support aligned new/delete functions."
This reverts commit bee3bbd9bdd3ab3364b8fb0cdb6326bc1ae740e0.

llvm-svn: 329217
2018-04-04 18:23:00 +00:00
Mandeep Singh Grang 93ab79d205 [AArch64] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches.

Reviewers: t.p.northover, jmolloy, RKSimon, rengolin

Reviewed By: rengolin

Subscribers: dexonsmith, rengolin, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D44853

llvm-svn: 329216
2018-04-04 18:20:28 +00:00
Eric Fiselier 0d5f3b0281 [Analysis] Support aligned new/delete functions.
Summary:
Clang's __builtin_operator_new/delete was recently taught about the aligned allocation overloads (r328134). This patch makes LLVM aware of them as well.
This allows the compiler to perform certain optimizations including eliding new/delete calls.

Reviewers: rsmith, majnemer, dblaikie, vsk, bkramer

Reviewed By: bkramer

Subscribers: ckennelly, llvm-commits

Differential Revision: https://reviews.llvm.org/D44769

llvm-svn: 329215
2018-04-04 18:12:01 +00:00
Craig Topper 498875fab0 [X86] Separate BSWAP32r and BSWAP64r scheduling data in SandyBridge/Haswell/Broadwell/Skylake scheduler models.
The BSWAP64r version is 2 uops and BSWAP32r is only 1 uop. The regular expressions also looked for a non-existant BSWAP16r.

llvm-svn: 329211
2018-04-04 17:54:19 +00:00
Zachary Turner 15b2bdfd8b [llvm-pdbutil] Add the ability to explain binary files.
Using this, you can use llvm-pdbutil to export the contents of a
stream to a binary file, then run explain on the binary file so
that it treats the offset as an offset into the stream instead
of an offset into a file.  This makes it easy to compare the
contents of the same stream from two different files.

llvm-svn: 329207
2018-04-04 17:29:09 +00:00
Lei Huang 09fda63af0 [Power9]Legalize and emit code for quad-precision fma instructions
Legalize and emit code for the following quad-precision fma:

  * xsmaddqp
  * xsnmaddqp
  * xsmsubqp
  * xsnmsubqp

Differential Revision: https://reviews.llvm.org/D44843

llvm-svn: 329206
2018-04-04 16:43:50 +00:00
Pavel Labath 0cc0306a75 Fix build breakage from r329201
Some compilers do not like having an enum type and a variable with the
same name (AccelTableKind). I rename the variable to TheAccelTableKind.

Suggestions for a better name welcome.

llvm-svn: 329202
2018-04-04 14:54:08 +00:00
Pavel Labath 6088c23431 Re-commit r329179 after fixing build&test issues
- MSVC was not OK with a static_assert referencing a non-static member
  variable, even though it was just in a sizeof(expression). I move the
  assert into the emit function, where it is probably more useful.
- Tests were failing in builds which did not have the X86 target
  configured. Since this functionality is not target-specific, I have
  removed the target specifiers from the .ll files.

llvm-svn: 329201
2018-04-04 14:42:14 +00:00
Dmitry Preobrazhensky 523872ea59 [AMDGPU][MC] Enabled instruction TBUFFER_LOAD_FORMAT_XYZ for SI/CI
See bug 36958: https://bugs.llvm.org/show_bug.cgi?id=36958

Differential Revision: https://reviews.llvm.org/D45099

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 329197
2018-04-04 13:54:55 +00:00
Nico Weber 55fcd07d25 Revert r329179 (and follow-up unsuccessful fix attempts 329184, 329186); it doesn't build.
llvm-svn: 329190
2018-04-04 13:06:22 +00:00
Dmitry Preobrazhensky a0b8cd038c [AMDGPU][MC] Added support of 3-element addresses for MIMG instructions
See bug 35999: https://bugs.llvm.org/show_bug.cgi?id=35999

Differential Revision: https://reviews.llvm.org/D45084

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 329187
2018-04-04 13:01:17 +00:00
Nico Weber be6a9b6d7d Attempt to fix bots more after r329179.
llvm-svn: 329186
2018-04-04 12:58:49 +00:00
Nico Weber 7e654e3231 Attempt to fix bots after r329179.
llvm-svn: 329184
2018-04-04 12:54:34 +00:00
Nico Weber 1cbd096914 Sort targetgen calls in lib/Target/*/CMakeLists.
Makes it easier to see mistakes such as the one fixed in r329178 and makes
the different target CMakeLists more consistent.

Also remove some stale-looking comments from the Nios2 target cmakefile.

No intended behavior change.

llvm-svn: 329181
2018-04-04 12:37:44 +00:00
Pavel Labath 69baab103a [CodeGen] Generate DWARF v5 Accelerator Tables
Summary:
This patch adds a DwarfAccelTableEmitter class, which generates an
accelerator table, as specified in DWARF v5 standard. At the moment it
only generates a DIE offset column and (if we are indexing more than one
compile unit) a CU column.

Indexing type units is not currently supported, as we don't even have
the ability to generate DWARF v5-compatible compile units.

The implementation is not data-source agnostic like the one generating
apple tables. This was not necessary as we currently only have one user
of this code, and without a second user it was not obvious to me how to
best abstract this. (The difference between these tables and the apple
ones is that they need a lot more metadata about the debug info they are
indexing).

The generation is triggered by the --accel-tables argument, which
supersedes the --dwarf-accel-tables arg -- the latter was a simple
on-off switch, but not we can choose between two kinds of accelerator
tables we can generate.

This is tested by parsing the generated tables with llvm-dwarfdump and
the DWARFVerifier, and I've also checked that GNU readelf is able to
make sense of the tables.

Differential Revision: https://reviews.llvm.org/D43286

llvm-svn: 329179
2018-04-04 12:28:20 +00:00
Nico Weber 644d456a5f Remove duplicate tablegen lines from AVR target.
They were added in r285274, in what looks like a merge mishap.
AVRGenMCCodeEmitter.inc is the only non-dupe tablegen invocation added in that
revision.

Also sort the tablegen lines to make this easier to spot in the future.

llvm-svn: 329178
2018-04-04 12:27:43 +00:00
Benjamin Kramer 1fc0da4849 Make helpers static. NFC.
llvm-svn: 329170
2018-04-04 11:45:11 +00:00
Nicolai Haehnle 2f5a73820c AMDGPU: Dimension-aware image intrinsics
Summary:
These new image intrinsics contain the texture type as part of
their name and have each component of the address/coordinate as
individual parameters.

This is a preparatory step for implementing the A16 feature, where
coordinates are passed as half-floats or -ints, but the Z compare
value and texel offsets are still full dwords, making it difficult
or impossible to distinguish between A16 on or off in the old-style
intrinsics.

Additionally, these intrinsics pass the 'texfailpolicy' and
'cachectrl' as i32 bit fields to reduce operand clutter and allow
for future extensibility.

v2:
- gather4 supports 2darray images
- fix a bug with 1D images on SI

Change-Id: I099f309e0a394082a5901ea196c3967afb867f04

Reviewers: arsenm, rampitec, b-sumner

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D44939

llvm-svn: 329166
2018-04-04 10:58:54 +00:00
Nicolai Haehnle eb7311ffb1 StructurizeCFG: Test for branch divergence correctly
Fixes cases like the new test @nonuniform. In that test, %cc itself
is a uniform value; however, when reading it after the end of the loop in
basic block %if, its value is effectively non-uniform, so the branch is
non-uniform.

This problem was encountered in
https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change
in itself is not sufficient to fix that bug, as there is another issue
in the AMDGPU backend.

As discovered after committing an earlier version of this change, this
exposes a subtle interaction between this pass and DivergenceAnalysis:
since we remove and re-create branch instructions, we can no longer rely
on DivergenceAnalysis for branches in subregions that were already
processed by the pass.

Explicitly remove branch instructions from DivergenceAnalysis to
avoid dangling pointers as a matter of defensive programming, and
change how we detect non-uniform subregions.

Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4

Differential Revision: https://reviews.llvm.org/D43743

llvm-svn: 329165
2018-04-04 10:58:15 +00:00
Nicolai Haehnle 3ffd383a15 AMDGPU: Fix copying i1 value out of loop with non-uniform exit
Summary:
When an i1-value is defined inside of a loop and used outside of it, we
cannot simply use the SGPR bitmask from the loop's last iteration.

There are also useful and correct cases of an i1-value being copied between
basic blocks, e.g. when a condition is computed outside of a loop and used
inside it. The concept of dominators is not sufficient to capture what is
going on, so I propose the notion of "lane-dominators".

Fixes a bug encountered in Nier: Automata.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103743
Change-Id: If37b969ddc71d823ab3004aeafb9ea050e45bd9a

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D40547

llvm-svn: 329164
2018-04-04 10:57:58 +00:00
John Brawn 21d9b33d62 [AArch64] Add patterns matching (fabs (fsub x y)) to (fabd x y)
Differential Revision: https://reviews.llvm.org/D44573

llvm-svn: 329163
2018-04-04 10:12:53 +00:00
Sam Parker 7ec722d603 [DAGCombine] Improve ReduceLoadWidth for SRL
Recommitting rL321259. Previosuly this caused an issue with PPCBE but
I didn't receieve a reproducer and didn't have the time to follow up.
If the issue appears again, please provide a reproducer so I can fix
it.

Original commit message:

If the SRL node is only used by an AND, we may be able to set the
ExtVT to the width of the mask, making the AND redundant. To support
this, another check has been added in isLegalNarrowLoad which queries
whether the load is valid.

Differential Revision: https://reviews.llvm.org/D41350

llvm-svn: 329160
2018-04-04 09:26:56 +00:00
Mikhail Maltsev 68f35bcc85 [ARM] Do not convert some vmov instructions
Summary:
Patch https://reviews.llvm.org/D44467 implements conversion of invalid
vmov instructions into valid ones. It turned out that some valid
instructions also get converted, for example

  vmov.i64 d2, #0xff00ff00ff00ff00 ->
  vmov.i16 d2, #0xff00

Such behavior is incorrect because according to the ARM ARM section
F2.7.7 Modified immediate constants in T32 and A32 Advanced SIMD
instructions, "On assembly, the data type must be matched in the table
if possible."

This patch fixes the isNEONmovReplicate check so that the above
instruction is not modified any more.

Reviewers: rengolin, olista01

Reviewed By: rengolin

Subscribers: javed.absar, kristof.beyls, rogfer01, llvm-commits

Differential Revision: https://reviews.llvm.org/D44678

llvm-svn: 329158
2018-04-04 08:54:19 +00:00
Craig Topper a30db995b3 [X86] Use the same predicate for the load for PMOVSXBQ and PMOVZXBQ.
These both use a 16-bit load, but one used loadi16_anyext and the other used extloadi32i16. The only difference between them is that loadi16_anyext checked that the load was at least 2 byte aligned and non-volatile. But the alignment doesn't matter here. Just use extloadi32i16 for both.

llvm-svn: 329154
2018-04-04 07:00:24 +00:00
Craig Topper a3cac956fc [X86] Use loadi16/loadi32 predicates in multiply patterns
llvm-svn: 329153
2018-04-04 07:00:19 +00:00
Craig Topper 88e38e3e3e [X86] Remove more dead code left over from the handling of i8/i16 UMUL_LOHI/SMUL_LOHI that is no longer needed. NFC
llvm-svn: 329152
2018-04-04 07:00:16 +00:00
Max Kazantsev 613af1f7ca [SCEV] Prove implications for SCEVUnknown Phis
This patch teaches SCEV how to prove implications for SCEVUnknown nodes that are Phis.
If we need to prove `Pred` for `LHS, RHS`, and `LHS` is a Phi with possible incoming values
`L1, L2, ..., LN`, then if we prove `Pred` for `(L1, RHS), (L2, RHS), ..., (LN, RHS)` then we can also
prove it for `(LHS, RHS)`. If both `LHS` and `RHS` are Phis from the same block, it is sufficient
to prove the predicate for values that come from the same predecessor block.

The typical case that it handles is that we sometimes need to prove that `Phi(Len, Len - 1) >= 0`
given that `Len > 0`. The new logic was added to `isImpliedViaOperations` and only uses it and
non-recursive reasoning to prove the facts we need, so it should not hurt compile time a lot.

Differential Revision: https://reviews.llvm.org/D44001
Reviewed By: anna

llvm-svn: 329150
2018-04-04 05:46:47 +00:00
Craig Topper afa22edcf0 [X86] Remove dead code for handling i8/i16 UMUL_LOHI/SMUL_LOHI from X86ISelDAGToDAG.cpp. NFC
These are promoted to i16/i32 multiplies by a DAG combine.

llvm-svn: 329147
2018-04-04 04:38:55 +00:00
Craig Topper 3064c15dc3 [X86] Remove some code that was only needed when i1 was a legal type. NFC
llvm-svn: 329146
2018-04-04 04:38:54 +00:00
Craig Topper 7d3aba6687 [SimplifyCFG] Teach merge conditional stores to handle cases where the PostBB has more than 2 predecessors by inserting a new block for the store.
Summary:
Currently merge conditional stores can't handle cases where PostBB (the block we need to move the store to) has more than 2 predecessors.

This patch removes that restriction by creating a new block with only the 2 predecessors we care about and an unconditional branch to the original block. This provides a place to put the store.

Reviewers: efriedma, jmolloy, ABataev

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39760

llvm-svn: 329142
2018-04-04 03:47:17 +00:00
Vlad Tsyrklevich b324733169 Fix bad #include path in r329139
llvm-svn: 329140
2018-04-04 01:34:42 +00:00
Vlad Tsyrklevich e3446017ed Add the ShadowCallStack pass
Summary:
The ShadowCallStack pass instruments functions marked with the
shadowcallstack attribute. The instrumented prolog saves the return
address to [gs:offset] where offset is stored and updated in [gs:0].
The instrumented epilog loads/updates the return address from [gs:0]
and checks that it matches the return address on the stack before
returning.

Reviewers: pcc, vitalybuka

Reviewed By: pcc

Subscribers: cryptoad, eugenis, craig.topper, mgorny, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D44802

llvm-svn: 329139
2018-04-04 01:21:16 +00:00
Nico Weber 086b1c8118 Minor no-op cmake file style fix.
llvm-svn: 329137
2018-04-04 00:50:22 +00:00
Lang Hames b1e5043cff Reapply r329133 with fix.
llvm-svn: 329136
2018-04-04 00:34:54 +00:00
Lang Hames 4e319acd84 Revert r329133 "[RuntimeDyld][AArch64] Add some error pluming / generation..."
This broke a number of buildbots. Looking in to it now...

llvm-svn: 329135
2018-04-04 00:12:12 +00:00
Jessica Paquette 5fa2a63785 [MachineOutliner] Test for X86FI->getUsesRedZone() as well as Attribute::NoRedZone
This commit is similar to r329120, but uses the existing getUsesRedZone() function
in X86MachineFunctionInfo. This teaches the outliner to look at whether or not a
function *truly* uses a redzone instead of just the noredzone attribute on a
function.

Thus, after this commit, it's possible to outline from x86 without using
-mno-red-zone and still get outlining results.

This also adds a new test for the new redzone behaviour.

llvm-svn: 329134
2018-04-03 23:32:41 +00:00
Lang Hames b92b10f3ec [RuntimeDyld][AArch64] Add some error pluming / generation to catch unhandled
relocation types on AArch64.

llvm-svn: 329133
2018-04-03 23:19:20 +00:00
Farhana Aleen e80aeac0f2 [AMDGPU] performMinMaxCombine should not optimize patterns of vectors to min3/max3.
Summary: There are no packed instructions for min3 or max3. So, performMinMaxCombine should not optimize vectors of f16 to min3/max3.

Author: FarhanaAleen

Reviewed By: arsenm

Subscribers: llvm-commits, AMDGPU

Differential Revision: https://reviews.llvm.org/D45219

llvm-svn: 329131
2018-04-03 23:00:30 +00:00
Evandro Menezes 6b8d8f4010 [AArch64] Adjust the cost model for Exynos M3
Fix typo and simplify matching expression.

llvm-svn: 329130
2018-04-03 22:57:17 +00:00
Ikhlas Ajbar 1376d934ed [Hexagon] peel loops with runtime small trip counts
Move the check canPeel() to Hexagon Target before setting PeelCount.

Differential Revision: https://reviews.llvm.org/D44880

llvm-svn: 329129
2018-04-03 22:55:09 +00:00
Sanjay Patel 81b3b10a95 [InstCombine] allow more fmul folds with 'reassoc'
The tests marked with 'FIXME' require loosening the check
in SimplifyAssociativeOrCommutative() to optimize completely;
that's still checking isFast() in Instruction::isAssociative().

llvm-svn: 329121
2018-04-03 22:19:19 +00:00
Jessica Paquette 642f6c61a3 [MachineOutliner] Keep track of fns that use a redzone in AArch64FunctionInfo
This patch adds a hasRedZone() function to AArch64MachineFunctionInfo. It
returns true if the function is known to use a redzone, false if it is known
to not use a redzone, and no value otherwise.

This removes the requirement to pass -mno-red-zone when outlining for AArch64.

https://reviews.llvm.org/D45189

llvm-svn: 329120
2018-04-03 21:56:10 +00:00
Farhana Aleen 936947349a Revert "MSG"
This reverts commit 9a0ce889d1c39c74d69ecad5ce9c875155ae55de.

This was committed by mistake.

llvm-svn: 329119
2018-04-03 21:51:45 +00:00
Vlad Tsyrklevich 07cf78cdad Fix bad copy-and-paste in r329108
llvm-svn: 329118
2018-04-03 21:40:27 +00:00
Jessica Paquette d506bf8e3d [MachineOutliner][NFC] Make outlined functions have internal linkage
The linkage type on outlined functions was private before. This meant that if
you set a breakpoint in an outlined function, the debugger wouldn't be able to
give a sane name to the outlined function.

This commit changes the linkage type to internal and updates any tests that
relied on the prefixes on the names of outlined functions.
 

llvm-svn: 329116
2018-04-03 21:36:00 +00:00
Farhana Aleen 3ab409dc86 MSG
llvm-svn: 329114
2018-04-03 21:20:39 +00:00
Gor Nishanov d4712715dd [coroutines] Respect alloca alignment requirements when building coroutine frame
Summary:
If an alloca need to be stored in the coroutine frame and it has an alignment specified and the alignment does not match the natural alignment of the alloca type. Insert appropriate padding into the coroutine frame to make sure that it gets requested alignment.

For example for a packet type (which natural alignment is 1), but alloca alignment is 8, we may need to insert a padding field with required number of bytes to make sure it is properly aligned.

```
%PackedStruct = type <{ i64 }>
...
  %data = alloca %PackedStruct, align 8
```

If the previous field in the coroutine frame had alignment 2, we would have [6 x i8] inserted before %PackedStruct in the coroutine frame:

```
%f.Frame = type { ..., i16, [6 x i8], %PackedStruct }
```

Reviewers: rnk, lewissbaker, modocache

Reviewed By: modocache

Subscribers: EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D45221

llvm-svn: 329112
2018-04-03 20:54:20 +00:00
Florian Hahn 9467ccf447 [LoopInterchange] Add remark for calls preventing interchanging.
It also updates test/Transforms/LoopInterchange/call-instructions.ll
to use accesses where we can prove dependence after D35430.


Reviewers: sebpop, karthikthecool, blitz.opensource

Reviewed By: sebpop

Differential Revision: https://reviews.llvm.org/D45206

llvm-svn: 329111
2018-04-03 20:54:04 +00:00
Vlad Tsyrklevich d17f61ea3b Add the ShadowCallStack attribute
Summary:
Introduce the ShadowCallStack function attribute. It's added to
functions compiled with -fsanitize=shadow-call-stack in order to mark
functions to be instrumented by a ShadowCallStack pass to be submitted
in a separate change.

Reviewers: pcc, kcc, kubamracek

Reviewed By: pcc, kcc

Subscribers: cryptoad, mehdi_amini, javed.absar, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D44800

llvm-svn: 329108
2018-04-03 20:10:40 +00:00
Aaron Smith 47f18b91bb [DebugInfoPDB] Add a few missing definitions to PDBTypes.h
The missing definitions are from cvconst.h shipped with DIA SDK.

Correct the url to MSDN for MemoryTypeEnum and set the underlying 
type of PDB_StackFrameType and PDB_MemoryType to uint16_t.

llvm-svn: 329104
2018-04-03 19:41:27 +00:00
Jun Bum Lim 7ab1b32b5e [CodeGen]Add NoVRegs property on PostRASink and ShrinkWrap
Summary:
This change declare that PostRAMachineSinking and ShrinkWrap require NoVRegs
property, so now the MachineFunctionPass can enforce this check.
These passes are disabled in NVPTX & WebAssembly.

Reviewers: dschuff, jlebar, tra, jgravelle-google, MatzeB, sebpop, thegameg, mcrosier

Reviewed By: dschuff, thegameg

Subscribers: jholewinski, jfb, sbc100, aheejin, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D45183

llvm-svn: 329095
2018-04-03 18:17:34 +00:00
Alexey Bataev d5b1f7892f [SLP] Fixed formatting, NFC.
llvm-svn: 329091
2018-04-03 17:48:14 +00:00
Alexey Bataev f7226ed67d [DEBUGINFO] Add option that allows to disable emission of flags in .loc directives.
Summary:
Some targets do not support extended format of .loc directive and
support only simple format: .loc <FileID> <Line> <Column>. Patch adds
MCAsmInfo flag and option that allows emit .loc directive without
additional flags.

Reviewers: echristo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45184

llvm-svn: 329089
2018-04-03 17:28:55 +00:00
Daniel Neilson 901acfab0c [InstCombine] Fold compare of int constant against a splatted vector of ints
Summary:
Folding patterns like:
  %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer
  %cast = bitcast <4 x i8> %vec to i32
  %cond = icmp eq i32 %cast, 0
into:
  %ext = extractelement <4 x i8> %insvec, i32 0
  %cond = icmp eq i32 %ext, 0

Combined with existing rules, this allows us to fold patterns like:
  %insvec = insertelement <4 x i8> undef, i8 %val, i32 0
  %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer
  %cast = bitcast <4 x i8> %vec to i32
  %cond = icmp eq i32 %cast, 0
into:
  %cond = icmp eq i8 %val, 0

When we construct a splat vector via a shuffle, and bitcast the vector into an integer type for comparison against an integer constant. Then we can simplify the the comparison to compare the splatted value against the integer constant.

Reviewers: spatel, anna, mkazantsev

Reviewed By: spatel

Subscribers: efriedma, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D44997

llvm-svn: 329087
2018-04-03 17:26:20 +00:00
Alexey Bataev 428e9d9d87 [SLP] Fix PR36481: vectorize reassociated instructions.
Summary:
If the load/extractelement/extractvalue instructions are not originally
consecutive, the SLP vectorizer is unable to vectorize them. Patch
allows reordering of such instructions.

Patch does not support reordering of the repeated instruction, this must
be handled in the separate patch.

Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43776

llvm-svn: 329085
2018-04-03 17:14:47 +00:00
Alexey Bataev df989c54cf Recommit "[SLP] Fix issues with debug output in the SLP vectorizer."
The primary issue here is that using NDEBUG alone isn't enough to guard
debug printing -- instead the DEBUG() macro needs to be used so that the
specific pass debug logging check is employed. Without this, every
asserts-enabled build was printing out information when it hit this.

I also fixed another place where we had multiple statements in a DEBUG
macro to use {}s to be a bit cleaner. And I fixed a place that used
errs() rather than dbgs().

llvm-svn: 329082
2018-04-03 16:40:33 +00:00
Krzysztof Parzyszek 9fa6ffe290 [Hexagon] Remove -mhvx-double and the corresponding subtarget feature
Specifying the HVX vector length should be done via the -mhvx-length
option.

llvm-svn: 329079
2018-04-03 16:06:36 +00:00
Puyan Lotfi 764b386e20 Adding optional Name parameter to createVirtualRegister and createGenericVirtualRegister.
llvm-svn: 329076
2018-04-03 15:53:49 +00:00
Benjamin Kramer 2fc3b18922 Revert "[SLP] Fix PR36481: vectorize reassociated instructions."
This reverts commit r328980 and r329046. Makes the vectorizer crash.

llvm-svn: 329071
2018-04-03 14:40:33 +00:00
Andrea Di Biagio 823e5f90db [MC] Fix -Wmissing-field-initializer warning after r329067.
This should fix the problem reported by the lld buildbots:
 - Builder lld-x86_64-darwin13, Build #19782
 - Builder lld-perf-testsuite, Build #1419

llvm-svn: 329068
2018-04-03 13:52:26 +00:00
Andrea Di Biagio 9da4d6db33 [MC][Tablegen] Allow the definition of processor register files in the scheduling model for llvm-mca
This patch allows the description of register files in processor scheduling
models. This addresses PR36662.

A new tablegen class named 'RegisterFile' has been added to TargetSchedule.td.
Targets can optionally describe register files for their processors using that
class. In particular, class RegisterFile allows to specify:
 - The total number of physical registers.
 - Which target registers are accessible through the register file.
 - The cost of allocating a register at register renaming stage.

Example (from this patch - see file X86/X86ScheduleBtVer2.td)

  def FpuPRF : RegisterFile<72, [VR64, VR128, VR256], [1, 1, 2]>

Here, FpuPRF describes a register file for MMX/XMM/YMM registers. On Jaguar
(btver2), a YMM register definition consumes 2 physical registers, while MMX/XMM
register definitions only cost 1 physical register.

The syntax allows to specify an empty set of register classes.  An empty set of
register classes means: this register file models all the registers specified by
the Target.  For each register class, users can specify an optional register
cost. By default, register costs default to 1.  A value of 0 for the number of
physical registers means: "this register file has an unbounded number of
physical registers".

This patch is structured in two parts.

* Part 1 - MC/Tablegen *

A first part adds the tablegen definition of RegisterFile, and teaches the
SubtargetEmitter how to emit information related to register files.

Information about register files is accessible through an instance of
MCExtraProcessorInfo.
The idea behind this design is to logically partition the processor description
which is only used by external tools (like llvm-mca) from the processor
information used by the llvm machine schedulers.
I think that this design would make easier for targets to get rid of the extra
processor information if they don't want it.

* Part 2 - llvm-mca related *

The second part of this patch is related to changes to llvm-mca.

The main differences are:
 1) class RegisterFile now needs to take into account the "cost of a register"
when allocating physical registers at register renaming stage.
 2) Point 1. triggered a minor refactoring which lef to the removal of the
"maximum 32 register files" restriction.
 3) The BackendStatistics view has been updated so that we can print out extra
details related to each register file implemented by the processor.

The effect of point 3. is also visible in tests register-files-[1..5].s.

Differential Revision: https://reviews.llvm.org/D44980

llvm-svn: 329067
2018-04-03 13:36:24 +00:00
Hiroshi Inoue 08a1775f28 [PowerPC] reorder entries in P9InstrResources.td in alphabetical order; NFC
Reorder entries added in my previous commit (rL328969) to keep alphabetical order.

llvm-svn: 329064
2018-04-03 12:49:42 +00:00
Alexander Potapenko ac70668cff MSan: introduce the conservative assembly handling mode.
The default assembly handling mode may introduce false positives in the
cases when MSan doesn't understand that the assembly call initializes
the memory pointed to by one of its arguments.

We introduce the conservative mode, which initializes the first
|sizeof(type)| bytes for every |type*| pointer passed into the
assembly statement.

llvm-svn: 329054
2018-04-03 09:50:06 +00:00
Serguei Katkov 2ace8dc1c3 [SCEV] Fix PR36974.
The patch changes the usage of dominate to properlyDominate
to satisfy the condition !(a < a) while using std::max.

It is actually NFC due to set data structure is used to keep
the Loops and no two identical loops can be in collection.
So in reality there is no difference between usage of
dominate and properlyDominate in this particular case.
However it might be changed so it is better to fix it.

llvm-svn: 329051
2018-04-03 07:29:00 +00:00
Craig Topper 9b6a65b9ef [X86] Reduce number of OpPrefix bits in TSFlags to 2. NFCI
TSFlag doesn't need to disambiguate NoPrfx from PS. So shift the encodings so PS is NoPrfx|0x4.

llvm-svn: 329049
2018-04-03 06:37:04 +00:00
Max Kazantsev c01e47b43f [SCEV] Make computeExitLimit more simple and more powerful
Current implementation of `computeExitLimit` has a big piece of code
the only purpose of which is to prove that after the execution of this
block the latch will be executed. What it currently checks is actually a
subset of situations where the exiting block dominates latch.

This patch replaces all these checks for simple particular cases with
domination check over loop's latch which is the only necessary condition
of taking the exiting block into consideration. This change allows to
calculate exact loop taken count for simple loops like

  for (int i = 0; i < 100; i++) {
    if (cond) {...} else {...}
    if (i > 50) break;
    . . .
  }

Differential Revision: https://reviews.llvm.org/D44677
Reviewed By: efriedma

llvm-svn: 329047
2018-04-03 05:57:19 +00:00
Chandler Carruth 597bfd8448 [SLP] Fix issues with debug output in the SLP vectorizer.
The primary issue here is that using NDEBUG alone isn't enough to guard
debug printing -- instead the DEBUG() macro needs to be used so that the
specific pass debug logging check is employed. Without this, every
asserts-enabled build was printing out information when it hit this.

I also fixed another place where we had multiple statements in a DEBUG
macro to use {}s to be a bit cleaner. And I fixed a place that used
`errs()` rather than `dbgs()`.

llvm-svn: 329046
2018-04-03 05:27:28 +00:00
Yonghong Song d3b522f519 bpf: fix incorrect SELECT_CC lowering
Commit 37962a331c77 ("bpf: Improve expanding logic in LowerSELECT_CC")
intended to improve code quality for certain jmp conditions. The
commit, however, has a couple of issues:
  (1). In code, just swap is not enough, ConditionalCode CC
       should also be swapped, otherwise incorrect code will
       be generated.
  (2). The ConditionalCode swap should be subject to
       getHasJmpExt(). If getHasJmpExt() is False, certain
       conditional codes will not be supported and swap
       may generate incorrect code.

The original goal for this patch is to optimize jmp operations
which does not have JmpExt turned on. If JmpExt is on,
better code could be generated. For example, the test
select_ri.ll is introduced to demonstrate the optimization.
The same result can be achieved with -mcpu=v2 flag.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 329043
2018-04-03 03:56:37 +00:00
Ikhlas Ajbar b7322e8ac7 peel loops with runtime small trip counts
For Hexagon, peeling loops with small runtime trip count is beneficial for our
benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set
by the target for computing the desired peel count.

Differential Revision: https://reviews.llvm.org/D44880

llvm-svn: 329042
2018-04-03 03:39:43 +00:00
Haicheng Wu 7f0daaeb86 [SLP] Distinguish "demanded and shrinkable" from "demanded and not shrinkable" values when determining the minimum bitwidth
We use two approaches for determining the minimum bitwidth.

   * Demanded bits
   * Value tracking

If demanded bits doesn't result in a narrower type, we then try value tracking.
We need this if we want to root SLP trees with the indices of getelementptr
instructions since all the bits of the indices are demanded.

But there is a missing piece though. We need to be able to distinguish "demanded
and shrinkable" from "demanded and not shrinkable". For example, the bits of %i
in

%i = sext i32 %e1 to i64
%gep = getelementptr inbounds i64, i64* %p, i64 %i

are demanded, but we can shrink %i's type to i32 because it won't change the
result of the getelementptr. On the other hand, in

%tmp15 = sext i32 %tmp14 to i64
%tmp16 = insertvalue { i64, i64 } undef, i64 %tmp15, 0

it doesn't make sense to shrink %tmp15 and we can skip the value tracking.

Ideas are from Matthew Simpson!

Differential Revision: https://reviews.llvm.org/D44868

llvm-svn: 329035
2018-04-03 00:05:10 +00:00
Brian Gesiak 64521bed0d [Coroutines] Avoid assert splitting hidden coros
Summary:
When attempting to split a coroutine with 'hidden' visibility (for
example, a C++ coroutine that is inlined when compiled with the option
'-fvisibility-inlines-hidden'), LLVM would hit an assertion in
include/llvm/IR/GlobalValue.h:240: "local linkage requires default
visibility". The issue is that the visibility is copied from the source
of the function split in the `CloneFunctionInto` function, but the linkage
is not. To fix, create the new function first with external linkage,
then copy the linkage from the original function *after* `CloneFunctionInto`
is called.

Since `GlobalValue::setLinkage` in turn calls `maybeSetDsoLocal`, the
explicit call to `setDSOLocal` can be removed in CoroSplit.cpp.

Test Plan: check-llvm

Reviewers: GorNishanov, lewissbaker, EricWF, majnemer, rnk

Reviewed By: rnk

Subscribers: llvm-commits, eric_niebler

Differential Revision: https://reviews.llvm.org/D44185

llvm-svn: 329033
2018-04-02 23:39:40 +00:00
Rafael Espindola 8c58750cc4 Align stubs for external and common global variables to pointer size.
This patch fixes PR36885: clang++ generates unaligned stub symbol
holding a pointer.

Patch by Rahul Chaudhry!

llvm-svn: 329030
2018-04-02 23:20:30 +00:00
Reid Kleckner 298ffc609b [InstCombine] Don't strip function type casts from musttail calls
Summary:
The cast simplifications that instcombine does here do not make any
attempt to obey the verifier rules for musttail calls. Therefore we have
to disable them.

Reviewers: efriedma, majnemer, pcc

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D45186

llvm-svn: 329027
2018-04-02 22:49:44 +00:00
Reid Kleckner a9e9918ee4 Treat inlining a notail call as a regular, non-tail call
Otherwise, we end up inlining a musttail call into a non-tail position,
which breaks verifier invariants.

Fixes PR31014

llvm-svn: 329015
2018-04-02 21:23:16 +00:00
Lang Hames 3fdfc04e53 [ORC] Create a new SymbolStringPool by default in ExecutionSession constructor.
This makes the common case of constructing an ExecutionSession tidier.

llvm-svn: 329013
2018-04-02 20:57:56 +00:00
Sanjay Patel cbb0450540 [InstCombine] add folds for icmp + sub (PR36969)
(A - B) >u A --> A <u B
C <u (C - D) --> C <u D

https://rise4fun.com/Alive/e7j

Name: ugt
  %sub = sub i8 %x, %y
  %cmp = icmp ugt i8 %sub, %x
=>
  %cmp = icmp ult i8 %x, %y
  
Name: ult
  %sub = sub i8 %x, %y
  %cmp = icmp ult i8 %x, %sub
=>
  %cmp = icmp ult i8 %x, %y

This should fix:
https://bugs.llvm.org/show_bug.cgi?id=36969

llvm-svn: 329011
2018-04-02 20:37:40 +00:00
Harlan Haskins bee4b5894a Fix header mismatch in DIBuilder Type APIs
Some of the headers changed slightly, and the accompanying
implementation didn't change. This caused a silent failure.

llvm-svn: 329003
2018-04-02 19:11:44 +00:00
Zachary Turner d11328a1bb [llvm-pdbutil] Add an export subcommand.
This command can dump the binary contents of a stream to a file.
This is useful when you want to do side-by-side comparisons of
a specific stream from two PDBs to examine the differences between
them.  You can export both of them to a file, then open them up
side by side in a hex editor (for example), so as to eliminate any
differences that might arise from the contents being on different
blocks in the PDB.

In subsequent patches I plan to improve the "explain" subcommand
so that you can explain the contents of a binary file that isn't
necessarily a full PDB, but one of these dumped streams, by telling
the subcommand how to interpret the contents.

llvm-svn: 329002
2018-04-02 18:35:21 +00:00
Nico Weber 868112181b Remove HAVE_LIBPSAPI, HAVE_SHELL32.
These used to be set in the old autoconf build, but the cmake build has had a
"TODO: actually check for these" comment since it was checked in, and they
were set to 1 on mingw unconditionally.  It seems safe to say that they always
exist under mingw, so just remove them and assume they're set exactly when on
mingw (with msvc, we use `pragma comment` instead of linking these via flags).

llvm-svn: 328992
2018-04-02 17:32:48 +00:00
Rong Xu 5a8d4c3357 [DeadArgumentElim] Clone function level metadatas
Some Function level metadatas, such as function entry count, are not cloned in
DeadArgumentElim. This happens a lot in lto/thinlto because of DeadArgumentElim
after internalization.

This patch clones the metadatas in the original function to the new function.

Differential Revision: https://reviews.llvm.org/D44127

llvm-svn: 328991
2018-04-02 17:27:38 +00:00
Nico Weber f3db8e3c70 Remove HAVE_DIRENT_H.
The autoconf manual: "This macro is obsolescent, as all current systems with
directory libraries have <dirent.h>. New programs need not use this macro."

llvm-svn: 328989
2018-04-02 17:17:29 +00:00
Dmitry Preobrazhensky b181c7312e [AMDGPU][MC][GFX9] Added instructions v_cvt_norm_*16_f16, v_sat_pk_u8_i16
See bug 36847: https://bugs.llvm.org/show_bug.cgi?id=36847

Differential Revision: https://reviews.llvm.org/D45097

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 328988
2018-04-02 17:09:20 +00:00
Gor Nishanov b0316d96ae [coroutines] Add support for llvm.coro.noop intrinsics
Summary:
A recent addition to Coroutines TS (https://wg21.link/p0913) adds a pre-defined coroutine noop_coroutine that does nothing.
To implement this feature, we implemented an llvm.coro.noop intrinsic that returns a coroutine handle to a coroutine that does nothing when resumed or destroyed.

Reviewers: EricWF, modocache, rnk, lewissbaker

Reviewed By: modocache

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45114

llvm-svn: 328986
2018-04-02 16:55:12 +00:00
Dmitry Preobrazhensky 6bad04ecf5 [AMDGPU][MC][GFX9] Added s_atomic_* and s_buffer_atomic_* instructions
Fixed a bug which caused Tablegen crash.

See bug 36837: https://bugs.llvm.org/show_bug.cgi?id=36837

Differential Revision: https://reviews.llvm.org/D45085

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 328983
2018-04-02 16:10:25 +00:00
Krzysztof Parzyszek 0831f57afe [Hexagon] Clean up some code in HexagonAsmPrinter, NFC
llvm-svn: 328981
2018-04-02 15:06:55 +00:00
Alexey Bataev 3decaf4275 [SLP] Fix PR36481: vectorize reassociated instructions.
Summary:
If the load/extractelement/extractvalue instructions are not originally
consecutive, the SLP vectorizer is unable to vectorize them. Patch
allows reordering of such instructions.

Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43776

llvm-svn: 328980
2018-04-02 14:51:37 +00:00
Nico Weber f492f58182 Revert r328975, it makes TableGen assert on the bots.
llvm-svn: 328978
2018-04-02 14:20:23 +00:00
Nico Weber 9f03e9de77 Remove HAVE_WRITEV that's unused after r255837.
llvm-svn: 328977
2018-04-02 14:18:13 +00:00
Dmitry Preobrazhensky 32c450ae6a [AMDGPU][MC][GFX9] Added s_atomic_* and s_buffer_atomic_* instructions
See bug 36837: https://bugs.llvm.org/show_bug.cgi?id=36837

Differential Revision: https://reviews.llvm.org/D45085

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 328975
2018-04-02 13:52:23 +00:00
Nico Weber 2eada78a50 Attempt to heal bots after r328970.
llvm-svn: 328974
2018-04-02 13:49:35 +00:00
Lama Saba 927468309f [X86] Reduce Store Forward Block issues in HW - Recommit after fixing Bug 36346
If a load follows a store and reloads data that the store has written to memory, Intel microarchitectures can in many cases forward the data directly from the store to the load, This "store forwarding" saves cycles by enabling the load to directly obtain the data instead of accessing the data from cache or memory.
A "store forward block" occurs in cases that a store cannot be forwarded to the load. The most typical case of store forward block on Intel Core microarchiticutre that a small store cannot be forwarded to a large load.
The estimated penalty for a store forward block is ~13 cycles.

This pass tries to recognize and handle cases where "store forward block" is created by the compiler when lowering memcpy calls to a sequence
of a load and a store.

The pass currently only handles cases where memcpy is lowered to XMM/YMM registers, it tries to break the memcpy into smaller copies.
breaking the memcpy should be possible since there is no atomicity guarantee for loads and stores to XMM/YMM.

Differential revision: https://reviews.llvm.org/D41330

Change-Id: Ib48836ccdf6005989f7d4466fa2035b7b04415d9
llvm-svn: 328973
2018-04-02 13:48:28 +00:00
Hiroshi Inoue 6d48493817 [PowerPC] fix assertion failure due to missing instruction in P9InstrResources.td
This patch adds L(D|W|H|B)XTLS instructions introduced by https://reviews.llvm.org/rL327635 in P9InstrResources.td.

llvm-svn: 328969
2018-04-02 12:18:21 +00:00
Jonas Devlieghere 9e3e7a99e8 [dsymutil] Upstream emitting of papertrail warnings.
When running dsymutil as part of your build system, it can be desirable
for warnings to be part of the end product, rather than just being
emitted to the output stream. This patch upstreams that functionality.

Differential revision: https://reviews.llvm.org/D44639

llvm-svn: 328965
2018-04-02 10:40:43 +00:00
Craig Topper 96729cd64b [X86][Silvermont] Use correct latency and throughput information for divide and square root in the scheduler model.
Data taken from Table 16-17 in the Intel Optimization Manual.

llvm-svn: 328962
2018-04-02 06:34:16 +00:00
Craig Topper 6a814904da [X86][SkylakeServer] Correct throughput for 512-bit sqrt and divide.
Data taken from the AVX512_SKX_PortAssign spreadsheet at http://instlatx64.atw.hu/

llvm-svn: 328961
2018-04-02 05:54:34 +00:00
Craig Topper 8104f266a4 [X86] Correct the throughput for divide instructions in Sandy Bridge/Haswell/Broadwell/Skylake scheduler models.
Fixes most of PR36898. Still need to fix the 512-bit instructions, but Agner's tables don't have those.

llvm-svn: 328960
2018-04-02 05:33:28 +00:00
Craig Topper dc74094398 [X86] Fix the SchedRW for AVX512 shift instructions.
It was being inadvertently defaulted to an FADD scheduler class.

llvm-svn: 328959
2018-04-02 03:15:02 +00:00
Craig Topper 5fb1dc2d22 [X86] Give the AVX512 VEXTRACT instructions the same SchedRWs as the SSE/AVX versions.
llvm-svn: 328958
2018-04-02 02:44:55 +00:00
Craig Topper caec723a1a [X86] Add an itinerary to BTR64rr.
llvm-svn: 328956
2018-04-02 01:12:34 +00:00
Craig Topper 02daec00a2 [X86] Make sure all the classes declare in the Haswell scheduler model are prefixed with HW.
The tablegen files all share a namespace so we shouldn't use a generic names in a specific scheduler model.

llvm-svn: 328955
2018-04-02 01:12:32 +00:00
Craig Topper c90d906b16 [X86] Give VINSERTPS the same intinerary as INSERTPS.
llvm-svn: 328954
2018-04-02 00:48:11 +00:00
Harlan Haskins b7881bbfa2 Add C API bindings for DIBuilder 'Type' APIs
This patch adds a set of unstable C API bindings to the DIBuilder interface for
creating structure, function, and aggregate types.

This patch also removes the existing implementations of these functions from
the Go bindings and updates the Go API to fit the new C APIs.

llvm-svn: 328953
2018-04-02 00:17:40 +00:00
Craig Topper dc4a6d1ef6 [X86] Cleanup ADCX/ADOX instruction definitions.
Give them both the same itineraries. Add hasSideEffects = 0 to ADOX since they don't have patterns. Rename source operands to $src1 and $src2 instead of $src0 and $src. Add ReadAfterLd to the memory form SchedRW.

llvm-svn: 328952
2018-04-01 23:58:50 +00:00
Petr Hosek 934e5d5436 [AArch64] Reserve x18 register on Fuchsia
This register is reserved as a platform register on Fuchsia.

Differential Revision: https://reviews.llvm.org/D45105

llvm-svn: 328950
2018-04-01 23:44:04 +00:00
Craig Topper 8a1787ae22 [DebugCounter] Make -debug-counter cl::Hidden.
llvm-svn: 328948
2018-04-01 22:16:52 +00:00
Craig Topper f5730c38e9 [LegacyPassManager] Make 'print-module-scope' cl::Hidden like the rest of the printing options.
llvm-svn: 328947
2018-04-01 21:54:26 +00:00
Craig Topper 9f834810ea [X86] Give ADC8/16/32/64mi the same scheduling information as ADC8/16/32/64mr and SBB8/16/32/64mi.
It doesn't make a lot of sense that it would be different.

llvm-svn: 328946
2018-04-01 21:54:24 +00:00
Chandler Carruth 4244625c51 [x86] Correct the operand structure of the ADOX instruction.
This also moves to define it in the same way as ADCX which seems to use
constraints a bit better.

This is pulled out of the review for reducing the use of popf for
restoring EFLAGS, but is independent. There are still more problems with
our definitions for these instructions that Craig is going to look at
but this is at least less broken and he can start from this to improve
them more fully.

Thanks to Craig for the review here.

llvm-svn: 328945
2018-04-01 21:53:18 +00:00
Chandler Carruth 06b343c6ed [x86] Expose more of the condition conversion routines in the public API
for X86's instruction information. I've now got a second patch under
review that needs these same APIs. This bit is nicely orthogonal and
obvious, so landing it. NFC.

llvm-svn: 328944
2018-04-01 21:47:55 +00:00
Nicolai Haehnle 4254d45a79 AMDGPU: Make isIntrinsicSourceOfDivergence table-driven
Summary:
This is in preparation for the new dimension-aware image intrinsics,
which I'd rather not have to list here by hand.

Change-Id: Iaa16e3a635a11283918ce0d9e1e618591b0bf6fa

Reviewers: arsenm, rampitec, b-sumner

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D44938

llvm-svn: 328939
2018-04-01 17:09:14 +00:00
Nicolai Haehnle 5d0d30304c AMDGPU: Make getTgtMemIntrinsic table-driven for resource-based intrinsics
Summary:
Avoids having to list all intrinsics manually.

This is in preparation for the new dimension-aware image intrinsics,
which I'd rather not have to list here by hand.

Change-Id: If7ced04998397ef68c4cb8f7de66b5050fb767e5

Reviewers: arsenm, rampitec, b-sumner

Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D44937

llvm-svn: 328938
2018-04-01 17:09:07 +00:00
Mandeep Singh Grang fe1d28e83d [DebugInfo] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: echristo, zturner, samsonov

Reviewed By: echristo

Subscribers: JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D45134

llvm-svn: 328935
2018-04-01 16:18:49 +00:00
Teresa Johnson 974706ebf7 [ThinLTO] Add an import cutoff for debugging/triaging
Summary:
Adds -import-cutoff=N which will stop importing during the thin link
after N imports. Default is -1 (no  limit).

Reviewers: wmi

Subscribers: inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D45127

llvm-svn: 328934
2018-04-01 15:54:40 +00:00
David Green f80ebc8d21 [LoopRotate] Rotate loops with loop exiting latches
If a loop has a loop exiting latch, it can be profitable
to rotate the loop if it leads to the simplification of
a phi node. Perform rotation in these cases even if loop
rotate itself didnt simplify the loop to get there.

Differential Revision: https://reviews.llvm.org/D44199

llvm-svn: 328933
2018-04-01 12:48:24 +00:00
Craig Topper 9b8cd5fe55 [X86] Don't check for folding into a store when deciding if we can promote an i16 mul.
There's no RMW mul operation.

llvm-svn: 328931
2018-04-01 06:29:32 +00:00
Craig Topper db6caabccc [X86] Check if the load and store are to the same pointer before preventing i16 RMW shifts and subtracts from being promoted.
llvm-svn: 328930
2018-04-01 06:29:28 +00:00
Craig Topper ae2de57db0 [X86] Allow i16 subtracts to be promoted if the load is on the LHS and its not being stored.
llvm-svn: 328928
2018-04-01 06:29:25 +00:00
Craig Topper 9bc0d881a3 [X86] Remove unneeded temporary variable. NFC
This Promote flag was alwasys set to true except in the default case. But in the default case we don't need to set PVT and can just return false.

llvm-svn: 328926
2018-04-01 06:29:21 +00:00
Mandeep Singh Grang 97bcade70f [Analysis] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer D44363 for a list of all the required patches.

Reviewers: sanjoy, dexonsmith, hfinkel, RKSimon

Reviewed By: dexonsmith

Subscribers: david2050, llvm-commits

Differential Revision: https://reviews.llvm.org/D44944

llvm-svn: 328925
2018-04-01 01:46:51 +00:00
Sanjay Patel 6124cae8f7 [DAGCombine] (float)((int) f) --> ftrunc (PR36617)
fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC, 
so replace a pair of casts with the equivalent node. We don't have to account for 
special cases (NaN, INF) because out-of-range casts are undefined.

Differential Revision: https://reviews.llvm.org/D44909

llvm-svn: 328921
2018-03-31 17:55:44 +00:00
Simon Pilgrim 3b8ad346f9 [X86][Btver2] Add MMX_PSHUFB to the JWritePSHUFB InstRW entries
llvm-svn: 328918
2018-03-31 09:15:54 +00:00
Simon Pilgrim 8c8ebd7945 Fix trailing whitespace. NFCI.
llvm-svn: 328917
2018-03-31 09:14:14 +00:00
Puyan Lotfi 57c4f38c35 [MIR-Canon] Adding support for local idempotent instruction hoisting.
llvm-svn: 328915
2018-03-31 05:48:51 +00:00
Craig Topper 13a0f83a05 [X86] Add SchedRW for PMULLD
Summary:
It seems many CPUs don't implement this instruction as well as the other vector multiplies. Often using a multi uop flow. Silvermont in particular has a 7 uop flow with 11 cycle throughput. Sandy Bridge implements it as a single uop with 5 cycle latency and 1 cycle throughput. But Haswell and later use 2 uops with 10 cycle latency and 2 cycle throughput.

This patch adds a new X86SchedWritePair we can use to tag this instruction separately. I've provided correct information for Silvermont, Btver2, and Sandy Bridge. I've removed the InstRWs for SandyBridge. I've left Haswell/Broadwell/Skylake InstRWs in place because I wasn't sure how to account for the different load latency between 128 and 256 bits. I also left Znver1 InstRWs in place because the existing values don't match Agner's spreadsheet.

I also left a FIXME in the SandyBridge model because it being used for the "generic" model is too optimistic for the 256/512-bit versions since those are multiple uops on all known CPUs.

Reviewers: RKSimon, GGanesh, courbet

Reviewed By: RKSimon

Subscribers: gchatelet, gbedwell, andreadb, llvm-commits

Differential Revision: https://reviews.llvm.org/D44972

llvm-svn: 328914
2018-03-31 04:54:32 +00:00
Teresa Johnson db83aceb06 [ThinLTO] Add an option to force summary call edges cold for debugging
Summary:
Useful to selectively disable importing into specific modules for
debugging/triaging/workarounds.

Reviewers: eraman

Subscribers: inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D45062

llvm-svn: 328909
2018-03-31 00:18:08 +00:00
Fangrui Song 956ee79795 Fix a bunch of typoes. NFC
llvm-svn: 328907
2018-03-30 22:22:31 +00:00
Ekaterina Romanova 0b01dfbba6 Prevent data races in concurrent ThinLTO processes.
Make sure ThinLTO with caching doesn't use non-atomic writes to the cache file (to prevent data races and cache files corruption).

1. Place temp file to the same place where the caching directory is (instead of creating it the directory pointed to by TMP/TEMP variable). This will help to prevent using non-atomic rename and falling back to non-atomic "direct" write to the cache file.
2. if rename failed do not write to the cache file directly (direct write to the file is non-atomic and could cause data race conditions).
3. if cache file doesn't exist (e.g., because 'rename' failed or because some other reasons), bypass using the cache altogether.

Differential Revision:  https://reviews.llvm.org/D45076

llvm-svn: 328904
2018-03-30 21:35:42 +00:00
Jacob Gravelle 40926451d2 [WebAssembly] Register wasm passes with the PassRegistry
Summary:
This exposes WebAssembly passes for use on the command line (as
arguments to -print-before and the like).

Reviewers: dschuff, sunfish

Subscribers: MatzeB, jfb, sbc100, llvm-commits, aheejin

Differential Revision: https://reviews.llvm.org/D45103

llvm-svn: 328901
2018-03-30 20:36:58 +00:00
Krzysztof Parzyszek 74096f7258 [Hexagon] Reduce excessive indentation in .s output
llvm-svn: 328898
2018-03-30 19:30:28 +00:00
Krzysztof Parzyszek 0f983d69a4 [Hexagon] Avoid creating invalid offsets in packetizer
Two memory instructions with a dependency only on the address register
between the two (the first one of them being post-incrememnt) can be
packetized together after the offset on the second was updated to the
incremement value. Make sure that the new offset is valid for the
instruction.

llvm-svn: 328897
2018-03-30 19:28:37 +00:00
Andrea Di Biagio dc97172b2f [X86][BtVer2] Fixed the number of micro opcodes for AVX vector converts and
VSQRT instructions.

There were still a few AVX instructions with an incorrect number of opcodes.
These should be fixed now.

llvm-svn: 328892
2018-03-30 18:53:47 +00:00
Peter Collingbourne d03bf12c1b DataFlowSanitizer: wrappers of functions with local linkage should have the same linkage as the function being wrapped
This patch resolves link errors when the address of a static function is taken, and that function is uninstrumented by DFSan.

This change resolves bug 36314.

Patch by Sam Kerner!

Differential Revision: https://reviews.llvm.org/D44784

llvm-svn: 328890
2018-03-30 18:37:55 +00:00
Puyan Lotfi 399b46c98d [MIR] Adding support for Named Virtual Registers in MIR.
llvm-svn: 328887
2018-03-30 18:15:54 +00:00
Andrea Di Biagio 3eaa26bb64 [X86][BtVer2] Fix the number of uOps for horizontal operations.
llvm-svn: 328886
2018-03-30 18:15:30 +00:00
Tim Shen 8f9f026965 [NVPTX] Enable StructuredCFG for NVPTX
Summary:
Make NVPTX require structured CFG. Added a temporary flag to
"roll back" the behavior for easy deployment.

Combined with D45008, this fixes several internal Nvidia GPU test
failures that we suspect to be ptxas miscompiles (PR27738).

Reviewers: jlebar

Subscribers: jholewinski, sanjoy, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D45070

llvm-svn: 328885
2018-03-30 17:51:03 +00:00
Tim Shen 1a8c6776a3 [BlockPlacement] Disable block placement tail duplciation in structured CFG.
Summary:
Tail duplication easily breaks the structure of CFG, e.g. duplicating on
a region entry. If the structure is intended to be preserved, then we
may want to configure tail duplication, or disable it for structured
CFG. From our benchmark results disabling it doesn't cause performance
regression.

Notice that this currently affects AMDGPU backend. In the next patch, I
also plan to turn on requiresStructuredCFG for NVPTX.

All unit tests still pass.

Reviewers: jlebar, arsenm

Subscribers: jholewinski, sanjoy, wdng, tpr, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D45008

llvm-svn: 328884
2018-03-30 17:51:00 +00:00
Robert Widmann 478fce9ebf [LLVM-C] Finish exception instruction bindings - Round 2
Summary:
Previous revision caused a leak in the echo test that got caught by the ASAN bots because of missing free of the handlers array and was reverted in r328759.  Resubmitting the patch with that correction.

Add support for cleanupret, catchret, catchpad, cleanuppad and catchswitch and their associated accessors.

Test is modified from SimplifyCFG because it contains many diverse usages of these instructions.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits, vlad.tsyrklevich

Differential Revision: https://reviews.llvm.org/D45100

llvm-svn: 328883
2018-03-30 17:49:53 +00:00
Zachary Turner d5cf5cf637 [llvm-pdbutil] Dig deeper into the PDB and DBI streams when explaining.
This will show more detail when using `llvm-pdbutil explain` on an
offset in the DBI or PDB streams.  Specifically, it will dig into
individual header fields and substreams to give a more precise
description of what the byte represents.

llvm-svn: 328878
2018-03-30 17:16:50 +00:00
Derek Schuff a2726e9ab6 [WebAssembly] Refactor tablegen for store instructions (NFC)
Summary: Add patterns similar to loads.

Differential Revision: https://reviews.llvm.org/D45064

llvm-svn: 328876
2018-03-30 17:02:50 +00:00
Krzysztof Parzyszek fce30c2ba3 Revert "peel loops with runtime small trip counts"
This reverts commit r328854, it breaks some Hexagon tests.

llvm-svn: 328875
2018-03-30 16:55:44 +00:00
Stanislav Mekhanoshin 74e2974ac6 [AMDGPU] Fixed some instructions latencies
Differential Revision: https://reviews.llvm.org/D45073

llvm-svn: 328874
2018-03-30 16:19:13 +00:00
Sanjay Patel e09b7dcf3d [SelectionDAG] Removing FABS folding from DAGCombiner
The code has bugs dealing with -0.0.

Since D44550 introduced FABS pattern folding in InstCombine, 
this patch removes the now-redundant code that causes 
https://bugs.llvm.org/show_bug.cgi?id=36600.

Patch by Mikhail Dvoretckii!

Differential Revision: https://reviews.llvm.org/D44683

llvm-svn: 328872
2018-03-30 15:42:52 +00:00
Krzysztof Parzyszek 4f99836a9e [Hexagon] Recognize and handle :endloop01
llvm-svn: 328870
2018-03-30 15:29:47 +00:00
Krzysztof Parzyszek 46abcb236b [Hexagon] Fix printing :mem_noshuf on compiler-generated packets
llvm-svn: 328869
2018-03-30 15:09:05 +00:00
Andrea Di Biagio 073a9d74ca [X86][BtVer2] Add missing ReadAfterLd to RM variants of AVX horizontal adds and
most vector logic instructions.

Fixed a few InstRW that forgot to specify a ReadAfterLd for the register input
operand.

llvm-svn: 328867
2018-03-30 14:48:08 +00:00
Krzysztof Parzyszek 3f55ad8fae [Hexagon] Remove unused scheduling classes
llvm-svn: 328866
2018-03-30 14:34:32 +00:00
Krzysztof Parzyszek 1ca23d9837 [Hexagon] Pass pointer to SelectionDAG to dump functions
llvm-svn: 328864
2018-03-30 14:29:15 +00:00
Vlad Tsyrklevich 894c028d56 Revert "[LLVM-C] Finish exception instruction bindings"
This reverts commit r328759. It was causing LSan failures on sanitizer-x86_64-linux-bootstrap

llvm-svn: 328858
2018-03-30 06:21:28 +00:00
Michael Bedy 59e5ef793c [AMDGPU] Fix the SDWA Peephole phase to handle src for dst:UNUSED_PRESERVE.
Summary:
The phase attempts to transform operations that extract a portion of a value
into an SDWA src operand in cases where that value is used only once. It
was not prepared for this use to be the preserved portion of a value for
dst:UNUSED_PRESERVE, resulting in a crash or assert.

This change either rejects the illegal SDWA attempt, or in the case where
dst:WORD_1 and the src_sel would be WORD_0, removes the unneeded
extract instruction.

Reviewers: arsenm, #amdgpu

Reviewed By: arsenm, #amdgpu

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D44364

llvm-svn: 328856
2018-03-30 05:03:36 +00:00
Ikhlas Ajbar 66c8ba5a50 peel loops with runtime small trip counts
For Hexagon, peeling loops with small runtime trip count is beneficial for our
benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set
by the target for computing the desired peel count.

Differential Revision: https://reviews.llvm.org/D44880

llvm-svn: 328854
2018-03-30 03:05:34 +00:00
Eli Friedman 208fe67a78 [MachineCopyPropagation] Handle COPY with overlapping source/dest.
MachineCopyPropagation::CopyPropagateBlock has a bunch of special
handling for COPY instructions. This handling assumes that COPY
instructions do not modify the source of the copy; this is wrong if
the COPY destination overlaps the source.

To fix the bug, check explicitly for this situation, and fall back to
the generic instruction handling.

This bug can't happen for most register classes because they don't
have this sort of overlap, but there are a few register classes
where this is possible. The testcase uses the AArch64 QQQQ register
class.

Differential Revision: https://reviews.llvm.org/D44911

llvm-svn: 328851
2018-03-30 00:56:03 +00:00
Eugene Zelenko 7fb5d41e44 [IR] Fix some Clang-tidy modernize-use-auto warnings; other minor fixes (NFC).
llvm-svn: 328850
2018-03-30 00:47:31 +00:00
Rafael Espindola 4b4d85fd4d Style update. NFC.
Rename 3 functions to start with lowercase letters. Don't repeat the
name in the comments.

llvm-svn: 328848
2018-03-29 23:32:54 +00:00
David Blaikie f423062aff Fix some layering in StripNonLineTableDebugInfo, moving its declaration from IPO.h to Utils.h to match its implementation
llvm-svn: 328844
2018-03-29 22:42:08 +00:00
David Blaikie 7883340331 Remove unused header to fix layering.
llvm-svn: 328842
2018-03-29 22:35:59 +00:00
David Blaikie 4778bb88ef Remove unused headers to fix layering
llvm-svn: 328840
2018-03-29 22:31:39 +00:00
David Blaikie c90289b5d3 llvm-c: Split Utils out of Scalar.h
To fix layering (so that Scalar.h, a libScalarOpts header, isn't
included from Utils - which libScalarOpts depends on).

llvm-svn: 328839
2018-03-29 22:31:38 +00:00
David Blaikie bd0c88078a Remove some unneeded #includes to fix layering
llvm-svn: 328838
2018-03-29 22:31:36 +00:00
Craig Topper ee3c19fd7f [X86] Add ReadAfterLds to some 3 src instructions
Sometimes the operand comes after the memory operand so we need 5 ReadDefaults first.

I suspect we also need to do something for the mask operand for masked avx512 instructions? I'm not sure if the mask should be ReadAfterLd or not since it can mask faults. If it shouldn't be ReadAfterLd then we're probably wrong for zero masking instructions already.

Differential Revision: https://reviews.llvm.org/D44726

llvm-svn: 328834
2018-03-29 22:03:05 +00:00
Matt Arsenault efd1b30436 AMDGPU: Fix build warning in release
llvm-svn: 328832
2018-03-29 21:44:44 +00:00
Matt Arsenault 03ae399d50 AMDGPU: Support realigning stack
While the stack access instructions don't care about
alignment > 4, some transformations on the pointer calculation
do make assumptions based on knowing the low bits of a pointer
are 0. If a stack object ends up being accessed through its
absolute address (relative to the kernel scratch wave offset),
the addressing expression may depend on the stack frame being
properly aligned. This was breaking in a testcase due to the
add->or combine.

I think some of the SP/FP handling logic is still backwards,
and overly simplistic to support all of the stack features.
Code which tries to modify the SP with inline asm for example
or variable sized objects will probably require redoing this.

llvm-svn: 328831
2018-03-29 21:30:06 +00:00
Evgeniy Stepanov 50635dab26 Add msan custom mapping options.
Similarly to https://reviews.llvm.org/D18865 this adds options to provide custom mapping for msan.
As discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-February/121339.html

Patch by vit9696(at)avp.su.

Differential Revision: https://reviews.llvm.org/D44926

llvm-svn: 328830
2018-03-29 21:18:17 +00:00
Craig Topper 3f2dbec652 [X86] Remove ReadAfterLd from BMI and TBM instructions that don't have a register operand in their memory form
The memory form of these instructions only read an input from memory. They don't have any register operands.

Differential Revision: https://reviews.llvm.org/D44836

llvm-svn: 328828
2018-03-29 21:03:53 +00:00
Craig Topper 89310f56c8 [X86] Correct the placement of ReadAfterLd in BEXTR and BZHI. Add dedicated SchedRW for BEXTR/BZHI.
These instructions have the memory operand before the register operand. So we need to put ReadDefault for all the load ops first. Then the ReadAfterLd

Differential Revision: https://reviews.llvm.org/D44838

llvm-svn: 328823
2018-03-29 20:41:39 +00:00
Philip Reames 5c14ed89f6 [NFC][LICM] Rearrange checks to have the cheap bail out first
llvm-svn: 328822
2018-03-29 20:32:15 +00:00
Matt Arsenault ffb132e74b AMDGPU: Increase default stack alignment
8 and 16-byte values are common, so increase the default
alignment to avoid realigning the stack in most functions.

llvm-svn: 328821
2018-03-29 20:22:04 +00:00
Matt Arsenault 6c041a3cab AMDGPU: Fix selection error on constant loads with < 4 byte alignment
llvm-svn: 328818
2018-03-29 19:59:28 +00:00
Philip Reames e4b728e82b Fix an accidental circular dependence
llvm-svn: 328816
2018-03-29 19:22:12 +00:00
Mandeep Singh Grang 10d8b85570 [Mips] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: sdardis, RKSimon, dsanders, atanasyan

Reviewed By: atanasyan

Subscribers: atanasyan, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D44869

llvm-svn: 328815
2018-03-29 19:05:26 +00:00
Zachary Turner 3203e27473 [MSF] Default to FPM2, and always mark FPM pages allocated.
There are two FPMs in an MSF file, the idea being that for
incremental updates you can write to the alternate one and then
atomically swap them on commit.  LLVM defaulted to using FPM1
on the first commit, but this differs from Microsoft's behavior
which is to default to using FPM2 on the first commit.  To
eliminate some byte-level file differences, this patch changes
LLVM's default to also be FPM2.

Additionally, LLVM was trying to be "smart" about marking FPM
pages allocated.  In addition to marking every page belonging
to the alternate FPM as unallocated, LLVM also marked pages at
the end of the main FPM which were not needed as unallocated.

In order to match the behavior of Microsoft-generated PDBs, we
now always mark every FPM block as allocated, regardless of
whether it is in the main FPM or the alt FPM, and regardless of
whether or not it describes blocks which are actually in the file.

This has the side benefit of simplifying our code.

llvm-svn: 328812
2018-03-29 18:34:15 +00:00
Craig Topper 2fa1436206 [IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer.
Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it.

The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly.

Differential Revision: https://reviews.llvm.org/D45017

llvm-svn: 328806
2018-03-29 17:21:10 +00:00
Paul Robinson b271f31d8d Reapply "[DWARFv5] Emit file 0 to the line table."
DWARF v5 specifies that the root file (also given in the DW_AT_name
attribute of the compilation unit DIE) should be emitted explicitly to
the line table's list of files.  This makes the line table more
independent of the .debug_info section.
We emit the new syntax only for DWARF v5 and later.

Fixes the bug found by asan. Also XFAIL the new test for Darwin, which
is stuck on DWARF v2, and fix up other tests so they stop failing on
Windows.  Last but not least, don't break "clang -g" of an assembler
file that has .file directives in it.

Differential Revision: https://reviews.llvm.org/D44054

llvm-svn: 328805
2018-03-29 17:16:41 +00:00
Haicheng Wu c7cc87922e [JumpThreading] Don't select an edge that we know we can't thread
In r312664 (D36404), JumpThreading stopped threading edges into
loop headers. Unfortunately, I observed a significant performance
regression as a result of this change. Upon further investigation,
the problematic pattern looked something like this (after
many high level optimizations):

while (true) {
    bool cond = ...;
    if (!cond) {
        <body>
    }
    if (cond)
        break;
}

Now, naturally we want jump threading to essentially eliminate the
second if check and hook up the edges appropriately. However, the
above mentioned change, prevented it from doing this because it would
have to thread an edge into the loop header.

Upon further investigation, what is happening is that since both branches
are threadable, JumpThreading picks one of them at arbitrarily. In my
case, because of the way that the IR ended up, it tended to pick
the one to the loop header, bailing out immediately after. However,
if it had picked the one to the exit block, everything would have
worked out fine (because the only remaining branch would then be folded,
not thraded which is acceptable).

Thus, to fix this problem, we can simply eliminate loop headers from
consideration as possible threading targets earlier, to make sure that
if there are multiple eligible branches, we can still thread one of
the ones that don't target a loop header.

Patch by Keno Fischer!
Differential Revision: https://reviews.llvm.org/D42260

llvm-svn: 328798
2018-03-29 16:01:26 +00:00
Pavel Labath ea0f841c3b .debug_names: Correctly align the AugmentationStringSize field
We should align the value of the field, not the overall section offset.

This distinction matters if one of the debug_names contributions is not
of size which is a multiple of four. The dwarf producers may choose to
emit rounded contributions, but they are not required to do so. In the
latter case, without this patch we would corrupt the parsing state, as
we would adjust the offset even if subsequent contributions contained
correctly rounded augmentation strings.

llvm-svn: 328796
2018-03-29 15:12:45 +00:00
Krzysztof Parzyszek dc7a557e6a [Hexagon] Add support to handle bit-reverse load intrinsics
Patch by Sumanth Gundapaneni.

llvm-svn: 328774
2018-03-29 13:52:46 +00:00
Pavel Labath 2d1fc4375f .debug_names: Parse DW_IDX_die_offset as a reference
Before this patch we were parsing the attributes as section offsets, as
that is what apple_names is doing. However, this is not correct as DWARF
v5 specifies that this attribute should use the Reference form class.

This also updates all the testcases (except the ones that deliberately
pass a different form) to use the correct form class.

llvm-svn: 328773
2018-03-29 13:47:57 +00:00
Simon Pilgrim 71c5f3fffd [X86][SSE] Don't bother re-adding combined target shuffles to the work list
We are re-adding all the bitcasts, constant masks and target shuffles to the work list for no apparent gain.

Found while investigating adding SimplifyDemandedVectorElts to target shuffles.

Differential Revision: https://reviews.llvm.org/D44942

llvm-svn: 328771
2018-03-29 11:18:41 +00:00
Simon Dardis 32a27fc77a [Mips] Remove dead code
I believe the role of ehDataReg has been replaced by MipsABIInfo::GetEhDataReg, thus removing the dead code.

Patch By: Wei-Ren Chen.

Reviewers: ehostunreach, sdardis

Differential Revision: https://reviews.llvm.org/D44867

llvm-svn: 328767
2018-03-29 09:21:20 +00:00
David Green b0aa36f9c2 [LoopRotate] Restructuring LoopRotation.cpp to create Loop Rotation Pass with Loop Rotation Utility Interface
The existing LoopRotation.cpp is implemented as one of loop passes instead of
being a utility. The user cannot easily perform the loop rotation selectively
(or on demand) under different optimization level. For example, the loop
rotation is needed as part of the logic to convert a loop into a loop with
bottom test for a transformation. If the loop rotation is simply added as a
loop pass before the transformation, the pass is skipped if it is compiled at
–O0 or if it is explicitly disabled by the user, causing the compiler to
generate incorrect code. Furthermore, as a loop pass it will rotate all loops
instead of just the relevant loops.

We provide a utility interface for the loop rotation so that the loop rotation
can be called on demand. The changeset is as follows:

- Create a new file lib/Transforms/Utils/LoopRotationUtils.cpp and move the main
  implementation of class LoopRotate into this file.
- Create a new file llvm/include/Transform/Utils/LoopRotationUtils.h with the
  interface LoopRotation(...).
- Original LoopRotation.cpp is changed to use the utility function LoopRotation
  in LoopRotationUtils.cpp. This is done in the same way community did for
  mem-to-reg implementation.

Patch by Jin Lin!

Differential Revision: https://reviews.llvm.org/D44595

llvm-svn: 328766
2018-03-29 08:48:15 +00:00
Benjamin Kramer 6b995a4a7e [Transforms] Make sure to include the c binding header when defining c binding functions
Otherwise the definitions can't see the extern C declarations and get
name mangled, making it impossible for users to call them. This breaks
the Go bindings.

llvm-svn: 328765
2018-03-29 07:56:53 +00:00
Max Kazantsev 18f93894db [NFC] Fix meaningless assert in SCEV
llvm-svn: 328764
2018-03-29 07:54:59 +00:00
Craig Topper a21758fa2c [X86] Don't pass getRegisterName from the InstPrinters into EmitAnyX86InstComments. Just always use the function from the ATTPrinter. NFC
The IntelPrinter and the ATTPrinter produce the same strings for the same input. We already use the ATTPrinter explicitly in several other places.

llvm-svn: 328762
2018-03-29 04:14:04 +00:00
Robert Widmann 6775f52fe0 [LLVM-C] Finish exception instruction bindings
Summary:
Add support for cleanupret, catchret, catchpad, cleanuppad and catchswitch and their associated accessors.

Test is modified from SimplifyCFG because it contains many diverse usages of these instructions.

Reviewers: whitequark, deadalnix, echristo

Reviewed By: echristo

Subscribers: llvm-commits, harlanhaskins

Differential Revision: https://reviews.llvm.org/D44496

llvm-svn: 328759
2018-03-29 03:43:15 +00:00
Craig Topper 7456af88f4 [X86] Rename RIi64_NOREX tblgen class to just Ii64. Make RIi64 inherit from it. NFC
This feels more consistent with the other classes. We don't need to say _NOREX if we didn't start it with an R in the first place.

llvm-svn: 328757
2018-03-29 03:14:57 +00:00
Craig Topper 7441ffff84 [X86] Cleanup inheritance of the X86InstrFormats.td classes. NFC
EVEX shouldn't inherit from VEX and EVEX_4V shouldn't inherit from VEX_4V.

llvm-svn: 328756
2018-03-29 03:14:56 +00:00
George Burgess IV af0b06f4fd [MemorySSA] Turn an assert into a condition
Eli pointed out that variadic functions are totally a thing, so this
assert is incorrect.

No test-case is provided, since the only way this assert fires is if a
specific DenseMap falls back to doing `isEqual` checks, and that seems
fairly brittle (and requires a pyramid of growing
`call void (i8, ...) @varargs(i8 0)`).

llvm-svn: 328755
2018-03-29 03:12:03 +00:00
George Burgess IV 3588fd4865 [MemorySSA] Consider callsite args for hashing and equality.
We use a `DenseMap<MemoryLocOrCall, MemlocStackInfo>` to keep track of
prior work when optimizing uses in MemorySSA. Because we weren't
accounting for callsite arguments in either the hash code or equality
tests for `MemoryLocOrCall`s, we optimized uses too aggressively in
some rare cases.

Fix by Daniel Berlin.

Should fix PR36883.

llvm-svn: 328748
2018-03-29 00:54:39 +00:00
David Blaikie b3f471a4bd Remove some unused includes to fix layering.
llvm-svn: 328745
2018-03-29 00:29:45 +00:00
David Blaikie 8ad9a97310 Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header dependency
Thanks to echristo for the pointers on direction.

llvm-svn: 328737
2018-03-28 22:28:50 +00:00
Craig Topper aac23d7881 [X86][SkylakeServer] Remove checks for 'k', 'z', '_Int' and 'b' from scheduler regexs.
Most of these were optional matches at the end of the strings, but since the strings themselves are prefix matches by default you don't need to check for something optional at the end.

I've left the 'b' on memory instructions where it means 'broadcast' because I'm not sure those really have the same load latency and we may need to split them explicitly in the future.

llvm-svn: 328730
2018-03-28 20:40:24 +00:00
Jun Bum Lim f90fe701ef [PostRAMachineSink] preserve CFG
Summary: Mark CFG is preserved  since this pass do not make any change in CFG.

Reviewers: sebpop, mzolotukhin, mcrosier

Reviewed By: mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44845

llvm-svn: 328727
2018-03-28 19:56:26 +00:00
Krzysztof Parzyszek 440ba3ae5c [Hexagon] Add support for "new" circular buffer intrinsics
These instructions have been around for a long time, but we
haven't supported intrinsics for them. The "new" versions use
the CSx register for the start of the buffer instead of the K
field in the Mx register.

We need to use pseudo instructions for these instructions until
after register allocation. The problem is that these instructions
allocate a M0/CS0 or M1/CS1 pair. But, we can't generate code for
the CSx set-up until after register allocation when the Mx
register has been fixed for the instruction.

There is a related clang patch.

Patch by Brendon Cahoon.

llvm-svn: 328724
2018-03-28 19:38:29 +00:00
David Blaikie eb8cc04ea2 Oops - moved slightly too many things from Scalar to Utils. Move LoopSimplifyCFG things back
llvm-svn: 328720
2018-03-28 18:03:25 +00:00
Jessica Paquette 4aa14dbcc2 [MachineOutliner] Simplify call outlining + require valid callee save info for call outlining
This commit simplifies the call outlining logic by removing references to the
Function associated with the callee. To do this, it requires that valid
callee save info is available to the outliner.

llvm-svn: 328719
2018-03-28 17:52:31 +00:00
David Blaikie a373d18eb7 Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h
Fixes layering - Transforms/Utils shouldn't depend on including a Scalar
or IPO header, because Scalar and IPO depend on Utils.

llvm-svn: 328717
2018-03-28 17:44:36 +00:00
Peter Collingbourne d579c31d68 [llvm-ar] Support multiple dashed options
This allows syntax like:
$ llvm-ar -c -r -u file.a file.o

This is in addition to the other formats that are already supported:
$ llvm-ar cru file.a file.o
$ llvm-ar -cru file.a file.o

Patch by Tom Anderson!

Differential Revision: https://reviews.llvm.org/D44452

llvm-svn: 328716
2018-03-28 17:21:14 +00:00
Dmitry Preobrazhensky 622bde8bc7 [AMDGPU][MC] Added ds_add_src2_f32
See bug 36833: https://bugs.llvm.org/show_bug.cgi?id=36833

Differential Revision: https://reviews.llvm.org/D44779

Reviewers: arsenm, artem.tamazov, timcorringham
llvm-svn: 328713
2018-03-28 16:21:56 +00:00
Dmitry Preobrazhensky 2456ac696a [AMDGPU][MC] Added PCK variants of image load/store instructions
See bug 36834: https://bugs.llvm.org/show_bug.cgi?id=36834

Differential Revision: https://reviews.llvm.org/D44795

Reviewers: artem.tamazov, arsenm, timcorringham, nhaehnle
llvm-svn: 328710
2018-03-28 15:44:16 +00:00
Dmitry Preobrazhensky a917e88585 [AMDGPU][MC][GFX9] Added buffer_*_format_d16_hi_x
See bug 36835: https://bugs.llvm.org/show_bug.cgi?id=36835

Differential Revision: https://reviews.llvm.org/D44825

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 328707
2018-03-28 14:53:13 +00:00
Dmitry Preobrazhensky dd2b929ffb [AMDGPU][MC][GFX9] Added s_scratch* instructions
See bug 36836: https://bugs.llvm.org/show_bug.cgi?id=36836

Differential Revision: https://reviews.llvm.org/D44832

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 328704
2018-03-28 14:08:03 +00:00
Simon Pilgrim b1bc6cd96b [X86][Btver2] Moved JWriteFCmp/JWriteFCmpY classes next to each other. NFCI
Renamed JWriteFPAY22 to JWriteFCmpY - we've tended to avoid latency based names

llvm-svn: 328701
2018-03-28 13:53:21 +00:00
Alexander Potapenko 202f809437 Revert "Reapply "[DWARFv5] Emit file 0 to the line table.""
This reverts commit r328676.

Commit r328676 broke the -no-integrated-as flag necessary to build Linux kernel with Clang:

$ cat t.c
void foo() {}
$ clang -no-integrated-as   -c  t.c -g
/tmp/t-dcdec5.s: Assembler messages:
/tmp/t-dcdec5.s:8: Error: file number less than one
clang-7.0: error: assembler command failed with exit code 1 (use -v to see invocation)

llvm-svn: 328699
2018-03-28 12:36:46 +00:00
Andrea Di Biagio 5076b98fb9 [X86][BtVer2] Fix the number of micro opcodes for AES[ENC|DEC] and other YMM instructions.
Similar to r328694. The number of micro opcodes should be 2 for those
instructions.

This was found when testing AVX code for BtVer2 using llvm-mca.

llvm-svn: 328698
2018-03-28 12:12:04 +00:00
Alexander Potapenko 4e7ad0805e [MSan] Introduce ActualFnStart. NFC
This is a step towards the upcoming KMSAN implementation patch.
KMSAN is going to prepend a special basic block containing
tool-specific calls to each function. Because we still want to
instrument the original entry block, we'll need to store it in
ActualFnStart.

For MSan this will still be F.getEntryBlock(), whereas for KMSAN
it'll contain the second BB.

llvm-svn: 328697
2018-03-28 11:35:09 +00:00
Tim Renouf cdac172e2a Revert "[AMDGPU] For OS type AMDPAL, fixed scratch on compute shader"
This reverts commit 0daf86291d3aa04d3cc280cd0ef24abdb0174981.

It was causing an assert in test/CodeGen/AMDGPU/amdpal.ll only on a
release-with-asserts build. I will resubmit the change when I have fixed
that.

Change-Id: If270594eba27a7dc4076bdeab3fa8e6bfda3288a
llvm-svn: 328695
2018-03-28 11:21:07 +00:00
Andrea Di Biagio 010924e35c [X86][BtVer2] Fix the number of micro opcodes for a bunch of YMM instructions.
The Jaguar backend natively supports 128-bit data types. Operations on YMM
registers are split into two COPs (complex operations). Each COP consumes a slot
in the dispatch group, and in the reorder buffer.

The scheduling model for Jaguar should mark those instructions as `let
NumMicroOps = 2`.

This was found when testing AVX code for BtVer2 using llvm-mca.

llvm-svn: 328694
2018-03-28 10:49:33 +00:00
Alexander Potapenko e1d5877847 [MSan] Add an isStore argument to getShadowOriginPtr(). NFC
This is a step towards the upcoming KMSAN implementation patch.
The isStore argument is to be used by getShadowOriginPtrKernel(),
it is ignored by getShadowOriginPtrUserspace().

Depending on whether a memory access is a load or a store, KMSAN
instruments it with different functions, __msan_metadata_ptr_for_load_X()
and __msan_metadata_ptr_for_store_X().

Those functions may return different values for a single address,
which is necessary in the case the runtime library decides to ignore
particular accesses.

llvm-svn: 328692
2018-03-28 10:17:17 +00:00
Christof Douma a1e77c0e02 [ARM] Support float literals under XO
Follow up patch of r328313 to support the UseVMOVSR constraint. Removed
some unneeded instructions from the test and removed some stray
comments.

Differential Revision: https://reviews.llvm.org/D44941

llvm-svn: 328691
2018-03-28 10:02:26 +00:00
Mikael Holmen 6c062b7641 [RegisterCoalescing] Don't move COPY if it would interfere with another value
Summary:
RegisterCoalescer::removePartialRedundancy tries to hoist B = A from
BB0/BB2 to BB1:

  BB1:
       ...
  BB0/BB2:  ----
       B = A;   |
       ...      |
       A = B;   |
         |-------
         |

It does so if a number of conditions are fulfilled. However, it failed
to check if B was used by any of the terminators in BB1. Since we must
insert B = A before the terminators (since it's not a terminator itself),
this means that we could erroneously insert a new definition of B before a
use of it.

Reviewers: wmi, qcolombet

Reviewed By: wmi

Subscribers: MatzeB, llvm-commits, sdardis

Differential Revision: https://reviews.llvm.org/D44918

llvm-svn: 328689
2018-03-28 06:01:30 +00:00
Lang Hames a95b0df5ed [ORC] Fix ORC on platforms without indirection support.
Previously this crashed because a nullptr (returned by
createLocalIndirectStubsManagerBuilder() on platforms without
indirection support) functor was unconditionally invoked.

Patch by Andres Freund. Thanks Andres!

llvm-svn: 328687
2018-03-28 03:41:45 +00:00
Matt Arsenault bd49eccca1 AMDGPU: Really implement getFrameRegister
Currently this seems to only really be used for debug
info.

llvm-svn: 328677
2018-03-27 23:26:59 +00:00
Paul Robinson 07480bd177 Reapply "[DWARFv5] Emit file 0 to the line table."
DWARF v5 specifies that the root file (also given in the DW_AT_name
attribute of the compilation unit DIE) should be emitted explicitly to
the line table's list of files.  This makes the line table more
independent of the .debug_info section.

Fixes the bug found by asan. Also XFAIL the new test for Darwin, which
is stuck on DWARF v2, and fix up other tests so they stop failing on
Windows.  Last but not least, don't break "clang -g" of an assembler
file that has .file directives in it.

Differential Revision: https://reviews.llvm.org/D44054

llvm-svn: 328676
2018-03-27 22:40:34 +00:00
Jessica Paquette 2519ee7081 [MachineOutliner] AArch64: Don't outline ADRPs with un-outlinable operands
If an ADRP appears with, say, a CPI operand, we shouldn't outline it.

This moves the check for unsafe operands so that it occurs before the special-case
for ADRPs. Also add a test for outlining ADRPs.

llvm-svn: 328674
2018-03-27 22:23:48 +00:00
Tim Renouf e4208bfa5b [AMDGPU] For OS type AMDPAL, fixed scratch on compute shader
Summary:
For OS type AMDPAL, the scratch descriptor is loaded from offset 0 of
the GIT, whose 32 bit pointer is in s0 (s8 for gfx9 merged shaders).

This commit fixes that to use offset 0x10 instead of offset 0 for a
compute shader, per the PAL ABI spec.

Reviewers: kzhuravl, nhaehnle, timcorringham

Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits, dstuttard, nhaehnle, arsenm

Differential Revision: https://reviews.llvm.org/D44468

Change-Id: I93dffa647758e37f613bb5e0dfca840d82e6d26f
llvm-svn: 328673
2018-03-27 21:35:00 +00:00
Paul Robinson 7cb26ad2ef [DWARF] Suppress split line tables more carefully.
If a given split type unit does not have source locations, don't have
it refer to the split line table.
If no split type unit refers to the split line table, don't emit the
line table at all.

This will save a little space on rare occasions, but also refactors
things a bit to improve which class is responsible for what.

Responding to review comments on r326395.

Differential Revision: https://reviews.llvm.org/D44220

llvm-svn: 328670
2018-03-27 21:28:59 +00:00
Tim Renouf 4db0960420 [CodeGen] Fixed unreachable with -print-machineinstrs and custom pseudo source value
Summary:
Rev 327580 "[CodeGen] Use MIR syntax for MachineMemOperand printing"
broke -print-machineinstrs for us on AMDGPU, because we have custom
pseudo source values, and MIR serialization does not implement that.

This commit at least restores the functionality of -print-machineinstrs,
even if it does not properly implement the missing MIR serialization
functionality.

Differential Revision: https://reviews.llvm.org/D44871

Change-Id: I44961c0b90bf6d48c01484ed7a4e466fd300db66
llvm-svn: 328668
2018-03-27 21:14:04 +00:00
Sterling Augustine 33dc01861a Initialize variable added in r328617.
llvm-svn: 328667
2018-03-27 21:11:57 +00:00
Simon Pilgrim a2f26788a3 [X86] Add WriteFMOVMSK/WriteVecMOVMSK/WriteMMXMOVMSK scheduler classes
Currently MOVMSK instructions use the WriteVecLogic class, which is a very poor choice given that MOVMSK involves a SSE->GPR transfer.

Differential Revision: https://reviews.llvm.org/D44924

llvm-svn: 328664
2018-03-27 20:38:54 +00:00
Wolfgang Pieb ab068eaa57 [DWARF][DWARF v5]: Adding support for dumping DW_RLE_offset_pair and DW_RLE_base_address
Reviewers: dblakie, aprantl

Differential Revision: https://reviews.llvm.org/D44811

llvm-svn: 328662
2018-03-27 20:27:36 +00:00
Graydon Hoare 926cd9b837 [YAML] Escape non-printable multibyte UTF8 in Output::scalarString.
The existing YAML Output::scalarString code path includes a partial and
incorrect implementation of YAML escaping logic. In particular, the logic put
in place in rL321283 escapes non-printable bytes only if they are not part of a
multibyte UTF8 sequence; implicitly this means that all multibyte UTF8
sequences -- printable and non -- are passed through verbatim.

The simplest solution to this is to direct the Output::scalarString method to
use the standalone yaml::escape function, and this _almost_ works, except that
the existing code in that function _over_ escapes: any multibyte UTF8 sequence
is escaped, even printable ones. While this is permitted for YAML, it is also
more aggressive (and hard to read for non-English locales) than necessary,
and the entire point of rL321283 was to back off such aggressive over-escaping.

So in this change, I have both redirected Output::scalarString to use
yaml::escape _and_ modified yaml::escape to optionally restrict its escaping to
non-printables. This preserves behaviour of any existing clients while giving
them a path to more moderate escaping should they desire.

Reviewers: JDevlieghere, thegameg, MatzeB, vladimir.plyashkun

Reviewed By: thegameg

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44863

llvm-svn: 328661
2018-03-27 19:52:45 +00:00
Xin Tong 0272cb077f 80-line wrap. NFC
llvm-svn: 328660
2018-03-27 19:43:02 +00:00
Matt Arsenault 17f3338015 AMDGPU: Fix not preserving CSR VGPR if used for SGPR spills
Before this was not done if the function had no calls in it. This
is still a possible issue with any callable function, regardless
of calls present.

llvm-svn: 328659
2018-03-27 19:42:55 +00:00
Matt Arsenault 95329f8c53 AMDGPU: Set natural stack alignment in DataLayout
Only 4 byte alignment is ever useful, so increasing anything
beyond this may require realigning the stack.

llvm-svn: 328656
2018-03-27 19:26:40 +00:00
Rong Xu 662f38b16f [PGO] Fix branch probability remarks assert
Fixed counter/weight overflow that leads to an assertion. Also fixed the help
string for pgo-emit-branch-prob option.

Differential Revision: https://reviews.llvm.org/D44809

llvm-svn: 328653
2018-03-27 18:55:56 +00:00
Matt Arsenault 0a0c871f60 AMDGPU: Fix crash when MachinePointerInfo invalid
The combine on a select of a load only triggers for
addrspace 0, and discards the MachinePointerInfo. The
conservative default needs to be used for this.

llvm-svn: 328652
2018-03-27 18:39:45 +00:00
Matt Arsenault e9f3679031 AMDGPU: Fix FP restore from being reordered with stack ops
In a function, s5 is used as the frame base SGPR. If a function
is calling another function, during the call sequence
it is copied to a preserved SGPR and restored.

Before it was possible for the scheduler to move stack operations
before the restore of s5, since there's nothing to associate
a frame index access with the restore.

Add an implicit use of s5 to the adjcallstack pseudo which ends
the call sequence to preven this from happening. I'm not 100%
satisfied with this solution, but I'm not sure what else would be
better.

llvm-svn: 328650
2018-03-27 18:38:51 +00:00
Krzysztof Parzyszek 0375cd46ef [Hexagon] Implement TTI::shouldMaximizeVectorBandwidth
llvm-svn: 328648
2018-03-27 18:10:47 +00:00
Stefan Pintilie 659f040351 [Power9] Fix the resource list for the COPY instruction.
The COPY instruction was listed as a 4 cycle instruction.
It is now listed correctly as a 2 cycle ALU instruction.

llvm-svn: 328647
2018-03-27 17:51:53 +00:00
Pirama Arumuga Nainar ddd7b06842 Remap values in PromotedFloats
Summary: When a node is about to be erased from ReplacedValues, we should also remap its corresponding values in PromotedFloats.

Patch by Yan Luo (Yan.Luo2@synopsys.com)

Reviewers: pirama

Reviewed By: pirama

Subscribers: lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D44872

llvm-svn: 328644
2018-03-27 17:42:36 +00:00
Krzysztof Parzyszek 0a15d24134 [Hexagon] Rudimentary support for auto-vectorization for HVX
This implements a set of TTI functions that the loop vectorizer uses.
The only purpose of this is to enable testing. Auto-vectorization is
disabled by default, enabled by -hexagon-autohvx.

llvm-svn: 328639
2018-03-27 17:07:52 +00:00
Rafael Auler d058b882be [AArch64] Decorate AArch64 instrs with OPERAND_PCREL
Summary:
This is a canonical way to teach objdump to print the target
symbols for branches when disassembling AArch64 code.

Reviewers: evandro, t.p.northover, espindola

Reviewed By: t.p.northover

Differential Revision: https://reviews.llvm.org/D44851

llvm-svn: 328638
2018-03-27 16:58:01 +00:00
Fedor Sergeev 98014e433f [NFC] OptPassGate extracted from OptBisect
Summary:
This is an NFC refactoring of the OptBisect class to split it into an optional pass gate interface used by LLVMContext and the Optional Pass Bisector (OptBisect) used for debugging of optional passes.

This refactoring is needed for D44464, which introduces setOptPassGate() method to allow implementations other than OptBisect.

Patch by Yevgeny Rouban.

Reviewers: andrew.w.kaylor, fedor.sergeev, vsk, dberlin, Eugene.Zelenko, reames, skatkov
Reviewed By: fedor.sergeev
Differential Revision: https://reviews.llvm.org/D44821

llvm-svn: 328637
2018-03-27 16:57:20 +00:00
Krzysztof Parzyszek 52396bb9c5 Use .set instead of = when printing assignment in assembly output
On Hexagon "x = y" is a syntax used in most instructions, and is not
treated as a directive.

Differential Revision: https://reviews.llvm.org/D44256

llvm-svn: 328635
2018-03-27 16:44:41 +00:00
Krzysztof Parzyszek 5d93fdfa89 [LV] Add TTI::shouldMaximizeVectorBandwidth to allow enabling it per target
The default implementation returns false and keeps the current behavior.

Differential Revision: https://reviews.llvm.org/D44735

llvm-svn: 328632
2018-03-27 16:14:11 +00:00
Simon Pilgrim 5f7ab4fedf [X86][Btver2] Add MMX_PMOVMSKBrr to MOVMSK scheduler class
llvm-svn: 328620
2018-03-27 12:26:12 +00:00
Strahinja Petrovic 06cf6a6490 [PowerPC] Secure PLT support
This patch supports secure PLT mode for PowerPC 32 architecture.

Differential Revision: https://reviews.llvm.org/D42112

llvm-svn: 328617
2018-03-27 11:23:53 +00:00
Alexander Richardson e8059b1de4 [MIPS] Add static_assert that all Fixups are handled in getFixupKind
Summary:
I recently added a new Fixup kind to our fork of LLVM but forgot to add
it to the table in MipsAsmBackend.cpp. With this static_assert the error
would have been caught instead of zero-initializing the array entries for
the new fixups.

Reviewers: sdardis, atanasyan

Reviewed By: atanasyan

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44895

llvm-svn: 328616
2018-03-27 10:08:12 +00:00
Max Kazantsev b1ad66ff12 [LoopUnroll][NFC] Remove redundant canPeel check
We check `canPeel` twice: when evaluating the number of iterations to be peeled
and within the method `peelLoop` that performs peeling. This method is only
executed if the calculated peel count is positive. Thus, the check in `peelLoop` can
never fail. This patch replaces this check with an assert.

Differential Revision: https://reviews.llvm.org/D44919
Reviewed By: fhahn

llvm-svn: 328615
2018-03-27 09:40:51 +00:00
Sam Parker 90b7f4f72c [IRCE] Enable decreasing loops of non-const bound
As a follow-up to r328480, this updates the logic for the decreasing
safety checks in a similar manner:
- CanBeMax is replaced by CannotBeMaxInLoop which queries
  isLoopEntryGuardedByCond on the maximum value.
- SumCanReachMin is replaced by isSafeDecreasingBound which includes
  some logic from parseLoopStructure and, again, has been updated to
  use isLoopEntryGuardedByCond on the given bounds.

Differential Revision: https://reviews.llvm.org/D44776

llvm-svn: 328613
2018-03-27 08:24:53 +00:00
Max Kazantsev ee5dd8306f [NFC] Fix comments in getExact()
llvm-svn: 328612
2018-03-27 08:13:55 +00:00
Max Kazantsev 7094c8deb2 [SCEV] Make exact taken count calculation more optimistic
Currently, `getExact` fails if it sees two exit counts in different blocks. There is
no solid reason to do so, given that we only calculate exact non-taken count
for exiting blocks that dominate latch. Using this fact, we can simply take min
out of all exits of all blocks to get the exact taken count.

This patch makes the calculation more optimistic with enforcing our assumption
with asserts. It allows us to calculate exact backedge taken count in trivial loops
like

  for (int i = 0; i < 100; i++) {
    if (i > 50) break;
    . . .
  }

Differential Revision: https://reviews.llvm.org/D44676
Reviewed By: fhahn

llvm-svn: 328611
2018-03-27 07:30:38 +00:00
Max Kazantsev a63d333881 [SCEV] Add one more case in computeConstantDifference
This patch teaches `computeConstantDifference` handle calculation of constant
difference between `(X + C1)` and `(X + C2)` which is `(C2 - C1)`.

Differential Revision: https://reviews.llvm.org/D43759
Reviewed By: anna

llvm-svn: 328609
2018-03-27 04:54:00 +00:00
David Blaikie 60e62438d2 Add a build dependency from libMC to libDebugInfoCodeView to match the reality of header dependencies here
llvm-svn: 328595
2018-03-26 23:48:52 +00:00
Aaron Smith f13938382c [DebugInfoPDB] Print the method name along with the variant value
Before this change, using dumpProperties() with PDBSymbolData
would look like this:

  get_locationType: 3
  1

After this change:

  get_locationType: 3
  get_value: 1

llvm-svn: 328590
2018-03-26 22:53:38 +00:00
Aaron Smith 1af50bcf89 [DebugInfoPDB] Add methods to get the compiland and line numbers with PDBSymbolData
llvm-svn: 328587
2018-03-26 22:17:12 +00:00
Aaron Smith ed81a9db29 [DebugInfoPDB] Add DIA implementation of findLineNumbersByRVA
This method is used to find line numbers for PDBSymbolData
that have an invalid virtual address.

llvm-svn: 328586
2018-03-26 22:13:22 +00:00
Aaron Smith 53708a5e9e [DebugInfoPDB] Add DIA implementation of addressForVA and addressForRVA
These are used in finding line numbers for PDBSymbolData

llvm-svn: 328585
2018-03-26 22:10:02 +00:00
Simon Pilgrim 28e7bcbba6 [X86] Add WriteCRC32 scheduler class
Currently CRC32 instructions use the WriteFAdd class, this patch splits them off into their own, at the moment it is still mostly just a duplicate of WriteFAdd but it can now be tweaked on a target by target basis.

Differential Revision: https://reviews.llvm.org/D44647

llvm-svn: 328582
2018-03-26 21:06:14 +00:00
Rafael Espindola 78fdca3cd5 Use local symbols for creating .stack-size.
llvm-svn: 328581
2018-03-26 20:40:22 +00:00
Paul Robinson 82e4864730 Use correct format specifier.
Review comment on r328235 by James Henderson.

llvm-svn: 328578
2018-03-26 19:55:01 +00:00
Eli Friedman 88e2bac94d [MemorySSA] Fix exponential compile-time updating MemorySSA.
MemorySSAUpdater::getPreviousDefRecursive is a recursive algorithm, for
each block, it computes the previous definition for each predecessor,
then takes those definitions and combines them. But currently it doesn't
remember results which it already computed; this means it can visit the
same block multiple times, which adds up to exponential time overall.

To fix this, this patch adds a cache. If we computed the result for a
block already, we don't need to visit it again because we'll come up
with the same result. Well, unless we RAUW a MemoryPHI; in that case,
the TrackingVH will be updated automatically.

This matches the original source paper for this algorithm.

The testcase isn't really a test for the bug, but it adds coverage for
the case where tryRemoveTrivialPhi erases an existing PHI node. (It's
hard to write a good regression test for a performance issue.)

Differential Revision: https://reviews.llvm.org/D44715

llvm-svn: 328577
2018-03-26 19:52:54 +00:00
Krzysztof Parzyszek 4a5a80c370 [Hexagon] Assertion failure in HexagonSubtarget.cpp
In restoreLatency, replace range-for loop with std::find.

Patch by Jyotsna Verma.

llvm-svn: 328574
2018-03-26 19:04:58 +00:00
Simon Pilgrim fcf49df21c [X86][Btver2] Add (U)COMISD/(U)COMISD scheduler costs
Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write)

llvm-svn: 328573
2018-03-26 19:01:06 +00:00
Reid Kleckner 41fb2dba9c [X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32
Summary:
Re-lands r328386 and r328443, reverting r328482.

Incorporates fixes from @mstorsjo in D44876 (thanks!) so that small
parameters in i8 and i16 do not end up in the SysV register parameters
(EDI, ESI, etc).

I added tests for how we receive small parameters, since that is the
important part. It's always safe to store more bytes than will be read,
but the assumptions you make when loading them are what really matter.

I also tested this by self-hosting clang and it passed tests on win64.

Reviewers: mstorsjo, hans

Subscribers: hiraditya, mstorsjo, llvm-commits

Differential Revision: https://reviews.llvm.org/D44900

llvm-svn: 328570
2018-03-26 18:49:48 +00:00
Simon Pilgrim f33d905293 [X86] Add WriteBitScan/WriteLZCNT/WriteTZCNT/WritePOPCNT scheduler classes (PR36881)
Give the bit count instructions their own scheduler classes instead of forcing them into existing classes.

These were mostly overridden anyway, but I had to add in costs from Agner for silvermont and znver1 and the Fam16h SoG for btver2 (Jaguar).

Differential Revision: https://reviews.llvm.org/D44879

llvm-svn: 328566
2018-03-26 18:19:28 +00:00
David Blaikie 7c4b5d92f1 Remove unused file, ExecutionEngine/MCJIT/ObjectBuffer.h
This header also wasn't self contained/modular - but with no users, it
didn't seem worth fixing because it'd break so easily again.

llvm-svn: 328565
2018-03-26 18:10:31 +00:00
Mandeep Singh Grang 1b9ff45157 [XCore] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: dblaikie, RKSimon, robertlytton

Reviewed By: robertlytton

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44875

llvm-svn: 328564
2018-03-26 18:08:26 +00:00
Sanjay Patel 0e3167cb30 [InstCombine] improve code comment; NFC
llvm-svn: 328560
2018-03-26 17:52:02 +00:00
Lei Huang be0afb0870 [Power9]Legalize and emit code for quad-precision convert from double-precision
Legalize and emit code for quad-precision floating point operation xscvdpqp
and add option to guard the quad precision operation support.

Differential Revision: https://reviews.llvm.org/D44746

llvm-svn: 328558
2018-03-26 17:46:25 +00:00
Stefan Pintilie 26d4f923c4 [PowerPC] Infrastructure work. Implement getting the opcode for a spill in one place.
A new function getOpcodeForSpill should now be the only place to get
the opcode for a given spilled register.

Differential Revision: https://reviews.llvm.org/D43086

llvm-svn: 328556
2018-03-26 17:39:18 +00:00
Zaara Syeda 17e4eeaa8b Disable [MachineLICM] Add functions to MachineLICM to hoist invariant stores
Disable https://reviews.llvm.org/D40196 with setting option
hoist-const-stores to false since failing s390 buildbot.

llvm-svn: 328555
2018-03-26 17:22:33 +00:00
Krzysztof Parzyszek 3ca233414b [Pipeliner] Several node-ordering fixes
First, we change the heuristic that is used to ignore the recurrent
node-sets in the node ordering. In certain cases it's not important
to focus on the recurrent node-sets.  Instead, the algorithm begins
by considering all the instructions in the node ordering step.

Second, a minor change to the bottom up traversal, which needs to
consider loop carried dependences (modeled as anti dependences).
Previously, these instructions were skipped, which caused problems
because the instruction ends up having both predecessors and
sucessors in the schedule.

Third, consider anti-dependences as a tie breaker when choosing
between instructions in the node ordering. We want to make sure
that the source of the anti-dependence does not end up with both
predecesssors and sucessors in the final node ordering.

Patch by Brendon Cahoon.

llvm-svn: 328554
2018-03-26 17:07:41 +00:00
Tim Corringham 7116e8963d [AMDGPU] Improve disassembler error handling
Summary:
llvm-objdump now disassembles unrecognised opcodes as data, using
the .long directive. We treat unrecognised opcodes as being 32 bit
values, so move along 4 bytes rather than the single byte which
previously resulted in a cascade of bogus disassembly following an
unrecognised opcode.

While no solution can always disassemble code that contains
embedded data correctly this provides a significant improvement.

The disassembler will now cope with an arbitrary length section
as it no longer truncates it to a multiple of 4 bytes, and will
use the .byte directive for trailing bytes.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D44685

llvm-svn: 328553
2018-03-26 17:06:33 +00:00
Simon Pilgrim 86ea53123d [X86][Btver2] Add CVTSI2SD/CVTSI2SS scheduler costs
We still need to account for how Jaguar passes data from GPR -> XMM, which isn't as clean as XMM -> GPR.....

llvm-svn: 328551
2018-03-26 17:02:02 +00:00
Krzysztof Parzyszek 8c07d0c42c [Pipeliner] Check for affine expression in isLoopCarriedOrder
The pipeliner must add a loop carried dependence between two memory
operations if the base register is not an affine (linear) exression.
The current implementation doesn't check how the base register is
defined, which allows non-affine expressions, and then the pipeliner
does not add a loop carried dependence when one is needed.

This patch adds code to isLoopCarriedOrder that checks if the base
register of the memory operations is defined by a phi, and the loop
definition for the phi is a constant increment value.  This is a very
simple check for a linear expression.

Patch by Brendon Cahoon.

llvm-svn: 328550
2018-03-26 16:58:40 +00:00
David Blaikie 535ca36e5e Remove an unneeded (& mislayered) include from Target/TargetLoweringObjectFile on a CodeGen header
llvm-svn: 328549
2018-03-26 16:57:31 +00:00
David Blaikie a1b2bf4c71 Remove unneeded (& mislayered) include from TargetMachine.cpp on a CodeGen header
llvm-svn: 328548
2018-03-26 16:52:10 +00:00
Krzysztof Parzyszek 9f041b1830 [Pipeliner] Add missing loop carried dependences
The pipeliner is not adding a dependence edge for a loop carried
dependence, and ends up scheduling a load from iteration n prior
to an aliased store in iteration n-1.

The code that adds the loop carried dependences in the pipeliner
doesn't check if the memory objects for loads and stores are
"identified" (i.e., distinct) objects. If they are not, then the
code that adds the dependences needs to be conservative. The
objects can be used to check dependences only when they are
distinct objects.

The code that checks for loop carried dependences has been updated
to classify loads and stores that are not identified as "unknown"
values. A store with an "unknown" value can potentially create
a loop carried dependence with any pending load.

Patch by Brendon Cahoon.

llvm-svn: 328547
2018-03-26 16:50:11 +00:00
Krzysztof Parzyszek 16e66f5901 [Pipeliner] Fix renaming in pipeliner when eliminating phis
The phi renaming code in the pipeliner uses the wrong value when
rewriting phi uses, which results in an undefined value. In this
case, the original phi is no longer needed due to the order of
instruction in the pipelined loop. The pipeliner was assuming, in
this case, the the phi loop definition should be used to
rewrite the uses. However, the pipeliner needs to check to make
sure that the loop definition has already been scheduled. If not,
then the phi initial value needs to be used instead.

Patch by Brendon Cahoon.

llvm-svn: 328545
2018-03-26 16:41:36 +00:00
Krzysztof Parzyszek 3f72a6b7a1 [Pipeliner] Fix number of phis to generate in the epilog
The pipeliner was generating too many phis in the epilog blocks, which
caused incorrect code generation when rewriting an instruction that uses
the phi.

In this case, there 3 prolog and epilog stages. An existing phi was
scheduled at stage 1. When generating the code for the 2nd epilog an
extra new phi was generated.

To fix this, we need to update the code that calculates the maximum
number of phis that can be generated, which is based upon the current
prolog stage and the stage of the original phi. In this case, when the
prolog stage is 1 and the original phi stage is 1, the maximum number
of phis to generate is 2.

Patch by Brendon Cahoon.

llvm-svn: 328543
2018-03-26 16:37:55 +00:00
Krzysztof Parzyszek a212204453 [Pipeliner] Use latency to compute RecMII
The patch contains severals changes needed to pipeline an example
that was transformed so that a Phi with a subreg is converted to
copies.

The pipeliner wasn't working for a couple of reasons.
- The RecMII was 3 instead of 2 due to the extra copies.
- Copy instructions contained a latency of 1.
- The node order algorithm was not choosing the best "bottom"
node, which caused an instruction to be scheduled that had a 
predecessor and successor already scheduled.
- Updated the Hexagon Machine Scheduler to check if the node is
latency bound when adding the cost for a 0-latency dependence.

The RecMII was 3 because the computation looks at the number of
nodes in the recurrence. The extra copy is an extra node but
it shouldn't increase the latency. The new RecMII computation
looks at the latency of the instructions in the recurrence. We
changed the latency of the dependence of a copy to 0. The latency
computation for the copy also checks the use of the copy (similar
to a reg_sequence).

The node order algorithm was not choosing the last instruction
in the recurrence for a bottom up traversal. This was when the
last instruction is a copy. A check was added when choosing the
instruction to check for NodeNum if the maxASAP is the same. This
means that the scheduler will not end up with another node in
the recurrence that has both a predecessor and successor already
scheduled.

The cost computation in Hexagon Machine Scheduler adds cost when
an instruction can be packetized with a zero-latency instruction.
We should only do this if the schedule is latency bound. 

Patch by Brendon Cahoon.

llvm-svn: 328542
2018-03-26 16:33:16 +00:00
Simon Pilgrim 8815105cd5 [X86][Btver2] Add CVTSD2SS/CVTSS2SD scheduler costs
llvm-svn: 328541
2018-03-26 16:24:13 +00:00
Krzysztof Parzyszek f13bbf1d58 [Pipeliner] Fix assert caused by pipeliner serialization
The pipeliner is asserting because the serialization step that 
occurs at the end is deleting an instruction.  The assert
occurs later on because there is a use without a definition.  

The problem occurs when an instruction defines a value used 
by a REQ_SEQUENCE and that value is used by a COPY instruction.
The latencies between these instructions are zero, so they are
put in to the same packet.  The serialization code is unable to
handle this correctly, and ends up putting the REG_SEQUENCE
before its definition.

There is special code in the serialization step that attempts
to handle zero-cost instructions (phis, copy, reg_sequence)
differently than regular instructions. Unfortunately, this means
the order does not come out correct.

This patch simplifies the code by changing the seperate steps for
handling zero-cost and regular instructions. Only phis are
handled separate now, since they should occurs first. Then, this
patch adds checks to make use the MoveUse is set to the smallest
value if there are multiple uses in a cycle.

Patch by Brendon Cahoon.

llvm-svn: 328540
2018-03-26 16:23:29 +00:00
Sebastian Pop d870aea03e [InstCombine] reassociate loop invariant GEP chains to enable LICM
This change brings performance of zlib up by 10%. The example below is from a
hot loop in longest_match() from zlib.

do.body:
  %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ]
  %idx.ext = zext i32 %cur_match.addr.0 to i64
  %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext
  %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1
  %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1

In this example %idx.ext1 is a loop invariant. It will be moved above the use of
loop induction variable %idx.ext such that it can be hoisted out of the loop by
LICM. The operands that have dependences carried by the loop will be sinked down
in the GEP chain. This patch will produce the following output:

do.body:
  %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ]
  %idx.ext = zext i32 %cur_match.addr.0 to i64
  %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1
  %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1
  %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext

llvm-svn: 328539
2018-03-26 16:19:31 +00:00
Krzysztof Parzyszek 40df8a2b98 [Pipeliner] Enable more base+offset dependence changes in pipeliner
The pipeliner changes dependences between base+offset instructions
(loads and stores) so that the instructions have more flexibility
to be scheduled with respect to each other. This occurs when the
pipeliner is able to compute that the instructions will not alias
if their order is changed. The prevous code enforced the alias
property by checking if the base register is the same, and that the
offset values are either both positive or negative.

This patch improves the alias check by using the API
areMemAccessesTriviallyDisjoint instead. This enables more cases,
especially if the offset is a negative value. The pipeliner uses
the function by creating a new instruction with the offset used
in the next iteration.

Patch by Brendon Cahoon.

llvm-svn: 328538
2018-03-26 16:17:06 +00:00
Krzysztof Parzyszek 55cb4986a4 [Pipeliner] Fix calculation when reusing phis
A schedule may require that a phi from the original loop is used in
multiple iterations in the scheduled loop. When this occurs, we generate
multiple phis in the pipelined loop to save the value across iterations.

When we generate the new phis and update the register names in the
pipelined loop, the pipeliner attempts to reuse a previously generated
phi, when possible. The calculation for the name of the new phi needs
to account for the version/iteration of the original phi. Also, in the
epilog, the code only needs to check backwards for a previous iteration
until reaching the first prolog block.

Patch by Brendon Cahoon.

llvm-svn: 328537
2018-03-26 16:10:48 +00:00
Simon Pilgrim aa40148cae [X86][Btver2] Account for the "+i" integer pipe transfer costs (1cy use of JALU0 for GPR PRF write)
llvm-svn: 328536
2018-03-26 16:10:08 +00:00
Krzysztof Parzyszek 8e1363df4e [Pipeliner] Fix check for order dependences when finalizing instructions
The code in orderDepdences that looks at the order dependences between
instructions was processing all the successor and predecessor order
dependences. However, we really only want to check for an order dependence
for instructions scheduled in the same cycle.

Also, fixed how the pipeliner handles output dependences. An output
dependence is also a potential loop carried dependence. The pipeliner
didn't handle this case properly so an invalid schedule could be created
that allowed an output dependence to be scheduled in the next iteration
at the same cycle.

Patch by Brendon Cahoon.

llvm-svn: 328516
2018-03-26 16:05:55 +00:00
Krzysztof Parzyszek 3a0a15afe7 [Pipeliner] Fix in the pipeliner phi reuse code
When the definition of a phi is used by a phi in the next iteration,
the pipeliner was assuming that the definition is processed first.
Because of the assumption, an incorrect phi name was used. This patch
has a check to see if the phi definition has been processed already.

Patch by Brendon Cahoon.

llvm-svn: 328510
2018-03-26 15:58:16 +00:00
Krzysztof Parzyszek b9b75b8cb6 [Pipeliner] Pipeliner should mark physical registers as used
The software pipeliner attempts to delete dead instructions after
generating the pipelined loop. The code looks for uses of each 
instruction. Physical registers should be treated differently because
the use chains do not exist. The code that checks for dead 
instructions should assume that definitions of physical registers
are used if the operand doesn't contain the dead flag.

Patch by Brendon Cahoon.

llvm-svn: 328509
2018-03-26 15:53:23 +00:00
Krzysztof Parzyszek 785b6cec11 [Pipeliner] Correctly update memoperands in the epilog
The pipeliner needs to be conservative when updating the memoperands
of instructions in the epilog. Previously, the pipeliner was changing
the offset of the memoperand based upon the scheduling stage. However,
that is incorrect when control flow branches around the kernel code.
The bug enabled a load and store to the same stack offset to be swapped.

This patch fixes the bug by updating the size of the memoperands to be
UINT_MAX. This conservative value means that dependences will be created
between other loads and stores.

Patch by Brendon Cahoon.

llvm-svn: 328508
2018-03-26 15:45:55 +00:00
Erik Pilkington 615e753e09 [demangler] Fix a bug in r328464 found by oss-fuzz.
llvm-svn: 328507
2018-03-26 15:34:36 +00:00
Krzysztof Parzyszek 56f0fc4716 [Hexagon] Give priority to post-incremementing memory accesses in LSR
llvm-svn: 328506
2018-03-26 15:32:03 +00:00
Simon Pilgrim 0b73b29388 [X86][Btver2] Add CVTSD2SI/CVTSS2SI scheduler costs
Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write)

This also adds missing vcvttss2si tests 

llvm-svn: 328505
2018-03-26 15:30:47 +00:00
Sanjay Patel 4fd4fd610c [InstCombine] distribute fmul over fadd/fsub
This replaces a large chunk of code that was looking for compound
patterns that include these sub-patterns. Existing tests ensure that
all of the previous examples are still folded as expected.

We still need to loosen the FMF check.

llvm-svn: 328502
2018-03-26 15:03:57 +00:00
Simon Pilgrim 3aa9344605 [X86][Btver2] Fix YMM BLENDPD/BLENDPS + UNPCKPD/UNPCKP instructions costs
These should match the YMM MOVDUP/ PERMILPD/PERMILPS + SHUFPD/SHUFPS shuffles instead of using the WriteFShuffle defaults.

llvm-svn: 328501
2018-03-26 14:44:24 +00:00
Sanjay Patel 2455fef497 [InstCombine] check uses before creating instructions for fmul distribution
As the tests show, we could create extra instructions without any obvious benefit.

llvm-svn: 328498
2018-03-26 14:25:43 +00:00
Simon Pilgrim 67df1cf597 [X86][Btver2] Add (V)SQRTPD/(V)SQRTSD costs
The xmm sd/pd versions were using the WriteFSQRT default which is modelled on sqrtss/sqrtps

llvm-svn: 328497
2018-03-26 14:03:40 +00:00
Nicolai Haehnle 4f850eabb6 AMDGPU: Introduce common SOP_Pseudo and VOP_Pseudo TableGen base classes
Differential revision: https://reviews.llvm.org/D44820

Change-Id: I732979e2964006aa15d78a333d8886e6855f319a
llvm-svn: 328496
2018-03-26 13:56:53 +00:00
Simon Pilgrim caa203aed5 [X86][Btver2] Double the AGU and schedule pipe resources for YMM
Both the AGUs and schedule pipes are double pumped for 256-bit instructions as well as the functional units which we already model.

llvm-svn: 328491
2018-03-26 13:15:20 +00:00
Krzysztof Parzyszek 0b377e0ae9 [LSR] Allow giving priority to post-incrementing addressing modes
Implement TTI interface for targets to indicate that the LSR should give
priority to post-incrementing addressing modes.

Combination of patches by Sebastian Pop and Brendon Cahoon.

Differential Revision: https://reviews.llvm.org/D44758

llvm-svn: 328490
2018-03-26 13:10:09 +00:00
Max Kazantsev a55749312b [LoopUnroll] Fix dangling pointers in SCEV
Current logic of loop SCEV invalidation in Loop Unroller implicitly relies on
fact that exit count of outer loops cannot rely on exiting blocks of
inner loops, which is true in current implementation of backedge taken count
calculation but is wrong in general. As result, when we only forget the loop that
we have just unrolled, we may still have cached data for its outer loops (in particular,
exit counts) which keeps references on blocks of inner loop that could have been
changed or even deleted.

The attached test demonstrates a situaton when after unrolling of innermost loop
the outermost loop contains a dangling pointer on non-existant block. The problem
shows up when we apply patch https://reviews.llvm.org/D44677 that makes SCEV
smarter about exit count calculation. I am not sure if the bug exists without this patch,
it appears that now it is accidentally correct just because in practice exact backedge
taken count for outer loops with complex control flow inside is never calculated.
But when SCEV learns to do so, this problem shows up.

This patch replaces existing logic of SCEV loop invalidation with a correct one, which
happens to be invalidation of outermost loop (which also leads to invalidation of all
loops inside of it). It is the only way to ensure that no outer loop keeps dangling pointers
on removed blocks, or just outdated information that has changed after unrolling.

Differential Revision: https://reviews.llvm.org/D44818
Reviewed By: samparker

llvm-svn: 328483
2018-03-26 11:31:46 +00:00
Hans Wennborg 311b63f13b Revert r328386 "[X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32"
This broke Chromium (see crbug.com/825748). It looks like mstorsjo's follow-up
patch at D44876 fixes this, but let's revert back to green for now until that's
ready to land.

(Also reverts r328443.)

> Both GCC and MSVC only look at the low byte of a boolean when it is
> passed.

llvm-svn: 328482
2018-03-26 10:07:51 +00:00
Benjamin Kramer 8840f644b4 [DeadArgElim] Strip allocsize attributes when deleting an argument.
Since allocsize refers to the argument number it gets invalidated when
an argument is removed and the numbers shift.

llvm-svn: 328481
2018-03-26 09:44:24 +00:00
Sam Parker 53a423a417 [IRCE] Enable increasing loops of variable bounds
CanBeMin is currently used which will report true for any unknown
values, but often a check is performed outside the loop which covers
this situation:
    
for (int i = 0; i < N; ++i)
  ...
    
if (N > 0)
  for (int i = 0; i < N; ++i)
    ...
    
So I've add 'LoopGuardedAgainstMin' which reports whether N is
greater than the minimum value which then allows loop with a variable
loop count to be optimised. I've also moved the increasing bound
checking into its own function and replaced SumCanReachMax is another
isLoopEntryGuardedByCond function.

llvm-svn: 328480
2018-03-26 09:29:42 +00:00
Martin Storsjo 439824622a [ARM] Simplify constructing the ARMArchFeature string. NFC.
Differential Revision: https://reviews.llvm.org/D44819

llvm-svn: 328478
2018-03-26 08:41:10 +00:00
Craig Topper 6f28d3c954 [X86] Fix the SchedRW for intrinsic register form of SQRT/RCP/RSQRT.
llvm-svn: 328474
2018-03-26 05:05:12 +00:00
Craig Topper cdfcf8ecda [X86] Merge the SSE and AVX versions of fp divs and sqrts in the SandyBridge/Haswell/Broadwell/Skylake scheduler models.
I've used Agner's data as best I could to get the values to converge on.

llvm-svn: 328473
2018-03-26 05:05:10 +00:00
Craig Topper fbf2d850e3 [X86] Add itinerary to intrinsic version of sqrtss, rcpss, and rsqrtss instructions.
llvm-svn: 328472
2018-03-26 04:20:36 +00:00
Craig Topper c049cb7823 [X86] Correct the itineraries for the dot production instructions.
llvm-svn: 328471
2018-03-26 02:17:15 +00:00
Craig Topper 4367874bc5 [X86] Use the same itinerary for VCVTDQ2PD as the SSE version so that the generated scheduler classes will merge.
llvm-svn: 328470
2018-03-26 02:17:14 +00:00
Craig Topper 659f85af14 [X86] Swap the itineraries on the memory and register forms of CVTDQ2PD.
They were backwards.

llvm-svn: 328469
2018-03-26 02:17:13 +00:00
Craig Topper 4bf23eddaf [X86] Give VMOVSX/ZX the same itinerary as the SSE version so they'll reuse the same generated scheduler class.
llvm-svn: 328468
2018-03-26 02:17:12 +00:00
Craig Topper 6e8d99bbea [X86] Give vpmsadbw the same itinerary as the SSE version so they'll be able to share the same generated scheduler class.
llvm-svn: 328466
2018-03-25 23:52:06 +00:00
Craig Topper 15fef89ad9 [X86] Move (v)movss to port 5 only for Skylake. Move (v)movups/d to port 015 for Skylake.
This matches Agner's data and is consistent with what the EVEX instructions were doing on SKX.

llvm-svn: 328465
2018-03-25 23:40:56 +00:00
Erik Pilkington 8a1cb33ba5 [demangler] Use a back-patching scheme to resolve forward references.
Strictly in a conversion operator's type, a <template-param> refers to a
<template-arg> that is further ahead in the mangled name. Instead of
doing a second parse to resolve these, introduce a
ForwardTemplateReference Node and back-patch the referenced
<template-arg> when we're in the right context.

This is also a correctness fix, previously we would only do a second
parse if the <template-param> was out of bounds in the current set of
<template-args>. This lead to misdemangles (gasp!) when the conversion
operator was a member of a templated struct, for instance.

llvm-svn: 328464
2018-03-25 22:50:33 +00:00
Erik Pilkington 8c7013d4ca [demangler] Tweak how parameter pack sizes are determined.
Rather than eagerly propagating up parameter pack sizes in Node ctors,
find the parameter pack size during printing. This is being done to
support back-patching forward referencing <template-param>s.

llvm-svn: 328463
2018-03-25 22:49:57 +00:00
Erik Pilkington c728786b1d [demangler] Support for clang's enable_if attribute.
Fixes PR33569.

llvm-svn: 328462
2018-03-25 22:49:16 +00:00
Sanjay Patel 93e64dd9a1 [PatternMatch] allow undef elements when matching vector FP +0.0
This continues the FP constant pattern matching improvements from:
https://reviews.llvm.org/rL327627
https://reviews.llvm.org/rL327339
https://reviews.llvm.org/rL327307

Several integer constant matchers also have this ability. I'm
separating matching of integer/pointer null from FP positive zero
and renaming/commenting to make the functionality clearer.

llvm-svn: 328461
2018-03-25 21:16:33 +00:00
Simon Pilgrim 68a8fbc102 [X86] Use WriteResPair for WriteIDiv to cleanup sched defs. NFCI.
llvm-svn: 328460
2018-03-25 20:16:53 +00:00
Simon Pilgrim fecb0b7874 [X86][SkylakeClient] Fix missing comma
llvm-svn: 328458
2018-03-25 19:17:17 +00:00
Simon Pilgrim 351e4fa0e2 [ARM] Remove sched model instregex entries that don't match any instructions (D44687)
Reviewed by @javed.absar

llvm-svn: 328457
2018-03-25 19:07:17 +00:00
Simon Pilgrim 854ac7490d [X86] Add missing full stop to comment. NFCI.
llvm-svn: 328456
2018-03-25 18:49:48 +00:00
Craig Topper 972bdbd415 [X86][SkylakeClient] Fix a set of regular expressions that were checking for optionally starting with 'Y' instead of 'V'
These bad regexs were introduced by r328435

llvm-svn: 328454
2018-03-25 17:33:14 +00:00
Simon Pilgrim 562e8b4eae [X86][MMX] MOVQ2DQ/MOVDQ2Q are better described as WriteVecMove than WriteMove
Not that it makes a difference to current cost values, but will when we try to better model GPR-SIMD transfer costs

llvm-svn: 328453
2018-03-25 17:28:06 +00:00
Simon Pilgrim 25acc0a79b [X86][SkylakeServer] Merge multiple instregex. NFCI
llvm-svn: 328452
2018-03-25 17:25:37 +00:00
Craig Topper a985919d3e [X86] Update cost model for Goldmont. Add fsqrt costs for Silvermont
Add fdiv costs for Goldmont using table 16-17 of the Intel Optimization Manual. Also add overrides for FSQRT for Goldmont and Silvermont.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44644

llvm-svn: 328451
2018-03-25 15:58:12 +00:00
Sanjay Patel 841aac04d4 [InstCombine] peek through more icmp of FP cast + bitcast
This is an extension of rL328426 as noted in D44367. 

llvm-svn: 328448
2018-03-25 14:01:42 +00:00
Simon Pilgrim e3547af7be [X86] Add the ability to override memory folding latency to schedules and add 1uop for memory folds for Intel models
The Intel models need an extra 1uop for memory folded instructions, plus a lot of instructions take a non-default memory latency which should allow us to use the multiclass a lot more to tidy things up.

Differential Revision: https://reviews.llvm.org/D44840

llvm-svn: 328446
2018-03-25 10:21:19 +00:00
Craig Topper e8f4e747bf [X86] Consistently prefix all defs in X86ScheduleSLM.td with 'SLM'.
llvm-svn: 328444
2018-03-25 01:28:43 +00:00
Martin Storsjo 98720156b9 [X86] Update a partially stale comment, since SVN r328386. NFC.
llvm-svn: 328443
2018-03-24 23:00:00 +00:00
Simon Pilgrim 31a9633724 [X86][SkylakeClient] Merge xmm/ymm instructions instregex entries to reduce regex matches to reduce compile time
llvm-svn: 328435
2018-03-24 20:40:14 +00:00
Simon Pilgrim c21deec37b [X86][Broadwell] Merge xmm/ymm instructions instregex entries to reduce regex matches to reduce compile time
llvm-svn: 328434
2018-03-24 19:37:28 +00:00
Mandeep Singh Grang 98bc25a0f2 [RISCV] Use init_array instead of ctors for RISCV target, by default
Summary:
LLVM defaults to the newer .init_array/.fini_array scheme for static
constructors rather than the less desirable .ctors/.dtors (the UseCtors
flag defaults to false). This wasn't being respected in the RISC-V
backend because it fails to call TargetLoweringObjectFileELF::InitializeELF with the the appropriate
flag for UseInitArray.
This patch fixes this by implementing RISCVELFTargetObjectFile and overriding its Initialize method to call
InitializeELF(TM.Options.UseInitArray).

Reviewers: asb, apazos

Reviewed By: asb

Subscribers: mgorny, rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, kito-cheng, shiva0217, llvm-commits

Differential Revision: https://reviews.llvm.org/D44750

llvm-svn: 328433
2018-03-24 18:37:19 +00:00
Simon Pilgrim 2b5967f510 [X86][Haswell] Merge xmm/ymm instructions instregex entries to reduce regex matches to reduce compile time
llvm-svn: 328432
2018-03-24 18:36:01 +00:00
Simon Pilgrim efcf1d85b3 [X86][SandyBridge] Merge xmm/ymm instructions instregex entries to reduce regex matches to reduce compile time
llvm-svn: 328431
2018-03-24 18:12:59 +00:00
Mandeep Singh Grang db00e2e20f [Hexagon] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches.

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44857

llvm-svn: 328430
2018-03-24 17:34:37 +00:00
Mandeep Singh Grang 860adef9e6 [AMDGPU] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Reviewers: tstellar, RKSimon, arsenm

Reviewed By: arsenm

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D44856

llvm-svn: 328429
2018-03-24 17:15:04 +00:00
Sanjay Patel 745a9c62c2 [InstCombine] peek through FP casts for sign-bit compares (PR36682)
This pattern came up in PR36682:
https://bugs.llvm.org/show_bug.cgi?id=36682
https://godbolt.org/g/LhuD9A

Equality checks are planned as a follow-up enhancement.

Differential Revision: https://reviews.llvm.org/D44367

llvm-svn: 328426
2018-03-24 15:45:02 +00:00
Sanjay Patel 286074e8a1 [InstCombine] fix formatting; NFC
llvm-svn: 328425
2018-03-24 15:41:59 +00:00
Craig Topper 097b47a0fc [X86] Add a new disassembler opcode map for 3DNow. Stop treating 3DNow as an attribute.
This reduces the size of llvm-mc by at least 150k since we no longer have to multiply the attribute across 7 tables.

llvm-svn: 328416
2018-03-24 07:48:54 +00:00
Craig Topper e865641aea [X86] Merge the Has3DNow0F0FOpcode TSFlag into the OpMap encoding. NFC
The 3DNow instructions are encoded a little weird, but we can still represent it as an opcode map.

llvm-svn: 328410
2018-03-24 06:04:12 +00:00
Craig Topper 2c0a62ab9a [X86] Add a DAG combine to simplify PMULDQ/PMULUDQ nodes
These nodes only use the lower 32 bits of their inputs so we can use SimplifyDemandedBits to simplify them.

Differential Revision: https://reviews.llvm.org/D44375

llvm-svn: 328405
2018-03-24 01:52:01 +00:00
Eric Christopher fe6e6d93d9 Allow FDE references outside the +/-2GB range supported by PC relative
offsets for code models other than small/medium. For JIT application,
memory layout is less controlled and can result in truncations
otherwise.

Patch based on one by Olexa Bilaniuk!

llvm-svn: 328400
2018-03-24 00:07:38 +00:00
David Blaikie 53f51c1df8 Remove unused header from EntryExitInstrumenter
Fixes layering, since Transforms/Utils doesn't depend on CodeGen, so
shouldn't include headers from it.

llvm-svn: 328399
2018-03-24 00:06:14 +00:00
Craig Topper bc6d2ec8ce [X86] Correct the value AdSizeX in X86II enum. NFC
Should be NFC since nothing used the enum value. The instruction descriptions are generated from tablegen which had the correct value.

llvm-svn: 328398
2018-03-24 00:02:46 +00:00
David Blaikie 36a0f226b1 Fix layering by moving ValueTypes.h from CodeGen to IR
ValueTypes.h is implemented in IR already.

llvm-svn: 328397
2018-03-23 23:58:31 +00:00
David Blaikie 13e77db2df Fix layering of MachineValueType.h by moving it from CodeGen to Support
This is used by llvm tblgen as well as by LLVM Targets, so the only
common place is Support for now. (maybe we need another target for these
sorts of things - but for now I'm at least making them correct & we can
make them better if/when people have strong feelings)

llvm-svn: 328395
2018-03-23 23:58:25 +00:00
David Blaikie bf121cf44a Fix layering by moving Support/CodeGenCWrappers.h to Target
This includes llvm-c/TargetMachine.h which is logically part of
libTarget (since libTarget implements llvm-c/TargetMachine.h's
functions).

llvm-svn: 328394
2018-03-23 23:58:21 +00:00
David Blaikie ab7f17f4ec Fix layering by moving X86DisassemblerDecoderCommon to Support
This is used from llvm tblgen and the X86Disassembler - the only common
library (apart from TableGen, which probably doesn't make sense to have
as a dependency from a release tool (rather than a use-while-building-llvm
tool) of LLVM)

llvm-svn: 328393
2018-03-23 23:58:20 +00:00
David Blaikie 6054e650ff Move TargetLoweringObjectFile from CodeGen to Target to fix layering
It's implemented in Target & include from other Target headers, so the
header should be in Target.

llvm-svn: 328392
2018-03-23 23:58:19 +00:00
Philip Reames 6a1f3446b5 [GuardWidening] Group code by class [NFC]
llvm-svn: 328387
2018-03-23 23:41:47 +00:00
Reid Kleckner e27b410661 [X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32
Both GCC and MSVC only look at the low byte of a boolean when it is
passed.

llvm-svn: 328386
2018-03-23 23:38:53 +00:00
David Blaikie 4fe1fe1418 Fix Layering, move instrumentation transform headers into Instrumentation subdirectory
llvm-svn: 328379
2018-03-23 22:11:06 +00:00
Fedor Sergeev 6660fd0f95 [PM][FunctionAttrs] add NoUnwind attribute inference to PostOrderFunctionAttrs pass
Summary:
This was motivated by absence of PrunEH functionality in new PM.
It was decided that a proper way to do PruneEH is to add NoUnwind inference
into PostOrderFunctionAttrs and then perform normal SimplifyCFG on top.

This change generalizes attribute handling implemented for (a removal of)
Convergent attribute, by introducing a generic builder-like class
   AttributeInferer

It registers all the attribute inference requests, storing per-attribute
predicates into a vector, and then goes through an SCC Node, scanning all
the instructions for not breaking attribute assumptions.

The main idea is that as soon all the instructions from all the functions
of SCC Node conform to attribute assumptions then we are free to infer
the attribute as set for all the functions of SCC Node.

It handles two distinct cases of attributes:
   - those that might break due to derefinement of the function code

     for these attributes we are allowed to apply inference only if all the
     functions are "exact definitions". Example - NoUnwind.

   - those that do not care about derefinement

     for these attributes we are allowed to apply inference as soon as we see
     any function definition. Example - removal of Convergent attribute.

Also in this commit:
* Converted all the FunctionAttrs tests to use FileCheck and added new-PM
  invocations to them

* FunctionAttrs/convergent.ll test demonstrates a difference in behavior between
   new and old PM implementations. Marked with FIXME.

* PruneEH tests were converted to new-PM as well, using function-attrs+simplify-cfg
  combo as intended

* some of "other" tests were updated since function-attrs now infers 'nounwind'
  even for old PM pipeline

* -disable-nounwind-inference hidden option added as a possible workaround for a supposedly
  rare case when nounwind being inferred by default presents a problem

Reviewers: chandlerc, jlebar

Reviewed By: jlebar

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D44415

llvm-svn: 328377
2018-03-23 21:46:16 +00:00
Sanjay Patel 32381d7c7e [InstCombine] simplify code for FP intrinsic shrinking; NFCI
llvm-svn: 328372
2018-03-23 21:18:12 +00:00
Krzysztof Parzyszek 998df2ca4f [Hexagon] Make findLoopInstr member of HexagonInstrInfo
llvm-svn: 328367
2018-03-23 20:43:02 +00:00
Krzysztof Parzyszek 8038dad7db [Hexagon] Correct update of instruction offet in HW loop fixup
llvm-svn: 328366
2018-03-23 20:41:44 +00:00
Krzysztof Parzyszek bcf0a96f9e [Hexagon] Boost profit for word-mask immediates, reduce for others
This avoids unnecessary splitting due to uninteresting immediates.

llvm-svn: 328364
2018-03-23 20:11:00 +00:00
Zachary Turner f228276262 [PDB] Resubmit "Support embedding natvis files in PDBs."
This was reverted several times due to what ultimately turned out
to be incompatibilities in our serialized hash table format.

Several changes went in prior to this to fix those issues since
they were more fundamental and independent of supporting injected
sources, so now that those are fixed this change should hopefully
pass.

llvm-svn: 328363
2018-03-23 19:57:25 +00:00
Krzysztof Parzyszek ca93f5e605 [Hexagon] Assume all extendable branches to be of size 8 in relaxation
The branch relaxation pass collects sizes of all instructions at the
beginning, before any changes have been made. It then performs one pass
over all branches to see which ones need to be extended. It does not
account for the case when a previously valid branch becomes out-of-range
due to relaxing other branches.
This approach fixes this problem by assuming from the beginning that
all extendable branches have been extended. This may cause unneeded
relaxation in some cases, but avoids iteration and recomputing instruction
sizes.

llvm-svn: 328360
2018-03-23 19:47:13 +00:00
Krzysztof Parzyszek 6f503b96fb [Hexagon] Incorrectly removing dead flag and adding kill flag
The HexagonExpandCondsets pass is incorrectly removing the dead
flag on a definition that is really dead, and adding a kill flag
to a use that is tied to a definition. This causes an assert later
during the machine scheduler when querying the live interval
information.

Patch by Brendon Cahoon.

llvm-svn: 328357
2018-03-23 19:39:37 +00:00
Benjamin Kramer faa9b438ce [Hexagon] Silence unused variable warning in Release builds
llvm-svn: 328356
2018-03-23 19:39:16 +00:00
Krzysztof Parzyszek e247526cc9 [Hexagon] Fold offset in base+immediate loads/stores
Optimize Ry = add(Rx,#n); memw(Ry+#0) = Rz  =>  memw(Rx,#n) = Rz.

Patch by Jyotsna Verma.

llvm-svn: 328355
2018-03-23 19:30:34 +00:00
Craig Topper 4529d3abcb [X86] Add itinerary to RCPSS*_Int and similar instructions.
llvm-svn: 328353
2018-03-23 19:15:05 +00:00
Craig Topper 02fb3907f1 [X86] Add itineraries to ADD.*_DB instructions to match their normal counterparts.
llvm-svn: 328352
2018-03-23 19:15:03 +00:00
Tony Tye 7a893d4e34 [AMDGPU] Remove use of OpenCL triple environment and replace with function attribute for AMDGPU
- Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target.
- Use function attribute to communicate to the AMDGPU backend to add implicit arguments for OpenCL kernels for the AMDHSA OS.

Differential Revision: https://reviews.llvm.org/D43736

llvm-svn: 328349
2018-03-23 18:45:18 +00:00
Zachary Turner a6fb536e5b [PDB] Make our PDBs look more like MS PDBs.
When investigating bugs in PDB generation, the first step is
often to do the same link with link.exe and then compare PDBs.

But comparing PDBs is hard because two completely different byte
sequences can both be correct, so it hampers the investigation when
you also have to spend time figuring out not just which bytes are
different, but also if the difference is meaningful.

This patch fixes a couple of cases related to string table emission,
hash table emission, and the order in which we emit strings that
makes more of our bytes the same as the bytes generated by MS PDBs.

Differential Revision: https://reviews.llvm.org/D44810

llvm-svn: 328348
2018-03-23 18:43:39 +00:00
Krzysztof Parzyszek 5f7ba9a74c [Hexagon] Always generate mux out of predicated transfers if possible
HexagonGenMux would collapse pairs of predicated transfers if it assumed
that the predicated .new forms cannot be created. Turns out that generating
mux is preferable in almost all cases.
Introduce an option -hexagon-gen-mux-threshold that controls the minimum
distance between the instruction defining the predicate and the later of
the two transfers. If the distance is closer than the threshold, mux will
not be generated. Set the threshold to 0 by default.

llvm-svn: 328346
2018-03-23 18:43:09 +00:00
Krzysztof Parzyszek 80f10e4fe5 [Hexagon] Avoid early if-conversion for one sided branches
Patch by Anand Kodnani.

llvm-svn: 328344
2018-03-23 18:00:18 +00:00
Simon Pilgrim 6c63e6c222 [X86][Btver2] Cleanup TEST instructions to use JFPA (+JFPX on ymms) function unit
llvm-svn: 328343
2018-03-23 17:59:22 +00:00
Alex Shlyapnikov 83e7841419 [HWASan] Port HWASan to Linux x86-64 (LLVM)
Summary:
Porting HWASan to Linux x86-64, first of the three patches, LLVM part.

The approach is similar to ARM case, trap signal is used to communicate
memory tag check failure. int3 instruction is used to generate a signal,
access parameters are stored in nop [eax + offset] instruction immediately
following the int3 one.

One notable difference is that x86-64 has to untag the pointer before use
due to the lack of feature comparable to ARM's TBI (Top Byte Ignore).

Reviewers: eugenis

Subscribers: kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D44699

llvm-svn: 328342
2018-03-23 17:57:54 +00:00
Ana Pazos 41573804f2 [ARM] Fix "Constant pool entry out of range!" in Thumb1 mode
This patch fixes PR36658, "Constant pool entry out of range!" in Thumb1 mode.

In ARMConstantIslands::optimizeThumb2JumpTables() in Thumb1 mode,
adjustBBOffsetsAfter() is not calculating postOffset correctly by
properly accounting for the padding that is required for the constant pool
that immediately follows the jump table branch  instruction.

Reviewers: t.p.northover, eli.friedman

Reviewed By: t.p.northover

Subscribers: chrib, tstellar, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D44709

llvm-svn: 328341
2018-03-23 17:53:27 +00:00
Krzysztof Parzyszek 570c6440cd [Hexagon] Two fixes in early if-conversion
- Fix checking for vector predicate registers.
- Avoid speculating llvm.lifetime.end intrinsic.

Patch by Harsha Jagasia and Brendon Cahoon.

llvm-svn: 328339
2018-03-23 17:46:09 +00:00
Simon Pilgrim e5c0a041ff [X86][Btver2] Cleanup MOVMSK instructions to use JFPA function unit
Add missing non-VEX and (V)PMOVMSKB instructions to the pattern

llvm-svn: 328338
2018-03-23 17:38:59 +00:00
Andrew Kaylor a237866faf Fix a block copying problem in LICM
Differential Revision: https://reviews.llvm.org/D44817

llvm-svn: 328336
2018-03-23 17:36:18 +00:00
Fangrui Song c244a15801 [ADT] Simplify getMemory. NFC
llvm-svn: 328334
2018-03-23 17:26:12 +00:00
Krzysztof Parzyszek c98802de09 [Hexagon] Copy subregisters in HexagonStoreWiden
When converting an instruction to the wider version, copy any
subregisters if the original operand has a subregister.

Patch by Brendon Cahoon.

llvm-svn: 328333
2018-03-23 17:22:55 +00:00
Simon Pilgrim 256f149bf0 [X86][Btver2] Vector permutes use a JFPU01 scheduler pipe and JFPX/JVALU function unit
llvm-svn: 328331
2018-03-23 16:17:56 +00:00
Simon Pilgrim ee282b3160 [X86][Btver2] Vector store instructions use a JFPU1 scheduler pipe and JSAGU/JSTC function units
llvm-svn: 328328
2018-03-23 15:35:13 +00:00
Zaara Syeda 6535993625 Re-commit: [MachineLICM] Add functions to MachineLICM to hoist invariant stores
This patch adds functions to allow MachineLICM to hoist invariant stores.
Currently, MachineLICM does not hoist any store instructions, however
when storing the same value to a constant spot on the stack, the store
instruction should be considered invariant and be hoisted. The function
isInvariantStore iterates each operand of the store instruction and checks
that each register operand satisfies isCallerPreservedPhysReg. The store
may be fed by a copy, which is hoisted by isCopyFeedingInvariantStore.
This patch also adds the PowerPC changes needed to consider the stack
register as caller preserved.

Differential Revision: https://reviews.llvm.org/D40196

llvm-svn: 328326
2018-03-23 15:28:15 +00:00
Simon Pilgrim 1335b9c0ca [X86][Btver2] Cleanup DPPS/DPPD instructions to use JFPA/JFPM function units
llvm-svn: 328324
2018-03-23 15:17:50 +00:00
Sanjay Patel 713ca3d36a [InstCombine] reduce code duplication; NFC
llvm-svn: 328323
2018-03-23 15:07:35 +00:00
Sanjay Patel 6de89ce3f7 [InstCombine] improve variable name; NFC
llvm-svn: 328322
2018-03-23 14:48:31 +00:00
John Brawn e3b44f9de6 [AArch64] Don't reduce the width of loads if it prevents combining a shift
Loads and stores can only shift the offset register by the size of the value
being loaded, but currently the DAGCombiner will reduce the width of the load
if it's followed by a trunc making it impossible to later combine the shift.

Solve this by implementing shouldReduceLoadWidth for the AArch64 backend and
make it prevent the width reduction if this is what would happen, though do
allow it if reducing the load width will let us eliminate a later sign or zero
extend.

Differential Revision: https://reviews.llvm.org/D44794

llvm-svn: 328321
2018-03-23 14:47:07 +00:00
Simon Pilgrim 5792e10ffb [X86][Btver2] Fix MicroOps counts for DPPS/YMM memory folded instructions
This was due to a misunderstanding over what llvm calls a micro-op (retirement unit) is actually called a macro-op on the AMD/Jaguar target. Folded loads don't affect num macro ops.

llvm-svn: 328320
2018-03-23 14:45:03 +00:00
Simon Pilgrim 8619962c73 [X86][Btver2] Cleanup SSE42 PCMPISTR/PCMPESTR string instructions to correctly use JFPU1 scheduler pipe followed by JLAGU/JSAGU/JFPA/JVALU function units
Fixes throughput to match Agner/Fam16h-SoG as well.

llvm-svn: 328318
2018-03-23 14:27:26 +00:00
Matthew Simpson 6c289a1c74 [SLP] Stop counting cost of gather sequences with multiple uses
When building the SLP tree, we look for reuse among the vectorized tree
entries. However, each gather sequence is represented by a unique tree entry,
even though the sequence may be identical to another one. This means, for
example, that a gather sequence with two uses will be counted twice when
computing the cost of the tree. We should only count the cost of the definition
of a gather sequence rather than its uses. During code generation, the
redundant gather sequences are emitted, but we optimize them away with CSE. So
it looks like this problem just affects the cost model.

Differential Revision: https://reviews.llvm.org/D44742

llvm-svn: 328316
2018-03-23 14:18:27 +00:00
Alexey Bataev bff360865b [DEBUGINFO] Add flag for DWARF2 to use sections as references.
Summary:
Some targets does not support labels inside debug sections, but support
references in form `section+offset`. Patch adds initial support
for this.

Reviewers: echristo, probinson, jlebar

Subscribers: llvm-commits, JDevlieghere

Differential Revision: https://reviews.llvm.org/D43943

llvm-svn: 328314
2018-03-23 13:35:54 +00:00
Christof Douma 4a025cc79d [ARM] Support float literals under XO
When targeting execute-only and fp-armv8, float constants in a compare
resulted in instruction selection failures. This is now fixed by using
vmov.f32 where possible, otherwise the floating point constant is
lowered into a integer constant that is moved into a floating point
register.

This patch also restores using fpcmp with immediate 0 under fp-armv8.

Change-Id: Ie87229706f4ed879a0c0cf66631b6047ed6c6443
llvm-svn: 328313
2018-03-23 13:02:03 +00:00
Florian Hahn f73c3ece7f Revert r328307: [IPSCCP] Use constant range information for comparisons of parameters.
Reverted for now, due to it causing verifier failures.

llvm-svn: 328312
2018-03-23 12:49:39 +00:00
Simon Pilgrim 9ea14bbbb0 [X86][Znver1] Fix instregex entries that don't match any instructions (D44687)
Reviewed by @GGanesh and @craig.topper

llvm-svn: 328309
2018-03-23 12:08:23 +00:00
Simon Pilgrim 2755893834 [X86][SandyBridge] Fix missing comma that was causing string concatenation of 2 instregex entries
Found while updating D44687

llvm-svn: 328308
2018-03-23 11:56:38 +00:00
Florian Hahn b1feec087e [IPSCCP] Use constant range information for comparisons of parameters.
For comparisons with parameters, we can use the ParamState lattice
elements which also provide constant range information. This improves
the code for PR33253 further and gets us closer to use
ValueLatticeElement for all values.

Also, as we are using the range information in the solver directly, we
do not need tryToReplaceWithConstantRange afterwards anymore.

Reviewers: dberlin, mssimpso, davide, efriedma

Reviewed By: mssimpso

Differential Revision: https://reviews.llvm.org/D43762

llvm-svn: 328307
2018-03-23 11:56:00 +00:00
Simon Pilgrim a1e3ea01ef [X86][Btver2] Vector move/load/store instructions use a JFPU01 scheduler pipe and JFPX/JVALU function unit as well as the AGUs
llvm-svn: 328304
2018-03-23 11:27:31 +00:00
Florian Hahn 588e640ea1 [AArch64] Clean-up a few over-eager regexps in models.
Patch by Simon Pilgrim <llvm-dev@redking.me.uk>

That is a slightly modified version of the AArch64 changes from
Simon's D44687 .

llvm-svn: 328303
2018-03-23 11:00:42 +00:00
Florian Hahn 52436a587e [LoopUnroll] Simplify induction variables after peeling too.
Loop peeling also has an impact on the induction variables, so we should
benefit from induction variable simplification after peeling too.

Reviewers: sanjoy, bogner, mzolotukhin, efriedma

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D43878

llvm-svn: 328301
2018-03-23 10:38:12 +00:00
Martin Storsjo e1a64fe95c [ARM] Error out on .arm assembler directives on windows
Windows on arm is thumb only.

Differential Revision: https://reviews.llvm.org/D43005

llvm-svn: 328298
2018-03-23 09:10:03 +00:00
Martin Storsjo db75aa96d3 Revert "[DAGCombiner] Fold (zext (and/or/xor (shl/shr (load x), cst), cst))"
This reverts commit r328252. This change broke building a number
of projects when targeting ARM and AArch64, see PR36873.

llvm-svn: 328297
2018-03-23 08:36:47 +00:00
Craig Topper dfeea84d63 [X86] Give VPCMPEQQ the same itinerary as its SSE counterpart.
llvm-svn: 328296
2018-03-23 06:58:55 +00:00
Craig Topper 4787b7f434 [X86] Correct the latencies of SNB integer vector multiplies based on Agner's data. Add missing MMX multiplies.
llvm-svn: 328295
2018-03-23 06:41:43 +00:00
Craig Topper 659c66dfc1 [X86] Match vpblendvb/vblendvps/vblendvpd itineraries to the SSE equivalent. Change pblendvb/blendvps/blendvpd to use WriteFVarBlend
llvm-svn: 328294
2018-03-23 06:41:41 +00:00
Craig Topper 7580a7997d [X86] Change VPSADBW itinerary to SSE_INTALU_ITINS_P to match the SSE version.
llvm-svn: 328293
2018-03-23 06:41:40 +00:00
Craig Topper d5ac3ae8d3 [X86] Give VLDDQUrm and LDDQUrm the same itinerary.
llvm-svn: 328292
2018-03-23 06:41:39 +00:00
Craig Topper 7f142b8bf1 [X86] Merge VMOVMSKBrr and MOVMSKBrr in the SNB sheduler model.
The VMOVMSKBrr was in a separate InstRW with a lower latency, but I assume they should be the same and the higher latency matches Agners table so I'm going with that.

llvm-svn: 328291
2018-03-23 06:41:38 +00:00
Craig Topper fae4173b47 [X86] Add VEXTRB/W/D/Q to Zen scheduler model.
The SSE versions were present, but not the VEX version.

llvm-svn: 328290
2018-03-23 06:41:36 +00:00
Craig Topper 6ef55d1887 [X86] Fix the itinerary for vextractps to match extractps.
llvm-svn: 328289
2018-03-23 06:41:35 +00:00
Nirav Dave 5b3e8791b4 [DAG] Fix node id invalidation in Instruction Selection.
Invalidation should be bit negation. Add missing negation.

llvm-svn: 328287
2018-03-23 01:22:39 +00:00
Michael Zolotukhin fab7a676c2 State that CFG is preserved in 'Falkor HW Prefetch Fix Late Phase'.
That removes some redundant recomputations from the passes pipeline.

llvm-svn: 328272
2018-03-22 23:44:40 +00:00
David Blaikie 301627f875 Move SampleProfile.h into IPO along with the rest of the IPO pass headers
llvm-svn: 328262
2018-03-22 22:42:44 +00:00
Craig Topper adb173314d [X86] Correct the VROUND regular expressions in Znver1 scheduler model to account for r328254
llvm-svn: 328260
2018-03-22 22:17:11 +00:00
David Blaikie 376294c23a Finish moving the IPSCCP pass from Scalar to IPO - moving the registration
llvm-svn: 328259
2018-03-22 22:07:53 +00:00
Evgeny Stupachenko 579507a53a Revert r325687 (workaround for PR36032).
Summary:
Revert r325687 workaround for PR36032 since
 a fix was committed in r326154.

Reviewers: sbaranga

Differential Revision: http://reviews.llvm.org/D44768

From: Evgeny Stupachenko <evstupac@gmail.com>
                         <evgeny.v.stupachenko@intel.com>
llvm-svn: 328257
2018-03-22 22:04:39 +00:00
Craig Topper 40d3b32e12 [X86] Rename VROUNDYPS* and VROUNDYPD* instructions to VROUNDPSY* and VROUNDPDY*. Fix itinerary mistake on all memory forms of VROUNDPD
This makes the Y position consistent with other instructions.

This should have been NFC, but while refactoring the multiclass I noticed that VROUNDPD memory forms were using the register itinerary.

llvm-svn: 328254
2018-03-22 21:55:20 +00:00
Guozhi Wei 17ff975eb1 [DAGCombiner] Fold (zext (and/or/xor (shl/shr (load x), cst), cst))
In our real world application, we found the following optimization is missed in DAGCombiner

(zext (and/or/xor (shl/shr (load x), cst), cst)) -> (and/or/xor (shl/shr (zextload x), (zext cst)), (zext cst))

If the user of original zext is an add, it may enable further lea optimization on x86.

This patch add a new function CombineZExtLogicopShiftLoad to do this optimization.

Differential Revision: https://reviews.llvm.org/D44402

llvm-svn: 328252
2018-03-22 21:47:25 +00:00
David Blaikie 3bbf5af0ac Fix layering between SCCP and IPO SCCP
Transforms/Scalar/SCCP.cpp implemented both the Scalar and IPO SCCP, but
this meant Transforms/Scalar including Transfroms/IPO headers, creating
a circular dependency. (IPO depends on Scalar already) - so move the IPO
SCCP shims out into IPO and the basic library implementation accessible
from Scalar/SCCP.h to be used from the IPO/SCCP.cpp implementation.

llvm-svn: 328250
2018-03-22 21:41:29 +00:00
Roman Tereshin d96de6f6ae [MIR] Making MIR Printing, opt -dot-cfg, and -debug printing faster
Value::printAsOperand has been scanning the entire module just to
print a single value as an operand, regardless being asked to print a
type or not at all, and regardless really needing to scan the module
to print a type.

It made some of the users of the method exceptionally slow on large
IR-modules (or large MIR-files with large IR-modules embedded).

This patch defers scanning a module looking for struct types, mostly
numbered struct types, as much as possible, speeding up those users
w/o changing any APIs at all.

See speedup examples below:

Release Build:

# 83 seconds -> 5.5 seconds
time ./bin/llc -start-before=irtranslator -stop-after=irtranslator \
  -global-isel -global-isel-abort=2 -simplify-mir sqlite3.O0.ll -o \
  sqlite3.O0.ll.regbankselected.mir

# 133 seconds -> 6.2 seconds
time ./bin/opt sqlite3.O0.ll -dot-cfg -disable-output

Release + Asserts Build:

# 95 seconds -> 5.5 seconds
time ./bin/llc -start-before=irtranslator -stop-after=irtranslator \
  -global-isel -global-isel-abort=2 -simplify-mir sqlite3.O0.ll -o \
  sqlite3.O0.ll.regbankselected.mir

# 146 seconds -> 6.2 seconds
time ./bin/opt sqlite3.O0.ll -dot-cfg -disable-output

# 1096 seconds -> 553 seconds
time ./bin/llc -debug-only=isel -fast-isel=false -stop-after=isel \
  sqlite3.O0.ll -o /dev/null 2> err

where sqlite3.O0.ll is non-optimized IR produced from
sqlite-amalgamation (http://sqlite.org/download.html), which is entire
SQLite3 implementation in a single C-file.

Benchmarked on 4-cores / 8 threads PCI-E SSD iMac running macOS

Reviewers: dexonsmith, bkramer, void, chandlerc, aditya_nandakumar, dsanders, qcolombet, 

Reviewed By: bogner

Subscribers: thegameg, llvm-commits

Differential Revision: https://reviews.llvm.org/D44132

llvm-svn: 328246
2018-03-22 21:29:07 +00:00
Mircea Trofin 29a21bab08 Revert "Revert "[InstrProf] Support for external functions in text format.""
Summary:
This reverts commit 364eb09576a7667bc6d3ff80c52a83014ccac976 and separates out
the portion that was fixing binary reader error propagation - turns out, there
are production cases where that causes a regression.

Will re-introduce the error propagation fix separately.

The fix to the text reader error propagation is still "in".

Reviewers: bkramer

Reviewed By: bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44807

llvm-svn: 328244
2018-03-22 21:26:52 +00:00
Craig Topper 58afb4ea58 [X86][SkylakeClient] Fix a bunch of instructions that were incorrectly assigned Port015 instead of Port01.
The VEC ADD and VEC MUL units aren't present on port 5 on SkylakeClient.

llvm-svn: 328241
2018-03-22 21:10:07 +00:00
Jessica Paquette df82274f3c [MachineOutliner][NFC] Refactoring + comments in runOnModule
Split up some of the if/else branches in runOnModule. Elaborate on some
comments. Replace a call to getOrCreateMachineFunction with getMachineFunction.

This makes it clearer what's happening in runOnModule, and ensures that the
outliner doesn't create any MachineFunctions which will never be used by the
outliner (or anything else, really).

llvm-svn: 328240
2018-03-22 21:07:09 +00:00
Jun Bum Lim 2ecb7ba4c6 [CodeGen] Add a new pass for PostRA sink
Summary:
This pass sinks COPY instructions into a successor block, if the COPY is not
used in the current block and the COPY is live-in to a single successor
(i.e., doesn't require the COPY to be duplicated).  This avoids executing the
the copy on paths where their results aren't needed.  This also exposes
additional opportunites for dead copy elimination and shrink wrapping.

These copies were either not handled by or are inserted after the MachineSink
pass. As an example of the former case, the MachineSink pass cannot sink
COPY instructions with allocatable source registers; for AArch64 these type
of copy instructions are frequently used to move function parameters (PhyReg)
into virtual registers in the entry block..

For the machine IR below, this pass will sink %w19 in the entry into its
successor (%bb.1) because %w19 is only live-in in %bb.1.

```
   %bb.0:
      %wzr = SUBSWri %w1, 1
      %w19 = COPY %w0
      Bcc 11, %bb.2
    %bb.1:
      Live Ins: %w19
      BL @fun
      %w0 = ADDWrr %w0, %w19
      RET %w0
    %bb.2:
      %w0 = COPY %wzr
      RET %w0
```
As we sink %w19 (CSR in AArch64) into %bb.1, the shrink-wrapping pass will be
able to see %bb.0 as a candidate.

With this change I observed 12% more shrink-wrapping candidate and 13% more dead copies deleted  in spec2000/2006/2017 on AArch64.

Reviewers: qcolombet, MatzeB, thegameg, mcrosier, gberry, hfinkel, john.brawn, twoh, RKSimon, sebpop, kparzysz

Reviewed By: sebpop

Subscribers: evandro, sebpop, sfertile, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D41463

llvm-svn: 328237
2018-03-22 20:06:47 +00:00
Paul Robinson 7947468e69 [DWARF] Replace assert with diagnostic. PR36868.
llvm-svn: 328235
2018-03-22 19:37:56 +00:00
David Blaikie 2965a01e98 Move the initialization of the Meta Renamer pass over to IPO along with the rest of it that was moved in r328209
llvm-svn: 328234
2018-03-22 19:36:54 +00:00
Nirav Dave 8c5f47ac40 [DAG, X86] Fix ISel-time node insertion ids
As in SystemZ backend, correctly propagate node ids when inserting new
unselected nodes into the DAG during instruction Seleciton for X86
target.

Fixes PR36865.

Reviewers: jyknight, craig.topper

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D44797

llvm-svn: 328233
2018-03-22 19:32:07 +00:00
Craig Topper 4a3be6e578 [X86] Correct the scheduling data for some of the 32 and 64 bit multiplies to as best as I understand how they are implemented.
llvm-svn: 328231
2018-03-22 19:22:51 +00:00
Daniel Neilson 710d7b9945 [InstCombineCalls] Update deprecated API usage (NFC)
Summary:
Just updating a call to MemSetInst::getAlignment() to MemSetInst::getDestAlignment(). The
former has been deprecated.

llvm-svn: 328227
2018-03-22 18:36:15 +00:00
Simon Pilgrim bcb86bb927 [X86][Btver2] Conversion, MaskedLoad/MaskedStore and NTStores all are scheduled through the JFPU1 pipe
llvm-svn: 328226
2018-03-22 18:29:16 +00:00
Simon Pilgrim 0e031afa95 [X86][Btver2] FCMP (inc FMAX/FMIN) instructions use the JFPA functional pipe
The ymm instructions are double pumped as well.

llvm-svn: 328222
2018-03-22 17:43:12 +00:00
Zachary Turner 71d36ad9f9 [Codeview/PDB] Rename some methods for clarity.
NFC, this just renames some methods to better express what they
do, and also adds a few helper methods to add some symmetry to the
API in a few places (for example there was a getStringFromId but not
a getIdFromString method in the string table).

llvm-svn: 328221
2018-03-22 17:37:28 +00:00
Aditya Nandakumar b3297ef051 [GISel]: Fix incorrect IRTranslation while translating null pointer types
https://reviews.llvm.org/D44762

Currently IRTranslator produces
%vreg17<def>(p0) = G_CONSTANT 0;

instead we should build
%vreg16(s64) = G_CONSTANT 0
%vreg17(p0) = G_INTTOPTR %vreg16

reviewed by @aemerson.

llvm-svn: 328218
2018-03-22 17:31:38 +00:00
Simon Pilgrim e5b51f6786 [X86][Btver2] FMUL ymm instructions are double pumped on the JFPM functional pipe
llvm-svn: 328217
2018-03-22 17:25:38 +00:00
Craig Topper 7ccb5ebed8 [ARM] Enable the full InstRW overlap check for ARMScheduleR52.td
This fixes a few issues with the R52 instregexs to enable the full overlap checking

Differential Revision: https://reviews.llvm.org/D44767

llvm-svn: 328216
2018-03-22 17:17:47 +00:00
Matt Morehouse 236cdaf84c [SimplifyCFG] Create attribute for fuzzing-specific optimizations.
Summary:
When building with libFuzzer, converting control flow to selects or
obscuring the original operands of CMPs reduces the effectiveness of
libFuzzer's heuristics.

This patch provides an attribute to disable or modify certain optimizations
for optimal fuzzing signal.

Provides a less aggressive alternative to https://reviews.llvm.org/D44057.

Reviewers: vitalybuka, davide, arsenm, hfinkel

Reviewed By: vitalybuka

Subscribers: junbuml, mehdi_amini, wdng, javed.absar, hiraditya, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D44232

llvm-svn: 328214
2018-03-22 17:07:51 +00:00
Alexey Bataev 07254641d3 [DWARF] Add EmitDwarfOffset function, NFC.
Added EmitDwarfOffset function after discussion with Eric Christofer.

llvm-svn: 328212
2018-03-22 16:43:21 +00:00
Anna Thomas 9b1176b0ef [LoopPredication] Add profitability check based on BPI
Summary:
LoopPredication is not profitable when the loop is known to always exit
through some block other than the latch block.
A coarse grained latch check can cause loop predication to predicate the
loop, and unconditionally deoptimize.

However, without predicating the loop, the guard may never fail within the
loop during the dynamic execution because the non-latch loop termination
condition exits the loop before the latch condition causes the loop to
exit.
We teach LP about this using BranchProfileInfo pass.

Reviewers: apilipenko, skatkov, mkazantsev, reames

Reviewed by: skatkov

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44667

llvm-svn: 328210
2018-03-22 16:03:59 +00:00
David Blaikie 0368417595 Move MetaRenamer from Transforms/UTils to Transforms/IPO since it implements part of IPO.h
llvm-svn: 328209
2018-03-22 15:57:47 +00:00
Paul Robinson 938d9a0778 [DWARF] Fix mixing assembler -g with DWARF .file directives.
We were effectively overriding an explicit '.file' directive with info
for the assembler source.  That shouldn't happen.

Fixes PR36636, really, even for .s files emitted by Clang.

Differential Revision: https://reviews.llvm.org/D44265

llvm-svn: 328208
2018-03-22 15:48:01 +00:00
Benjamin Kramer de18a2e6ff Revert "[InstrProf] Support for external functions in text format."
This reverts commit r328132. Breaks FDO selfhost. I'm seeing
error: /tmp/profraw: Invalid instrumentation profile data (bad magic)

llvm-svn: 328207
2018-03-22 15:29:55 +00:00
Florian Hahn 9bc0bc4b9b [CallSiteSplitting] Preserve DominatorTreeAnalysis.
The dominator tree analysis can be preserved easily.
Some other kinds of analysis can probably be preserved
too.

Reviewers: junbuml, dberlin

Reviewed By: dberlin

Differential Revision: https://reviews.llvm.org/D43173

llvm-svn: 328206
2018-03-22 15:23:33 +00:00
Sanjay Patel 3bf58317f7 [MC] fix documentation comments; NFC
llvm-svn: 328205
2018-03-22 15:23:21 +00:00
Simon Pilgrim 53b2c3329a [X86][SSE42] Use the default PCMPEST/PCMPIST scheduler classes directly. NFCI.
Models were completely overriding all SSE42 strins instructions when the default classes could be used for exactly the same coverage.

llvm-svn: 328203
2018-03-22 14:56:18 +00:00
Pavel Labath 79cd942c23 DWARFVerifier: verify debug_names abbreviation table
Summary:
This commit adds checks of the abbreviation table in a DWARF v5 Name
Index. The most interesting/useful check is the one which checks that
each index attributes is encoded using the correct form class, but it
also checks for the more obvious errors like unknown
forms/tags/attributes and duplicated attributes.

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44736

llvm-svn: 328202
2018-03-22 14:50:44 +00:00
Sanjay Patel 94c91b78e7 [InstCombine] add folds for xor-of-icmp signbit tests (PR36682)
This is a retry of r328119 which was reverted at r328145 because
it could crash by trying to combine icmps with different operand
types. This version has a check for that and additional tests.

Original commit message:

This is part of solving:
https://bugs.llvm.org/show_bug.cgi?id=36682

There's also a leftover improvement from the long-ago-closed:
https://bugs.llvm.org/show_bug.cgi?id=5438

https://rise4fun.com/Alive/dC1

llvm-svn: 328197
2018-03-22 14:08:16 +00:00
Simon Pilgrim 3b2ff1faa9 [X86][CLMUL] Use the default CLMUL scheduler classes directly. NFCI.
Models were completely overriding all CLMUL instructions when the WriteCLMUL default classes could be used for exactly the same coverage.

llvm-svn: 328194
2018-03-22 13:37:30 +00:00
Simon Pilgrim 6bdd6b32fd [X86][CLMUL] Fix/add missing itinerary tags to (V)PCLMULQDQ instructions
PCLMULQDQrm was using the rr itinerary.

Difference in itineraries between PCLMULQDQ/VPCLMULQDQ variants was causing an unnecessary duplication of scheduler class entries.

llvm-svn: 328193
2018-03-22 13:36:06 +00:00
Simon Pilgrim 7684e055b3 [X86] Use the default AES scheduler classes directly. NFCI.
Models were completely overriding all AES instructions when the WriteAES default classes could be used for exactly the same coverage.

Removes 6 unnecessary scheduler classes from every model.

Note: Still looking for a way for tblgen to warn when this is happening - often the override is more complete than the default.
llvm-svn: 328192
2018-03-22 13:18:08 +00:00
Florian Hahn 3bb822e7d6 [CloneFunction] Preserve DT in DuplicateInstructionsInSplitBetween.
DuplicateInstructionsInSplitBetween can preserve the DT by passing
through DT to SplitEdge.

Reviewers: sanjoy, junbuml, anna, kuhar

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D44629

llvm-svn: 328189
2018-03-22 11:38:53 +00:00
Craig Topper df7855fc8d [X86] Remove unused SchedWriteRes classes. NFC
llvm-svn: 328181
2018-03-22 04:52:08 +00:00
Craig Topper fc179c6dd5 [X86][Skylake] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955)
I've also merged some VEX/non-VEX instregex strings with a (V?) prefix or (Y?) ymm variant - there are still a lot more of these to do.

This reduces the size of the optimized llc binary on my computer by 400K. Presumably because we went from 5000+ scheduler classes per CPU to ~2000.

llvm-svn: 328179
2018-03-22 04:23:41 +00:00
Aaron Smith 523de05a1f [DIA] Add IPDBSectionContrib interfaces and DIA implementation
To resolve symbol context at a particular address, we need to
determine the compiland for the address. We are able to determine
the parent compiland of PDBSymbolFunc, PDBSymbolTypeUDT,
PDBSymbolTypeEnum symbols indirectly through line information. 
However no such information is availabile for PDBSymbolData, 
i.e. variables.

The Section Contribution table from PDBs has information about
each compiland's contribution to sections by address. For example,
a piece of a contribution looks like,

  VA         RelativeVA  Sect No.  Offset    Length    Compiland
  14000087B0 000087B0    0001      000077B0  000000BB  exe_main.obj

So given an address, it's possible to determine its compiland with
this information.

llvm-svn: 328178
2018-03-22 04:08:15 +00:00
Aaron Smith 58a32a478f [PDB] Get more DIA table enumerators
Rename the original function and make it a static template.

llvm-svn: 328177
2018-03-22 03:57:06 +00:00
David Blaikie 2be3922807 Fix a couple of layering violations in Transforms
Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering.

Transforms depends on Transforms/Utils, not the other way around. So
remove the header and the "createStripGCRelocatesPass" function
declaration (& definition) that is unused and motivated this dependency.

Move Transforms/Utils/Local.h into Analysis because it's used by
Analysis/MemoryBuiltins.cpp.

llvm-svn: 328165
2018-03-21 22:34:23 +00:00
Mircea Trofin 556f8f6adc [InstrProf] Encapsulates access to AddrToMD5Map.
Summary:
This fixes a unittest failure introduced by D44717

D44717 introduced lazy sorting of the internal data structures of the
symbol table. The AddrToMD5Map getter was potentially exposing
inconsistent (unsorted) state. We could sort in the accessor, however,
a client may store the pointer and thus bypass the internal state
management of the symbol table. The alternative in this CL blocks
direct access to the state, thus ensuring consistent
externally-observable state.

Reviewers: davidxl, xur, eraman

Reviewed By: xur

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44757

llvm-svn: 328163
2018-03-21 22:27:31 +00:00
Zachary Turner eb62999455 [PDB] Don't ignore bucket 0 when writing the PDB string table.
The hash table is a list of buckets, and the *value* stored in
the bucket cannot be 0 since that is reserved.  However, the code
here was incorrectly skipping over the 0'th bucket entirely.
The 0'th bucket is perfectly fine, just none of these buckets
can contain the value 0.

As a result, whenever there was a string where hash(S) % Size
was equal to 0, we would write the value in the next bucket
instead.  We never caught this in our tests due to *another*
bug, which is that we would iterate the entire list of buckets
looking for the value, only using the hash value as a starting
point.  However, the real algorithm stops when it finds 0 in
a bucket since it takes that to mean "the item is not in the
hash table".

The unit test is updated to carefully construct a set of hash
values that will cause one item to hash to 0 mod bucket count,
and the reader is also updated to return an error indicating that
the item is not found when it encounters a 0 bucket.

llvm-svn: 328162
2018-03-21 22:23:59 +00:00
Artem Belevich 30512869ff [NVPTX] Make tensor shape part of WMMA intrinsic's name.
This is needed for the upcoming implementation of the
new 8x32x16 and 32x8x16 variants of WMMA instructions
introduced in CUDA 9.1.

Differential Revision: https://reviews.llvm.org/D44719

llvm-svn: 328158
2018-03-21 21:55:02 +00:00
Reid Kleckner 8562c1a198 [PDB] Remove unused private variable, re-applying r327900 after relanding more natvis changes[4~
llvm-svn: 328156
2018-03-21 21:47:26 +00:00
Reid Kleckner 440219d53e [WebAssembly] Really disable wasm register name matcher
The "ShouldEmitMatchRegisterName" bit wasn't taking effect because the
WebAssembly target didn't point to the custom WebAssemblyAsmParser
record.

llvm-svn: 328155
2018-03-21 21:46:47 +00:00
Rafael Espindola c51dc906ea Handle abbr_offset with relocations.
This is mostly just plumbing to get a DWARFDataExtractor where we
compute abbr_offset so we can use getRelocatedValue.

This is part of PR36793.

llvm-svn: 328154
2018-03-21 21:31:25 +00:00
Reid Kleckner 762331be07 Revert r328119 "[InstCombine] add folds for xor-of-icmp signbit tests (PR36682)"
This asserts when compiling safe_numerics_unittest.cpp in Chromium with
MSan.

llvm-svn: 328145
2018-03-21 20:35:36 +00:00
Sanjay Patel e235942a1e [InstSimplify] fp_binop X, NaN --> NaN
We propagate the existing NaN value when possible.

Differential Revision: https://reviews.llvm.org/D44521

llvm-svn: 328140
2018-03-21 19:31:53 +00:00
Craig Topper 2854dc93e1 [X86] Rewrite getOperandBias in X86BaseInfo.h to be a little more structured and update comments to be more clear about what it does. NFC
llvm-svn: 328136
2018-03-21 19:30:28 +00:00
David Blaikie 8820929011 Sink Analysis/ObjectUtil(canBeOmittedFromSymbolTable) into IR so it can be legitimately be used by Object/IRSymtab
llvm-svn: 328135
2018-03-21 19:23:45 +00:00
Mircea Trofin 71349ff07d [InstrProf] Support for external functions in text format.
Summary:
External functions appearing as indirect call targets could not be
found in the SymTab, and the value:counter record was represented,
in the text format, using an empty string for the name. This would
then cause a silent parsing error when reading.

This CL:
- adds explicit support for such functions
- fixes the places where we would not propagate errors when reading
- addresses a performance issue due to eager resorting of the SymTab.

Reviewers: xur, eraman, davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44717

llvm-svn: 328132
2018-03-21 19:06:06 +00:00
David Blaikie 99e172532c Reapply Support layering fixes.
Compiler.h is used by Demangle (which Support depends on) - so sink it
into Demangle to avoid a circular dependency

DataTypes.h is used by llvm-c (which Support depends on) - so sink it
into llvm-c.

DataTypes.h could probably be fixed the other way - making llvm-c depend
on Support instead of Support depending on llvm-c - if anyone feels
that's the better option, happy to work with them on that.

I /think/ this'll address the layering issues that previous attempts to
commit this have triggered in the Modules buildbot, but I haven't been
able to reproduce that build so can't say for sure. If anyone's having
trouble with this - it might be worth taking a look to see if there's a
quick fix/something small I missed rather than revert, but no worries.

llvm-svn: 328123
2018-03-21 17:31:49 +00:00
Krzysztof Parzyszek b4bb75d6ad [Hexagon] Generalize DAG mutation for function calls
Add barrier edges to check for any physical register. The previous code
worked for the function return registers: r0/d0, v0/w0.

Patch by Brendon Cahoon.

llvm-svn: 328120
2018-03-21 17:23:32 +00:00
Sanjay Patel 778032f39d [InstCombine] add folds for xor-of-icmp signbit tests (PR36682)
This is part of solving:
https://bugs.llvm.org/show_bug.cgi?id=36682

There's also a leftover improvement from the long-ago-closed:
https://bugs.llvm.org/show_bug.cgi?id=5438

https://rise4fun.com/Alive/dC1

llvm-svn: 328119
2018-03-21 17:17:13 +00:00
Nicolai Haehnle 87aec1b194 TableGen: Remove redundant loop in ListInit::resolveReferences
Summary:
Recursive lookups are handled by the Resolver, so the loop was purely
a waste of runtime.

Change-Id: I2bd23a68b478aea0bbac1a86ca7635adffa28688

Reviewers: arsenm, craig.topper, tra, MartinO

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D44624

llvm-svn: 328118
2018-03-21 17:13:10 +00:00
Nicolai Haehnle 420e28c78c TableGen: Streamline how defs are instantiated
Summary:
Instantiating def's and defm's needs to perform the following steps:

- for defm's, clone multiclass def prototypes and subsitute template args
- for def's and defm's, add subclass definitions, substituting template
  args
- clone the record based on foreach loops and substitute loop iteration
  variables
- override record variables based on the global 'let' stack
- resolve the record name (this should be simple, but unfortunately it's
  not due to existing .td files relying on rather silly implementation
  details)
- for def(m)s in multiclasses, add the unresolved record as a multiclass
  prototype
- for top-level def(m)s, resolve all internal variable references and add
  them to the record keeper and any active defsets

This change streamlines how we go through these steps, by having both
def's and defm's feed into a single addDef() method that handles foreach,
final resolve, and routing the record to the right place.

This happens to make foreach inside of multiclasses work, as the new
test case demonstrates. Previously, foreach inside multiclasses was not
forbidden by the parser, but it was de facto broken.

Another side effect is that the order of "instantiated from" notes in error
messages is reversed, as the modified test case shows. This is arguably
clearer, since the initial error message ends up pointing directly to
whatever triggered the error, and subsequent notes will point to increasingly
outer layers of multiclasses. This is consistent with how C++ compilers
report nested #includes and nested template instantiations.

Change-Id: Ica146d0db2bc133dd7ed88054371becf24320447

Reviewers: arsenm, craig.topper, tra, MartinO

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D44478

llvm-svn: 328117
2018-03-21 17:12:53 +00:00
Krzysztof Parzyszek c715a5d2b8 [Hexagon] Eliminate subregisters from PHI nodes before pipelining
The pipeliner needs to remove instructions from the SlotIndexes
structure when they are deleted. Otherwise, the SlotIndexes map
has stale data, and an assert will occur when adding new
instructions.

This patch also changes the pipeliner to make the back-edge of
a loop carried dependence 1 cycle. The 1 cycle latency is added
to the anti-dependence that represents the back-edge. This
changes eliminates a couple of hacks added to the pipeliner to
handle the latency of the back-edge. It is needed to correctly
pipeline the test case for the sub-register elimination pass.

llvm-svn: 328113
2018-03-21 16:39:11 +00:00
Reid Kleckner 18574836f9 [WebAssembly] Suppress unused function warning for register name matcher
llvm-svn: 328112
2018-03-21 16:20:58 +00:00
Simon Pilgrim ec2f878779 [X86][Haswell] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955)
I've also merged some VEX/non-VEX instregex strings with a (V?) prefix or (Y?) ymm variant - there are still a lot more of these to do.

llvm-svn: 328111
2018-03-21 16:19:03 +00:00
Simon Pilgrim 96b605d34c [X86][SandyBridge] Merge more VEX/non-VEX instregex patterns (NFCI) (PR35955)
llvm-svn: 328110
2018-03-21 16:05:58 +00:00
Alex Bradbury 65d6ea5e68 [RISCV] Codegen support for RV32F floating point comparison operations
This patch also includes extensive tests targeted at select and br+fcmp IR
inputs. A sequence of br+fcmp required support for FPR32 registers to be added
to RISCVInstrInfo::storeRegToStackSlot and
RISCVInstrInfo::loadRegFromStackSlot.

llvm-svn: 328104
2018-03-21 15:11:02 +00:00
Daniel Neilson 6f1eb58e92 [MemCpyOpt] Update to new API for memory intrinsic alignment
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
MemCpyOpt pass to cease using:
1) The old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific
alignments through the new API.
2) The old IRBuilder CreateMemCpy/CreateMemMove single-alignment APIs in favour of the new
API that allows setting source and destination alignments independently.

We also add a few tests to fill gaps in the testing of this pass.

Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278,
rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774,
rL324781, rL324784, rL324955, rL324960, rL325816, rL327398, rL327421 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

llvm-svn: 328097
2018-03-21 14:14:55 +00:00
Justin Lebar 038cbc5c13 Re-re-land: Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions.
Summary:
If the operands of a udiv/urem can be proved to fit within a smaller
power-of-two-sized type, reduce the width of the udiv/urem.

Backed out for causing performance regressions.  Re-landing
because we've determined that these regressions were noise.

Original Differential Revision: https://reviews.llvm.org/D44102

llvm-svn: 328096
2018-03-21 14:08:21 +00:00
Pavel Labath 646ab4113f Fix build broken by r328090
- constexpr is needed for out-of-class definition of the Type static
  member by some compilers
- MSVC is confused by the initialization of the static constexpr char[]
  member when it happens in a template specialization. Explicitly
  specifying the length of the array seems to be enough to help it
  figure things out.

llvm-svn: 328093
2018-03-21 12:18:03 +00:00
Pavel Labath 9025f9559d [dwarf] Unify unknown dwarf enum formatting code
Summary:
We have had at least three pieces of code (in DWARFAbbreviationDeclaration,
DWARFAcceleratorTable and DWARFDie) that have hand-rolled support for
dumping unknown dwarf enum values. While not terrible, they are a bit
distracting and enable small differences to creep in (Unknown_ffff vs.
Unknown_0xffff). I ended up needing to add a fourth place
(DWARFVerifier), so it seems it would be a good time to centralize.

This patch creates an alternative to the XXXString dumping functions in
the BinaryFormat library, which formats an unknown value as
DW_TYPE_unknown_1234, instead of just an empty string. It is based on
the formatv function, as that allows us to avoid materializing the
string for unknown values (and because this way I don't have to invent a
name for the new functions :P).

In this patch I add formatters for dwarf attributes, forms, tags, and
index attributes as these are the ones in use currently, but adding
other enums is straight-forward.

Reviewers: dblaikie, JDevlieghere, aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44570

llvm-svn: 328090
2018-03-21 11:46:37 +00:00
Jonas Devlieghere 2fdda6882f Revert layering changes
This reverts:
  r328072 "Move Compiler.h from Support to Demangler to fix layering."
  r328073 "Fix the actual user of DataTypes.h in llvm-c to avoid the circular dependency"

Failing bots:
  http://green.lab.llvm.org/green/job/clang-stage2-coverage-R/
  http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/

llvm-svn: 328085
2018-03-21 10:35:09 +00:00
Bjorn Pettersson 5c25f88536 [SelectionDAG] Support multiple dangling debug info for one value
Summary:
When building the selection DAG we sometimes need to postpone
the handling of a dbg.value until the value it should refer to
is created. This is done by using the DanglingDebugInfoMap.
In the past this map has been limited to hold one dangling
dbg.value per value. This patch removes that restriction.

Reviewers: aprantl, rnk, probinson, vsk

Reviewed By: aprantl

Subscribers: Ka-Ka, llvm-commits, JDevlieghere

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D44610

llvm-svn: 328084
2018-03-21 09:44:34 +00:00
Craig Topper 5a69a0011b [X86][Broadwell] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955)
llvm-svn: 328076
2018-03-21 06:28:42 +00:00
David Blaikie 6336aedd39 Move Compiler.h from Support to Demangler to fix layering.
Support depends on Demangle (Support/Unix/Signals.inc), so Demangle
including Support/Compiler.h created a circular dependency.

Leave a forwarding shim of Compiler.h because it makes more sense for
users (a deeper fix might involve splitting Support into lower and upper
Support - but that also sounds a bit weird/awkward) than thinking about
the dependency on the Demangler.

llvm-svn: 328072
2018-03-21 04:07:05 +00:00
Craig Topper 137a4dd84d [X86] Fix the SchedRW for XOP vpcom register form instructions to not be marked as loads.
llvm-svn: 328071
2018-03-21 03:41:33 +00:00
Craig Topper d25f1acf67 [X86] Change PMULLD to 10 cycles on Skylake per Agner's tables and llvm-exegesis.
Also restrict to port 0 and 1 for SkylakeClient. It looks like the scheduler models don't account for client not having a full vector ALU on port 5 like server.

Fixes PR36808.

llvm-svn: 328061
2018-03-20 23:39:48 +00:00
Philip Reames 37a1a29fcb [MustExecute] Shwo the effect of using full loop info variant
Most basic possible test for the logic used by LICM.

Also contains a speculative build fix for compiles which complain about a definition of a stuct K; followed by a declaration as class K;

llvm-svn: 328058
2018-03-20 23:00:54 +00:00
Derek Schuff 73a98f5eea [WebAssembly] Update torture compile test expectations
The tests compile after r328049

llvm-svn: 328057
2018-03-20 23:00:13 +00:00
Philip Reames 23aed5ef6f [MustExecute] Move isGuaranteedToExecute and related rourtines to Analysis
Next step is to actually merge the implementations and get both implementations tested through the new printer.

llvm-svn: 328055
2018-03-20 22:45:23 +00:00
Derek Schuff 39b5367cba [WebAssembly] Strip threadlocal attribute from globals in single thread mode
The default thread model for wasm is single, and in this mode thread-local
global variables can be lowered identically to non-thread-local variables.

Differential Revision: https://reviews.llvm.org/D44703

llvm-svn: 328049
2018-03-20 22:01:32 +00:00
Simon Pilgrim 572bfa562a [X86] Drop unnecessary InstRW overrides for WriteFMA
As noticed on D44687, these already match the WriteFMA def so can be removed.

llvm-svn: 328045
2018-03-20 21:15:23 +00:00
Craig Topper 0f110a88be [ReachingDefAnalysis] Fix what I assume to be a typo ReachingDedDefaultVal->ReachingDefDefaultVal.
Unless Ded has some many I don't know about.

llvm-svn: 328043
2018-03-20 20:53:21 +00:00
Shoaib Meenai 3f689c8632 [ObjCARC] Add funclet token to ARC marker
The inline assembly generated for the ARC autorelease elision marker
must have a funclet token if it's emitted inside a funclet, otherwise
the inline assembly (and all subsequent code in the funclet) will be
marked unreachable by WinEHPrepare.

Note that this only applies for the non-O0 case, since at O0, clang
emits the autorelease elision marker itself rather than deferring to the
backend. The fix for clang is handled in a separate change.

Differential Revision: https://reviews.llvm.org/D44641

llvm-svn: 328042
2018-03-20 20:45:41 +00:00
Martin Storsjo 07589fc496 [X86] Don't use the MSVC stack protector names on mingw
Mingw uses the same stack protector functions as GCC provides
on other platforms as well.

Patch by Valentin Churavy!

Differential Revision: https://reviews.llvm.org/D27296

llvm-svn: 328039
2018-03-20 20:37:51 +00:00
Alexey Bataev 858a7dd6d7 [DEBUGINFO] Add -no-dwarf-debug-ranges option.
Summary:
Added option -no-dwarf-debug-ranges option to disable emission of
.debug_ranges section.

Reviewers: probinson, echristo

Subscribers: aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D44384

llvm-svn: 328030
2018-03-20 20:21:38 +00:00
Derek Schuff e4825975d8 [WebAssembly] Added initial AsmParser implementation.
It uses the MC framework and the tablegen matcher to do the
heavy lifting. Can handle both explicit and implicit locals
(-disable-wasm-explicit-locals). Comes with a small regression
test.

This is a first basic implementation that can parse most llvm .s
output and round-trips most instructions succesfully, but in order
to keep the commit small, does not address all issues.

There are a fair number of mismatches between what MC / assembly
matcher think a "CPU" should look like and what WASM provides,
some already have workarounds in this commit (e.g. the way it
deals with register operands) and some that require further work.
Some of that further work may involve changing what the
Disassembler outputs (and what s2wasm parses), so are probably
best left to followups.

Some known things missing:
- Many directives are ignored and not emitted.
- Vararg calls are parsed but extra args not emitted.
- Loop signatures are likely incorrect.
- $drop= is not emitted.
- Disassembler does not output SIMD types correctly, so assembler
  can't test them.

Patch by Wouter van Oortmerssen

Differential Revision: https://reviews.llvm.org/D44329

llvm-svn: 328028
2018-03-20 20:06:35 +00:00
Evandro Menezes 36afbee1d8 [AArch64] Adjust the cost model for Exynos M3
Fix typo in the number of integer dividers.

llvm-svn: 328027
2018-03-20 20:00:29 +00:00
Krzysztof Parzyszek 65059ee284 [Hexagon] Add heuristic to exclude critical path cost for scheduling
Patch by Brendon Cahoon.

llvm-svn: 328022
2018-03-20 19:26:27 +00:00
Krzysztof Parzyszek 9315c0de9b [Hexagon] Fix fall-through warnings in HexagonMCDuplexInfo.cpp
llvm-svn: 328021
2018-03-20 19:23:18 +00:00
Nirav Dave ce71989188 [MC,X86] Cleanup some X86 parser functions to use MCParser helpers. NFCI.
llvm-svn: 328019
2018-03-20 19:12:41 +00:00
Craig Topper c2dbd677bd [PowerPC][LegalizeFloatTypes] Move the PPC hacks for (i32 fp_to_sint/fp_to_uint (ppcf128 X)) out of LegalizeFloatTypes and into PPC specific code
I'm not entirely sure these hacks are still needed. If you remove the hacks completely, the name of the library call that gets generated doesn't match the grep the test previously had. So the test wasn't really checking anything.

If the hack is still needed it belongs in PPC specific code. I believe the FP_TO_SINT code here is the only place in the tree where a FP_ROUND_INREG node is created today. And I don't think its even being used correctly because the legalization returned a BUILD_PAIR with the same value twice. That doesn't seem right to me. By moving the code entirely to PPC we can avoid creating the FP_ROUND_INREG at all.

I replaced the grep in the existing test with full checks generated by hacking update_llc_test_check.py to support ppc32 just long enough to generate it.

Differential Revision: https://reviews.llvm.org/D44061

llvm-svn: 328017
2018-03-20 18:49:28 +00:00
Krzysztof Parzyszek eb0c510ecd [X86] Add phony registers for high halves of regs with low halves
Registers E[A-D]X, E[SD]I, E[BS]P, and EIP have 16-bit subregisters
that cover the low halves of these registers. This change adds artificial
subregisters for the high halves in order to differentiate (in terms of
register units) between the 32- and the low 16-bit registers.

This patch contains parts that aim to preserve the calculated register
pressure. This is in order to preserve the current codegen (minimize the
impact of this patch). The approach of having artificial subregisters
could be used to fix PR23423, but the pressure calculation would need
to be changed.

Differential Revision: https://reviews.llvm.org/D43353

llvm-svn: 328016
2018-03-20 18:46:55 +00:00
Philip Reames ce998adf0a [MustExecute] Use the annotation style printer
As suggested in the original review (https://reviews.llvm.org/D44524), use an annotation style printer instead.

Note: The switch from -analyze to -disable-output in tests was driven by the fact that seems to be the idiomatic style used in annoation passes.  I tried to keep both working, but the old style pass API for printers really doesn't make this easy.  It invokes (runOnFunction, print(Module)) repeatedly.  I decided the extra state wasn't worth it given the old pass manager is going away soonish anyway.
llvm-svn: 328015
2018-03-20 18:43:44 +00:00
Zachary Turner fced530650 Revert "Resubmit "Support embedding natvis files in PDBs.""
This is still failing on a different bot this time due to some
issue related to hashing absolute paths.  Reverting until I can
figure it out.

llvm-svn: 328014
2018-03-20 18:37:03 +00:00
Artem Belevich 914d4babec [NVPTX] Make tensor load/store intrinsics overloaded.
This way we can support address-space specific variants without explicitly
encoding the space in the name of the intrinsic. Less intrinsics to deal with ->
less boilerplate.

Added a bit of tablegen magic to match/replace an intrinsics with a pointer
argument in particular address space with the space-specific instruction
variant.

Updated tests to use non-default address spaces.

Differential Revision: https://reviews.llvm.org/D43268

llvm-svn: 328006
2018-03-20 17:18:59 +00:00
Philip Reames 89f2241770 Add an analysis printer for must execute reasoning
Many of our loop passes make use of so called "must execute" or "guaranteed to execute" facts to prove the legality of code motion. The basic notion is that we know (by assumption) an instruction didn't fault at it's original location, so if the location we move it to is strictly post dominated by the original, then we can't have introduced a new fault.

At the moment, the testing for this logic is somewhat adhoc and done mostly through LICM. Since I'm working on that code, I want to improve the testing. This patch is the first step in that direction. It doesn't actually test the variant used by the loop passes - I need to move that to the Analysis library first - but instead exercises an alternate implementation used by SCEV. (I plan on merging both implementations.)

Note: I'll be replacing the printing logic within this with an annotation based version in the near future.  Anna suggested this in review, and it seems like a strictly better format.  

Differential Revision: https://reviews.llvm.org/D44524

llvm-svn: 328004
2018-03-20 17:09:21 +00:00
Zachary Turner 132d7a134f Resubmit "Support embedding natvis files in PDBs."
The issue causing this to fail in certain configurations
should be fixed.

It was due to the fact that DIA apparently expects there to be
a null string at ID 1 in the string table.  I'm not sure why this
is important but it seems to make a difference, so set it.

llvm-svn: 328002
2018-03-20 17:06:39 +00:00
Krzysztof Parzyszek 4c6b65f685 [Hexagon] Correct the computation of TopReadyCycle and BotReadyCycle of SU
TopReadyCycle and BotReadyCycle were off by one cycle when an SU is either
the first instruction or the last instruction in a packet.

Patch by Ikhlas Ajbar.

llvm-svn: 328000
2018-03-20 17:03:27 +00:00
Michael Zolotukhin fb3f509e01 [XRay] Lazily compute MachineLoopInfo instead of requiring it.
Summary:
Currently X-Ray Instrumentation pass has a dependency on MachineLoopInfo
(and thus on MachineDominatorTree as well) and we have to compute them
even if X-Ray is not used. This patch changes it to a lazy computation
to save compile time by avoiding these redundant computations.

Reviewers: dberris, kubamracek

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D44666

llvm-svn: 327999
2018-03-20 17:02:29 +00:00
Krzysztof Parzyszek 73be83dec5 [Hexagon] Check weak dependences when only 1 instruction is available
Patch by Brendon Cahoon.

llvm-svn: 327997
2018-03-20 16:22:06 +00:00
Alexey Bataev 648ed2dedb [DEBUGINFO] Add flag -no-dwarf-pub-sections to disable pub sections.
Summary:
Added a flag -no-dwarf-pub-sections, which allows to disable
emission of DWARF public sections.

Reviewers: probinson, echristo

Subscribers: aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D44385

llvm-svn: 327994
2018-03-20 16:04:40 +00:00
Simon Pilgrim 62690e9d0e [X86][Haswell][Znver1] Fix typo in fldl instregexs
Missing comma was casing 2 instregex entries to be concatenated together by mistake.

Found while investigating PR35548

llvm-svn: 327992
2018-03-20 15:44:47 +00:00
Krzysztof Parzyszek 5ffd808a27 [Hexagon] Improve scheduling heuristic for large basic blocks
This patch changes the isLatencyBound heuristic to look at the
path length based upon the number of packets needed to schedule
a basic block. For small basic blocks, the heuristic uses a small
threshold for isLatencyBound. For large basic blocks, the
heuristic uses a large threshold.

The goal is to increase the priority of an instruction in a small
basic block that has a large height or depth relative to the code
size. For large functions, the height and depth are ignored
because it increases the live range of a register and causes more
spills. That is, for large functions, it is more important to
schedule instructions when available, and attempt to keep the defs
and uses closer together.

Patch by Brendon Cahoon.

llvm-svn: 327987
2018-03-20 14:54:01 +00:00
Geoff Berry 0b64402adb [AArch64][Falkor] Correct load/store increment scheduling details
llvm-svn: 327982
2018-03-20 13:46:35 +00:00
Krzysztof Parzyszek 2c4231d888 [Hexagon] Fix division by zero in machine scheduler
llvm-svn: 327980
2018-03-20 13:28:46 +00:00
Alex Bradbury 80c8eb7696 [RISCV] Add codegen for RV32F floating point load/store
As part of this, add support for load/store from the constant pool. This is
used to materialise f32 constants.

llvm-svn: 327979
2018-03-20 13:26:12 +00:00
Alex Bradbury 76c29ee815 [RISCV] Add codegen for RV32F arithmetic and conversion operations
Currently, only a soft floating point ABI is supported.

llvm-svn: 327976
2018-03-20 12:45:35 +00:00
Krzysztof Parzyszek dca383123f [Hexagon] Improve scheduling based on register pressure
Patch by Brendon Cahoon.

llvm-svn: 327975
2018-03-20 12:28:43 +00:00
Simon Pilgrim 4a83f802cc [X86][SandyBridge] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955)
I've also merged some VEX/non-VEX instregex strings with a (V?) prefix - there are still a lot more of these to do.

llvm-svn: 327974
2018-03-20 12:26:55 +00:00
Xin Tong a713ebea24 [MergeICmps] Break eargerly out of loop
llvm-svn: 327972
2018-03-20 12:03:25 +00:00
Xin Tong bdbd97ed9a [MergeICmp] Fix a bug in entry block shuffled to middle of the chain
Summary: Fix a bug in entry block shuffled to middle of the chain.

Reviewers: davide, courbet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44642

llvm-svn: 327971
2018-03-20 11:57:54 +00:00
Igor Laevsky 3ce2d7f270 [llvm-opt-fuzzer] Add irce to the fuzzing options
llvm-svn: 327969
2018-03-20 11:32:13 +00:00
Bjorn Pettersson bf3213e485 [CGP] Avoid segmentation fault when doing PHI node simplifications
Summary:
Made PHI node simplifiations more robust in several ways:

- Minor refactoring to let the SimplificationTracker own the
sets with new PHI/Select nodes that are introduced. This is
maybe not mapping to the original intention with the
SimplificationTracker, but IMHO it encapsulates the logic behind
those sets a little bit better.

- MatchPhiNode can sometimes populate the Matched set with
several entries, where it maps one PHI node to different candidates
for replacement. The Matched set is changed into a SmallSetVector
to make sure we get a deterministic iteration when doing
the replacements.

- As described above we may get several different replacements
for a single PHI node. The loop in MatchPhiSet that is doing
the replacements could end up calling eraseFromParent several
times for the same PHI node, resulting in segmentation faults.
This problem was supposed to be fixed in rL327250, but due to
the non-determinism(?) it only appeared to be fixed (I still
got crashes sometime when turning on/off -print-after-all etc
to get different iteration order in the DenseSets).
With this patch we follow the deterministic ordering in the
Matched set when replacing the PHI nodes. If we find a new
replacement for an already replaced PHI node we replace the
new replacement by the old replacement instead. This is quite
similar to what happened in the rl327250 patch, but here we
also recursively verify that the old replacement hasn't been
replaced already.

- It was really hard to track down the fault described above
(segementation fault due to doing eraseFromParent multiple
times for the same instruction). The fault was intermittent and
small changes in the code, or simply turning on -print-after-all
etc could make the problem go away. This was basically due to
the iteration over PhiNodesToMatch in MatchPhiSet no being
deterministic. Therefore I've changed the data structure for
the SimplificationTracker::AllPhiNodes into an SmallSetVector.
This gives a deterministic behavior.

Reviewers: skatkov, john.brawn

Reviewed By: skatkov

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44571

llvm-svn: 327961
2018-03-20 09:06:37 +00:00
Andrei Elovikov 8b8253fdc7 [LV] Let recordVectorLoopValueForInductionCast to check if IV was created from the cast.
Summary:
It turned out to be error-prone to expect the callers to handle that - better to
leave the decision to this routine and make the required data to be explicitly
passed to the function.

This handles the case that was missed in the r322473 and fixes the assert
mentioned in PR36524.

Reviewers: dorit, mssimpso, Ayal, dcaballe

Reviewed By: dcaballe

Subscribers: Ka-Ka, hiraditya, dneilson, hsaito, llvm-commits

Differential Revision: https://reviews.llvm.org/D43812

llvm-svn: 327960
2018-03-20 09:04:39 +00:00
Martin Storsjo 802b434156 [X86] Properly implement the calling convention for f80 for mingw/x86_64
In these cases, both parameters and return values are passed
as a pointer to a stack allocation.

MSVC doesn't use the f80 data type at all, while it is used
for long doubles on mingw.

Normally, this part of the calling convention is handled
within clang, but for intrinsics that are lowered to libcalls,
it may need to be handled within llvm as well.

Differential Revision: https://reviews.llvm.org/D44592

llvm-svn: 327957
2018-03-20 06:19:38 +00:00
Lang Hames 2c83285716 [ORC] Don't fully qualify explicit destructor call -- it confuses some compilers.
This should fix the builder failure at
http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/19224

llvm-svn: 327955
2018-03-20 05:56:58 +00:00
Craig Topper ad7c685791 [X86] Rename MOVSX32_NOREXrr8 to MOVSX32rr8_NOREX so that the scheduler model regular expressions will pick it up with the regular version.
Do the same for MOVSX32_NOREXrm8, MOVZX32_NOREXrr8, and MOVZX32_NOREXrm8

llvm-svn: 327948
2018-03-20 05:00:20 +00:00
Craig Topper 4778fa7e8a [X86] Fix the SchedRW for memory forms of CMP and TEST.
They were incorrectly marked as RMW operations. Some of the CMP instrucions worked, but the ones that use a similar encoding as RMW form of ADD ended up marked as RMW.

TEST used the same tablegen class as some of the CMPs.

llvm-svn: 327947
2018-03-20 03:55:17 +00:00
Lang Hames 4cca7d229e [ORC] Rename SymbolSource to MaterializationUnit, and make the materialization
operation all-or-nothing, rather than allowing materialization on a per-symbol
basis.

This addresses a shortcoming of per-symbol materialization: If a
MaterializationUnit (/SymbolSource) wants to materialize more symbols than
requested (which is likely: most materializers will want to materialize whole
modules) then it needs a way to notify the symbol table about the extra symbols
being materialized. This process (checking what has been requested against what
is being provided and notifying the symbol table about the difference) has to
be repeated at every level of the JIT stack. Making materialization
all-or-nothing eliminates this issue, simplifying both materializer
implementations and the symbol table (VSO class) API. The cost is that
per-symbol materialization (e.g. for individual symbols in a module) now
requires multiple MaterializationUnits.

llvm-svn: 327946
2018-03-20 03:49:29 +00:00
Craig Topper 3e9462607e [X86] Add TEST16mi/TEST32mi/TEST64mi32 to the Sandybridge/Haswell/Broadwell/Skylake scheduler models.
Move it from a load+store group on SNB to a load only group, the same group as CMP.

llvm-svn: 327944
2018-03-20 03:02:03 +00:00
Craig Topper 7c90e29cf8 [X86] Add ROR/ROL/SHR/SAR by 1 instructions to the Sandy Bridge scheduler model.
I assume these match the generic immediate version like they do in the other models.

llvm-svn: 327943
2018-03-20 03:01:59 +00:00
Quentin Colombet 508f68233d [ShrinkWrap] Take into account landing pad
When scanning the function for CSRs uses and defs, also check if
the basic block are landing pads.
Consider that landing pads needs the CSRs to be properly set.
That way we force the prologue/epilogue to always be pushed out
of the problematic "throw" region. The "throw" region is
problematic because the jumps are not properly modeled.

Fixes PR36513

llvm-svn: 327942
2018-03-20 02:44:40 +00:00
Shiva Chen cbd498ac10 [RISCV] Preserve stack space for outgoing arguments when the function contain variable size objects
E.g.

bar (int x)
{
  char p[x];

  push outgoing variables for foo.
  call foo
}

We need to generate stack adjustment instructions for outgoing arguments by
eliminateCallFramePseudoInstr when the function contains variable size
objects to avoid outgoing variables corrupt the variable size object.

Default hasReservedCallFrame will return !hasFP().
We don't want to generate extra sp adjustment instructions when hasFP()
return true, So We override hasReservedCallFrame as !hasVarSizedObjects().

Differential Revision: https://reviews.llvm.org/D43752

llvm-svn: 327938
2018-03-20 01:39:17 +00:00
Craig Topper 2330d6cd55 [X86] Fix the SNB scheduler for BLENDVB.
PBLENDVBrr0 was with the memory version of VBLENDVB and PBLENDVBrm0 was missing.

llvm-svn: 327937
2018-03-20 01:30:21 +00:00
Vitaly Buka 849217abdf Object: Fix handling of @@@ in .symver directive
Summary:
name@@@nodename is going to be replaced with name@@nodename if symbols is
defined in the assembled file, or name@nodename if undefined.
https://sourceware.org/binutils/docs/as/Symver.html

Fixes PR36623

Reviewers: pcc, espindola

Subscribers: mehdi_amini, hiraditya

Differential Revision: https://reviews.llvm.org/D44274

llvm-svn: 327930
2018-03-20 00:45:03 +00:00
Vitaly Buka 0d03881eb5 Object: Move attribute calculation into RecordStreamer. NFC
Summary: Preparation for D44274

Reviewers: pcc, espindola

Subscribers: hiraditya

Differential Revision: https://reviews.llvm.org/D44276

llvm-svn: 327928
2018-03-20 00:38:33 +00:00
Aaron Smith 6738960588 [SelectionDAG] Transfer DbgValues when integer operations are promoted
Summary:
DbgValue nodes were not transferred when integer DAG nodes were promoted. For example, if an i32 add node was promoted to an i64 add node by DAGTypeLegalizer::PromoteIntegerResult(), its DbgValue node was not transferred to the new node. The simple fix is to update SetPromotedInteger() to transfer DbgValues. 

Add AArch64/dbg-value-i8.ll to test this change and fix ARM/debug-info-d16-reg.ll which had the wrong DILocalVariable nodes with arg numbers even though they are not for function parameters.

Patch by Se Jong Oh!

Reviewers: vsk, JDevlieghere, aprantl

Reviewed By: JDevlieghere

Subscribers: javed.absar, kristof.beyls, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D44546

llvm-svn: 327919
2018-03-19 22:58:50 +00:00
Jessica Paquette 563548d8f3 [MachineOutliner] AArch64: Emit CFI instructions when outlining calls
When outlining calls, the outliner needs to update CFI to ensure that, say,
exception handling works. This commit adds that functionality and adds a test
just for call outlining.

Call outlining stuff in machine-outliner.mir should be moved into
machine-outliner-calls.mir in a later commit.

llvm-svn: 327917
2018-03-19 22:48:40 +00:00
Craig Topper 956fec2a4a [DAGCombiner] Fix type in comment. NFC
llvm-svn: 327916
2018-03-19 22:25:26 +00:00
Craig Topper ab6076514d [X86] Simplify the AVX512 code in LowerTruncate a little.
We don't need to create an ISD::TRUNCATE node to return, we started with one and can return it. Also remove the call to getExtendInVec, the result is just going to be a getNode of that value passed in.

llvm-svn: 327914
2018-03-19 21:58:02 +00:00
Aaron Smith da61120749 [PDB] Add a method to get the full path of the source file for PDBSymbolCompiland
Summary:
Redefine PDBSymbolCompiland::getSourceFileName() to return the filename (w/o directory) of the source file that is used to compile the compiland. This is because the result returned previously is ambiguous. It could be the filename, relative path or full path of the source file. 

Move the implementation of SymbolFilePDB::GetSourceFileNameForPDBCompiland() into a new method PDBSymbolCompiland::getSourceFileFullPath(). 

Reviewers: zturner, rnk, llvm-commits

Reviewed By: zturner

Differential Revision: https://reviews.llvm.org/D44458

llvm-svn: 327910
2018-03-19 21:20:04 +00:00
Aaron Smith 06173e8b46 [PDB] Add exclusive methods to derived symbol class
Summary: This commit adds two methods to the PDBSymboFunc class used in parsing symbols. getLineNumbers() is used to determine a Function symbol's declaration and getCompilandId() is used to initialize the SymbolContext field sc.comp_unit.

Reviewers: zturner, rnk, llvm-commits

Reviewed By: zturner

Differential Revision: https://reviews.llvm.org/D44457

llvm-svn: 327909
2018-03-19 21:18:39 +00:00
Zachary Turner a21558897b Revert "Support embedding natvis files in PDBs."
This is causing a test failure on a certain bot, so I'm removing
this temporarily until we can figure out the source of the error.

llvm-svn: 327903
2018-03-19 20:41:59 +00:00
Zachary Turner 426885b10c Remove an unused private variable.
llvm-svn: 327900
2018-03-19 20:22:48 +00:00
Craig Topper 3b967466d5 [X86] Replace a couple calls to getExtendInVec with getNode and the appropriate target independent EXTEND_VECTOR_INREG opcode.
llvm-svn: 327899
2018-03-19 20:20:22 +00:00