Commit Graph

86328 Commits

Author SHA1 Message Date
Amjad Aboud d7cfb48485 Added support for macro emission in dwarf (supporting DWARF version 4).
Differential Revision: http://reviews.llvm.org/D15495

llvm-svn: 257060
2016-01-07 14:28:20 +00:00
James Molloy 9971a6841c [GlobalsAA] Partially back out r248576
See PR25822 for a more full summary, but we were conflating the concepts of "capture" and "escape". We were proving nocapture and using that proof to infer noescape, which is not true. Escaped-ness is a function-local property - as soon as a value is used in a call argument it escapes. Capturedness is a related but distinct property. It implies a *temporally limited* escape. Consider:

  static int a;
  int b;
  int g(int * nocapture arg);
  int f() {
    a = 2;  // Even though a escapes to g, it is not captured so can be treated as non-escaping here.
    g(&a);  // But here it must be treated as escaping.
    g(&b);  // Now that g(&a) has returned we know it was not captured so we can treat it as non-escaping again.
  }

The original commit did not sufficiently understand this nuance and so caused PR25822 and PR26046.

r248576 included both a performance improvement (which has been backed out) and a related conformance fix (which has been kept along with its testcase).

llvm-svn: 257058
2016-01-07 13:33:28 +00:00
Michael Zuckerman a6df006b50 [AVX512] add PSHUFHW and PSHUFLW Intrinsic
Differential Revision: http://reviews.llvm.org/D15925

llvm-svn: 257056
2016-01-07 12:35:43 +00:00
Simon Pilgrim bcc11a059e [X86][AVX] Match broadcast loads through a bitcast
AVX1 v8i32/v4i64 shuffles are bitcasted to v8f32/v4f64, this patch peeks through bitcasts to check for a load node to allow broadcasts to occur.

Follow up to D15310

llvm-svn: 257055
2016-01-07 11:34:27 +00:00
Dylan McKay 5c96de3ad7 Added AVRTargetObjectFile class and AVR.h
llvm-svn: 257049
2016-01-07 10:53:15 +00:00
Tamas Berghammer 904d5fe496 Mark arm as the 32bit variant of aarch64 in Triple
Change Triple::get32BitArchVariant to return arm/armeb as the 32bit
variant of aarch64/aarch64_be and do the same change for the oppoiste
direction in Triple::get64BitArchVariant.

Differential revision: http://reviews.llvm.org/D15529

llvm-svn: 257048
2016-01-07 10:41:12 +00:00
Junmo Park 1238610aa1 Remove extra whitespace. NFC.
llvm-svn: 257047
2016-01-07 10:26:32 +00:00
Simon Pilgrim 83e44c66ae [X86][SSE} Add INSERTPS as a target shuffle
Follow up to D15378, added INSERTPS to the list of decodable target shuffles and enabled XFormVExtractWithShuffleIntoLoad to handle target shuffles with SentinelZero and tested this with INSERTPS.

llvm-svn: 257046
2016-01-07 10:24:19 +00:00
Michael Zuckerman 4a1566827d [AVX512] add PSHUFD Intrinsic
Differential Revision: http://reviews.llvm.org/D15934

llvm-svn: 257044
2016-01-07 09:24:12 +00:00
Tim Northover bd41cf880c ARM: support TLS accesses on Darwin platforms
Darwin TLS accesses most closely resemble ELF's general-dynamic situation,
since they have to be able to handle all possible situations. The descriptors
and so on are obviously slightly different though.

llvm-svn: 257039
2016-01-07 09:03:03 +00:00
Jonas Paulsson 3939b690f6 [SystemZ] Add hasSideEffects flag on Serialize instruction.
Serialize will perform a hardware serialization operation, and is
acting as a memory barrier. Therefore it must have the hasSideEffects
flag set so it will be treated as a global memory object.

Reviewed by Ulrich Weigand

llvm-svn: 257036
2016-01-07 07:20:55 +00:00
Craig Topper 68cffb17a0 [X86] Remove superfluous mayLoad flag. The pattern already implies it.
llvm-svn: 257035
2016-01-07 06:42:10 +00:00
Craig Topper 79e0ef82e8 [X86] Had hasSideEffects=0 to VBROADCASTI128.
llvm-svn: 257034
2016-01-07 06:37:55 +00:00
Craig Topper 04cc5d25c7 [X86] Add OpSize32 to MOVSX32_NOREX instructions to match their other versions.
llvm-svn: 257033
2016-01-07 06:37:52 +00:00
Craig Topper 0b165557b2 [X86] Add hasSideEffects=0 and mayLoad=1 to MOVZX64* instructions. While there remove a superfluous _Q from the instruction names.
llvm-svn: 257032
2016-01-07 05:57:39 +00:00
Craig Topper fc678ba944 [X86] STOSQ without a rep prefix doesn't read or write RCX.
llvm-svn: 257030
2016-01-07 05:18:49 +00:00
David Majnemer 0e90f46e10 Undo spurious change made in r256965
llvm-svn: 257028
2016-01-07 04:31:35 +00:00
Philip Reames afdbcc6a84 [Statepoints] Add test cases around vectors and stablize test
Unlike my comment in 257022 said, it turns out we do handle constant vectors in the statepoint lowering, but only because SelectionDAG doesn't actually produce constants for them.  Add a couple of tests which show this working.

Also, add a triple to the same test file to hopefully fix a failing bot.

It turns out we do han

llvm-svn: 257025
2016-01-07 04:15:31 +00:00
Haicheng Wu 08b9462540 [AArch64 MachineCombine] Enhance/Add support for general reassociation to reduce the critical path
Allow fadd/fmul to be reassociated in aarch64.

llvm-svn: 257024
2016-01-07 04:01:02 +00:00
Philip Reames 3e2cf5320c [Statepoints] Initial support for relocating vectors of pointers
Currently, we try to split vectors of pointers back into their component pointer elements during rewrite-statepoints-for-gc. This is less than ideal since presumably the vectorizer chose to vectorize for a reason. :) It's also been a source of bugs - in particular, the relocation logic as currently implemented was recently discovered to be wrong.

The alternate approach is to allow gc.relocates of vector-of-pointer type and update the backend to handle them. That's what this patch tries to do. This won't actually enable vector-of-pointers in practice - there are some RS4GC changes needed - but the lowering is standalone and testable so it makes sense to separate.

Note that there are some known cases around vector constants which this patch does not handle. Once this is in, I'll send another patch with individual fixes and test cases. 

Differential Revision: http://reviews.llvm.org/D15632

llvm-svn: 257022
2016-01-07 03:32:11 +00:00
Dan Gohman 0c6f5ac50a [WebAssembly] Add -m:e to the target triple.
This enables ELF-style name mangling, which primarily means using ".L" for
private symbols.

llvm-svn: 257020
2016-01-07 03:19:23 +00:00
Ahmed Bougacha a7324a2823 [Linker] Also treat a DIImportedEntity scope DISubprogram as needed.
Follow-up to r257000: DIImportedEntity can reach a DISubprogram via
its entity, but also via its scope. Handle the latter case as well.

PR26037.

llvm-svn: 257019
2016-01-07 03:14:59 +00:00
Philip Reames 103d2381d6 [RS4GC] Add an option to suppress vector splitting
At the moment, this is essentially a diangostic option so that I can start collecting failing test cases, but we will eventually migrate to removing the vector splitting code entirely.

llvm-svn: 257015
2016-01-07 02:20:11 +00:00
Kostya Serebryany 152ac7ad70 [libFuzzer] add a position hint to the dictionary-based mutator
llvm-svn: 257013
2016-01-07 01:49:35 +00:00
Quentin Colombet 9ed52e9a9e [ShrinkWrapping] Give up on irreducible CFGs.
We need to know whether or not a given basic block is in a loop for the analysis
to be correct.
Loop information may be incomplete on irreducible CFGs, therefore we may
generate incorrect code if we use it in those situations.

This fixes PR25988.

llvm-svn: 257012
2016-01-07 01:23:49 +00:00
Teresa Johnson b951558294 Always treat DISubprogram reached by DIImportedEntity as needed.
It is illegal to have a null entity in a DIImportedEntity, so
we must link in a DISubprogram metadata node referenced by one,
even if the associated function is not linked in or inlined anywhere.

Fixes PR26037.

llvm-svn: 257000
2016-01-07 00:06:27 +00:00
Mehdi Amini 0535003bef Fix PR26051: Memcpy optimization should introduce a call to memcpy before the store destination position
This is a conservative fix, I expect Amaury to relax this.
Follow-up for r256923

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 256999
2016-01-06 23:50:22 +00:00
Sanjay Patel 882a8eed3e rangify; NFCI
llvm-svn: 256998
2016-01-06 23:45:05 +00:00
Simon Pilgrim bc82dedd26 [X86] Determine if target shuffle can contain zero elements
getTargetShuffleMask may return shuffle masks with SM_SentinelZero (-2) values (currently just for PSHUFB but VPERM2X128 as well with this patch). Although some calling functions can make use of this (mainly for shuffle combining), others can not and their inclusion makes shuffle mask comparisons more difficult.

This patch adds a flag to getTargetShuffleMask to indicate if the calling function can't handle SM_SentinelZero; getTargetShuffleMask will then return false if it occurs to make handling much easier.

I've tidied up some uses of getTargetShuffleMask to better indicate what is going on - more could be done but at present I don't have test cases to demonstrate it.

Some upcoming patches will make use of this to both support more uses where SM_SentinelZero is not permitted (e.g. combineShuffleToAddSub), and also will allow us to add INSERTPS support to getTargetShuffleMask as part of better zero handling discussed in D14261.

Differential Revision: http://reviews.llvm.org/D15378

llvm-svn: 256992
2016-01-06 23:24:40 +00:00
Weiming Zhao 0f1762caf9 Recommit r256952 "Filtering IR printing for print-after-all/print-before-all"
Fix lit test fail due to outputting an extra line.

Differential Revision: http://reviews.llvm.org/D15776

llvm-svn: 256987
2016-01-06 22:55:03 +00:00
Justin Bogner a43eacbf9e Bitcode: Fix reading and writing of ConstantDataVectors of halfs
In r254991 I allowed ConstantDataVectors to contain elements of
HalfTy, but I missed updating the bitcode reader and writer to handle
this, so now we crash if we try to emit bitcode on programs that have
constant vectors of half.

This fixes the issue and adds test coverage for reading and writing
constant sequences in bitcode.

llvm-svn: 256982
2016-01-06 22:31:32 +00:00
Nicolai Haehnle a61e5a8d4e AMDGPU/SI: Fix crash when inline assembly is used in a graphics shader
Summary:
This is admittedly something that you could only run into by manually
playing around with shader assembly because the SITypeWriter pass is
skipped for compute.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15902

llvm-svn: 256980
2016-01-06 22:01:04 +00:00
Sanjay Patel c2d6461a4a [LibCallSimplifier] less indenting; NFCI
llvm-svn: 256973
2016-01-06 20:52:21 +00:00
Chen Li 78bde83003 [SplitLandingPadPredecessors] Create a PHINode for the original landingpad only if it has some uses
Summary: This patch adds a check in SplitLandingPadPredecessors to see if the original landingpad instruction has any uses. If not, we don't need to create a PHINode for it in the joint block since it's gonna be a dead code anyway. The motivation for this patch is that we found a bug that SplitLandingPadPredecessors created a PHINode of token type landingpad, which failed the verifier since PHINode can not be token type. However, the created PHINode will never be used in our code pattern. This patch will workaround this bug, and we might add supports in SplitLandingPadPredecessors to handle token type landingpad with uses in the future.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15835

llvm-svn: 256972
2016-01-06 20:32:05 +00:00
Amaury Sechet 3235c08253 Promote aggregate store to memset when possible
Summary: As per title. This will allow the optimizer to pick up on it.

Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15923

llvm-svn: 256969
2016-01-06 19:47:24 +00:00
Amaury Sechet 5fc9f6999d Remove useless DEBUG
llvm-svn: 256968
2016-01-06 19:45:09 +00:00
Philip Reames 5eb90a7835 Consolidate MemRefs handling from BranchFolding and correct latent bug
Move the logic from BranchFolding to use the shared infrastructure for merging MMOs introduced in 256909. This has the effect of making BranchFolding more capable.

In the process, fix a latent bug. The existing handling for merging didn't handle the case where one of the instructions being merged had overflowed and dropped MemRefs. This was a latent bug in the places the code was commoned from, but potentially reachable in BranchFolding.

Once this is in, we're left with a single place to consider implementing MMO unique-ing as proposed in http://reviews.llvm.org/D15230.

Differential Revision: http://reviews.llvm.org/D15913

llvm-svn: 256966
2016-01-06 19:33:12 +00:00
David Majnemer eea7582bfa [WinEH] Remove calculateCatchReturnSuccessorColors
The functionality that calculateCatchReturnSuccessorColors provides was
once non-trivial: it was a computation layered on top of funclet
coloring.

These days, LLVM IR directly encodes what
calculateCatchReturnSuccessorColors computed, obsoleting the need for
it.

No functionality change is intended.

llvm-svn: 256965
2016-01-06 19:26:30 +00:00
Sanjay Patel cddcd7256c [LibCallSimplifier] use instruction-level fast-math-flags for tan/atan transform
llvm-svn: 256964
2016-01-06 19:23:35 +00:00
Quentin Colombet eb61e8e6b0 [X86] Correctly model TLS calls w.r.t. frame requirements.
TLS calls need the stack frame to be properly set up and this
implies that such calls need ADJUSTSTACK_xxx markers.

Fixes PR25820.

llvm-svn: 256959
2016-01-06 19:09:26 +00:00
Nico Weber 891419adc2 Make WinCOFFObjectWriter.cpp's timestamp writing not use ENABLE_TIMESTAMPS
LLVM_ENABLE_TIMESTAMPS controls if timestamps are embedded into llvm's
binaries. Turning it off is useful for deterministic builds.

r246905 made it so that the define suddenly also controls if the binaries that
the llvm binaries _create_ embed timestamps or not – but this shouldn't be a
configure-time option. r256203/r256204 added a driver option to toggle this on
and off, so this patch now passes this driver option in LLVM_ENABLE_TIMESTAMPS
builds so that if LLVM_ENABLE_TIMESTAMPS is set, the build of LLVM is
deterministic – but the built clang can still write timestamps into other
executables when requested.

This also allows removing some of the test machinery added in r292012 to work
around this problem.

See PR24740 for background.
http://reviews.llvm.org/D15783

llvm-svn: 256958
2016-01-06 19:05:19 +00:00
Sanjay Patel ab69e9f497 refactor divrem8 lowering; NFCI
The code duplication contributed to PR25754:
https://llvm.org/bugs/show_bug.cgi?id=25754

llvm-svn: 256957
2016-01-06 18:47:09 +00:00
Michael Kuperstein 037c9984db [ShrinkWrap] Fix FindIDom to only have one kind of failure.
FindIDom() can fail in two different ways - it can either return nullptr or the
block itself, depending on the circumstances. Some users of FindIDom() check
one error condition, while others check the other.

Change it to always return nullptr on failure.
This fixes PR26004.

Differential Revision: http://reviews.llvm.org/D15847

llvm-svn: 256955
2016-01-06 18:40:11 +00:00
Weiming Zhao b243c95c6a Revert r256952 due to lit test fails.
llvm-svn: 256954
2016-01-06 18:31:44 +00:00
Dan Gohman 8f59cf756f [WebAssembly] Don't use range-based loop for a list that's being modified
The first instruction in a block is what the rend() iterator points to, so
if it moves, we need to re-evaluate rend() so that we continue to iterate
through the rest of the instructions.

llvm-svn: 256953
2016-01-06 18:29:35 +00:00
Weiming Zhao eac0636805 Filtering IR printing for print-after-all/print-before-all
Summary:
This patch implements "-print-funcs" option to support function filtering for IR printing like -print-after-all, -print-before etc.
Examples:
  -print-after-all -print-funcs=foo,bar

Reviewers: mcrosier, joker.eph

Subscribers: tejohnson, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D15776

llvm-svn: 256952
2016-01-06 18:20:25 +00:00
Weiming Zhao c7c18d6d14 Fix option desc in FunctionAttrs; NFC
Summary: The example in desc should match with actual option name

Reviewers: jmolloy

Differential Revision: http://reviews.llvm.org/D15800

llvm-svn: 256951
2016-01-06 18:18:16 +00:00
Geoff Berry 12fe2279f3 ScheduleDAGInstrs: Bug fix for missed memory dependency.
Summary:
In buildSchedGraph(), when adding memory dependencies for loads, move
the call to adjustChainDeps() after the call to
addChainDependency(AliasChain) to handle the case where
addChainDependency(AliasChain) ends up not adding a dependency and
instead putting the SU on the RejectMemNodes list.  The call to
adjustChainDeps() must be done after the call to addChainDependency() in
order to process the SU added to the RejectMemNodes list to create
memory dependencies for it.

Reviewers: hfinkel, atrick, jonpa, resistor

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D15927

llvm-svn: 256950
2016-01-06 18:14:26 +00:00
Philip Reames fe46cadcf9 [BasicAA] Extract WriteOnly predicate on parameters [NFC]
Since writeonly is the only missing attribute and special case left for the memset/memcpy family of intrinsics, rearrange the code to make that much more clear.

llvm-svn: 256949
2016-01-06 18:10:35 +00:00
JF Bastien 1dede3f95f WebAssembly: add missing expected failures exposed by r256890
llvm-svn: 256948
2016-01-06 17:08:56 +00:00
JF Bastien e6ec487cf7 WebAssembly: add new expected failures exposed by r256890
llvm-svn: 256945
2016-01-06 16:15:51 +00:00
Krzysztof Parzyszek 2d0418e842 [Hexagon] Add system instructions for cache manipulation
llvm-svn: 256936
2016-01-06 14:22:22 +00:00
Amaury Sechet 457cc4db9e Revert "GlobalsAA: Take advantage of ArgMemOnly, InaccessibleMemOnly and InaccessibleMemOrArgMemOnly attributes"
Summary:
This reverts commit 5a9e526f29cf8510ab5c3d566fbdcf47ac24e1e9.

As per discussion in D15665

This also add a test case so that regression introduced by that diff are not reintroduced.

Reviewers: vaivaswatha, jmolloy, hfinkel, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15919

llvm-svn: 256932
2016-01-06 13:23:52 +00:00
Matthew Simpson bf894faa15 [LV] Avoid creating empty reduction entries (NFC)
This patch prevents us from unintentionally creating entries in the reductions
map for PHIs that are not actually reductions. This is currently not an issue
since we bail out if we encounter PHIs other than inductions or reductions.
However the behavior could become problematic as we add support for additional
recurrence types.

llvm-svn: 256930
2016-01-06 12:50:29 +00:00
Artyom Skrobov 51f2d11be9 PR25754: avoid generating UDIVREM8_ZEXT_HREG nodes with i64 result
Reviewers: spatel, srking

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15331

llvm-svn: 256924
2016-01-06 09:41:10 +00:00
Amaury Sechet d3b2c0fd94 Improve load/store to memcpy for aggregate
Summary: It turns out that if we don't try to do it at the store location, we can do it before any operation that alias the load, as long as no operation alias the store.

Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15903

llvm-svn: 256923
2016-01-06 09:30:39 +00:00
Simon Pilgrim 267163e713 [X86][SSE] There is no zmm addsubpd/addsubps instruction.
Replace the assert in combineShuffleToAddSub with an early out.

llvm-svn: 256922
2016-01-06 09:08:49 +00:00
Simon Pilgrim eaabd64a11 [X86][SSE] An empty target shuffle mask is always a failure.
As discussed on D15378, move the mask.empty() tests to after the switch statement and consider any shuffle decode where the extracted target shuffle mask is empty as a failure.

llvm-svn: 256921
2016-01-06 08:59:32 +00:00
Craig Topper 1b94d9a3cc [X86] Use PS instead of TB for instructions that have PD/XS/XD variations. Use OpSize32 on an instruction that has an OpSize16 variant.
llvm-svn: 256918
2016-01-06 06:18:41 +00:00
Craig Topper 275600390f [X86] Fix an incorrect usage of In32BitMode that should have been Not64BitMode.
llvm-svn: 256917
2016-01-06 06:18:37 +00:00
Philip Reames 2d2fc4adf1 Fix a warning [NFC]
llvm-svn: 256916
2016-01-06 05:53:09 +00:00
David Majnemer b70e23c390 [SimplifyLibCalls] Teach SimplifyLibCalls about operand bundles
If we replace one call-site with another, be sure to move over any
operand bundles that lingered on the old call-site.

This fixes PR26036.

llvm-svn: 256912
2016-01-06 05:01:34 +00:00
Philip Reames ae050a5703 [BasicAA] Remove special casing of memset_pattern16 in favor of generic attribute inference
Most of the properties of memset_pattern16 can be now covered by the generic attributes and inferred by InferFunctionAttrs.  The only exceptions are:
- We don't yet have a writeonly attribute for the first argument.
- We don't have an attribute for modeling the access size facts encoded in MemoryLocation.cpp.  

Differential Revision: http://reviews.llvm.org/D15879

llvm-svn: 256911
2016-01-06 04:53:16 +00:00
Philip Reames cdf46d1b52 [BasicAA] Delete dead code related to memset/memcpy/memmove intrinsics [NFCI]
We only need to describe the writeonly property of one of the arguments. All of the rest of the semantics are nicely described by existing attributes in Intrinsics.td.

Differential Revision: http://reviews.llvm.org/D15880

llvm-svn: 256910
2016-01-06 04:43:03 +00:00
Philip Reames c86ed0055d Extract helper function to merge MemoryOperand lists [NFC]
In the discussion on http://reviews.llvm.org/D15730, Andy pointed out we had a utility function for merging MMO lists. Since it turned we actually had two copies and there's another review in progress (http://reviews.llvm.org/D15230) which needs the same, extract it into a utility function and clean up the interfaces to make it easier to use with a MachineInstBuilder.

I introduced a pair here to track size and allocation together. I think we should probably move in the direction of the MachineOperandsRef helper class, but I'm leaving that for further work. I want to get the poison state introduced before I make major changes to the interface.

Differential Revision: http://reviews.llvm.org/D15757

llvm-svn: 256909
2016-01-06 04:39:03 +00:00
Junmo Park 3a40237c03 Delete trailing whitespace; NFC
llvm-svn: 256908
2016-01-06 03:53:36 +00:00
Junmo Park 3ec882feed Delete trailing whitespace; NFC
llvm-svn: 256906
2016-01-06 03:41:30 +00:00
Yunzhong Gao 34c0199378 Do not define NOGDI. Mingw defines LOGFONTW type in wingdi.h and the mingw
version of shlobj.h includes shobjidl.h and the latter uses the LOGFONTW type.

llvm-svn: 256904
2016-01-06 03:01:10 +00:00
Yunzhong Gao d84c13cdb8 Another attempt at fixing the i686-mingw32-RA-on-linux buildbot. I am getting
confused with what version of mingw is actually installed on the buildbot, and
for now I will just assume this is an unknown version which does not ship with
VersionHelpers.h.

llvm-svn: 256902
2016-01-06 02:48:42 +00:00
Yunzhong Gao b15585f0ea Another attempt at fixing the i686-mingw32-RA-on-linux buildbot.
llvm-svn: 256901
2016-01-06 02:32:31 +00:00
Kostya Serebryany 80eb76abf4 [libFuzzer] extend the dictionary mutator to optionally overwrite data with the dict entry
llvm-svn: 256900
2016-01-06 02:13:04 +00:00
Yunzhong Gao d7009f31a1 Hopefully fix a mingw32 buildbot (i686-mingw32-RA-on-linux) which does not have
the VersionHelpers.h header.

llvm-svn: 256896
2016-01-06 01:36:45 +00:00
Yunzhong Gao fb2a9c4209 Fixing PR25717: fatal IO error writing large outputs to console on Windows.
This patch is similar to the Python issue#11395. We need to cap the output
size to 32767 on Windows to work around the size limit of WriteConsole().
Reference: https://bugs.python.org/issue11395

Writing a test for this bug turns out to be harder than I thought. I am
still working on it (see phabricator review D15705).

Differential Revision: http://reviews.llvm.org/D15553

llvm-svn: 256892
2016-01-06 00:50:06 +00:00
Sanjay Patel 3d07ec973f rangify; NFCI
llvm-svn: 256891
2016-01-06 00:45:42 +00:00
Dan Gohman 797f639e79 [SelectionDAGBuilder] Set NoUnsignedWrap for inbounds gep and load/store offsets.
In an inbounds getelementptr, when an index produces a constant non-negative
offset to add to the base, the add can be assumed to not have unsigned overflow.

This relies on the assumption that addresses can't occupy more than half the
address space, which isn't possible in C because it wouldn't be possible to
represent the difference between the start of the object and one-past-the-end
in a ptrdiff_t.

Setting the NoUnsignedWrap flag is theoretically useful in general, and is
specifically useful to the WebAssembly backend, since it permits stronger
constant offset folding.

Differential Revision: http://reviews.llvm.org/D15544

llvm-svn: 256890
2016-01-06 00:43:06 +00:00
Sanjay Patel 8260d0a9fa use std::max ; NFCI
llvm-svn: 256889
2016-01-06 00:36:59 +00:00
Sanjay Patel c7ddb7fcdb A (B + C) = A B + A C ; NFCI
llvm-svn: 256884
2016-01-06 00:32:15 +00:00
Sanjay Patel f2ea8a25ed fix typo; NFC
llvm-svn: 256883
2016-01-06 00:23:12 +00:00
Mike Aizatsky 8b11f877e4 [libfuzzer] print_new_cov_pcs experimental option.
Differential Revision: http://reviews.llvm.org/D15901

llvm-svn: 256882
2016-01-06 00:21:22 +00:00
Sanjay Patel f5c2d129d8 fix typos; NFC
llvm-svn: 256881
2016-01-06 00:18:29 +00:00
Kostya Serebryany 226b734d73 [libFuzzer] make trace-based fuzzing not crash in presence of threads
llvm-svn: 256876
2016-01-06 00:03:35 +00:00
Manuel Jacob 3eedd11329 [Statepoints] Check for the "gc-leaf-function" attribute on call sites as well.
Reviewers: sanjoy, reames

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D15900

llvm-svn: 256875
2016-01-05 23:59:08 +00:00
Sanjay Patel 29095ea1b0 [LibCallSimplfier] use instruction-level fast-math-flags for fmin/fmax transforms
llvm-svn: 256871
2016-01-05 20:46:19 +00:00
Nicolai Haehnle 6035504ab3 AMDGPU/SI: Do not move scratch resource register on Tonga & Iceland
Due to the SGPR init bug, every program claims to use the same number
of SGPRs anyway, so there's no point in trying to shift those registers
down from their initial spot of reservation.

Add a test that uses VGPR spilling and blocks most SGPRs from being used for
the scratch resource register. Previously, this would run into an assertion.

Differential Revision: http://reviews.llvm.org/D15724

llvm-svn: 256870
2016-01-05 20:42:49 +00:00
Amaury Sechet a0c242cdfd Implement load to store => memcpy in MemCpyOpt for aggregates
Summary:
Most of the tool chain is able to optimize scalar and memcpy like operation effisciently while it isn't that good with aggregates. In order to improve the support of aggregate, we try to change aggregate manipulation into either scalar or memcpy like ones whenever possible without loosing informations.

This is one such opportunity.

Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15894

llvm-svn: 256868
2016-01-05 20:17:48 +00:00
Oleg Ranevskyy 2e83790c37 [Clang/Support/Windows/Unix] Command lines created by clang may exceed the command length limit set by the OS
Summary:
Hi Rafael,

Would you be able to review this patch, please?

(Clang part of the patch is D15832).

When clang runs an external tool, e.g. a linker, it may create a command line that exceeds the length limit.

Clang uses the llvm::sys::argumentsFitWithinSystemLimits function to check if command line length fits the OS 

limitation. There are two problems in this function that may cause exceeding of the limit:

1. It ignores the length of the program path in its calculations. On the other hand, clang adds the program 

path to the command line when it runs the program.

2. It assumes no space character is inserted after the last argument, which is not true for Windows. The flattenArgs function adds the trailing space for *each* argument. The result of this is that the terminating NULL character is not counted and may be placed beyond the length limit if the command line is exactly 32768 characters long. The WinAPI's CreateProcess does not find the NULL character and fails.

Reviewers: rafael, ygao, probinson

Subscribers: asl, llvm-commits

Differential Revision: http://reviews.llvm.org/D15831

llvm-svn: 256866
2016-01-05 19:56:12 +00:00
Sanjay Patel a1c5347982 [InstCombine] insert a new shuffle before its uses (PR26015)
Although this solves the test case in PR26015:
https://llvm.org/bugs/show_bug.cgi?id=26015

And may solve PR25999:
https://llvm.org/bugs/show_bug.cgi?id=25999

...I suspect this is not the best solution. I think we want to insert the new shuffle
just ahead of the earliest ExtractElementInst that we're replacing, but I don't know 
how that should be implemented.

Differential Revision: http://reviews.llvm.org/D15878

llvm-svn: 256857
2016-01-05 19:09:47 +00:00
Manuel Jacob 0aa9f7fdad Add function for testing string attributes to InvokeInst and CallSite. NFC.
llvm-svn: 256856
2016-01-05 19:08:33 +00:00
David Majnemer 861a0ae349 [X86] Determine if we have an OpaqueSPAdjustment earlier
We queried hasFP before we hit ExpandISelPseudos.  ExpandISelPseudos
manipulated state that hasFP relied on, potentially changing the result
after it has been queried elsewhere.

While I am not aware of any particular bug due to this state of affairs,
it seems best to avoid it entirely by changing the state during DAG
construction.

llvm-svn: 256849
2016-01-05 17:46:36 +00:00
Michael Zuckerman 5cbae95916 [AVX512] add PSLLD and PSLLQ Intrinsic
Differential Revision: http://reviews.llvm.org/D15885

llvm-svn: 256840
2016-01-05 15:17:39 +00:00
MinSeong Kim 4a9a4e198f [MISched] Explanatory error message when machine model is not complete. NFC
When not all instructions have a scheduling class,
the error message now provides a possible solution.

Differential Revision: http://reviews.llvm.org/D15854

llvm-svn: 256839
2016-01-05 14:50:15 +00:00
MinSeong Kim a7385ebf78 [AArch64] Add support for Samsung Exynos-M1
Adds core tuning support for new Samsung Exynos-M1 core (ARMv8-A).

Differential Revision: http://reviews.llvm.org/D15663

llvm-svn: 256828
2016-01-05 12:51:59 +00:00
Artyom Skrobov 8c6992344d (NFC) Change SubtargetFeatures::ToggleFeature and
SubtargetFeatures::ApplyFeatureFlag to be static, so that
MCSubtargetInfo doesn't need to instantiate SubtargetFeatures
for nothing. Also change the return type to void, as it
wasn't ever used.

This is a partial commit of http://reviews.llvm.org/D15746

llvm-svn: 256823
2016-01-05 10:25:56 +00:00
Junmo Park 3b8c715b2f Remove extra whitespace. NFC.
llvm-svn: 256820
2016-01-05 09:36:47 +00:00
Simon Pilgrim d47ac60f00 [X86][SSE] Merge PerformBLENDICombine into PerformShuffleCombine
PBLEND/BLENDPD/BLENDPS are no different to the other target shuffles and this will make future improvements to the target shuffle combines more straightforward.

llvm-svn: 256819
2016-01-05 09:12:17 +00:00
Craig Topper e00bffbc13 [X86] Make MOV32ri64 a post-RA pseudo instead of a CodeGenOnly instruction. It was only needed for rematerialization.
llvm-svn: 256818
2016-01-05 07:44:14 +00:00
Craig Topper 9583f51348 [X86] Add OpSize32 to OR32mrLocked instruction to match the normal OR32mr instruction.
llvm-svn: 256817
2016-01-05 07:44:11 +00:00
Craig Topper ad2ce36be0 [AVX512] Add hasSideEffects=0 to kunpck instructions since they lack a pattern in their instructions.
llvm-svn: 256816
2016-01-05 07:44:08 +00:00
David Majnemer 59eb733af1 [SimplifyCFG] Further improve our ability to remove redundant catchpads
In r256814, we managed to remove catchpads which were trivially redudant
because they were the same SSA value.  We can do better using the same
algorithm but with a smarter datastructure by hashing the SSA values
within the catchpad and comparing them structurally.

llvm-svn: 256815
2016-01-05 07:42:17 +00:00
David Majnemer 2fa8651a8f [SimplifyCFG] Remove redundant catchpads
Remove duplicate catchpad handlers from a catchswitch.

llvm-svn: 256814
2016-01-05 06:27:50 +00:00
Matt Arsenault 905042774d AMDGPU: Remove redundant let mayLoad = 1
This is already set on the SMRD format class.

llvm-svn: 256813
2016-01-05 04:50:28 +00:00
Manuel Jacob 75cbfdcf03 [RS4GC] Simplify handling of Constants in findBaseDefiningValue(). NFC.
Summary:
Previously there were three conditionals, checking for global
variables, undef values and everything constant except these two, all three
returning the same value.  This commit replaces them by one conditional.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15818

llvm-svn: 256812
2016-01-05 04:06:21 +00:00
Manuel Jacob 83eefa6d20 [Statepoints] Refactor GCRelocateOperands into an intrinsic wrapper. NFC.
Summary:
This commit renames GCRelocateOperands to GCRelocateInst and makes it an
intrinsic wrapper, similar to e.g. MemCpyInst.  Also, all users of
GCRelocateOperands were changed to use the new intrinsic wrapper instead.

Reviewers: sanjoy, reames

Subscribers: reames, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D15762

llvm-svn: 256811
2016-01-05 04:03:00 +00:00
Tom Stellard 5cd09ade38 AMDGPU/SI: Select non-uniform constant addrspace loads to flat instructions for HSA
Summary: This fixes a regression caused by r256282.

Reviewers: arsenm, cfang

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15736

llvm-svn: 256810
2016-01-05 03:40:16 +00:00
Joseph Tremoulet 0d808888c1 [WinEH] Simplify unreachable catchpads
Summary:
At least for CoreCLR, a catchpad which immediately executes an
`unreachable` instruction indicates that the exception can never have a
matching type, and so such catchpads can be removed, and so can their
catchswitches if the catchswitch becomes empty.

Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15846

llvm-svn: 256809
2016-01-05 02:37:41 +00:00
David Majnemer 869be0a4a6 Revert "[X86] Use push-pop for materializing small constants under 'minsize'"
The red zone consists of 128 bytes beyond the stack pointer so that the
allocation of objects in leaf functions doesn't require decrementing
rsp.  In r255656, we introduced an optimization that would cheaply
materialize certain constants via push/pop.  Push decrements the stack
pointer and stores it's result at what is now the top of the stack.
However, this means that using push/pop would encroach on the red zone.
PR26023 gives an example where this corrupts an object in the red zone.

llvm-svn: 256808
2016-01-05 02:32:06 +00:00
Tom Stellard 2c82ee60c3 AMDGPU/SI: Consolidate FLAT patterns
Summary:
We had to sets of identical FLAT patterns one inside the
HasFlatAddressSpace predicate and one inside the useFlatForGloabl
predicate.  This patch merges these sets into a single pattern
under the isCIVI predicate.

The reason we can remove the predicates is that when MUBUF instructions
are legal, the instruction selector will prefer selecting those over
FLAT instructions because MUBUF patterns have a higher complexity score.
So, in this case having patterns for FLAT instructions will have no effect.

This change also simplifies the process for forcing global address space
loads to use FLAT instructions, since we no only have to disable the
MUBUF patterns instead of having to disable the MUBUF patterns and
enable the FLAT patterns.

Reviewers: arsenm, cfang

Subscribers: llvm-commits
llvm-svn: 256807
2016-01-05 02:26:37 +00:00
Philip Reames a694a0b141 [MDA] Don't be quite as conservative for noalias functions
If we encounter a noalias call that alias analysis can't analyse, we can fall down into the generic call handling rather than giving up entirely. I noticed this while reading through the code for another purpose.

I can't seem to write a test case which changes; that sorta makes sense given any test case would have to be an inconsistency in AA. Suggestions welcome.

Differential Revision: http://reviews.llvm.org/D15825

llvm-svn: 256802
2016-01-05 00:49:14 +00:00
Matthias Braun 7e762e4f9c MachineInstrBundle: Fix reversed isSuperRegisterEq() call
Unfortunately this fix had the effect of exposing the
-verify-machineinstrs FIXME of X86InstrInfo.cpp in two testcases for
which I disabled it for now.
Two testcases also have additional pushq/popq where the corrected code
cannot prove that %rax is dead any longer. Looking at the examples, this
could potentially be fixed by improving computeRegisterLiveness() to check
the live-in lists of the successors blocks when reaching the end of a
block.

This fixes http://llvm.org/PR25951.

llvm-svn: 256799
2016-01-05 00:45:35 +00:00
Nicolai Haehnle 5b50497617 AMDGPU: add +xnack feature
Summary:
Enabling this feature will account for the two SGPRs used by the hardware
to store the XNACK_MASK physically.

The hardware only requires this reservation when the XNACK feature is
explicitly enabled. At some point, HSA will probably want to do that, but
it does increase SGPR register pressure, so leave it disabled by default
for now (but do add a small test).

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15869

llvm-svn: 256794
2016-01-04 23:35:53 +00:00
Chen Li c6021038f6 [InstructionCombining] prepareICWorklistFromFunction halts in infinite loop with instructions of token type
Summary: This patch fixes a bug in prepareICWorklistFromFunction, where the loop becomes infinite with instructions of token type. The patch checks if the instruction is token type, and if so it updates EndInst with the current instruction.

Reviewers: reames, majnemer

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D15859

llvm-svn: 256792
2016-01-04 23:28:57 +00:00
Eric Christopher 49a7d6c473 Clarify that the bypassSlowDivision optimization operates on a single BB [v2]
Update some comments to be more explicit.

Change bypassSlowDivision and the functions it calls so that they take
BasicBlock*s and Instruction*s, rather than Function::iterator&s and
BasicBlock::iterator&s.

Change the APIs so that the caller is responsible for updating the
iterator, rather than the callee. This makes control flow much easier
to follow.

Patch by Justin Lebar!

llvm-svn: 256789
2016-01-04 23:18:58 +00:00
David Majnemer b33f3a239a [LICM] Fix a small oversight introduced in r256763
r256763 had promoteLoopAccessesToScalars check for the existence of a
catchswitch when the exit blocks were populated but
promoteLoopAccessesToScalars may be called with a prepopulated set of
exit blocks which would also need to be checked.

This fixes PR26019.

llvm-svn: 256788
2016-01-04 23:16:22 +00:00
Philip Reames 2466719e44 [MemoryBuiltins] Remove isOperatorNewLike by consolidating non-null inference handling
This patch removes the isOperatorNewLike predicate since it was only being used to establish a non-null return value and we have attributes specifically for that purpose with generic handling. To keep approximate the same behaviour for existing frontends, I added the various operator new like (i.e. instances of operator new) to InferFunctionAttrs. It's not really clear to me why this isn't handled in Clang, but I didn't want to break existing code and any subtle assumptions it might have.

Once this patch is in, I'm going to start separating the isAllocLike family of predicates. These appear to be being used for a mixture of things which should be more clearly separated and documented. Today, they're being used to indicate (at least) aliasing facts, CSE-ability, and default values from an allocation site.

Differential Revision: http://reviews.llvm.org/D15820

llvm-svn: 256787
2016-01-04 22:49:23 +00:00
Xinliang David Li 204efe2de8 [PGO] Simplify string parsing
Patch Suggested by Vedant.

llvm-svn: 256785
2016-01-04 22:09:26 +00:00
Xinliang David Li 120fe2e898 [PGO] Refactor string writer code
For readability and code sharing.
(Adapted from Suggestions by Vedant).

llvm-svn: 256784
2016-01-04 22:01:02 +00:00
Haicheng Wu 9d6c94006e [LIR] General refactoring to simplify code and the ease future code review
This is a resubmission of r256336 which was reverted in r256361. The issue was the lack of the invariant check of the memset value in processLooMemSet().

The original message:

Move several checks into isLegalStores. Also, delineate between those stores that are memset-able and those that are memcpy-able.

llvm-svn: 256783
2016-01-04 21:43:14 +00:00
Simon Pilgrim e6955f3211 [X86][SSE] Ensure BLENDPD/BLENDPS/PBLEND inputs are both of the correct input type
llvm-svn: 256782
2016-01-04 21:41:11 +00:00
Xinliang David Li 0c677878d3 [PGO]: Use efficient 'join' API for uncompressed string
llvm-svn: 256781
2016-01-04 21:31:09 +00:00
Xinliang David Li 13ea29b5a1 [PGO]: reserve space for string to avoid excessive memory realloc/copy (non linear)
llvm-svn: 256776
2016-01-04 20:26:05 +00:00
Tom Stellard 3da5672755 AMDGPU/SI: Move VI SMEM pattern back into VIInstructions.td
Summary: This was accidently moved to CIInstructions.td in r256282

Reviewers: cfang, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15763

llvm-svn: 256775
2016-01-04 20:23:10 +00:00
Aditya Nandakumar 12d060481a Remove dead instructions before Redoing
Before reevaluating instructions, iterate over all instructions
to be reevaluated and remove trivially dead instructions and if
any of it's operands become trivially dead, mark it for deletion
until all trivially dead instructions have been removed

llvm-svn: 256773
2016-01-04 19:48:14 +00:00
Geoff Berry 9e934b0cc2 [AArch64] Optimize some simple TBZ/TBNZ cases.
Summary:
Add some AArch64 dag combines to optimize some simple TBZ/TBNZ cases:

 (tbz (and x, m), b) -> (tbz x, b)
 (tbz (shl x, c), b) -> (tbz x, b-c)
 (tbz (shr x, c), b) -> (tbz x, b+c)
 (tbz (xor x, -1), b) -> (tbnz x, b)

Reviewers: jmolloy, mcrosier, t.p.northover

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D15702

llvm-svn: 256765
2016-01-04 18:55:47 +00:00
Paul Robinson e95fa4258e Clang-format my previous change (r256313)
llvm-svn: 256764
2016-01-04 18:49:15 +00:00
David Majnemer 219055f9df [LICM] Don't insert instructions after a catchswitch when performing loop promotion
Inserting after a catchswitch results in verifier errors, bail out on
promotion if a catchswitch is a loop exit.

llvm-svn: 256763
2016-01-04 17:42:19 +00:00
Nick Lewycky 947ca8ac52 Fix comment in typo. NFC
llvm-svn: 256761
2016-01-04 16:44:44 +00:00
Joseph Tremoulet 52f729a613 [WinEH] Update CoreCLR EH state numbering
Summary:
Fix the CLR state numbering to generate correct tables, and update the lit
test to verify them.

The CLR numbering assigns one state number to each catchpad and
cleanuppad.

It also computes two tree-like relations over states:
 1) Each state has a "HandlerParentState", which is the state of the next
    outer handler enclosing this state's handler (same as nearest ancestor
    per the ParentPad linkage on EH pads, but skipping over catchswitches).
 2) Each state has a "TryParentState", which:
    a) for a catchpad that's not the last handler on its catchswitch, is
       the state of the next catchpad on that catchswitch.
    b) for all other pads, is the state of the pad whose try region is the
       next outer try region enclosing this state's try region.  The "try
       regions are not present as such in the IR, but will be inferred
       based on the placement of invokes and pads which reach each other
       by exceptional exits.

Catchswitches do not get their own states, but each gets mapped to the
state of its first catchpad.

Table generation requires each state's "unwind dest" state to have a lower
state number than the given state.

Since HandlerParentState can be computed as a function of a pad's
ParentPad, and TryParentState can be computed as a function of its unwind
dest and the TryParentStates of its children, the CLR state numbering
algorithm first computes HandlerParentState in a top-down pass, then
computes TryParentState in a bottom-up pass.

Also reword some comments/names in the CLR EH table generation to make the
distinction between the different kinds of "parent" clear.


Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: AndyAyers, llvm-commits

Differential Revision: http://reviews.llvm.org/D15325

llvm-svn: 256760
2016-01-04 16:16:01 +00:00
Nicolai Haehnle e705aadd67 AMDGPU: Avoid assertions after SGPR spilling failed
Summary:
The comment explains it: emitError does not necessarily exit the compilation
process, and then using NoRegister leads to assertions later on.
This generates incorrect code, of course, but the user should know to not use
the result when an error has been emitted.

It would be nice to have a test-case for this inside the LLVM repository,
but llc exits on error. shader-db tests trigger the underlying issue at least
on Tonga.

Reviewers: arsenm, tstellarAMD, mareko

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15826

llvm-svn: 256757
2016-01-04 15:50:01 +00:00
Michael Zuckerman cf0b6db9ef [AVX512] add PSRAD and PSRAQ Intrinsic
Differential Revision: http://reviews.llvm.org/D15851

llvm-svn: 256754
2016-01-04 13:45:45 +00:00
Michael Zuckerman 000fca44a8 [AVX512] add PSRAW Intrinsic
Differential Revision: http://reviews.llvm.org/D15850

llvm-svn: 256751
2016-01-04 12:50:36 +00:00
Jeroen Ketema 5a02dc46cb [MC] Fix file name in file header
llvm-svn: 256749
2016-01-04 12:22:34 +00:00
Michael Zuckerman 068bc2f219 [AVX512] add PSRLV Intrinsic
Differential Revision: http://reviews.llvm.org/D15838

llvm-svn: 256747
2016-01-04 11:39:06 +00:00
Chandler Carruth 7664127f8c Fix a horrible infloop in value tracking in the face of dead code.
Amazingly, we just never triggered this without:
1) Moving code around for MetadataTracking so that a certain *different*
   amount of inlining occurs in the per-TU compile step.
2) Then you LTO opt or clang with a bootstrap, and get inlining, loop
   opts, and GVN line up everything *just* right.

I don't really know how we didn't hit this before. We really need to be
fuzz testing stuff, it shouldn't be hard to trigger. I'm working on
crafting a reduced nice test case, and will submit that when I have it,
but I want to get LTO build bots going again.

llvm-svn: 256735
2016-01-04 07:23:12 +00:00
Craig Topper 5a6dda5376 [TableGen] Use some free space in Init to store the opcode for UnOpInit/BinOpInit/TernOpInit allowing those types to be a little smaller. NFC
llvm-svn: 256733
2016-01-04 06:28:49 +00:00
David Majnemer ca1c9f074f [X86] Make hasFP constant time
We need a frame pointer if there is a push/pop sequence after the
prologue in order to unwind the stack.  Scanning the instructions to
figure out if this happened made hasFP not constant-time which is a
violation of expectations.  Let's compute this up-front and reuse that
computation when we need it.

llvm-svn: 256730
2016-01-04 04:49:41 +00:00
David Majnemer 42a0730c42 [LICM] Make instruction sinking funclet-aware
We had two bugs here:
- We might try to sink into a catchswitch, causing verifier failures.
- We will succeed in sinking into a cleanuppad but we didn't update the
  funclet operand bundle.

This fixes PR26000.

llvm-svn: 256728
2016-01-04 03:37:39 +00:00
Craig Topper cfd8173327 [TableGen] Change TGParser::SetValue to take an ArrayRef instead of std::vector reference. Use None in many places where a default constructed vector was being passed. NFC
llvm-svn: 256726
2016-01-04 03:15:08 +00:00
Craig Topper 1e23ed9eaa [TableGen] Fix a bug that caused the wrong name for a record built from a multiclass containing a defm called NAME that references another multiclass that contains a defm that uses NAME concatenated with other strings.
It would end up doing the concatenations from the second multiclass twice. This occured because SetValue detected a self assignment when trying to set the value of NAME to a VarInit called NAME. NAME is special here and it will get cleaned up later. So add a flag to suppress the self assignment check for this case.

Strangely the self-assignment error was returning false indicating it wasn't an error, but it wasn't doing the right thing. So this also changes it to report an error.

This fixes the names of some AVX512 FMA instructions that showed this double expansion.

llvm-svn: 256725
2016-01-04 03:05:14 +00:00
Craig Topper e30b8ca149 Use std::is_sorted and std::none_of instead of manual loops. NFC
llvm-svn: 256719
2016-01-03 19:43:40 +00:00
Xinliang David Li 76c3f38774 [PGO] Cleanup: remove reduncant calls in lowering
CoverageMapping data's section and alignment is
already set during creation. No need to call it again
during lowering.

llvm-svn: 256716
2016-01-03 19:38:51 +00:00
Xinliang David Li 5205ca0c70 [PGO] Cleanup: Use covmap header definition in the template file
This is one last remaining instrumentatation related structure
that needs to be migrate to use the centralized template
definition.  With this change, instrumentation code 
related to coverage module header will be kept in sync
with the coverage mapping reader. The remaining code
which makes implicit assumption about covmap control
structure layout in the the lowering pass will cleaned
up in a different patch. This patch is not intended to
have no functional change.

llvm-svn: 256715
2016-01-03 19:26:07 +00:00
Xinliang David Li 946558df2d [PGO] Code refactoring to use header struct def /NFC
llvm-svn: 256712
2016-01-03 18:57:40 +00:00
Simon Pilgrim 5bf96e41c5 [SelectionDAG] Pulled out common code for CONCAT_VECTORS node creation
Pulled out the similar CONCAT_VECTORS creation code from the 2/3 operand getNode() calls (to handle all UNDEF and all BUILD_VECTOR cases). Added a similar handler to the general getNode() call as well.

llvm-svn: 256709
2016-01-03 18:24:19 +00:00
Dimitry Andric 227b928abc Fix several accidental DOS line endings in source files
Summary:
There are a number of files in the tree which have been accidentally checked in with DOS line endings.  Convert these to native line endings.

There are also a few files which have DOS line endings on purpose, and I have set the svn:eol-style property to 'CRLF' on those.

Reviewers: joerg, aaron.ballman

Subscribers: aaron.ballman, sanjoy, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D15848

llvm-svn: 256707
2016-01-03 17:22:03 +00:00
Craig Topper 90b18c4014 Use an ArrayRef to simplify repeated calculation of the array end. NFC
llvm-svn: 256702
2016-01-03 08:45:36 +00:00
Craig Topper 5f167729aa Use std::is_sorted instead of manual loops. NFC
llvm-svn: 256701
2016-01-03 07:33:45 +00:00
Xinliang David Li 37c1fa047d [PGO] simple refactoring (NFC)
llvm-svn: 256695
2016-01-03 04:38:13 +00:00
NAKAMURA Takumi ded575e4eb WinEHPrepare.cpp: Suppress a warning for -Asserts. [-Wunused-variable]
llvm-svn: 256694
2016-01-03 01:41:00 +00:00
Joseph Tremoulet d425dd13ad [Verifier] Add braces to satisfy buildbots. NFC
Fix build break introduced by r256691.

llvm-svn: 256692
2016-01-02 15:50:34 +00:00
Joseph Tremoulet 131a462690 [WinEH] Verify catchswitch handlers
Summary:
The handler list must be nonempty and consist solely of CatchPads.


Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15842

llvm-svn: 256691
2016-01-02 15:25:25 +00:00
Joseph Tremoulet 06125e52a7 [WinEH] Tighten parentPad verifier checks
Summary: A catchswitch cannot be a parent of a cleanuppad or another catchswitch.

Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15841

llvm-svn: 256690
2016-01-02 15:24:24 +00:00
Joseph Tremoulet 71e5676de4 [WinEH] Update catchrets with cloned successors
Summary:
Add a pass to update catchrets when their successors get cloned; the
existing pass doesn't catch these because it walks the funclet whose
blocks are being cloned but the catchret is in a child funclet.

Also update the test for removing incoming PHI values; when the
predecessor is a catchret, the relevant color is the catchret's parentPad,
not its block's color.


Reviewers: andrew.w.kaylor, rnk, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15840

llvm-svn: 256689
2016-01-02 15:22:36 +00:00
Yaron Keren c47c6ac0a5 Correct misleading formatting of several ifs followed by two statements without braces.
While the original code would work with or without braces, it makes sense to
set HaveSemi to true only if (!HaveSemi), otherwise it's already true, so I
put the assignment inside the if block. This addresses PR25998.

llvm-svn: 256688
2016-01-02 13:40:36 +00:00
David Majnemer dbdc9c274d [WinEH] Add additional verification
Recolor the IR to make sure our computed colors are not hiding any bugs.
Also, verifyFunction if we are running some post-preparation operations;
some of these operations can hide latent bugs.

llvm-svn: 256687
2016-01-02 09:26:36 +00:00
David Majnemer 011980cd50 [X86] Add intrinsics for reading and writing to the flags register
LLVM's targets need to know if stack pointer adjustments occur after the
prologue.  This is needed to correctly determine if the red-zone is
appropriate to use or if a frame pointer is required.

Normally, LLVM can figure this out very precisely by reasoning about the
contents of the MachineFunction.  There is an interesting corner case:
inline assembly.

The vast majority of inline assembly which will perform a push or pop is
done so to pair up with pushf or popf as appropriate.  Unfortunately,
this inline assembly doesn't mark the stack pointer as clobbered
because, well, it isn't.  The stack pointer is decremented and then
immediately incremented.  Because of this, LLVM was changed in r256456
to conservatively assume that inline assembly contain a sequence of
stack operations.  This is unfortunate because the vast majority of
inline assembly will not end up manipulating the stack pointer in any
way at all.

Instead, let's provide a more principled solution: an intrinsic.
FWIW, other compilers (MSVC and GCC among them) also provide this
functionality as an intrinsic.

llvm-svn: 256685
2016-01-01 06:50:01 +00:00
Sanjay Patel bee05caa6b [LibCallSimplifier] propagate FMF when shrinking binary calls
llvm-svn: 256682
2015-12-31 23:40:59 +00:00
Craig Topper 74658dfaad [X86] Remove a return after llvm_unreachable.
llvm-svn: 256681
2015-12-31 22:40:48 +00:00
Craig Topper 69653af748 [X86] Move shuffle decoding for constant pool into the X86CodeGen library to remove a layering violation in the Util library.
llvm-svn: 256680
2015-12-31 22:40:45 +00:00
Sanjay Patel aa23114cb4 [LibCallSimplifier] propagate FMF when shrinking unary calls
llvm-svn: 256679
2015-12-31 21:52:31 +00:00
Sanjay Patel 96475cbd22 Variable names start with an upper case letter; NFC
llvm-svn: 256676
2015-12-31 16:16:58 +00:00
Sanjay Patel d707db97a9 fix formatting; NFC
llvm-svn: 256675
2015-12-31 16:10:49 +00:00
Michael Zuckerman 0dc468880d [AVX512] add PSRLQ and PSRLD Intrinsic
Differential Revision: http://reviews.llvm.org/D15770

llvm-svn: 256673
2015-12-31 15:22:04 +00:00
Michael Kuperstein d36e24a166 [X86] Avoid folding scalar loads into unary sse intrinsics
Not folding these cases tends to avoid partial register updates:
sqrtss (%eax), %xmm0
Has a partial update of %xmm0, while
movss (%eax), %xmm0
sqrtss %xmm0, %xmm0
Has a clobber of the high lanes immediately before the partial update,
avoiding a potential stall.

Given this, we only want to fold when optimizing for size.
This is consistent with the patterns we already have for some of
the fp/int converts, and in X86InstrInfo::foldMemoryOperandImpl()

Differential Revision: http://reviews.llvm.org/D15741

llvm-svn: 256671
2015-12-31 09:45:16 +00:00
Asaf Badouh af6569afd2 [X86][PKU] Add {RD,WR}PKRU intrinsics
Differential Revision: http://reviews.llvm.org/D15808

llvm-svn: 256670
2015-12-31 08:31:13 +00:00
Craig Topper fd2c6a3be0 [TableGen] Modify the AsmMatcherEmitter to only apply the table growth from r252440 to the Hexagon target.
This restores the previous behavior of not including the mnemonic in the classes table for every target that starts instruction lines with the mnemonic. Not only did the table size increase by 1 entry, but the class enum increased in size which caused every class in the array to increase in size. It also grew the size of the function that parsers tokens into classes by a substantial amount.

This adds a new HasMnemonicFirst flag to all AsmParsers. It's set to 1 by default and Hexagon target overrides it to 0.

For the X86 target alone this recovers 324KB of size on the llvm-mc executable.

I believe the current state is still a bad design choice for the Hexagon target as it causes most of the parsing to do a linear search through the entire match table to comparing operands against every instruction until it finds one that works. At least for the other targets we do a binary search based on mnemonic over which to do the linear scan.

llvm-svn: 256669
2015-12-31 08:18:23 +00:00
Xinliang David Li e413f1a0fc [PGO]: Implement Func PGO name string compression
This is part of the effort/prepration to reduce the size
instr-pgo (object, binary, memory footprint, and raw data).

The functionality is currently off by default and not yet
used by any clients.

llvm-svn: 256667
2015-12-31 07:57:16 +00:00
Sanjay Patel 41160c2094 [ValueTracking] fix bug computing isKnownToBeAPowerOfTwo() with arithmetic shift right (PR25900)
This is a fix for:
https://llvm.org/bugs/show_bug.cgi?id=25900

If we think that an arithmetic right shift of a power of two is always a power of two, 
an sdiv gets wrongly converted to udiv.

Differential Revision: http://reviews.llvm.org/D15827

llvm-svn: 256655
2015-12-30 22:40:52 +00:00
Teresa Johnson 96f7f81aa3 [ThinLTO] Rename variables used in metadata linking (NFC)
As suggested in review for r255909, rename MDMaterialized to AllowTemps,
and identify the name of the boolean flag being set in calls to
saveMetadataList.

llvm-svn: 256653
2015-12-30 21:13:55 +00:00
Teresa Johnson cc428573a2 Ensure MDNode used as key in metadata linking map cannot be RAUWed
As suggested in review for r255909, add a way to ensure that temporary
MD used as keys in the MetadataToID map during ThinLTO importing are not
RAUWed.

Add support for marking an MDNode as not replaceable. Clear the new
CanReplace flag when adding a temporary MD node to the MetadataToID map
and clear it when destroying the map.

llvm-svn: 256648
2015-12-30 19:32:24 +00:00
Teresa Johnson 26aa93586a [ThinLTO] Check MDNode values saved for metadata linking (NFC)
Add an assert suggested in review for r255909 to ensure that MDNodes
saved in the map used for metadata linking are either temporary or
resolved.

Also add a comment clarifying why we may need to save off non-MDNode
metadata.

llvm-svn: 256646
2015-12-30 19:13:57 +00:00
Sanjay Patel 16395dd709 fix formatting; NFC
llvm-svn: 256645
2015-12-30 18:31:30 +00:00
Teresa Johnson 61b406ec75 Rename MDValue* to Metadata* (NFC)
Renamed MDValue* to Metadata*, and MDValueToValIDMap to MetadataToIDs,
as per review for r255909.

llvm-svn: 256593
2015-12-29 23:00:22 +00:00
Manuel Jacob 67f1d3ac63 [RS4GC] Use DenseMap::count() instead of DenseMap::find()/DenseMap::end(). NFC.
llvm-svn: 256586
2015-12-29 22:16:41 +00:00
Sanjay Patel ac6e910c42 don't repeat function names in comments; NFC
llvm-svn: 256584
2015-12-29 22:11:50 +00:00
Sanjay Patel d1d9db5889 use auto with dyn_casted values; NFC
llvm-svn: 256581
2015-12-29 22:00:37 +00:00
Manuel Jacob e3773d632e [PlaceSafepoints] Assert that the gc.safepoint_poll function is present in the module.
If running the PlaceSafepoints pass on a module which doesn't have the
gc.safepoint_poll function without disabling entry and backedge safepoints,
previously the pass crashed with an obscure error because of a null pointer.
Now it fails the assert instead.

llvm-svn: 256580
2015-12-29 21:57:55 +00:00
Sanjay Patel 7a7abc9a3b use auto with dyn_casted values; NFC
llvm-svn: 256579
2015-12-29 21:49:08 +00:00
Sanjay Patel b120ae96d3 fix formatting; NFC
llvm-svn: 256574
2015-12-29 19:34:53 +00:00
Sanjay Patel 4104f78640 use range-based for-loops; NFCI
llvm-svn: 256573
2015-12-29 19:14:23 +00:00
Sanjay Patel faeee6f44d use range-based for-loop; NFCI
llvm-svn: 256572
2015-12-29 18:30:09 +00:00
Chad Rosier 6b4326367a Add command line options to force function/loop alignments.
These are being added for testing purposes.
http://reviews.llvm.org/D15648

llvm-svn: 256571
2015-12-29 18:18:07 +00:00
Sanjay Patel 59309cc090 don't repeat function names in comments; NFC
llvm-svn: 256569
2015-12-29 18:14:06 +00:00
Geoff Berry 43dc285915 [JumpThreading] Fix opcode bonus in getJumpThreadDuplicationCost()
The code that was meant to adjust the duplication cost based on the
terminator opcode was not being executed in cases where the initial
threshold was hit inside the loop.

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D15536

llvm-svn: 256568
2015-12-29 18:10:16 +00:00
Sanjay Patel 7dd45697ca use range-based for-loops; NFCI
llvm-svn: 256566
2015-12-29 17:15:22 +00:00
Philip Reames 87a8677e3d [MemoryBuiltins] Delete dead code [NFC]
llvm-svn: 256565
2015-12-29 17:04:43 +00:00
Michael Zuckerman 80821ee77c [AVX512] add PSRLW Intrinsic
Differential Revision: http://reviews.llvm.org/D15751

llvm-svn: 256558
2015-12-29 13:04:35 +00:00
Chandler Carruth 5bd31b37ca [ptr-traits] Provide a real MCFragment address for the sentinel instead
of casting the integer '4' to such a pointer. There is no reason to
expect '4' to be a portable or reliable pointer of this form. The only
reason this ever worked is because the PointerIntPair that this actually
gets used with has an artificially *low* presumed alignment that allowed
it to work. When the alignment of PointerIntPair is derived from the
actual type's alignment, the asserts start firing on this pointer. I'm
amazed we never managed to do anything that triggered the alignment
sanitizer with it, as this is just flat out UB.

If folks dislike this approach to providing a sentinel fragment address,
there are a myriad of other alternatives, suggestions welcome. But this
one has the distinct advantage of not requiring the friend dance of
ilist's sentinel (which I'll point out is *also* in play for
MCFragment!) and seems to be using a nicely provided facility in
MCFragment to establish just such dummy nodes.

This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.

llvm-svn: 256552
2015-12-29 09:32:18 +00:00
Chandler Carruth e0115344e6 [ptr-traits] Sink a constructor definition to the .cpp file and add
missing includes so that the pointee types for DenseMap pointer keys and
such are complete prior to us querying the pointer traits for them.

This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.

llvm-svn: 256550
2015-12-29 09:24:39 +00:00
Chandler Carruth 8d736236d3 [ptr-traits] Split the MCFragment type hierarchy out of the MCAssembler
header to its own header, allowing users of fragments to have a narrower
header file, and avoid circular header dependencies when getting the
definition of MCSection prior to inspecting traits on MCSection
pointers.

This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.

Note that this doesn't in any way change the design of MC, it is just
moving code around to allow the *header files* to be more fine grained.
Without this, it is impossible to get a complete type for MCSection
where it is needed.

If anyone would prefer a different slicing of the header files, I'm
happy to oblige of course. =]

llvm-svn: 256548
2015-12-29 09:06:16 +00:00
Craig Topper d270501a6e [TableGen] Remove MnemonicContainsDot from AsmParser. It isn't used. NFC
llvm-svn: 256542
2015-12-29 07:03:30 +00:00
Craig Topper 3294966ed7 [X86] Remove declaration of ATTAsmParser. Its equivalent to the DefaultAsmParser. NFC
llvm-svn: 256541
2015-12-29 07:03:27 +00:00
Chandler Carruth a7dc087b36 [ptr-traits] Merge the MetadataTracking helpers into the Metadata
header.

This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.

The MetadataTracking helpers aren't actually independent. They rely on
constructing a PointerUnion between Metadata and MetadataAsValue
pointers, which requires know the alignment of pointers to those types
which requires them to be complete.

The .cpp file even defined a method declared in Metadata.h! These really
don't seem like something that is separable, and there is no real
layering problem with just placing them together.

llvm-svn: 256531
2015-12-29 02:14:50 +00:00
Eric Christopher 2ae7180b24 Accept dwarf version 5 for CIE versions.
llvm-svn: 256527
2015-12-28 23:02:42 +00:00
Artyom Skrobov 2aca0c622a [Thumb] Fix assembler error 'cannot honor width suffix pop {lr}'
Summary:
* avoid generating POP {LR} in Thumb1 epilogues
* combine MOV LR, Rx + BX LR -> BX Rx in a peephole optimization pass
* combine POP {LR} + B + BX LR -> POP {PC} on v5T+

Test cases by Ana Pazos

Differential Revision: http://reviews.llvm.org/D15707

llvm-svn: 256523
2015-12-28 21:40:45 +00:00
Sanjay Patel b3c53e512f [x86] lower calls to fmin and llvm.minnum.* using minss/minsd/minps/minpd (PR24475)
This is a follow-on to:
http://reviews.llvm.org/rL255700
http://reviews.llvm.org/rL256454
http://reviews.llvm.org/rL256510

llvm-svn: 256522
2015-12-28 21:16:55 +00:00
Easwaran Raman b9f7120e7a Refactor inline costs analysis by removing the InlineCostAnalysis class
InlineCostAnalysis is an analysis pass without any need for it to be one.
Once it stops being an analysis pass, it doesn't maintain any useful state
and the member functions inside can be made free functions. NFC.

Differential Revision: http://reviews.llvm.org/D15701

llvm-svn: 256521
2015-12-28 20:28:19 +00:00
Manuel Jacob 9db5b93ffc [RS4GC] Fix rematerialization of bitcast of bitcast.
Summary:
Previously, only the outer (last) bitcast was rematerialized, resulting in a
use of the unrelocated inner (first) bitcast after the statepoint.  See the
test case for an example.

Reviewers: igor-laevsky, reames

Subscribers: reames, alex, llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D15789

llvm-svn: 256520
2015-12-28 20:14:05 +00:00
Elena Demikhovsky 5494698828 Implemented cost model for masked gather and scatter operations
The cost is calculated for all X86 targets. When gather/scatter instruction
is not supported we calculate the cost of scalar sequence.

Differential revision: http://reviews.llvm.org/D15677

llvm-svn: 256519
2015-12-28 20:10:59 +00:00
Sanjay Patel 9da2b647c7 [x86] lower calls to fmax and llvm.maxnum.* using maxps/maxpd (PR24475)
This is a follow-on to:
http://reviews.llvm.org/rL255700
http://reviews.llvm.org/rL256454

llvm-svn: 256510
2015-12-28 19:20:19 +00:00
Sanjay Patel cc4c71b4fb tidy up; NFC
llvm-svn: 256506
2015-12-28 18:18:22 +00:00
Roman Divacky 73fc84761f Support clrex instruction on ARMv6k. Patch by Andrew Turner.
llvm-svn: 256505
2015-12-28 17:47:23 +00:00
Alexander Kornienko d0af3b3178 Refactor: Simplify boolean conditional return statements in lib/Transforms/ObjCARC
Summary: Use clang-tidy to simplify boolean conditional return statements

Reviewers: craig.topper, bkramer, chandlerc, gottesmm

Subscribers: llvm-commits

Patch by Richard Thomson!

Differential Revision: http://reviews.llvm.org/D9999

llvm-svn: 256502
2015-12-28 16:19:08 +00:00
Alexander Kornienko 66da20a6f2 Refactor: Simplify boolean conditional return statements in llvm/lib/Support
Summary: Use clang-tidy to simplify boolean conditional return statements

Reviewers: rafael, bkramer, ddunbar, Bigcheese, chandlerc, chapuni, nicholas, alexfh

Subscribers: alexfh, craig.topper, llvm-commits

Patch by Richard Thomson!

Differential Revision: http://reviews.llvm.org/D9978

llvm-svn: 256500
2015-12-28 15:46:15 +00:00
Michael Kuperstein 2ea81baf3a [X86] Better support for the MCU psABI (LLVM part)
This adds support for the MCU psABI in a way different from r251223 and r251224,
basically reverting most of these two patches. The problem with the approach
taken in r251223/4 is that it only handled libcalls that originated from the backend.
However, the mid-end also inserts quite a few libcalls and assumes these use the
platform's default calling convention.

The previous patch tried to insert inregs when necessary both in the FE and,
somewhat hackily, in the CG. Instead, we now define a new default calling convention
for the MCU, which doesn't use inreg marking at all, similarly to what x86-64 does.

Differential Revision: http://reviews.llvm.org/D15054

llvm-svn: 256494
2015-12-28 14:39:21 +00:00
Alexander Kornienko 175a7cbf3f Refactor: Simplify boolean conditional return statements in lib/Target/PowerPC
Summary: Use clang-tidy to simplify boolean conditional return statements

Reviewers: uweigand, rafael, wschmidt

Subscribers: craig.topper, llvm-commits

Patch by Richard Thomson!

Differential Revision: http://reviews.llvm.org/D9984

llvm-svn: 256493
2015-12-28 13:38:42 +00:00
Asaf Badouh fba562004b [X86][AVX512] Lower broadcast sub vector to vector inrtrinsics
lower broadcast<type>x<vector> to shuffles.
 there are two cases:
1.src is 128 bits and dest is 512 bits: in this case we will lower it to shuffle with imm = 0.
2.src is 256 bit and dest is 512 bits: in this case we will lower it to shuffle with imm = 01000100b (0x44) that way we will broadcast the 256bit source: ymm[0,1,2,3] => zmm[0,1,2,3,0,1,2,3] then it will mask it with the passthru value (in case it's mask op).



Differential Revision: http://reviews.llvm.org/D15790

llvm-svn: 256490
2015-12-28 08:26:26 +00:00
Asaf Badouh 5546f51011 [X86][AVX512] add fp scalar broadcast intrinsics
Differential Revision: http://reviews.llvm.org/D15790

llvm-svn: 256489
2015-12-28 08:09:25 +00:00
Craig Topper 401675ce5b [AVX512] Remove VEX_LIG from vmovd/vmovq instructions. From what I can tell from the Intel docs these instructions require the L-bit to be 0.
llvm-svn: 256486
2015-12-28 06:32:47 +00:00
Craig Topper af88afb214 [AVX512] Fix some places that used FR64 instead of FR64X.
llvm-svn: 256484
2015-12-28 06:11:45 +00:00
Craig Topper c648c9b92d [AVX512] Bring vmovq instructions names into alignment with the AVX and SSE names. Add a missing encoding to disassembler and assembler.
I believe this also fixes a case where a 64-bit memory form that is documented as being unsupported in 32-bit mode was able to be selected there.

llvm-svn: 256483
2015-12-28 06:11:42 +00:00
Craig Topper b4c56624eb [X86] Move address for store target from outs to ins on a couple instructions.
llvm-svn: 256482
2015-12-28 06:11:39 +00:00
Craig Topper cd4621a8ab [X86] Add proper Uses/Defs/mayLoad flags for AAA/AAD/AAM/AAS/DAA/DAS/XLAT instructions.
llvm-svn: 256481
2015-12-28 06:11:37 +00:00
Chandler Carruth 9153911551 [lcg] Fix a few more formatting goofs found by clang-format. NFC.
llvm-svn: 256480
2015-12-28 01:54:20 +00:00
Craig Topper f3ed5c115c [AVX512] Remove separate instruction and patterns for lowering ctlz_zero_undef. Change the operation for CTLZ_ZERO_UNDEF to Expand so SelectionDAG will convert them to CTLZ before lowering.
llvm-svn: 256477
2015-12-27 21:33:50 +00:00
Craig Topper 4b1808d8e7 [SelectionDAG] Teach LegalizeVectorOps to not unroll CTLZ_ZERO_UNDEF and CTTZ_ZERO_UNDEF if the non-ZERO_UNDEF form is legal or custom. Will be used to simplify X86 code in a follow on commit.
llvm-svn: 256476
2015-12-27 21:33:47 +00:00
Craig Topper c48fa89e44 [AVX512] Remove alternate data type versions of VALIGND, VALIGNQ, VMOVSHDUP and VMOVSLDUP. They don't have any tests and I don't think they can be selected. If they are truly needed they should be implemented with patterns against the normal instructions and not separate instructions.
llvm-svn: 256475
2015-12-27 19:45:21 +00:00
Igor Breger 756c289dd8 AVX512: Change VPMOVB2M DAG lowering , use CVT2MASK node instead TRUNCATE.
Fix TRUNCATE lowering vector to vector i1, use LSB and not MSB.
Implement VPMOVB/W/D/Q2M intrinsic.

Differential Revision: http://reviews.llvm.org/D15675

llvm-svn: 256470
2015-12-27 13:56:16 +00:00
Asaf Badouh b0d91fa42a [X86][AVX512] change broadcast to use maskable pattern
Differential Revision: http://reviews.llvm.org/D15786

llvm-svn: 256469
2015-12-27 12:14:34 +00:00
Chandler Carruth 3a040e6d47 [attrs] Extract the pure inference of function attributes into
a standalone pass.

There is no call graph or even interesting analysis for this part of
function attributes -- it is literally inferring attributes based on the
target library identification. As such, we can do it using a much
simpler module pass that just walks the declarations. This can also
happen much earlier in the pass pipeline which has benefits for any
number of other passes.

In the process, I've cleaned up one particular aspect of the logic which
was necessary in order to separate the two passes cleanly. It now counts
inferred attributes independently rather than just counting all the
inferred attributes as one, and the counts are more clearly explained.

The two test cases we had for this code path are both ... woefully
inadequate and copies of each other. I've kept the superset test and
updated it. We need more testing here, but I had to pick somewhere to
stop fixing everything broken I saw here.

Differential Revision: http://reviews.llvm.org/D15676

llvm-svn: 256466
2015-12-27 08:41:34 +00:00
Chandler Carruth f49f1a87ef [attrs] Split off the forced attributes utility into its own pass that
is (by default) run much earlier than FuncitonAttrs proper.

This allows forcing optnone or other widely impactful attributes. It is
also a bit simpler as the force attribute behavior needs no specific
iteration order.

I've added the pass into the default module pass pipeline and LTO pass
pipeline which mirrors where function attrs itself was being run.

Differential Revision: http://reviews.llvm.org/D15668

llvm-svn: 256465
2015-12-27 08:13:45 +00:00
Craig Topper f8423c05ee [AVX-512] Remove alernate integer forms for VPERMILPS and VPERMILPD. There no tests for them and I don't see any way to select them anyway. If they are really needed they should be implemented as patterns and not full fledged instructions.
llvm-svn: 256462
2015-12-27 06:55:08 +00:00
David Majnemer 334676355a [X86, Win64] Use a frame pointer if pushf is emitted
A frame pointer must be used if stack pointer is modified after the
prologue.  LLVM will emit pushf/popf if we need to save/restore the
FLAGS register, requiring us to have a frame pointer for the function.

There is a small twist: this sequence might exist in user code via
inline-assembly.  For now, conservatively assume that such functions
require a frame pointer.  For real world justification, please see
clang's implementation of __readeflags.

This fixes PR25945.

llvm-svn: 256456
2015-12-27 06:07:26 +00:00
David Majnemer 081e8fe4c0 [WinEH] Add comments explaining the EH tables
This is aids in debugging WinEH, similar functionality is present for
DWARF EH.

llvm-svn: 256455
2015-12-27 06:07:12 +00:00
Sanjay Patel bcff3f7d92 [x86] lower calls to llvm.maxnum.v4f32 using maxps
This is a follow-on to:
http://reviews.llvm.org/rL255700

llvm-svn: 256454
2015-12-26 21:44:55 +00:00
Craig Topper 5ce29aa307 [X86] Fix an unused variable warning in released builds.
llvm-svn: 256453
2015-12-26 20:13:33 +00:00
Craig Topper 7e3ba15529 [X86] Add support for printing shuffle comments for AVX512 PSHUFB instructions.
llvm-svn: 256452
2015-12-26 19:48:43 +00:00
Craig Topper fa5f35e6ad [X86] Fold some variable declarations and initializations into if statements. NFC
llvm-svn: 256451
2015-12-26 19:48:37 +00:00
Chen Li d71999ef1b [gc.statepoint] Change gc.statepoint intrinsic's return type to token type instead of i32 type
Summary: This patch changes gc.statepoint intrinsic's return type to token type instead of i32 type. Using token types could prevent LLVM to merge different gc.statepoint nodes into PHI nodes and cause further problems with gc relocations. The patch also changes the way on how gc.relocate and gc.result look for their corresponding gc.statepoint on unwind path. The current implementation uses the selector value extracted from a { i8*, i32 } landingpad as a hook to find the gc.statepoint, while the patch directly uses a token type landingpad (http://reviews.llvm.org/D15405) to find the gc.statepoint. 

Reviewers: sanjoy, JosephTremoulet, pgavlin, igor-laevsky, mjacob

Subscribers: reames, mjacob, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D15662

llvm-svn: 256443
2015-12-26 07:54:32 +00:00
Craig Topper d400019447 [X86] Fix shuffle decoding for variable VPERMIL to be tolerant of the Constant type not matching due to folding in the constant pool and to get VPERMILPD correct.
llvm-svn: 256433
2015-12-26 04:50:07 +00:00
Craig Topper 53bd5cac86 [X86] Fix copy and paste typo from pasting from another Makefile to restore code.
llvm-svn: 256431
2015-12-25 23:27:57 +00:00
Craig Topper 96c985169b [X86] Put back the include path to the main X86 sources in the AsmParser library to fix the bots.
llvm-svn: 256430
2015-12-25 22:22:16 +00:00
Craig Topper 95e5596228 [X86] Remove X86CodeGen dependency from the AsmParser library.
llvm-svn: 256429
2015-12-25 22:10:11 +00:00
Craig Topper c0453e87dc [X86] Move getX86SubSuperRegisterOrZero to X86MCTargetDesc.cpp so it can be used by AsmParser library without depending on X86CodeGen library.
llvm-svn: 256428
2015-12-25 22:10:08 +00:00
Craig Topper daf2e3ff7a Remove extra forward declarations and scrub includes for all in tree InstPrinters. NFC
llvm-svn: 256427
2015-12-25 22:10:01 +00:00
Craig Topper c7277d9485 [X86] Move AVX512 STATIC_ROUNDING enum to X86BaseInfo.h to fix a layering violation in AsmParser.
llvm-svn: 256426
2015-12-25 22:09:49 +00:00
Craig Topper 91dab7baee [X86] Replace MVT::SimpleValueType in the AsmParser library and getX86SubSuperRegister with just an unsigned representing size.
This a is step towards fixing a layering violation so the X86 AsmParser won't depending on CodeGen types.

llvm-svn: 256425
2015-12-25 22:09:45 +00:00
Craig Topper 2c7d7c2584 [X86] Don't pass the default value to the High argument of getX86SubSuperRegister. Most place don't care about this argument. NFC
llvm-svn: 256424
2015-12-25 19:44:16 +00:00
Craig Topper d59bc5188d [X86] getX86SubSuperRegisterOrZero shouldn't call getX86SubSuperRegister recursively. It should call itself instead. Otherwise it might fire an assertion when it was designed not too.
llvm-svn: 256422
2015-12-25 17:07:32 +00:00
Craig Topper 3453a43da9 [X86] Add missing X86II::MRM_C4, MRM_C5, etc. encodings to getMemoryOperandNo. These aren't used by any instructions, but could be someday. NFC
llvm-svn: 256421
2015-12-25 17:07:30 +00:00
Craig Topper f804af209d [X86] Use assert instead of if and llvm_unreachable. NFC
llvm-svn: 256420
2015-12-25 17:07:27 +00:00
Craig Topper 3fb423ef8b [X86] Minor identation fixes. NFC
llvm-svn: 256419
2015-12-25 17:07:24 +00:00
David Majnemer 11234ed7d3 [CodeGen] Use generic printAsOperand machinery instead of hand rolling it
We already know how to properly print out basic blocks in
printAsOperand, we should not roll it ourselves in
AsmPrinter::EmitBasicBlockStart.  No functionality change is intended.

llvm-svn: 256413
2015-12-25 09:37:26 +00:00
Craig Topper 370c8d6c6b [IR] Mark the Type subclass helper methods 'inline' and move their definitions to DerivedTypes.h so they can be inlined by the compiler.
llvm-svn: 256406
2015-12-25 04:06:20 +00:00
Craig Topper 582d8ecf6a [Transforms] Use asserts instead of ifs around llvm_unreachable. NFC
llvm-svn: 256405
2015-12-25 02:04:17 +00:00
Dan Gohman 8887d1faed [WebAssembly] Fix handling of COPY instructions in WebAssemblyRegStackify.
Move RegStackify after coalescing and teach it to use LiveIntervals instead
of depending on SSA form. This avoids a problem where a register in a COPY
instruction is stackified and then subsequently coalesced with a register
that is not stackified.

This also puts it after the scheduler, which allows us to simplify the
EXPR_STACK constraint, as we no longer have instructions being reordered
after stackification and before coloring.

llvm-svn: 256402
2015-12-25 00:31:02 +00:00
Sanjay Patel ae945e7927 [InstCombine] transform more extract/insert pairs into shuffles (PR2109)
This is an extension of the shuffle combining from r203229:
http://reviews.llvm.org/rL203229

The idea is to widen a short input vector with undef elements so the
existing shuffle transform for extract/insert can kick in.

The motivation is to finally solve PR2109:
https://llvm.org/bugs/show_bug.cgi?id=2109

For that example, the IR becomes:

%1 = bitcast <2 x i32>* %P to <2 x float>*
%ld1 = load <2 x float>, <2 x float>* %1, align 8
%2 = shufflevector <2 x float> %ld1, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
%i2 = shufflevector <4 x float> %A, <4 x float> %2, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
ret <4 x float> %i2

And x86 SSE output improves from:

movq	(%rdi), %xmm1           ## xmm1 = mem[0],zero
movdqa	%xmm1, %xmm2
shufps	$229, %xmm2, %xmm2      ## xmm2 = xmm2[1,1,2,3]
shufps	$48, %xmm0, %xmm1       ## xmm1 = xmm1[0,0],xmm0[3,0]
shufps	$132, %xmm1, %xmm0      ## xmm0 = xmm0[0,1],xmm1[0,2]
shufps	$32, %xmm0, %xmm2       ## xmm2 = xmm2[0,0],xmm0[2,0]
shufps	$36, %xmm2, %xmm0       ## xmm0 = xmm0[0,1],xmm2[2,0]
retq

To the almost optimal:

movhpd	(%rdi), %xmm0

Note: There's a tension in the existing transform related to generating
arbitrary shufflevector masks. We avoid that in other places in InstCombine
because we're scared that codegen can't handle strange masks, but it looks
like we're ok with producing those here. I purposely chose weird insert/extract
indexes for the regression tests to see the effect in these cases. 
For PowerPC+Altivec, AArch64, and X86+SSE/AVX, I think the codegen is equal or
better for these examples.

Differential Revision: http://reviews.llvm.org/D15096

llvm-svn: 256394
2015-12-24 21:17:56 +00:00
Dave Bartolomeo a779f5a401 Remove unused constants from TypeTableBuilder.cpp.
llvm-svn: 256389
2015-12-24 19:15:56 +00:00
Bill Seurer 8771bbfbe2 Fix case of path name
llvm-svn: 256388
2015-12-24 18:54:35 +00:00
Dave Bartolomeo dd38b1bf12 Fix CodeView library name and non-CMake builds
llvm-svn: 256387
2015-12-24 18:51:35 +00:00
Dave Bartolomeo 89ba802b92 LLVM CodeView library
Summary: This diff is the initial implementation of the LLVM CodeView library. There is much more work to be done, namely a CodeView dumper and tests. This patch should help others make progress on the LLVM->CodeView debug info emission while I continue with the implementation of the dumper and tests.

This library implements support for emitting debug info in the CodeView format. This phase of the implementation only includes support for CodeView type records. Clients that need to emit type records will use a class derived from TypeTableBuilder. TypeTableBuilder provides member functions for writing each kind of type record; each of these functions eventually calls the writeRecord virtual function to emit the actual bits of the record. Derived classes override writeRecord to implement the folding of duplicate records and the actual emission to the appropriate destination. LLVMCodeView provides MemoryTypeTableBuilder, which creates the table in memory. In the future, other classes derived from TypeTableBuilder will write to other destinations, such as the type stream in a PDB.

The rest of the types in LLVMCodeView define the actual CodeView type records and all of the supporting enums and other types used in the type records. The TypeIndex class is of particular interest, because it is used by clients as a handle to a type in the type table.

The library provides a relatively low-level interface based on the actual on-disk format of CodeView. For example, type records refer to other type records by TypeIndex, rather than by an actual pointer to the referent record. This allows clients to emit type records one at a time, rather than having to keep the entire transitive closure of type records in memory until everything has been emitted. At some point, having a higher-level interface layered on top of this one may be useful for debuggers and other tools that want a more holistic view of the debug info. The lower-level interface should be sufficient for compilers and linkers to do the debug info manipulation that they need to do efficiently.

Reviewers: rnk, majnemer

Subscribers: silvas, rnk, jevinskie, llvm-commits

Differential Revision: http://reviews.llvm.org/D14961

llvm-svn: 256385
2015-12-24 18:12:38 +00:00
Marina Yatsina 8dfd5cbb73 [X86][ms-inline asm] Add support for memory operands that include structs
Add ability to reference struct symbols in memory operands.
Test case will be added on the clang side (review http://reviews.llvm.org/D15749)

Differential Revision: http://reviews.llvm.org/D15748

llvm-svn: 256381
2015-12-24 12:09:51 +00:00
Benjamin Kramer 7a5c8c8fe3 [ProfileData] Make helper function static.
No functional change.

llvm-svn: 256375
2015-12-24 10:03:37 +00:00
Benjamin Kramer fe2b541546 [FunctionImport] Move pass into anonymous namespace.
No functional change.

llvm-svn: 256374
2015-12-24 10:03:35 +00:00
Chandler Carruth 85dbea99ee Add a missing const qualifier on the context instruction. This somehow
has always been missing. =/

llvm-svn: 256371
2015-12-24 09:08:08 +00:00
Asaf Badouh 9a5a83a518 [X86][PKU] Add {RD,WR}PKRU encoding
Differential Revision: http://reviews.llvm.org/D15711

llvm-svn: 256366
2015-12-24 08:25:00 +00:00
Elena Demikhovsky 9e225a2f52 AVX-512: Kreg set 0/1 optimization
The patterns that set a mask register to 0/1
KXOR %kn, %kn, %kn / KXNOR %kn, %kn, %kn
are replaced with
KXOR %k0, %k0, %kn / KXNOR %k0, %k0, %kn - AVX-512 targets optimization.

KNL does not recognize dependency-breaking idioms for mask registers,
so kxnor %k1, %k1, %k2 has a RAW dependence on %k1.
Using %k0 as the undef input register is a performance heuristic based
on the assumption that %k0 is used less frequently than the other mask
registers, since it is not usable as a write mask.

Differential Revision: http://reviews.llvm.org/D15739

llvm-svn: 256365
2015-12-24 08:12:22 +00:00
Igor Breger 268f6f53c5 AVX512: VPMOVM2B/W/D/Q intrinsic implementation.
Differential Revision: http://reviews.llvm.org//D15747

llvm-svn: 256364
2015-12-24 07:11:53 +00:00
Craig Topper 73275a2951 Use range-based for loops. NFC
llvm-svn: 256363
2015-12-24 05:20:40 +00:00
Matt Arsenault 4339b3ff35 AMDGPU: Fix getRegisterBitWidth for vectors
llvm-svn: 256362
2015-12-24 05:14:55 +00:00
Nico Weber 95cc9d5f14 Revert r256336, it caused PR25939
llvm-svn: 256361
2015-12-24 04:01:06 +00:00
Tom Stellard 5ebdfbe562 AMDGPU/SI: Fix encoding of flat instructions on VI
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15735

llvm-svn: 256360
2015-12-24 03:18:18 +00:00
Tom Stellard 668f793049 AMDGPU/SI: Remove non-existent flat instructions
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15734

llvm-svn: 256357
2015-12-24 02:41:55 +00:00
Philip Reames cb0f947a2a [Statepoints] Use Indirect operands for spill slots
Teach the statepoint lowering code to emit Indirect stackmap entries for spill inserted by StatepointLowering (i.e. SelectionDAG), but Direct stackmap entries for in-IR allocas which represent manual stack slots. This is what the docs call for (http://llvm.org/docs/StackMaps.html#stack-map-format), but we've been emitting both as Direct. This was pointed out recently on the mailing list as a bug. It also blocks http://reviews.llvm.org/D15632 which extends the lowering to handle vector-of-pointers since only Indirect references can encode a variable sized slot.

To implement this, I introduced a new flag on the StackObject class used to maintian information about stack slots. I original considered (and prototyped in http://reviews.llvm.org/D15632), the idea of using the existing isSpillSlot flag, but end up deciding that was a bit too risky and that the cost of adding a new flag was low. Having the new flag will also allow us - in the future - to emit better comments in verbose assembly which indicate where a particular stack spill around a call comes from. (deopt, gc, regalloc).

Differential Revision: http://reviews.llvm.org/D15759

llvm-svn: 256352
2015-12-23 23:44:28 +00:00
Philip Reames 4e66c84722 [MemOperands] Clarify code around dropping memory operands [NFC]
Clarify a comment about what it means to drop memory operands from an instruction.  While I'm adding change the name of the method slightly to make it a bit more clear what's going on when reading calling code.

llvm-svn: 256346
2015-12-23 19:16:04 +00:00
Keno Fischer 9bc46b117b [Function] Properly remove use when clearing personality
Summary:
We need to actually remove the use of the personality function,
otherwise we can run into trouble if we want to e.g. delete
the personality function because ther's no way to get rid of
its uses. Do this by resetting to ConstantPointerNull value
that the operands are set to when first allocated.

Reviewers: vsk, dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15752

llvm-svn: 256345
2015-12-23 18:27:23 +00:00
JF Bastien 61ad8b3907 Fix SCEV r256338.
llvm-svn: 256344
2015-12-23 18:18:53 +00:00
Sanjoy Das 2fbfb25ad6 [SCEV] Fix getLoopBackedgeTakenCounts
The way `getLoopBackedgeTakenCounts` is written right now isn't
correct. It will try to compute and store the BE counts of a Loop
 #{child loop} number of times (which may be zero).

llvm-svn: 256338
2015-12-23 17:48:14 +00:00
Chad Rosier fba65d2fd3 [LIR] General refactoring to simplify code and the ease future code review.
Move several checks into isLegalStores. Also, delineate between those stores
that are memset-able and those that are memcpy-able.

http://reviews.llvm.org/D15683
Patch by Haicheng Wu <haicheng@codeaurora.org>!

llvm-svn: 256336
2015-12-23 17:29:33 +00:00
Philip Reames 42bd26f29d [MachineLICM] Fix handling of memoperands
As far as I can tell, the correct interpretation of an empty memoperands list is that we didn't have sufficient room to store information about the MachineInstr, NOT that the MachineInstr doesn't access any particular bit of memory. This appears to be fairly consistent in a number of places, but I'm not 100% sure of this interpretation. I'd really appreciate someone more knowledgeable confirming my reading of the code.

This patch fixes two latent bugs in MachineLICM - given the above assumption - and adds comments to document the meaning and required handling. I don't have test cases; these were noticed by inspection.

Differential Revision: http://reviews.llvm.org/D15730

llvm-svn: 256335
2015-12-23 17:05:57 +00:00
Simon Pilgrim 17377bdd45 [X86][AVX] Only shuffle the lower half of vectors if the upper half is undefined
First step towards making better use of AVX's implicit zeroing of the upper half of a 256-bit vector by instructions that only act on the lower 128-bit vector - discussed on D14151.

As well as the fact that 128-bit shuffle instructions are generally more capable, this can be performant for older CPUs with 128-bit ALUs (e.g. Jaguar, Sandy Bridge) that must treat 256-bit vectors as multiple micro-ops.

Moved the similar subvector extraction shuffle combines from PerformShuffleCombine256 to lowerVectorShuffle as well.

Note: I've avoided combining shuffles that reference elements from the upper halves of the input vectors - this may be reviewed in future work as well (AVX1 would probably always gain, but AVX2 does have some cross-lane shuffle instructions).

Differential Revision: http://reviews.llvm.org/D15477

llvm-svn: 256332
2015-12-23 13:10:07 +00:00
David Majnemer 2bc2538470 [OperandBundles] Have GlobalsModRef play nice with operand bundles
A call site's use of a Value might not correspond to an argument
operand but to a bundle operand.

llvm-svn: 256329
2015-12-23 09:58:46 +00:00
David Majnemer 63ad9e0543 [OperandBundles] Have TailCallElim play nice with operand bundles
A call site's use of a Value might not correspond to an argument
operand but to a bundle operand.

This fixes PR25928.

llvm-svn: 256328
2015-12-23 09:58:43 +00:00
David Majnemer 02f4787e45 [OperandBundles] Have InstCombine play nice with operand bundles
Don't assume a call's use corresponds to an argument operand, it might
correspond to a bundle operand.

llvm-svn: 256327
2015-12-23 09:58:41 +00:00
David Majnemer 464be3724a [OperandBundles] Have DeadArgElim play nice with operand bundles
A call site's use of a Value might not correspond to an argument
operand but to a bundle operand.

llvm-svn: 256326
2015-12-23 09:58:36 +00:00
Igor Breger 7b46b4e798 AVX512BW: Enable packed word shift for 512bit vector. Enable lowering scalar immidiate shift v64i8 .Fix predicate for AVX1/2 shifts.
Differential Revision: http://reviews.llvm.org/D15713

llvm-svn: 256324
2015-12-23 08:06:50 +00:00
David Majnemer c640f863e0 [WinEH] Don't visit the same catchswitch twice
We visited the same catchswitch twice because it was both the child of
another funclet and the predecessor of a cleanuppad.

Instead, change the numbering algorithm to only recurse if the unwind
destination of the inner funclet agrees with the unwind destination of
the catchswitch.

This fixes PR25926.

llvm-svn: 256317
2015-12-23 03:59:04 +00:00
Paul Robinson 22d0d31a72 Form reform for MCDwarf.
MCDwarf emits a canned abbreviation table, but was not emitting proper
forms for DWARF version 4, which is the default after r249655.

Differential Revision: http://reviews.llvm.org/D15732

llvm-svn: 256313
2015-12-23 01:57:31 +00:00
Philip Reames ee8f055327 [GC] Make GCStrategy::isGCManagedPointer a type predicate not a value predicate [NFC]
Reasons:
1) The existing form was a form of false generality.  None of the implemented GCStrategies use anything other than a type.  Its becoming more and more clear we're going to need some type of strong GC pointer in the type system and we shouldn't pretend otherwise at this point.
2) The API was awkward when applied to vectors-of-pointers.  The old one could have been made to work, but calling isGCManagedPointer(Ty->getScalarType()) is much cleaner than the Value alternatives.  
3) The rewriting implementation effectively assumes the type based predicate as well.  We should be consistent.

llvm-svn: 256312
2015-12-23 01:42:15 +00:00
Dan Gohman 08d58bcf6a [WebAssembly] Add a TODO comment for a possible future optimization.
llvm-svn: 256306
2015-12-23 00:22:04 +00:00
Manuel Jacob a4efd8ac2e [RS4GC] Fix base pair printing for constants.
Previously, "%" + name of the value was printed for each derived and base
pointer.  This is correct for instructions, but wrong for e.g. globals.

llvm-svn: 256305
2015-12-23 00:19:45 +00:00
Akira Hatanaka 1cb242eb13 Provide a way to specify inliner's attribute compatibility and merging.
This reapplies r256277 with two changes:

- In emitFnAttrCompatCheck, change FuncName's type to std::string to fix
  a use-after-free bug.
- Remove an unnecessary install-local target in lib/IR/Makefile. 

Original commit message for r252949:

Provide a way to specify inliner's attribute compatibility and merging
rules using table-gen. NFC.

This commit adds new classes CompatRule and MergeRule to Attributes.td,
which are used to generate code to check attribute compatibility and
merge attributes of the caller and callee.

rdar://problem/19836465

llvm-svn: 256304
2015-12-22 23:57:37 +00:00
Cong Hou 6a2c71af0b [BPI] Fix two potential divide-by-zero operations that are introduced in r256263.
llvm-svn: 256303
2015-12-22 23:45:55 +00:00
Dan Gohman a2b2cdc813 [WebAssembly] Trim unneeded #includes. NFC.
llvm-svn: 256301
2015-12-22 23:45:21 +00:00
Dan Gohman cc38ba1954 [WebAssembly] Minor code simplification. NFC.
llvm-svn: 256300
2015-12-22 23:39:16 +00:00
Changpeng Fang b41574a961 AMDGPU/SI: Use flat for global load/store when targeting HSA
Summary:
  For some reason doing executing an MUBUF instruction with the addr64
  bit set and a zero base pointer in the resource descriptor causes
  the memory operation to be dropped when the shader is executed using
  the HSA runtime.

  This kind of MUBUF instruction is commonly used when the pointer is
  stored in VGPRs.  The base pointer field in the resource descriptor
  is set to zero and and the pointer is stored in the vaddr field.

  This patch resolves the issue by only using flat instructions for
  global memory operations when targeting HSA. This is an overly
  conservative fix as all other configurations of MUBUF instructions
  appear to work.

  NOTE: re-commit by fixing a failure in Codegen/AMDGPU/llvm.dbg.value.ll

Reviewers: tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15543

llvm-svn: 256282
2015-12-22 20:55:23 +00:00
Rafael Espindola 10d9a033db Also add unnamed_addr to functions.
llvm-svn: 256281
2015-12-22 20:43:30 +00:00
Akira Hatanaka 9c05cc5670 Revert r256277 and r256279.
Some of the bots failed again.

llvm-svn: 256280
2015-12-22 20:29:09 +00:00
Akira Hatanaka 3f1bf25db1 Add a .td file I forgot to add in r256277.
llvm-svn: 256279
2015-12-22 20:06:50 +00:00
Akira Hatanaka a61deb249b Provide a way to specify inliner's attribute compatibility and merging.
This reapplies r252990 and r252949. I've added member function getKind
to the Attr classes which returns the enum or string of the attribute.

Original commit message for r252949:

Provide a way to specify inliner's attribute compatibility and merging
rules using table-gen. NFC.

This commit adds new classes CompatRule and MergeRule to Attributes.td,
which are used to generate code to check attribute compatibility and
merge attributes of the caller and callee.

rdar://problem/19836465

llvm-svn: 256277
2015-12-22 20:00:05 +00:00
Rafael Espindola 5349d87a69 Delete dead GlobalAliases.
llvm-svn: 256276
2015-12-22 19:50:22 +00:00
Rafael Espindola 4b0d24c00a Revert "AMDGPU/SI: Use flat for global load/store when targeting HSA"
This reverts commit r256273.

It broke CodeGen/AMDGPU/llvm.dbg.value.ll

llvm-svn: 256275
2015-12-22 19:46:44 +00:00
Rafael Espindola 2cc46b3701 Merge duplicated code.
The code for deleting dead global variables and functions was
duplicated.

This is in preparation for also deleting dead global aliases.

llvm-svn: 256274
2015-12-22 19:38:07 +00:00
Changpeng Fang 9b8a9be058 AMDGPU/SI: Use flat for global load/store when targeting HSA
Summary:
  For some reason doing executing an MUBUF instruction with the addr64
  bit set and a zero base pointer in the resource descriptor causes
  the memory operation to be dropped when the shader is executed using
  the HSA runtime.

  This kind of MUBUF instruction is commonly used when the pointer is
  stored in VGPRs.  The base pointer field in the resource descriptor
  is set to zero and and the pointer is stored in the vaddr field.

  This patch resolves the issue by only using flat instructions for
  global memory operations when targeting HSA. This is an overly
  conservative fix as all other configurations of MUBUF instructions
  appear to work.

Reviewers: tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15543

llvm-svn: 256273
2015-12-22 19:32:28 +00:00
Rafael Espindola 9f0bebc3da Use early continue to reduce indentation.
llvm-svn: 256272
2015-12-22 19:26:18 +00:00
Rafael Espindola e4ed0e56ce Simplify iterator management. NFC.
Not passing an iterator to processGlobal will allow it to work with
other GlobalValues.

llvm-svn: 256271
2015-12-22 19:16:50 +00:00
Cong Hou e93b8e1539 [BPI] Replace weights by probabilities in BPI.
This patch removes all weight-related interfaces from BPI and replace
them by probability versions. With this patch, we won't use edge weight
anymore in either IR or MC passes. Edge probabilitiy is a better
representation in terms of CFG update and validation.


Differential revision: http://reviews.llvm.org/D15519 

llvm-svn: 256263
2015-12-22 18:56:14 +00:00
Manuel Jacob 4e4f60ded0 Remove deprecated llvm.experimental.gc.result.{int,float,ptr} intrinsics.
Summary:
These were deprecated 11 months ago when a generic
llvm.experimental.gc.result intrinsic, which works for all types, was added.

Reviewers: sanjoy, reames

Subscribers: sanjoy, chenli, llvm-commits

Differential Revision: http://reviews.llvm.org/D15719

llvm-svn: 256262
2015-12-22 18:44:45 +00:00
Vedant Kumar d167586a28 [Support] Allow multiple paired calls to {start,stop}Timer()
Differential Revision: http://reviews.llvm.org/D15619

Reviewed-by: rafael
llvm-svn: 256258
2015-12-22 17:36:17 +00:00
Manuel Jacob 990dfa6fe5 [RS4GC] Fix crash in the case that a live variable has a constant base.
Summary:
Previously, RS4GC crashed in CreateGCRelocates() because it assumed
that every base is also in the array of live variables, which isn't true if a
live variable has a constant base.

This change fixes the crash by making sure CreateGCRelocates() won't try to
relocate a live variable with a constant base.  This would be unnecessary
anyway because anything with a constant base won't move.

Reviewers: reames

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D15556

llvm-svn: 256252
2015-12-22 16:50:44 +00:00
Jun Bum Lim 6755c3bc5f [AArch64] Promote loads from stored
This is a recommit of r256004 which was reverted in r256160. The issue was the
incorrect promotion for half and byte loads transformed into mov instructions.
This fix will replace half and byte type loads only with bit field extracts.

Original commit message:

This change promotes load instructions which directly read from stored by
replacing them with mov instructions. If the store is wider than the load,
the load will be replaced with a bitfield extract.
For example :
  STRWui %W1, %X0, 1
  %W0 = LDRHHui %X0, 3
becomes
  STRWui %W1, %X0, 1
  %W0 = UBFMWri %W1, 16, 31

llvm-svn: 256249
2015-12-22 16:36:16 +00:00
Chad Rosier a108010385 Typo. NFC.
llvm-svn: 256242
2015-12-22 15:06:47 +00:00
Asaf Badouh 13ffa4bf7c [X86][AVX512] Add rcp14 and rsqrt14 intrinsics
Differential Revision: http://reviews.llvm.org/D15414

llvm-svn: 256237
2015-12-22 11:40:04 +00:00
Keno Fischer 4eccf11373 [ASMPrinter] Fix missing handling of DW_OP_bit_piece
In r256077, I added printing for DIExpressions in DEBUG_VALUE comments,
but neglected to handle DW_OP_bit_piece operands. Thanks to
Mikael Holmen and Joerg Sonnenberger for spotting this.

llvm-svn: 256236
2015-12-22 07:14:50 +00:00
Kostya Serebryany b0fb6e8508 [libFuzzer] add AFL-style dictionary for C++, remove the old file with tokens
llvm-svn: 256229
2015-12-22 01:50:51 +00:00
David Majnemer ff1d084aa2 [MC] Don't use the architecture to govern which object file format to use
InitMCObjectFileInfo was trying to override the triple in awkward ways.
For example, a triple specifying COFF but not Windows was forced as ELF.
This makes it easy for internal invariants to get violated, such as
those which triggered PR25912.

This fixes PR25912.

llvm-svn: 256226
2015-12-22 01:39:04 +00:00
Teresa Johnson d213aa469e Handle empty Subprogram list when linking metadata.
Use an iterator that handles an empty subprogram list.

Fixes PR25915.

llvm-svn: 256224
2015-12-22 01:17:19 +00:00
Easwaran Raman bdb6f1dcc3 Determine callee's hotness and adjust threshold based on that. NFC.
This uses the same criteria used in CFE's CodeGenPGO to identify hot and cold
callees and uses values of inlinehint-threshold and inlinecold-threshold
respectively as the thresholds for such callees.

Differential Revision: http://reviews.llvm.org/D15245

llvm-svn: 256222
2015-12-22 00:32:35 +00:00
Evgeniy Stepanov 8827f2db85 [safestack] Add option for non-TLS unsafe stack pointer.
This patch adds an option, -safe-stack-no-tls, for using normal
storage instead of thread-local storage for the unsafe stack pointer.
This can be useful when SafeStack is applied to an operating system
kernel.

http://reviews.llvm.org/D15673

Patch by Michael LeMay.

llvm-svn: 256221
2015-12-22 00:13:11 +00:00
Xinliang David Li 5fe0455563 [PGO] Fix another comdat related issue for COFF
The linker requires that a comdat section must be associated
with a another comdat section that precedes it. This
means the comdat section's name needs to use the  profile name
var's name.

Patch tested by Johan Engelen.

llvm-svn: 256220
2015-12-22 00:11:15 +00:00
Vedant Kumar 11dc6dc71e [Support] Timer: Use emplace_back() and range-based loops (NFC)
llvm-svn: 256217
2015-12-21 23:41:38 +00:00
Vedant Kumar 3f79e32593 [Support] Timer: simplify the init() method
llvm-svn: 256215
2015-12-21 23:27:44 +00:00
Dylan McKay 751a449e2f [AVR] Added configuration file and machine function information class
This commit adds the 'AVRMachineFunctionInfo' class, which simply stores
basic properties about generated machine functions.

llvm-svn: 256213
2015-12-21 23:13:15 +00:00
Eric Christopher 213a5daab7 Fix line endings after r256155. NFC.
llvm-svn: 256211
2015-12-21 23:04:27 +00:00
Evgeniy Stepanov fda72c52a2 [cfi] Fix LowerBitSets on 32-bit targets.
This code attempts to truncate IntPtrTy to i32, which may be the same
type.

llvm-svn: 256205
2015-12-21 22:14:04 +00:00
David Majnemer 03e2cc3007 [MC, COFF] Support link /incremental conditionally
Today, we always take into account the possibility that object files
produced by MC may be consumed by an incremental linker.  This results
in us initialing fields which vary with time (TimeDateStamp) which harms
hermetic builds (e.g. verifying a self-host went well) and produces
sub-optimal code because we cannot assume anything about the relative
position of functions within a section (call sites can get redirected
through incremental linker thunks).

Let's provide an MCTargetOption which controls this behavior so that we
can disable this functionality if we know a-priori that the build will
not rely on /incremental.

llvm-svn: 256203
2015-12-21 22:09:27 +00:00
Jun Bum Lim a23e5f7516 Enhance BranchProbabilityInfo::calcUnreachableHeuristics for InvokeInst
This is recommit of r256028 with minor fixes in unittests:
  CodeGen/Mips/eh.ll
  CodeGen/Mips/insn-zero-size-bb.ll

Original commit message:

When identifying blocks post-dominated by an unreachable-terminated block
in BranchProbabilityInfo, consider only the edge to the normal destination
block if the terminator is InvokeInst and let calcInvokeHeuristics() decide
edge weights for the InvokeInst.

llvm-svn: 256202
2015-12-21 22:00:51 +00:00
Xinliang David Li ab361efee7 Resubmit r256193 with test fix: assertion failure analyzed
llvm-svn: 256201
2015-12-21 21:52:27 +00:00
Xinliang David Li 13da1f149e Revert r256193: build bot failure triggered
llvm-svn: 256198
2015-12-21 21:00:33 +00:00
Cong Hou 8df93ce455 [X86][SSE] Transform truncations between vectors of integers into X86ISD::PACKUS/PACKSS operations during DAG combine.
This patch transforms truncation between vectors of integers into
X86ISD::PACKUS/PACKSS operations during DAG combine. We don't do it in
lowering phase because after type legalization, the original truncation
will be turned into a BUILD_VECTOR with each element that is extracted
from a vector and then truncated, and from them it is difficult to do
this optimization. This greatly improves the performance of truncations
on some specific types.

Cost table is updated accordingly.


Differential revision: http://reviews.llvm.org/D14588

llvm-svn: 256194
2015-12-21 20:42:43 +00:00
Xinliang David Li 6c494cd0df [PGO] Fix profile var comdat generation problem with COFF
When targeting COFF, it is required that a comdat section to
have a global obj with the same name as the comdat (except for
comdats with select kind to be associative). This fix makes
sure that the comdat is keyed on the data variable for COFF.

Also improved test coverage for this.

llvm-svn: 256193
2015-12-21 20:41:20 +00:00
Michael Zolotukhin 0c97988e54 [ValueTracking] Properly handle non-sized types in isAligned function.
Reviewers: apilipenko, reames, sanjoy, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15597

llvm-svn: 256192
2015-12-21 20:38:18 +00:00
Adrian Prantl ce8581389b Fix PR24563 (LiveDebugVariables unconditionally propagates all DBG_VALUEs)
LiveDebugVariables unconditionally propagates all DBG_VALUE down the
dominator tree, which happens to work fine if there already is another
DBG_VALUE or the DBG_VALUE happends to describe a single-assignment vreg
but is otherwise wrong if the DBG_VALUE is coming from only one of the
predecessors.

In r255759 we introduced a proper data flow analysis scheduled after
LiveDebugVariables that correctly propagates DBG_VALUEs across basic block
boundaries. With the new pass in place, the incorrect propagation in
LiveDebugVariables can be retired witout loosing any of the benefits
where LiveDebugVariables happened to do the right thing.

llvm-svn: 256188
2015-12-21 20:03:00 +00:00
Adrian Prantl 5d9acc2443 Teach ARMLoadStoreOptimizer to ignore DBG_VALUE instructions when merging
instructions.

As noted in PR24563.
rdar://problem/23963293

llvm-svn: 256183
2015-12-21 19:25:03 +00:00
Tom Stellard 2b65ed306d AMDGPU/SI: Fix encoding for FLAT_SCRATCH registers on VI
Summary:
These register has different encodings on CI and VI, so we add pseudo
FLAT_SCRACTH registers to be used before MC, and subtarget specific
registers to be used by the MC layer.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15661

llvm-svn: 256178
2015-12-21 18:44:27 +00:00
Tom Stellard 9da8620cdb AMDGPU/SI: Change assembly name for flat scratch registers to flat_scratch
This matches what the assembler accepts.

llvm-svn: 256177
2015-12-21 18:44:21 +00:00
Matthew Simpson 11c4de6054 [AArch64] Add additional extract-extend patterns for smov
This patch adds to the target description two additional patterns for matching
extract-extend operations to SMOV. The patterns catch the v16i8-to-i64 and
v8i16-to-i64 cases. The existing patterns miss these cases because the
extracted elements must first be legalized to i32, resulting in any_extend
nodes.

This was originally implemented as a DAG combine (r255895), but was reverted
due to failing out-of-tree tests.

llvm-svn: 256176
2015-12-21 18:31:25 +00:00
Chad Rosier 353d71914a Remove extra whitespace. NFC.
llvm-svn: 256173
2015-12-21 18:08:05 +00:00
Teresa Johnson 4f04d85fa6 [ThinLTO] Rename variable to reflect bulk importing change (NFC)
llvm-svn: 256171
2015-12-21 17:33:24 +00:00
Dan Gohman d544e0c100 [WebAssembly] Convert a regular for loop to a range-based for loop.
llvm-svn: 256169
2015-12-21 17:22:02 +00:00
Dan Gohman d9b4cdb68d [WebAssembly] Clean up comments and fix a missing #include dependency.
llvm-svn: 256168
2015-12-21 17:19:31 +00:00
Dan Gohman 979b766fef [WebAssembly] Remove an unneeded empty destructor.
llvm-svn: 256167
2015-12-21 17:12:40 +00:00
Dan Gohman d587aa5917 [WebAssembly] Enclose the operand variables for load and store instructions in braces.
This allows the AsmMatcherEmitter to properly tokenize the AsmStrings for
load and store instructions. This is a step towards asm parsing.

llvm-svn: 256166
2015-12-21 16:58:49 +00:00
Dan Gohman a783f10c16 [WebAssembly] Mark the ARGUMENT pseudo-instructions as CodeGenOnly.
llvm-svn: 256165
2015-12-21 16:53:29 +00:00
Dan Gohman dd20c70b61 [WebAssembly] Add some comments and make some minor source cleanups.
llvm-svn: 256164
2015-12-21 16:50:41 +00:00
Dan Gohman 216e0c2ffe Teach MCOperand::print how to print FPImm operands.
llvm-svn: 256163
2015-12-21 16:47:10 +00:00
Teresa Johnson 4034d55158 Remove unused functions from ModuleLinker (NFC)
Remove a couple ModuleLinker methods and a related static function that
are no longer used after the linker split.

llvm-svn: 256162
2015-12-21 15:49:59 +00:00
Teresa Johnson 3470295967 Remove overly strict new assert in BitcodeReader.
This fixes a bug introduced by the ThinLTO metadata linking patch
r255909. The assert is overly-strict and while useful in development of
the patch, doesn't seem interesting to keep.

Fixes PR25907.

llvm-svn: 256161
2015-12-21 15:38:13 +00:00
Jun Bum Lim 4bb171c8da Revert "[AArch64] Promote loads from stores"
This reverts commit r256004 due to a failure in cortex-a53.

llvm-svn: 256160
2015-12-21 15:36:49 +00:00
Chad Rosier 94274fb1ad [LIR] Refactor code to enable future patch. NFC.
llvm-svn: 256159
2015-12-21 14:49:32 +00:00
Chad Rosier d016574df8 [AArch64] Enable PostRAScheduler for AArch64 generic build.
Disable post-ra scheduler for perturbed tests to appease the bots and to
preserve the history of the tests.

http://reviews.llvm.org/D15652

llvm-svn: 256158
2015-12-21 14:43:45 +00:00
Igor Breger 44b60a3687 AVX512BW: Enable AND/OR/XOR vector byte/word paked operation by promoting to qword that natively suppored.
llvm-svn: 256157
2015-12-21 14:40:36 +00:00
Amjad Aboud 60b5e1b6c0 Implemented Support of IA interrupt and exception handlers:
http://lists.llvm.org/pipermail/cfe-dev/2015-September/045171.html

Differential Revision: http://reviews.llvm.org/D15567

llvm-svn: 256155
2015-12-21 14:07:14 +00:00
Zlatko Buljan 5da2f6cd03 [mips][microMIPS] Implement DERET and DI instructions and check size operand for EXT and DEXT* instructions
Differential Revision: http://reviews.llvm.org/D15570

llvm-svn: 256152
2015-12-21 13:08:58 +00:00
David Majnemer 18663f8787 [MC, COFF] Unbreak support for COFF timestamps
Support for COFF timestamps was unintentionally broken in r246905 when
it was conditionally available depending on whether or not LLVM was
configured with LLVM_ENABLE_TIMESTAMPS.  However, Config/config.h was
never included which essentially broke the feature.  Due to lax testing,
the breakage was never identified until we observed strange failures
during incremental links of Chromium.

This issue is resolved by simply including Config/config.h in
WinCOFFObjectWriter and teaching lit that the MC/COFF/timestamp.s test
is conditionally supported depending on LLVM_ENABLE_TIMESTAMPS.  With
this in place, we can strengthen the test to ensure that it will not
accidentally get broken in the future.

This fixes PR25891.

llvm-svn: 256137
2015-12-21 08:03:07 +00:00
NAKAMURA Takumi 9ec6a826dd [Cygwin] Enable TLS as emutls.
It resolves clang selfhosting with std::once() for Cygwin.

FIXME: It may be EmulatedTLS-generic also for X86-Android.
FIXME: Pass EmulatedTLS to LLVM CodeGen from Clang with -femulated-tls.
llvm-svn: 256134
2015-12-21 02:37:23 +00:00
Manuel Jacob 8050a49737 [RS4GC] Add an assert which fails if there is a (yet unsupported) addrspacecast.
The slightly strange indentation comes from clang-format.

llvm-svn: 256132
2015-12-21 01:26:46 +00:00
Craig Topper eafbd57ebc [InstCombine] Fix indentation. NFC.
llvm-svn: 256131
2015-12-21 01:02:28 +00:00
Dylan McKay f061e9b7b2 [AVR] Added AVRCallingConv.td
llvm-svn: 256130
2015-12-20 23:17:44 +00:00
Craig Topper ca66fc5473 [X86] Use range-based for loop. NFC
llvm-svn: 256127
2015-12-20 18:41:57 +00:00
Craig Topper 074e845260 [X86] Prevent constant hoisting for a couple compare immediates that the selection DAG knows how to optimize into a shift.
This allows "icmp ugt %a, 4294967295" and "icmp uge %a, 4294967296" to be optimized into right shifts by 32 which can fold the immediate into the shift instruction. These patterns show up with some regularity in real code.

Unfortunately, since getImmCost can't see the icmp predicate we can't be tell if we're only catching these specific cases.

llvm-svn: 256126
2015-12-20 18:41:54 +00:00
Dylan McKay 029346f438 Add AVR.td and AVRRegisterInfo.td
Summary:
This adds the core AVR TableGen file, along with the register descriptions.

Lines in AVR.td which require other TableGen files which haven't been committed
yet are commented out.

This is a fairly trivial patch, and should only require a quick review.

I kept the line width smaller than 80 columns, but there are a few exceptions
because I'm not sure how to split a string over several lines.

Reviewers: stoklund

Subscribers: dylanmckay, agnat

Differential Revision: http://reviews.llvm.org/D14684

llvm-svn: 256120
2015-12-20 12:16:20 +00:00
Xinliang David Li 6005728843 Fix a latent UAF bug in profwriter
llvm-svn: 256116
2015-12-20 08:46:18 +00:00
Weiming Zhao 613c6862fa Fix mapping of @llvm.arm.ssat/usat intrinsics to ssat/usat instructions for Thumb2
Summary:
r250697 fixed the mapping for ARM mode. We have to do the same for Thumb2 otherwise the same llvm.arm.ssat() will generate different saturating amount for ARM and Thumb.

r250697: http://reviews.llvm.org/rL250697

Reviewers: rmaprath

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D15653

llvm-svn: 256115
2015-12-20 06:41:44 +00:00
Xinliang David Li a716cc5c33 [PGO] Improve Indexed Profile Reader efficiency
With the support of value profiling added, the Indexed prof
reader gets less efficient. The prof reader initialization
used to be just reading the file header, but with VP support
added, initialization needs to walk through all profile keys
of ondisk hash table resulting in very poor locality and large
memory increase (keys are stored together with the profile data
in the mapped profile buffer). Even worse, when the reader is 
used by the compiler (not llvm-profdata too), the penalty becomes
very high as compilation of each single module requires touching
profile data buffer for the whole program. 

In this patch, the icall target values (MD5hash) are no longer eargerly 
converted back to name strings when the data is read into memory. New
interface is added to to profile reader so that InstrProfSymtab can be
lazily created for Indexed profile reader on-demand. Creating of the 
symtab is intended to be used by llvm-profdata tool for symbolic dumping
of  VP data. It can be used with compiler (for legacy out of tree uses)
too but not recommended due to compile time and memory reasons 
mentioned above.

Some other cleanups are also included: Function Addr to md5 map is now
consolated into InstrProfSymtab. InstrProfStringtab is no longer used and
eliminated.

llvm-svn: 256114
2015-12-20 06:22:13 +00:00
Xinliang David Li 5c24da5d8e Minor clean up -- move large single use method out of header(NFC)
llvm-svn: 256113
2015-12-20 05:15:45 +00:00
Sanjoy Das ab0626e35f Nonnull elements in OperandBundleCallSites are not all Instructions
`CloneAndPruneIntoFromInst` sometimes RAUW's dead instructions with
`undef` before erasing them (to avoid deleting instructions that still
have uses).  This changes the `WeakVH` in `OperandBundleCallSites` to
hold an `undef`, and we need to guard for this situation in eventuality
in `llvm::InlineFunction`.

llvm-svn: 256110
2015-12-19 22:40:28 +00:00
Rafael Espindola 30941d264b Delete APIs that have been deprecated since 2010.
llvm-svn: 256107
2015-12-19 21:42:07 +00:00
Rafael Espindola e01e363fd9 Assert that we have all use/users in the getters.
An error that is pretty easy to make is to use the lazy bitcode reader
and then do something like

if (V.use_empty())

The problem is that uses in unmaterialized functions are not accounted
for.

This patch adds asserts that all uses are known.

llvm-svn: 256105
2015-12-19 20:03:23 +00:00
Manuel Jacob 5b90b147d4 Remove unnecessary casts. NFC.
llvm-svn: 256101
2015-12-19 18:38:42 +00:00
Matt Arsenault d206d6cc54 SelectionDAG: Cleanup integer bin op promotion functions.
SDIV and UDIV had special handling, but this is the same handling
that min/max need.

llvm-svn: 256098
2015-12-19 17:18:43 +00:00
Vedant Kumar 3a63fb316c Re-reapply "[IR] Move optional data in llvm::Function into a hungoff uselist"
Make personality functions, prefix data, and prologue data hungoff
operands of Function.

This is based on the email thread "[RFC] Clean up the way we store
optional Function data" on llvm-dev.

Thanks to sanjoyd, majnemer, rnk, loladiro, and dexonsmith for feedback!

Includes a fix to scrub value subclass data in dropAllReferences. Does not
use binary literals.

Differential Revision: http://reviews.llvm.org/D13829

llvm-svn: 256095
2015-12-19 08:52:49 +00:00
Vedant Kumar 44dd9871e8 Revert "Reapply "[IR] Move optional data in llvm::Function into a hungoff uselist""
This reverts commit r256093.

This broke lld-x86_64-win7 because of -Werror,-Wc++1y-extensions.

llvm-svn: 256094
2015-12-19 08:48:43 +00:00
Vedant Kumar d481752e68 Reapply "[IR] Move optional data in llvm::Function into a hungoff uselist"
Make personality functions, prefix data, and prologue data hungoff
operands of Function.

This is based on the email thread "[RFC] Clean up the way we store
optional Function data" on llvm-dev.

Thanks to sanjoyd, majnemer, rnk, loladiro, and dexonsmith for feedback!

Includes a fix to scrub value subclass data in dropAllReferences.

Differential Revision: http://reviews.llvm.org/D13829

llvm-svn: 256093
2015-12-19 08:29:51 +00:00
Vedant Kumar e069c4b6d1 Revert "[IR] Move optional data in llvm::Function into a hungoff uselist"
This reverts commit r256090.

This broke llvm-clang-lld-x86_64-debian-fast.

llvm-svn: 256091
2015-12-19 07:30:44 +00:00
Vedant Kumar be7525d4fa [IR] Move optional data in llvm::Function into a hungoff uselist
Make personality functions, prefix data, and prologue data hungoff
operands of Function.

This is based on the email thread "[RFC] Clean up the way we store
optional Function data" on llvm-dev.

Thanks to sanjoyd, majnemer, rnk, loladiro, and dexonsmith for feedback!

Differential Revision: http://reviews.llvm.org/D13829

llvm-svn: 256090
2015-12-19 07:08:56 +00:00
Kostya Serebryany 550e9c80a6 [libFuzzer] deprecate -save_minimized_corpus, -merge can be used instead
llvm-svn: 256086
2015-12-19 03:42:16 +00:00
Kostya Serebryany bf65644c97 [libFuzzer] split the tests to run them in parallel, remove one redundant test
llvm-svn: 256085
2015-12-19 03:35:30 +00:00
Tom Stellard ffc1a5aef7 AMDGPU/SI: Fix implemenation of isSourceOfDivergence() for graphics shaders
Summary:
The analysis of shader inputs was completely wrong.  We were passing the
wrong index to AttributeSet::hasAttribute() and the logic for which
inputs where in SGPRs was wrong too.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15608

llvm-svn: 256082
2015-12-19 02:54:15 +00:00
Kostya Serebryany 27ab2d759f [libFuzzer] make CrossOver just one of the other mutations
llvm-svn: 256081
2015-12-19 02:49:09 +00:00
Philip Reames 5d54689bca [RS4GC] Remove an overly strong assertion
As shown by the included test case, it's reasonable to end up with constant references during base pointer calculation.  The code actually handled this case just fine, we only had the assert to help isolate problems under the belief that constant references shouldn't be present in IR generated by managed frontends. This turned out to be wrong on two fronts: 1) Manual Jacobs is working on a language with constant references, and b) we found a case where the optimizer does create them in practice.

llvm-svn: 256079
2015-12-19 02:38:22 +00:00
Keno Fischer 00cbf9a69a Clean up the processing of dbg.value in various places
Summary:
First up is instcombine, where in the dbg.declare -> dbg.value conversion,
the llvm.dbg.value needs to be called on the actual loaded value, rather
than the address (since the whole point of this transformation is to be
able to get rid of the alloca). Further, now that that's cleaned up, we
can remove a hack in the backend, that would add an implicit OP_deref if
the argument to dbg.value was an alloca. This stems from before the
existence of DIExpression and is no longer necessary since the deref can
be expressed explicitly.

Now, in order to make sure that the tests pass with this change, we need to
correct the printing of DEBUG_VALUE comments to take into account the
expression, which wasn't taken into account before.

Unfortunately, for both these changes, there were a number of incorrect
test cases (mostly the wrong number of DW_OP_derefs, but also a couple
where the test itself was broken more badly). aprantl and I have gone
through and adjusted these test case in order to make them pass with
these fixes and in some cases to make sure they're actually testing
what they are meant to test.

Reviewers: aprantl

Subscribers: dsanders

Differential Revision: http://reviews.llvm.org/D14186

llvm-svn: 256077
2015-12-19 02:02:44 +00:00
Matt Arsenault 2aed6ca1d3 AMDGPU: Switch barrier intrinsics to using convergent
noduplicate prevents unrolling of small loops that happen to have
barriers in them. If a loop has a barrier in it, it is OK to duplicate
it for the unroll.

llvm-svn: 256075
2015-12-19 01:46:41 +00:00
Matt Arsenault 10a509292c Fix broken type legalization of min/max
This was using an anyext when promoting the type
when zext/sext is required.

llvm-svn: 256074
2015-12-19 01:39:48 +00:00
Nicolai Haehnle 6bcf8b2890 AMDGPU/SI: use S_MOV_B64 for larger copies in copyPhysReg
Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15629

llvm-svn: 256073
2015-12-19 01:36:26 +00:00
Nicolai Haehnle dd58705af6 AMDGPU: fix overlapping copies in copyPhysReg
Summary:
When copying aggregate registers within the same register class, there may
be an overlap between source and destination that forces us to do the copy
backwards.

Do the simplest possible thing that guarantees the correct order of moves
when there are overlaps, and does whatever when there is no overlap. (The
last part forces some trivial adjustments to test cases.)

Together with r255906, this fixes a VM fault in Unreal Elemental Demo.

While at it, change the generation of kill and def flags to something that
looks more reasonable. This method is used very late during compilation, so
it probably doesn't matter in practice, and to be honest, I don't know if
this change is actually correct because the semantics in connection with
aggregate registers vs. sub-registers are not clear to me.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93264

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15622

llvm-svn: 256072
2015-12-19 01:16:06 +00:00
Kostya Serebryany 14c50288cc [libFuzzer] print successfull mutations sequences
llvm-svn: 256071
2015-12-19 01:09:49 +00:00
Rafael Espindola 2339ffed97 Deprecate a few C APIs.
This deprecates:
* LLVMParseBitcode
* LLVMParseBitcodeInContext
* LLVMGetBitcodeModuleInContext
* LLVMGetBitcodeModule

They are replaced with the functions with a 2 suffix which do not record
a diagnostic.

llvm-svn: 256065
2015-12-18 23:46:42 +00:00
Xinliang David Li 020f22d810 [PGO] Cleanup: Move large member functions out of line (NFC)
llvm-svn: 256058
2015-12-18 23:06:37 +00:00
Xinliang David Li 49ee76d082 [PGO] Simplify computehash interface (NFC)
llvm-svn: 256047
2015-12-18 22:22:12 +00:00
Alexey Samsonov 1eaae4c3b1 [Symbolize] Improve the ownership of parsed objects.
This code changes the way Symbolize handles parsed binaries: now
parsed OwningBinary<Binary> is not broken into (binary, memory buffer)
pair, and is just stored as-is in a cache. ObjectFile components
of Mach-O universal binaries are also stored explicitly in a
separate cache.

Additionally, this change:
* simplifies the code that parses/caches binaries: it's now done
  in a single place, not three different functions.
* makes flush() method behave as expected, and actually clear
  the cached parsed binaries and objects.
* fixes a dangling pointer issue described in
  http://reviews.llvm.org/D15638

llvm-svn: 256041
2015-12-18 22:02:14 +00:00
Cong Hou fd0d62b87e Use getEdgeProbability() instead of getEdgeWeight() in BFI and remove getEdgeWeight() interfaces from MBPI.
This patch removes all getEdgeWeight() interfaces from CodeGen directory. As
getEdgeProbability() is a little more expensive than getEdgeWeight(), I will
compose a patch soon in which BPI only stores probabilities instead of edge
weights so that getEdgeProbability() will have O(1) time.


Differential revision: http://reviews.llvm.org/D15489

llvm-svn: 256039
2015-12-18 21:53:24 +00:00
Jingyue Wu 3f422280f5 [DivergenceAnalysis] fix a bug in computing influence regions
Fixes PR25864

llvm-svn: 256036
2015-12-18 21:44:26 +00:00
Jingyue Wu ba3ca76ed2 [NaryReassociate] allow candidate to have a different type
Summary:
If Candiadte may have a different type from GEP, we should bitcast or
pointer cast it to GEP's type so that the later RAUW doesn't complain.

Added a test in nary-gep.ll

Reviewers: tra, meheff

Subscribers: mcrosier, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D15618

llvm-svn: 256035
2015-12-18 21:36:30 +00:00
Rafael Espindola 708a91a103 Revert "Enhance BranchProbabilityInfo::calcUnreachableHeuristics for InvokeInst"
This reverts commit r256028.

It broke:

    LLVM :: CodeGen/Mips/eh.ll
    LLVM :: CodeGen/Mips/insn-zero-size-bb.ll

llvm-svn: 256032
2015-12-18 21:23:32 +00:00
Rafael Espindola 79753a07a6 Remove redundant argument. NFC.
llvm-svn: 256031
2015-12-18 21:18:57 +00:00
Jun Bum Lim 51a247065e Enhance BranchProbabilityInfo::calcUnreachableHeuristics for InvokeInst
When identifying blocks post-dominated by an unreachable-terminated block
in BranchProbabilityInfo, consider only the edge to the normal destination
block if the terminator is InvokeInst and let calcInvokeHeuristics() decide
edge weights for the InvokeInst.

llvm-svn: 256028
2015-12-18 20:53:47 +00:00
Krzysztof Parzyszek 21dc8bdd9e [Hexagon] Add PIC support
llvm-svn: 256025
2015-12-18 20:19:30 +00:00
Rafael Espindola c4a03483f4 Drop materializeAllPermanently.
This inlines materializeAll into the only caller
(materializeAllPermanently) and renames materializeAllPermanently to
just materializeAll.

llvm-svn: 256024
2015-12-18 20:13:39 +00:00
Changpeng Fang c9963936e7 AMDGPU/SI: Test commit
Summary: This is just my first commit. Test!

    Reviewers: none

    Subscribers: none

    Differential Revision: none

llvm-svn: 256022
2015-12-18 20:04:28 +00:00
Changpeng Fang ef735b74c1 Revert "AMDGPU/SI: Test commit"
This reverts commit a493cb636e0152ad28210934a47c6c44b1437193.

llvm-svn: 256021
2015-12-18 20:04:26 +00:00
Changpeng Fang 7fdf674c2e AMDGPU/SI: Test commit
Summary: This is just my first commit. Test!

Reviewers: none

Subscribers: none

Differential Revision: none

llvm-svn: 256020
2015-12-18 19:57:41 +00:00
Rafael Espindola 18c63b0f18 Drop support for dematerializing.
It was only used on lib/Linker and the use was "dead" since it was used on a
function the IRMover had just moved.

llvm-svn: 256019
2015-12-18 19:57:26 +00:00
Pete Cooper 98052537f0 Revert "Improve DWARFDebugFrame::parse to also handle __eh_frame."
This reverts commit r256008.

Its breaking multiple buildbots, although works for me locally.

llvm-svn: 256013
2015-12-18 19:45:38 +00:00
Teresa Johnson bef543635a Rename variables to reflect linker split (NFC)
Renamed variables to be more reflective of whether they are
an instance of Linker, IRLinker or ModuleLinker. Also fix a stale
comment.

llvm-svn: 256011
2015-12-18 19:28:59 +00:00
Eric Christopher 9a8b5e7ece Convert Arg, ArgList, and Option to dump() to dbgs() rather than errs().
Also add print() functions.

Patch by Justin Lebar!

llvm-svn: 256010
2015-12-18 18:55:26 +00:00
Eric Christopher 42b56eefd8 Add a dump method for ArgList.
Patch by Justin Lebar!

llvm-svn: 256009
2015-12-18 18:55:22 +00:00
Pete Cooper 6c97f4c7d7 Improve DWARFDebugFrame::parse to also handle __eh_frame.
LLVM MC has single methods which can handle the output of EH frame and DWARF CIE's and FDE's.

This code improves DWARFDebugFrame::parse to do the same for parsing.

This also allows llvm-objdump to support the --dwarf=frames option which objdump supports.  This
option dumps the .eh_frame section using the new code in DWARFDebugFrame::parse.

http://reviews.llvm.org/D15535

Reviewed by Rafael Espindola.

llvm-svn: 256008
2015-12-18 18:51:08 +00:00
Krzysztof Parzyszek a45c0e0d4e Recognize strings for Hexagon-specific variant kinds
llvm-svn: 256007
2015-12-18 18:47:27 +00:00
Andrew Kaylor 123048d26a [WinEH] Update LCSSA to handle catchswitch with handlers inside and outside a loop
Differential Revision: http://reviews.llvm.org/D15630

llvm-svn: 256005
2015-12-18 18:12:35 +00:00
Jun Bum Lim 3509d64c24 [AArch64] Promote loads from stores
This change promotes load instructions which directly read from stores by
replacing them with mov instructions. If the store is wider than the load,
the load will be replaced with a bitfield extract.
For example :
  STRWui %W1, %X0, 1
  %W0 = LDRHHui %X0, 3
becomes
  STRWui %W1, %X0, 1
  %W0 = UBFMWri %W1, 16, 31

llvm-svn: 256004
2015-12-18 18:08:30 +00:00
Teresa Johnson 0e7c82cb69 [ThinLTO/LTO] Don't link in unneeded metadata
Summary:
Third patch split out from http://reviews.llvm.org/D14752.

Only map in needed DISubroutine metadata (imported or otherwise linked
in functions and other DISubroutine referenced by inlined instructions).
This is supported for ThinLTO, LTO and llvm-link --only-needed, with
associated tests for each one.

Depends on D14838.

Reviewers: dexonsmith, joker.eph

Subscribers: davidxl, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D14843

llvm-svn: 256003
2015-12-18 17:51:37 +00:00
Rafael Espindola 7a36355b21 Handle archives with paths in the names.
We always create archives with just he filename as the member name, but
other archives can put a more complicated path in there.

This patches handles it by computing just the filename as we do when
adding a new member.

If storing the path is important for some reason, we should probably
have an orthogonal option for doing that and do it for both old and new
members.

Fixes pr25877.

llvm-svn: 256001
2015-12-18 16:07:17 +00:00
Rafael Espindola d7f9c250df clang-format to reduce diff in another patch.
llvm-svn: 255999
2015-12-18 14:06:34 +00:00
Rafael Espindola f382b8836a Fix error handling in LLVMGetBitcodeModuleInContext.
It was not setting OutMessage.

llvm-svn: 255998
2015-12-18 13:58:05 +00:00
Vaivaswatha Nagaraj ed237938da GlobalsAA: Take advantage of ArgMemOnly, InaccessibleMemOnly and InaccessibleMemOrArgMemOnly attributes
Summary:
1. Modify AnalyzeCallGraph() to retain function info for external functions
if the function has [InaccessibleMemOr]ArgMemOnly flags.
2. When analyzing the use of a global is function parameter at a call site,
mark the callee also as modifying the global appropriately.
3. Add additional test cases.

Depends on D15499

Reviewers: hfinkel, jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15605

llvm-svn: 255994
2015-12-18 11:02:52 +00:00
Zlatko Buljan 252cca555f [mips][microMIPS][DSP] Implement PACKRL.PH, PICK.PH, PICK.QB, SHILO, SHILOV and WRDSP instructions
Differential Revision: http://reviews.llvm.org/D14429

llvm-svn: 255991
2015-12-18 08:59:37 +00:00
Philip Reames dd0948a1b6 [RS4GC] Use an value handle to help isolate errors quickly
Inspired by the bug reported in 25846.  Whatever we end up doing about that one, the value handle change is a generally good one since it will help catch this type of mistake more quickly.

Patch by: Manuel Jacob

llvm-svn: 255984
2015-12-18 03:53:28 +00:00
Vedant Kumar 2892a4a302 Revert "[Option] Introduce Arg::print(raw_ostream&) and use llvm::dbgs"
This reverts commit r255977. This is part of
http://reviews.llvm.org/D15634.

llvm-svn: 255978
2015-12-18 02:30:45 +00:00
Vedant Kumar a1e51fd968 [Option] Introduce Arg::print(raw_ostream&) and use llvm::dbgs
llvm-svn: 255977
2015-12-18 02:27:52 +00:00
Eric Christopher a6b96004b5 Reorganize the C API headers to improve build times.
Type specific declarations have been moved to Type.h and error handling
routines have been moved to ErrorHandling.h. Both are included in Core.h
so nothing should change for projects directly including the headers,
but transitive dependencies may be affected.

llvm-svn: 255965
2015-12-18 01:46:52 +00:00
Eric Christopher 8c2adf6b49 Remove unused class variables.
llvm-svn: 255939
2015-12-17 23:43:40 +00:00
Hans Wennborg a6a2e512cf [X86] Use push-pop for materializing small constants under 'minsize'
Use the 3-byte (4 with REX prefix) push-pop sequence for materializing
small constants. This is smaller than using a mov (5, 6 or 7 bytes
depending on size and REX prefix), but it's likely to be slower, so
only used for 'minsize'.

This is a follow-up to r255656.

Differential Revision: http://reviews.llvm.org/D15549

llvm-svn: 255936
2015-12-17 23:18:39 +00:00
Philip Reames d7a6cc859a [InstCombine] Extend peephole DSE to handle unordered atomics
This extends the same line of reasoning used in EarlyCSE w/http://reviews.llvm.org/D15352 to the DSE implementation in InstCombine.

Key points:
 * We only remove unordered or simple stores.
 * The loads producing values consumed by dead stores don't influence whether the store is dead.

Differential Revision: http://reviews.llvm.org/D15354

llvm-svn: 255932
2015-12-17 22:19:27 +00:00
JF Bastien d1fb58538f Polish atomic pointers
Summary:
I didn't realize that we already allowed atomic load/store of pointers,
it was added in 2012 by r162146. This patch updates the documentation
and tightens the verifier by using DataLayout to make sure that the
stored size is byte-sized and power-of-two. DataLayout is also used for
integers, and while I'm here I updated the corresponding code for
cmpxchg and rmw.

See the following discussion for context and upcoming changes to
add floating-point and vector atomics:
  https://groups.google.com/forum/#!topic/llvm-dev/Nh0P_E3CRoo/discussion

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15512

llvm-svn: 255931
2015-12-17 22:09:19 +00:00
Matthew Simpson 13dddb0799 Revert "[AArch64] Add DAG combine for extract extend pattern"
This reverts commit r255895. The patch breaks internal tests. Reverting until a
fix is ready.

llvm-svn: 255928
2015-12-17 21:29:47 +00:00
Rafael Espindola 776e458d81 Drop function that are deprecated since 2010.
These functions were deprecated in r97608.

llvm-svn: 255927
2015-12-17 21:16:12 +00:00
Dave Bartolomeo ea039c121b Test commit
llvm-svn: 255926
2015-12-17 20:54:16 +00:00
Dan Gohman 670a60ed52 [WebAssembly] Switch WebAssemblyMCAsmInfo.h from MCAsmInfo to MCAsmInfoELF.
llvm-svn: 255925
2015-12-17 20:50:45 +00:00
Sanjoy Das 0de2feceb1 [SCEV] Add and use SCEVConstant::getAPInt; NFCI
llvm-svn: 255921
2015-12-17 20:28:46 +00:00
Weiming Zhao 24fbef55f9 [InstCombine] Adding "\n" to debug output. NFC.
Summary:
[InstCombine] Adding '\n' to debug output. NFC.

Patch by Zhaoshi Zheng <zhaoshiz@codeaurora.org>

Reviewers: apazos, majnemer, weimingz

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15403

llvm-svn: 255920
2015-12-17 19:53:41 +00:00
Philip Reames 15145fb7b1 [EarlyCSE] DSE of atomic unordered stores
The rules for removing trivially dead stores are a lot less complicated than loads. Since we know the later store post dominates the former and the former dominates the later, unless the former has side effects other than the actual store, we can remove it. One slightly surprising thing is that we can freely remove atomic stores, even if the later one isn't atomic. There's no guarantee the atomic one was every visible.

For the moment, we don't handle DSE of ordered atomic stores. We could extend the same chain of reasoning to them, but the catch is we'd then have to model the ordering effect without a store instruction. Since our fences are a stronger than our operation orderings, simple using a fence isn't an obvious win. This arguable calls for a refinement in our fence specification, but that's (much) later work.

Differential Revision: http://reviews.llvm.org/D15352

llvm-svn: 255914
2015-12-17 18:50:50 +00:00
Teresa Johnson e5a6191732 [ThinLTO] Metadata linking for imported functions
Summary:
Second patch split out from http://reviews.llvm.org/D14752.

Maps metadata as a post-pass from each module when importing complete,
suturing up final metadata to the temporary metadata left on the
imported instructions.

This entails saving the mapping from bitcode value id to temporary
metadata in the importing pass, and from bitcode value id to final
metadata during the metadata linking postpass.

Depends on D14825.

Reviewers: dexonsmith, joker.eph

Subscribers: davidxl, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D14838

llvm-svn: 255909
2015-12-17 17:14:09 +00:00
Tom Stellard caaa3aa07c AMDGPU/SI: Reserve appropriate number of sgprs for flat scratch init.
Reviewers: tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15583

Patch by: Changpeng Fang

llvm-svn: 255908
2015-12-17 17:05:09 +00:00
Nicolai Haehnle 87323da6eb AMDGPU: Fix off-by-one in SIRegisterInfo::eliminateFrameIndex
Summary:
The method insertNOPs expected the number of wait states to be passed as
parameter, while eliminateFrameIndex passed the immediate argument for the
S_NOP, leading to an off-by-one error. Rename the method to make the
meaning of its parameter clearer. The number of 4 / 5 wait states (which
is what the method has always _tried_ to do according to the comment) is
correct according to the hardware docs.

I stumbled upon this while trying to track down the cause of
https://bugs.freedesktop.org/show_bug.cgi?id=93264. While clearly needed,
this patch unfortunately does not fix that bug...

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15542

llvm-svn: 255906
2015-12-17 16:46:42 +00:00
Andy Gibbs 33a0eb740e Revert r254592 (virtual dtor in SCEVPredicate).
Clang has better diagnostics in this case.  It is not necessary therefore
to change the destructor to avoid what is effectively an invalid warning
in gcc.  Instead, better handle the warning flags given to the compiler.

llvm-svn: 255905
2015-12-17 16:43:53 +00:00
Teresa Johnson 4a9bf5872c Mark a couple ModuleLinker member functions as const (NFC)
llvm-svn: 255903
2015-12-17 16:34:53 +00:00
Rafael Espindola f44db24e1f Avoid explicit relocation sorting most of the time.
These days relocations are created and stored in a deterministic way.
The order they are created is also suitable for the .o file, so we don't
need an explicit sort.

The last remaining exception is MIPS.

llvm-svn: 255902
2015-12-17 16:22:06 +00:00
Rafael Espindola 9e1cae510f Revert "[AArch64] Enable PostRAScheduler for AArch64 generic build"
This reverts commit r255896. It broke the tests.

llvm-svn: 255899
2015-12-17 15:12:26 +00:00
Rafael Espindola d0e16522c7 Always sort by offset first. NFC.
Every target changing sortRelocs was first calling the parent
implementation. Just run that first.

llvm-svn: 255898
2015-12-17 15:08:24 +00:00
Diego Novillo 8561841875 Fix unused variable warning in release builds. NFC.
llvm-svn: 255897
2015-12-17 14:58:34 +00:00
MinSeong Kim d05e9fd194 [AArch64] Enable PostRAScheduler for AArch64 generic build
This patch enables PostRAScheduler specifically for AArch64 generic build,
which is beneficial from the performance perspective.
Speedups up to 2 to 7% for some benchmarks on A57 and A53 are observed.
Also benchmarks from LLVM test-suite did not regress.

Differential Revision: http://reviews.llvm.org/D15557

llvm-svn: 255896
2015-12-17 14:51:22 +00:00
Matthew Simpson 4355e404d5 [AArch64] Add DAG combine for extract extend pattern
This patch adds a DAG combine for (any_extend (extract_vector_elt v, i)) ->
(extract_vector_elt v, i). The combine enables us to better match some SMOV
patterns.

Differential Revision: http://reviews.llvm.org/D15515

llvm-svn: 255895
2015-12-17 14:30:55 +00:00
Rafael Espindola 850ba46dd6 Simplify. NFC.
llvm-svn: 255894
2015-12-17 14:19:52 +00:00
Alexey Bataev 7b72b658cc [X86] Add option for enabling LEA optimization pass, by Andrey Turetsky
Add option to enable/disable LEA optimization pass. By default the pass is disabled.
Differential Revision: http://reviews.llvm.org/D15573

llvm-svn: 255881
2015-12-17 07:34:39 +00:00
Dan Gohman 5bf22fc84a [WebAssembly] Convert WebAssemblyTargetObjectFile to TargetLoweringObjectFileELF
llvm-svn: 255877
2015-12-17 04:55:44 +00:00
Matthias Braun 454192917b AArch64: Simplify emitEpilogue() and related code; NFC
This is in preparation to an upcoming patch.

llvm-svn: 255872
2015-12-17 03:18:47 +00:00
Dan Gohman 05ac43fec3 [WebAssembly] Experimental ELF writer support
This creates the initial infrastructure for writing ELF output files. It
doesn't yet have any implementation for encoding instructions.

Differential Revision: http://reviews.llvm.org/D15555

llvm-svn: 255869
2015-12-17 01:39:00 +00:00
Cong Hou b9e8d483b5 Fix PR25838.
This is a quick fix to PR25838. The issue comes from the restriction that we
cannot normalize probabilities containing both known and unknown ones. A patch
that removes this restriction is under the review now:

http://reviews.llvm.org/D15548

llvm-svn: 255867
2015-12-17 01:29:08 +00:00
Xinliang David Li 50de45dcc1 [PGO] InstrPGO and coverage code refactoring (NFC)
Introduce a new class InstrProfSymtab to abstract
the PGO symbol table for prof and coverage reader.
The symtab is is to lookup function's PGO name
using function keys. The first user of the class
is CoverageMapping Reader. More will follow.

llvm-svn: 255862
2015-12-17 00:53:37 +00:00
JF Bastien eefff9ccc5 WebAssembly: update expected torture test failures
We now have 240 expected failures.

llvm-svn: 255858
2015-12-17 00:12:06 +00:00
Rafael Espindola c49ac5e7c2 Use std::unique_ptr. NFC.
llvm-svn: 255852
2015-12-16 23:49:14 +00:00
Dan Gohman 4172953813 [WebAssembly] Fix legalization of shift operators on large integer types.
llvm-svn: 255847
2015-12-16 23:25:51 +00:00
Derek Schuff 8bb5f2927a [WebAssembly] Implement eliminateCallFramePseudo
Summary:
Implement eliminateCallFramePsuedo to handle ADJCALLSTACKUP/DOWN
pseudo-instructions. Add a test calling a vararg function which causes non-0
adjustments. This revealed an issue with RegisterCoalescer wherein it
eliminates a COPY from SP32 to a vreg but failes to update the live ranges
of EXPR_STACK, causing a machineinstr verifier failure (so this test
is commented out).

Also add a dynamic alloca test, which causes a callseq_end dag node with
a 0 (instead of undef) second argument to be generated. We currently fail to
select that, so adjust the ADJCALLSTACKUP tablegen code to handle it.

Differential Revision: http://reviews.llvm.org/D15587

llvm-svn: 255844
2015-12-16 23:21:30 +00:00
Rafael Espindola 434e956181 Change linkInModule to take a std::unique_ptr.
Passing in a std::unique_ptr should help find errors when the module
is used after being linked into another module.

llvm-svn: 255842
2015-12-16 23:16:33 +00:00
Eric Christopher bfba572425 Fix funciton->function typo.
llvm-svn: 255841
2015-12-16 23:10:53 +00:00
Rafael Espindola 3f210fc0c8 Drop an unnecessary use of writev.
It looks like the code this patch deletes is based on a misunderstanding of
what guarantees writev provides. In particular, writev with 1 iovec is
not "more atomic" than a write.

Testing on OS X shows that both write and writev from multiple processes
can be intermixed.

llvm-svn: 255837
2015-12-16 22:59:06 +00:00
Ahmed Bougacha 66834ec6e1 [AArch64] Simplify some TRI/TII getters. NFC.
We don't need static_casts when we use the right Subtarget.

llvm-svn: 255836
2015-12-16 22:54:06 +00:00
Rafael Espindola b94ab5ffbd Simplify memory management with std::unique_ptr.
llvm-svn: 255831
2015-12-16 22:28:34 +00:00
Ahmed Bougacha cecb6b0865 [CodeGen] Make MachineInstrBuilder::copyImplicitOps const. NFC.
This matches the other MIB methods, none of which modify the builder.
Without this, we can't chain copyImplicitOps.
Also reformat the few users, in PPCEarlyReturn.

llvm-svn: 255828
2015-12-16 22:15:30 +00:00
Nathan Slingerland 48dd080c77 [PGO] Handle and report overflow during profile merge for all types of data
Summary: Surface counter overflow when merging profile data. Merging still occurs on overflow but counts saturate to the maximum representable value. Overflow is reported to the user.

Reviewers: davidxl, dnovillo, silvas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15547

llvm-svn: 255825
2015-12-16 21:45:43 +00:00