Commit Graph

97308 Commits

Author SHA1 Message Date
Simon Dardis 8fe36cd77c [mips][ias] N32/N64 must not sort the relocation table.
Doing so changes the evaluation order for relocation composition.

Patch By: Daniel Sanders

Reviewers: vkalintiris, atanasyan

Differential Revision: https://reviews.llvm.org/D26401

llvm-svn: 288666
2016-12-05 12:55:19 +00:00
Simon Pilgrim b08c98f125 [X86][SSE] Add support for combining target shuffles to UNPCKL/UNPCKH.
llvm-svn: 288663
2016-12-05 11:25:13 +00:00
Simon Pilgrim 20b1409f35 [X86][SSE] Add helper function to create UNPCKL/UNPCKH shuffle masks. NFCI.
llvm-svn: 288659
2016-12-05 11:00:25 +00:00
Diana Picus f11f042ecb [GlobalISel] Extract handleAssignments out of AArch64CallLowering
This function seems target-independent so far: all the target-specific behaviour
is isolated in the CCAssignFn and the ValueHandler (which we're also extracting
into the generic CallLowering).

The intention is to use this in the ARM backend.

Differential Revision: https://reviews.llvm.org/D27045

llvm-svn: 288658
2016-12-05 10:40:33 +00:00
Sam Kolton 83102d99ce [AMDGPU] Disassembler: fix s_buffer_store_dword instructions
Summary: s_buffer_store_dword instructions sdata operand was called sdst in encoding. This caused disassembler to fail.

Reviewers: tstellarAMD, vpykhtin, artem.tamazov

Subscribers: arsenm, nhaehnle, rampitec

Differential Revision: https://reviews.llvm.org/D27100

llvm-svn: 288657
2016-12-05 09:58:51 +00:00
Matthias Braun 215ff84b40 TableGen: Some more std::string->StringInit* replacements
llvm-svn: 288653
2016-12-05 07:35:13 +00:00
Matthias Braun 99f8937029 TableGen: TableGenStringKey is no longer necessary as of r288642
llvm-svn: 288651
2016-12-05 07:04:19 +00:00
Matthias Braun ca151317e8 TableGen: Use range based for; reserve vectors where possible
llvm-svn: 288650
2016-12-05 07:00:44 +00:00
Matthias Braun c66e75572e TableGen/TGParser: Prefer SmallVector/ArrayRef over std::vector
llvm-svn: 288649
2016-12-05 06:41:54 +00:00
Matthias Braun 1ddb78cd5f TableGen/Record: Replace std::vector with SmallVector/ArrayRef
llvm-svn: 288648
2016-12-05 06:41:51 +00:00
Matthias Braun dbe6e7de9e ListInit::convertInitializerTo: avoid foldingset lookup if nothing changed
llvm-svn: 288647
2016-12-05 06:41:47 +00:00
Craig Topper 088ba17f88 [X86] Remove unnecessary explicit uses of .SimpleTy just to do an equality comparison. MVT's operator== already takes care of this. NFCI
llvm-svn: 288646
2016-12-05 06:09:55 +00:00
Matthias Braun bb05316441 TableGen: Use StringInit instead of std::string for DagInit arg names
llvm-svn: 288644
2016-12-05 06:00:46 +00:00
Matthias Braun 7cf3b11224 TableGen: Use StringInit instead of std::string for DagInit name
llvm-svn: 288643
2016-12-05 06:00:41 +00:00
Matthias Braun 6a441839a6 TableGen: Use more StringInit instead of StringRef
This forces the code to call StringInit::get on the string early and
avoids storing duplicates in std::string and sometimes allows pointer
comparisons instead of string comparisons.

llvm-svn: 288642
2016-12-05 06:00:36 +00:00
Craig Topper db8467ae26 [AVX-512] Teach fast isel to handle 512-bit vector bitcasts.
llvm-svn: 288641
2016-12-05 05:50:51 +00:00
Matthias Braun 6e074de48e TableGen: Factor out STRCONCAT constructor, add shortcut.
Introduce new constructor for STRCONCAT binop with a shortcut that
immediately concatenates if the two arguments are StringInits.
Makes the QualifyName code more readable and tablegen 2-3% faster.

llvm-svn: 288639
2016-12-05 05:21:18 +00:00
Matthias Braun b1627ff0c8 TableGen/Record: Move PointerIntPair to less used field of RecordVal
llvm-svn: 288638
2016-12-05 05:21:13 +00:00
Colin LeMahieu 5d19862b22 [Hexagon] Adding additional tokenization characters in preparation for removing spacing from syntax.
llvm-svn: 288637
2016-12-05 04:52:28 +00:00
Craig Topper 7ef6ea324a [AVX-512] Teach fast isel to use masked compare and movss for handling scalar cmp and select sequence when AVX-512 is enabled. This matches the behavior of normal isel.
llvm-svn: 288636
2016-12-05 04:51:31 +00:00
Colin LeMahieu 8170754919 [Hexagon] Changing from literal numeric value to argument since #-1 will not parse when '-' is converted to a token.
llvm-svn: 288634
2016-12-05 04:29:00 +00:00
Matthias Braun d0edb0dfa7 TableGen: Store Records on a BumpPtrAllocator
All these records are internalized and will live until exit.  This makes
them perfect candidates for a fast BumpPtrAllocator.

llvm-svn: 288613
2016-12-04 05:48:20 +00:00
Matthias Braun 4a86d456d3 TableGen: Use StringRef instead of const std::string& in return vals.
This will allow to switch to a different string storage in an upcoming
commit.

llvm-svn: 288612
2016-12-04 05:48:16 +00:00
Matthias Braun 84bac184ea TableGen: Optimize common string concatenation with SmallString
llvm-svn: 288611
2016-12-04 05:48:06 +00:00
Matthias Braun 5ce9057666 TableGen: Use StringRef instead of const std::string& for parameters
This avoid an extra construction of a std::string (and a heap
allocation) when the caller only has a StringRef but no std::string at
hand.

llvm-svn: 288610
2016-12-04 05:48:03 +00:00
Lang Hames 697e7cd761 [Object][MachO] Reference-ify some helper function arguments. NFC.
Changes all static helper functions in MachOObjectFile.cpp that expect a
non-null MachOObjectFile pointer to take a reference instead.

llvm-svn: 288608
2016-12-04 01:56:10 +00:00
Dan Gohman abceeeee5d [MC] Generalize MCContext's SectionSymbols field.
Change SectionSymbols so that it doesn't hard-code ELF types, so that
it can be used for non-ELF targets.

llvm-svn: 288607
2016-12-03 23:55:57 +00:00
Matt Arsenault 92fede361f DAG: Fold out out of bounds insert_vector_elt
getNode already prevents formation of out of bounds constant
extract_vector_elts. Do the same for insert_vector_elt.

llvm-svn: 288603
2016-12-03 23:03:26 +00:00
Dan Gohman 66caac5735 [WebAssembly] Eliminate an ad-hoc command-line argument.
Use the target triple to determine whether to run the explicit-locals
pass, rather than using a separate command-line argument.

llvm-svn: 288602
2016-12-03 23:00:12 +00:00
Saleem Abdulrasool 9c89ba7fa7 AMDGPU: remove a couple of unused variables
lib/Target/AMDGPU/SIRegisterInfo.cpp: In member function 'void llvm::SIRegisterInfo::spillSGPR(llvm::MachineBasicBlock::iterator, int, llvm::RegScavenger*) const':
	lib/Target/AMDGPU/SIRegisterInfo.cpp:572:30: warning: variable 'SubRC' set but not used [-Wunused-but-set-variable]
	   const TargetRegisterClass *SubRC = nullptr;
	                              ^
	lib/Target/AMDGPU/SIRegisterInfo.cpp: In member function 'void llvm::SIRegisterInfo::restoreSGPR(llvm::MachineBasicBlock::iterator, int, llvm::RegScavenger*) const':
	lib/Target/AMDGPU/SIRegisterInfo.cpp:723:30: warning: variable 'SubRC' set but not used [-Wunused-but-set-variable]
	   const TargetRegisterClass *SubRC = nullptr;
	                              ^

The variable was assigned to, but never used.  The functions called did not
mutate state.  Simplify the logic and remove the variable.  Identified by gcc
5.4.0.

llvm-svn: 288601
2016-12-03 22:25:21 +00:00
Craig Topper 9d16bfa0f5 [AVX-512] Add many of the VPERM instructions to the load folding table. Move VPERMPDZri to the correct table.
llvm-svn: 288591
2016-12-03 19:37:39 +00:00
Matt Arsenault b55f620ebc AMDGPU: Clean up struct initializers
llvm-svn: 288590
2016-12-03 18:22:49 +00:00
Sanjay Patel 9d5b5e38bb [InstSimplify] add more helper functions for SimplifyICmpInst; NFCI
llvm-svn: 288589
2016-12-03 18:03:53 +00:00
Sanjay Patel dc65a27a10 [InstSimplify] add helper functions for SimplifyICmpInst; NFCI
llvm-svn: 288588
2016-12-03 17:30:22 +00:00
Craig Topper c210827b53 [AVX-512] Add EVEX VPMADDUBSW and VPMADDWD to the load folding tables.
llvm-svn: 288587
2016-12-03 17:19:15 +00:00
Sanjay Patel b7f8cb698c [InstCombine] change select type to eliminate bitcasts
This solves a secondary problem seen in PR6137:
https://llvm.org/bugs/show_bug.cgi?id=6137#c6

This is similar to the bitwise logic op fold added with:
https://reviews.llvm.org/rL287707

And like that patch, I'm artificially restricting the
transform from vector <-> scalar types until we're sure
that the backend can handle that. 

llvm-svn: 288584
2016-12-03 15:25:16 +00:00
Craig Topper 8e7498976a [X86] Fix VEX encoded VPMADDUBSW to not be marked commutable.
This was accidentallly broken in r285515 when we started lowering the intrinsic to an ISD node. Should fix PR31241.

llvm-svn: 288578
2016-12-03 05:35:44 +00:00
Michael Kuperstein 997dac8709 Remove stale comment. NFC.
llvm-svn: 288572
2016-12-03 01:59:13 +00:00
Kostya Serebryany 520753a321 [sanitizer-coverage] use IRB.SetCurrentDebugLocation after IRB.SetInsertPoint
llvm-svn: 288568
2016-12-03 01:43:30 +00:00
Matthias Braun 1fbb0f6dd9 AArch64CollectLOH: Rewrite as block-local analysis.
Previously this pass was using up to 5% compile time in some cases which
is a bit much for what it is doing. The pass featured a full blown
data-flow analysis which in the default configuration was restricted to a
single block.

This rewrites the pass under the assumption that we only ever work on a
single block. This is done in a single pass maintaining a state machine
per general purpose register to catch LOH patterns.

Differential Revision: https://reviews.llvm.org/D27329

llvm-svn: 288561
2016-12-03 00:52:56 +00:00
Guozhi Wei 835de1f3ab [ppc] Correctly compute the cost of loading 32/64 bit memory into VSR
VSX has instructions lxsiwax/lxsdx that can load 32/64 bit value into VSX register cheaply. That patch makes it known to memory cost model, so the vectorization of the test case in pr30990 is beneficial.

Differential Revision: https://reviews.llvm.org/D26713

llvm-svn: 288560
2016-12-03 00:41:43 +00:00
Ivan Krasin 75453b057b Support escaping in TrigramIndex.
Summary:
This is a follow up to r288303, where I have introduced TrigramIndex
to speed up SpecialCaseList for the cases when all rules are
simple wildcards, like *hello*wor.d*.

Here, I add support for escaping, so that it's possible to
specify rules like *c\+\+abi*.

Reviewers: pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27318

llvm-svn: 288553
2016-12-02 23:30:16 +00:00
Zachary Turner 6fa57ad9bd Resubmit "[LibFuzzer] Split FuzzerUtil for Posix and Windows."
This resubmits r288529, which was resubmitted because it broke a
fuzzer bot.  According to kcc@ the test that broke was flakey
and it is unlikely to be a result of this patch.

llvm-svn: 288549
2016-12-02 23:02:01 +00:00
Jacques Pienaar 3bec3ef6cd [lanai] Custom lowering of SHL_PARTS
Summary: Implement custom lowering of SHL_PARTS to enable lowering of left shift with larger than 32-bit shifts.

Reviewers: eliben, majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27232

llvm-svn: 288541
2016-12-02 22:01:28 +00:00
Zachary Turner 3cfeab7059 Revert "[LibFuzzer] Split FuzzerUtil for Posix and Windows."
This reverts commit r288529, as it seems to introduce some
problems on the Linux bots.

llvm-svn: 288533
2016-12-02 20:54:56 +00:00
Dan Gohman f295cc8fb5 [WebAssembly] Fix a compiler warning. NFC.
Fix a warning about a comparison between signed and unsigned integer
expressions.

llvm-svn: 288532
2016-12-02 20:13:05 +00:00
Zachary Turner d755e4f587 [LibFuzzer] Introduce a portable WeakAlias implementation.
Windows doesn't really support weak aliases, but with some
linker magic we can get something that's pretty close on
Windows.  This introduces an interface to accessing weakly
aliased symbols that will work on any platform.  Linker
magic changes to come in a separate patch.

Patch by Marcos Pividori
Differential Revision: https://reviews.llvm.org/D27235

llvm-svn: 288530
2016-12-02 19:41:17 +00:00
Zachary Turner 34dcfb9294 [LibFuzzer] Split FuzzerUtil for Posix and Windows.
Pave the way for separating out platform specific
utility functions into separate files.

Patch by Marcos Pividori
Differential Revision: https://reviews.llvm.org/D27234

llvm-svn: 288529
2016-12-02 19:38:19 +00:00
Rong Xu a5b5745a62 [PGO] Fix PGO use ICE when there are unreachable BBs
For -O0 there might be unreachable BBs, which breaks the assumption that all the
BBs have an auxiliary data structure.  In this patch, we add another interface
called findBBInfo() so that a nullptr can be returned for the unreachable BBs
(and the callers can ignore those BBs).

This fixes the bug reported
https://llvm.org/bugs/show_bug.cgi?id=31209

Differential Revision: https://reviews.llvm.org/D27280

llvm-svn: 288528
2016-12-02 19:10:29 +00:00
Ulrich Weigand 612d24badf [SystemZ] Support remaining atomic instructions
Add assembler support for all atomic instructions that weren't already
supported.  Some of those could be used to implement codegen for 128-bit
atomic operations, but this isn't done here yet.

llvm-svn: 288526
2016-12-02 18:24:16 +00:00
Ulrich Weigand 1c5a5c42de [SystemZ] Support floating-point control register instructions
Add assembler support for instructions manipulating the FPC.

Also add codegen support via the GCC compatibility builtins:
  __builtin_s390_sfpc
  __builtin_s390_efpc

llvm-svn: 288525
2016-12-02 18:21:53 +00:00
Ulrich Weigand da951d3bdc [SystemZ] Refactor hasSideEffects setting
Move setting of hasSideEffects out of SystemZInstrFormats.td,
to allow use of the format classes for instructions where this
flag shouldn't be set.  NFC.

llvm-svn: 288524
2016-12-02 18:19:22 +00:00
Matt Arsenault d4da0edd98 AMDGPU: Implement isCheapAddrSpaceCast
llvm-svn: 288523
2016-12-02 18:12:53 +00:00
Adam Nemet 4c207a6a1f [LTOs] Allow generation of hotness information
The flag is passed by the clang driver.

Differential Revision: https://reviews.llvm.org/D27331

llvm-svn: 288519
2016-12-02 17:53:56 +00:00
Renato Golin 5b8e7ecdb3 Revert "[SLP] Fix for PR6246: vectorization for scalar ops on vector elements."
This reverts commit r288497, as it broke the AArch64 build of Compiler-RT's
builtins (twice: once in r288412 and once in r288497). We should investigate
this offline.

llvm-svn: 288508
2016-12-02 16:56:26 +00:00
Nicolai Haehnle 33ca182c91 [DAGCombiner] do not fold (fmul (fadd X, 1), Y) -> (fmad X, Y, Y) by default
Summary:
When X = 0 and Y = inf, the original code produces inf, but the transformed
code produces nan. So this transform (and its relatives) should only be
used when the no-infs-fp-math flag is explicitly enabled.

Also disable the transform using fmad (intermediate rounding) when unsafe-math
is not enabled, since it can reduce the precision of the result; consider this
example with binary floating point numbers with two bits of mantissa:

  x = 1.01
  y = 111

  x * (y + 1) = 1.01 * 1000 = 1010 (this is the exact result; no rounding occurs at any step)

  x * y + x = 1000.11 + 1.01 =r 1000 + 1.01 = 1001.01 =r 1000 (with rounding towards zero)

The example relies on rounding towards zero at least in the second step.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98578

Reviewers: RKSimon, tstellarAMD, spatel, arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D26602

llvm-svn: 288506
2016-12-02 16:06:18 +00:00
Simon Pilgrim 9cb74267ac Tidyup code with indentation and clang-format. NFCI.
llvm-svn: 288505
2016-12-02 15:44:30 +00:00
Daniel Cederman ef62c59dd6 [Sparc] Fix parsing of double-precision %f18, %f20, and %f22
Summary: They are currently being parsed as %f14, %f16, and %f18.

Reviewers: venkatra, jyknight

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27342

llvm-svn: 288503
2016-12-02 15:05:26 +00:00
Simon Pilgrim cbf5f97018 [X86][SSE] Add support for extracting constant bit data from broadcasted constants
llvm-svn: 288499
2016-12-02 13:16:08 +00:00
Alexey Bataev e8e94a7176 [SLP] Fix for PR6246: vectorization for scalar ops on vector elements.
When trying to vectorize trees that start at insertelement instructions
function tryToVectorizeList() uses vectorization factor calculated as
MinVecRegSize/ScalarTypeSize. But sometimes it does not work as tree
cost for this fixed vectorization factor is too high.
Patch tries to improve the situation. It tries different vectorization
factors from max(PowerOf2Floor(NumberOfVectorizedValues),
MinVecRegSize/ScalarTypeSize) to MinVecRegSize/ScalarTypeSize and tries
to choose the best one.

Differential Revision: https://reviews.llvm.org/D27215

llvm-svn: 288497
2016-12-02 12:20:22 +00:00
Simon Pilgrim b3ae416839 [X86] Refactored getTargetConstantBitsFromNode to allow for expansion. NFCI.
getTargetConstantBitsFromNode currently only extracts constant pool vector data, but it will need to be generalized to support broadcast and scalar constant pool data as well.

Converted Constant bit extraction and Bitset splitting to helper lambda functions.

llvm-svn: 288496
2016-12-02 11:58:05 +00:00
Craig Topper 4961fa9bba [AVX-512] Add EVEX vpshuflw/vpshufhw/vpshufd instructions to load folding tables.
llvm-svn: 288484
2016-12-02 07:57:11 +00:00
Craig Topper 17ddb521ef [AVX-512] Add EVEX PSHUFB instructions to load folding tables.
llvm-svn: 288482
2016-12-02 07:06:30 +00:00
Craig Topper f7866fad54 [AVX-512] Add masked VINSERTF/VINSERTI instructions to load folding tables.
llvm-svn: 288481
2016-12-02 06:24:38 +00:00
Peter Collingbourne bc0705240e IR: Move NumElements field from {Array,Vector}Type to SequentialType.
Now that PointerType is no longer a SequentialType, all SequentialTypes
have an associated number of elements, so we can move that information to
the base class, allowing for a number of simplifications.

Differential Revision: https://reviews.llvm.org/D27122

llvm-svn: 288464
2016-12-02 03:20:58 +00:00
Dehao Chen c3be225895 Change LoopUnrollPass cost from int to unsigned to make it consistent. (NFC)
llvm-svn: 288463
2016-12-02 03:17:07 +00:00
Peter Collingbourne 4568158c4d IR: Change PointerType to derive from Type rather than SequentialType.
As proposed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-October/106640.html

This is for a couple of reasons:

- Values of type PointerType are unlike the other SequentialTypes (arrays
  and vectors) in that they do not hold values of the element type. By moving
  PointerType we can unify certain aspects of how the other SequentialTypes
  are handled.
- PointerType will have no place in the SequentialType hierarchy once
  pointee types are removed, so this is a necessary step towards removing
  pointee types.

Differential Revision: https://reviews.llvm.org/D26595

llvm-svn: 288462
2016-12-02 03:05:41 +00:00
Peter Collingbourne 25a40759c1 Fix GlobalISel build.
llvm-svn: 288460
2016-12-02 02:55:30 +00:00
Matt Arsenault 47a4b39646 ConstantFolding: Factor code into helper function
llvm-svn: 288459
2016-12-02 02:26:02 +00:00
Peter Collingbourne ab85225be4 IR: Change the gep_type_iterator API to avoid always exposing the "current" type.
Instead, expose whether the current type is an array or a struct, if an array
what the upper bound is, and if a struct the struct type itself. This is
in preparation for a later change which will make PointerType derive from
Type rather than SequentialType.

Differential Revision: https://reviews.llvm.org/D26594

llvm-svn: 288458
2016-12-02 02:24:42 +00:00
Paul Robinson dad4907bc1 [DWARF] Put linkage-name on abstract origin even when there's a declaration.
In r266692, we made it possible to emit linkage names for just inlined
functions, putting the attribute on the abstract origin. Make sure we
don't think the linkage-name was already emitted on a declaration.

Differential Revision: http://reviews.llvm.org/D27320

llvm-svn: 288450
2016-12-02 01:55:17 +00:00
Teresa Johnson 185b4ab6d4 [ThinLTO] Stop importing constant global vars as copies in the backend
Summary:
We were doing an optimization in the ThinLTO backends of importing
constant unnamed_addr globals unconditionally as a local copy (regardless
of whether the thin link decided to import them). This should be done in
the thin link instead, so that resulting exported references are marked
and promoted appropriately, but will need a summary enhancement to mark
these variables as constant unnamed_addr.

The function import logic during the thin link was trying to handle
this proactively, by conservatively marking all values referenced in
the initializer lists of exported global variables as also exported.
However, this only handled values referenced directly from the
initializer list of an exported global variable. If the value is itself
a constant unnamed_addr variable, we could end up exporting its
references as well. This caused multiple issues. The first is that the
transitively exported references weren't promoted. Secondly, some could
not be promoted/renamed (e.g. they had a section or other constraint).
recursively, instead of just adding the first level of initializer list
references to the ExportList directly.

Remove this optimization and the associated handling in the function
import backend. SPEC measurements indicate we weren't getting much
from it in any case.

Fixes PR31052.

Reviewers: mehdi_amini

Subscribers: krasin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26880

llvm-svn: 288446
2016-12-02 01:02:30 +00:00
Matt Arsenault c47701c0e9 AMDGPU: Use wider scalar spills for SGPR spilling
Since the spill is for the whole wave, these
don't have the swizzling problems that vector stores do
and a single 4-byte allocation is enough to spill a 64 element
register. This should reduce the number of spill instructions and
put all the spills for a register in the same cacheline.

This should save allocated private size, but for now it doesn't.
The extra slots are allocated for each component, but never used
because the frame layout is essentially finalized before frame
indices are replaced. For always using the scalar store path,
this should probably be moved into processFunctionBeforeFrameFinalized.

llvm-svn: 288445
2016-12-02 00:54:45 +00:00
Wolfgang Pieb 42f92a7225 When instructions are hoisted out of loops by MachineLICM, remove their debug loc.
This prevents erratic stepping behavior as well as incorrect source attribution
for sample profiling.

Reviewers: dblakie

Subscribers: llvm-commit

Differential Revision: https://reviews.llvm.org/D27290

llvm-svn: 288442
2016-12-02 00:37:57 +00:00
Justin Bogner 35c5e58f8c SDAG: Avoid a large, usually empty SmallVector in a recursive function
This SmallVector is using up 128 bytes on the stack every time despite
almost always being empty[1], and since this function can recurse quite
deeply that adds up to a lot of overhead. We've seen this run afoul of
ulimits in some cases with ASAN on.

Replacing the SmallVector with a std::vector trades an occasional heap
allocation for vastly less stack usage.

[1]: I gathered some stats on an internal test suite and the vector
was non-empty in only 45,000 of 10,000,000 calls to this function.

llvm-svn: 288441
2016-12-02 00:11:01 +00:00
Geoff Berry 7ffce7be0c [AArch64] Fold more spilled/refilled COPYs.
Summary:
Make AArch64InstrInfo::foldMemoryOperandImpl more general by folding all
full COPYs between register classes of the same size that are either
spilled or refilled.

Reviewers: MatzeB, qcolombet

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D27271

llvm-svn: 288439
2016-12-01 23:43:55 +00:00
Dan Gohman 734c59d501 [MC] Refactor emitELFSize to make usage more consistent. NFC.
Move the cast<MCSymbolELF> inside emitELFSize, so that: 
 - it's done in one place instead of at each call
 - it's more consistent with similar functions like EmitCOFFSafeSEH
 - ambiguity between cast<> and dyn_cast<> is avoided (which also
   eliminates an unnecessary dyn_cast call)

This also makes it easier to experiment with using ".size" directives on
non-ELF targets.

llvm-svn: 288437
2016-12-01 23:39:08 +00:00
Oleg Ranevskyy e2ae41519f [ARM] Fix for 64-bit CAS expansion on ARM32 with -O0
Summary:
This patch fixes comparison of 64-bit atomic with its expected value in CMP_SWAP_64 expansion.

Currently, the low words are compared with CMP, while the high words are compared with SBC. SBC expects the carry flag to be set if CMP detects a difference. CMP might leave the carry unset for unequal arguments though if the first one is >= than the second. This might cause the comparison logic to detect false equality.

Example of the broken C++ code:
```
std::atomic<long long> at(2);

long long ll = 1;
std::atomic_compare_exchange_strong(&at, &ll, 3);
```
Even though the atomic `at` and the expected value `ll` are not equal and `atomic_compare_exchange_strong` returns `false`, `at` is changed to 3.

The patch replaces SBC with CMPEQ.

Reviewers: t.p.northover

Subscribers: aemerson, rengolin, llvm-commits, asl

Differential Revision: https://reviews.llvm.org/D27315

llvm-svn: 288433
2016-12-01 22:58:35 +00:00
Artem Belevich 704395a25a Revert "[SLP] Fix for PR6246: vectorization for scalar ops on vector elements."
This reverts r288412 which causes severe compile-time regression.

llvm-svn: 288431
2016-12-01 22:52:15 +00:00
Matthias Braun 709a4cc238 RegisterCoalscer: Only coalesce complete reserved registers.
The coalescer eliminates copies from reserved registers of the form:
   %vregX = COPY %rY
in the case where %rY is a reserved register. However this turns out to
be invalid if only some of the subregisters are reserved (see also
https://reviews.llvm.org/D26648).

Differential Revision: https://reviews.llvm.org/D26687

llvm-svn: 288428
2016-12-01 22:39:51 +00:00
David Blaikie e40caaee99 [debug info] Minor cleanup from D27170/r288399
llvm-svn: 288421
2016-12-01 21:59:09 +00:00
Tim Northover 5bb87b6769 AArch64: fix 128-bit cmpxchg at -O0 (again, again).
This time the issue is fortunately just a simple mistake rather than a horrible
design spectre. I thought SUBS/SBCS provided sufficient NZCV flags for
comparing two 64-bit values, but they don't.

The fix is slightly clunkier in AArch64 because we can't use conditional
execution to emit a pair of CMPs. Traditionally an "icmp ne i128" would map to
an EOR/EOR/ORR/CBNZ, but that uses more registers so it's easier to go with a
CSET/CINC/CBNZ combination. Slightly less efficient, but this is -O0 anyway.

Thanks to Anton Korobeynikov for pointing out the issue.

llvm-svn: 288418
2016-12-01 21:31:59 +00:00
Benjamin Kramer 215b22e612 Fix unused variable warning in Release builds. NFC.
llvm-svn: 288416
2016-12-01 20:49:34 +00:00
Philip Reames 89e92d21b4 [PR29121] Don't fold if it would produce atomic vector loads or stores
The instcombine code which folds loads and stores into their use types can trip up if the use is a bitcast to a type which we can't directly load or store in the IR. In principle, such types shouldn't exist, but in practice they do today. This is a workaround to avoid a bug while we work towards the long term goal.

Differential Revision: https://reviews.llvm.org/D24365

llvm-svn: 288415
2016-12-01 20:17:06 +00:00
Philip Reames 4d00af1bde Factor out common parts of LVI and Float2Int into ConstantRange [NFCI]
This just extracts out the transfer rules for constant ranges into a single shared point. As it happens, neither bit of code actually overlaps in terms of the handled operators, but with this change that could easily be tweaked in the future.

I also want to have this separated out to make experimenting with a eager value info implementation and possibly a ValueTracking-like fixed depth recursion peephole version. There's no reason all four of these can't share a common implementation which reduces the chances of bugs.

Differential Revision: https://reviews.llvm.org/D27294

llvm-svn: 288413
2016-12-01 20:08:47 +00:00
Alexey Bataev 2c01af5904 [SLP] Fix for PR6246: vectorization for scalar ops on vector elements.
When trying to vectorize trees that start at insertelement instructions
function tryToVectorizeList() uses vectorization factor calculated as
MinVecRegSize/ScalarTypeSize. But sometimes it does not work as tree
cost for this fixed vectorization factor is too high.
Patch tries to improve the situation. It tries different vectorization
factors from max(PowerOf2Floor(NumberOfVectorizedValues),
MinVecRegSize/ScalarTypeSize) to MinVecRegSize/ScalarTypeSize and tries
to choose the best one.

Differential Revision: https://reviews.llvm.org/D27215

llvm-svn: 288412
2016-12-01 20:06:53 +00:00
David L Kreitzer 0e3ae305b6 Refactored X86InterleavedAccess into a class. NFCI.
Patch by Farhana Aleen

Differential Revision: https://reviews.llvm.org/D25986

llvm-svn: 288410
2016-12-01 19:56:39 +00:00
Matthias Braun d0ee66c2e9 Move most EH from MachineModuleInfo to MachineFunction
Recommitting r288293 with some extra fixes for GlobalISel code.

Most of the exception handling members in MachineModuleInfo is actually
per function data (talks about the "current function") so it is better
to keep it at the function instead of the module.

This is a necessary step to have machine module passes work properly.

Also:
- Rename TidyLandingPads() to tidyLandingPads()
- Use doxygen member groups instead of "//===- EH ---"... so it is clear
  where a group ends.
- I had to add an ugly const_cast at two places in the AsmPrinter
  because the available MachineFunction pointers are const, but the code
  wants to call tidyLandingPads() in between
  (markFunctionEnd()/endFunction()).

Differential Revision: https://reviews.llvm.org/D27227

llvm-svn: 288405
2016-12-01 19:32:15 +00:00
Benjamin Kramer 6a8704c1c0 Fix unused variable warning in Release builds. NFC.
llvm-svn: 288401
2016-12-01 19:10:10 +00:00
Greg Clayton 35630c3357 This change removes the dependency on DwarfDebug that was used for DW_FORM_ref_addr by making a new DIEUnit class in DIE.cpp.
The DIEUnit class represents a compile or type unit and it owns the unit DIE as an instance variable. This allows anyone with a DIE, to get the unit DIE, and then get back to its DIEUnit without adding any new ivars to the DIE class. Why was this needed? The DIE class has an Offset that is always the CU relative DIE offset, not the "offset in debug info section" as was commented in the header file (the comment has been corrected). This is great for performance because most DIE references are compile unit relative and this means most code that accessed the DIE's offset didn't need to make it into a compile unit relative offset because it already was. When we needed to emit a DW_FORM_ref_addr though, we needed to find the absolute offset of the DIE by finding the DIE's compile/type unit. This class did have the absolute debug info/type offset and could be added to the CU relative offset to compute the absolute offset. With this change we can easily get back to a DIE's DIEUnit which will have this needed offset. Prior to this is required having a DwarfDebug and required calling:

DwarfCompileUnit *DwarfDebug::lookupUnit(const DIE *CU) const;
Now we can use the DIEUnit class to do so without needing DwarfDebug. All clients now use DIEUnit objects (the DwarfDebug stack and the DwarfLinker). A follow on patch for the DWARF generator will also take advantage of this.

Differential Revision: https://reviews.llvm.org/D27170

llvm-svn: 288399
2016-12-01 18:56:29 +00:00
Alexey Bataev 62af7252f1 [SLP] Fixed cost model for horizontal reduction.
Currently when cost of scalar operations is evaluated the vector type is
used for scalar operations. Patch fixes this issue and fixes evaluation
of the vector operations cost.
Several test showed that vector cost model is too optimistic. It
allowed vectorization of 8 or less add/fadd operations, though scalar
code is faster. Actually, only for 16 or more operations vector code
provides better performance.

Differential Revision: https://reviews.llvm.org/D26277

llvm-svn: 288398
2016-12-01 18:42:42 +00:00
Mandeep Singh Grang 32360071a0 [llvm] Implement support for -defsym assembler option
Summary:
Changes to llvm-mc to move common logic to separate function.

Related clang patch: https://reviews.llvm.org/D26213

Reviewers: rafael, t.p.northover, colinl, echristo, rengolin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26214

llvm-svn: 288396
2016-12-01 18:42:04 +00:00
Simon Pilgrim 17d5b6b493 [X86][SSE] Moved shuffle mask widening/narrowing helper functions earlier in the file.
Will be necessary for a future patch.

llvm-svn: 288395
2016-12-01 18:27:19 +00:00
Kostya Serebryany 09f4fa5200 [libFuzzer] add a test for r288389 (-rss_limit_mb=0 means no limit).
llvm-svn: 288392
2016-12-01 18:02:07 +00:00
Ulrich Weigand d36b31d03f [SystemZ] Fix fallout from r288374
Avoid undefined behavior due to too-large shift count.

llvm-svn: 288391
2016-12-01 18:00:50 +00:00
Weiming Zhao cf26d56390 [AsmParser] Diagnose empty symbol for .set directive
Summary: Diagnose empty symbol to avoid hitting assertion in MCContext::getOrCreateSymbol

Reviewers: eli.friedman, rengolin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26728

llvm-svn: 288390
2016-12-01 18:00:36 +00:00
Kostya Serebryany dc6b8ca879 [libFuzzer] treat -rss_limit_mb=0 as no limit
llvm-svn: 288389
2016-12-01 17:56:15 +00:00
Adam Nemet 4ddb8c01b1 [GVN, OptDiag] Print the interesting instructions involved in missed load-elimination
[recommitting after the fix in r288307]

This includes the intervening store and the load/store that we're trying
to forward from in the optimization remark for the missed load
elimination.

This is hooked up under a new mode in ORE that allows for compile-time
budget for a bit more analysis to print more insightful messages.  This
mode is currently enabled for -fsave-optimization-record (-Rpass is
trickier since it is controlled in the front-end).

With this we can now print the red remark in http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L446

Differential Revision: https://reviews.llvm.org/D26490

llvm-svn: 288381
2016-12-01 17:34:50 +00:00
Adam Nemet 8b5fba8081 [GVN, OptDiag] Include the value that is forwarded in load elimination
[recommitting after the fix in r288307]

This requires some changes to the opt-diag API.  Hal and I have
discussed this at the Dev Meeting and came up with a streaming delimiter
(setExtraArgs) to solve this.

Arguments after this delimiter are only included in the optimization
records and not in the remarks printed in the compiler output.  (Note,
how in the test the content of the YAML file changes but the remarks on
the compiler output don't.)

This implements the green GVN message with a bug fix at line
http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L446

The fix is that now we properly include the constant value in the
message: "load of type i32 eliminated in favor of 7"

Differential Revision: https://reviews.llvm.org/D26489

llvm-svn: 288380
2016-12-01 17:34:44 +00:00
Ulrich Weigand 55082cddef [SystemZ] Fix applyFixup for 12-bit fixups
Now that we have fixups that only fill parts of a byte, it turns
out we have to mask off the bits outside the fixup area when
applying them.  Failing to do so caused invalid object code to
be emitted for bprp with a negative 12-bit displacement.

llvm-svn: 288374
2016-12-01 17:10:27 +00:00