Commit Graph

104393 Commits

Author SHA1 Message Date
Diana Picus 8445858a93 [ARM] GlobalISel: Support G_AND
This is identical to the support for the other binary operators:
- widen to s32
- map into GPR
- select ANDrr (via TableGen'erated code)

llvm-svn: 304885
2017-06-07 09:17:41 +00:00
Florian Hahn 1d38129b92 [Linker] Remove warning when linking ARM and Thumb IR modules.
Summary:
This patch updates Triple::isCompatibleWith to make armxx and thumbxx
triples compatible, as long as the subarch, vendor, os, envorionment and
object format match. Thumb/ARM code generation should be controlled
using the thumb-mode per-function target feature rather than by the
triple to allow mixing Thumb and ARM functions.

D33448 updates Clang's codegen to add thumb-mode for all functions with
armxx or thumbxx triples.

Reviewers: echristo, t.p.northover, rafael, kristof.beyls, rengolin, tejohnson

Reviewed By: tejohnson

Subscribers: rinon, eugenis, pcc, srhines, aemerson, mehdi_amini, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33287

llvm-svn: 304884
2017-06-07 09:17:01 +00:00
Florian Hahn 9afd9d9254 [ARM] Create relocations for unconditional branches.
Summary:
Relocations are required for unconditional branches to function symbols with
different execution mode. Without this patch, incorrect branches are
generated for tail calls between functions with different execution
mode.


Reviewers: peter.smith, rafael, echristo, kristof.beyls

Reviewed By: peter.smith

Subscribers: aemerson, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33898

llvm-svn: 304882
2017-06-07 08:54:47 +00:00
Craig Topper 73ba1c84be [InstCombine][InstSimplify] Use APInt::isNullValue/isOneValue to reduce compiled code for comparing APInts with 0 and 1. NFC
These methods are specifically optimized to only counting leading zeros without an additional uint64_t compare.

llvm-svn: 304876
2017-06-07 07:40:37 +00:00
Craig Topper 29c282eac8 [InstCombine] Fix two asserts that were accidentally checking that an APInt pointer is non-zero instead of checking that the APInt self is non-zero.
I believe this code used to use APInt references which would have worked. But then they were changed to pointers to allow m_APInt to be used.

llvm-svn: 304875
2017-06-07 07:40:29 +00:00
NAKAMURA Takumi 92c99cd6dc Update libdeps to add BinaryFormat, introduced in r304864.
llvm-svn: 304869
2017-06-07 04:48:49 +00:00
NAKAMURA Takumi ef9d9481b5 Reorder and reformat.
llvm-svn: 304868
2017-06-07 04:48:45 +00:00
Zachary Turner 830b6fd350 Add dependency from LibDriver to BinaryFormat.
llvm-svn: 304867
2017-06-07 04:39:50 +00:00
Zachary Turner 8a9e2c6bad Add dependency from AsmParser to BinaryFormat.
This breaks the MinGW build, but not other builds for some reason.

llvm-svn: 304866
2017-06-07 04:24:33 +00:00
Zachary Turner 264b5d9e88 Move Object format code to lib/BinaryFormat.
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.

Differential Revision: https://reviews.llvm.org/D33843

llvm-svn: 304864
2017-06-07 03:48:56 +00:00
Craig Topper 7945248267 [LazyValueInfo] Remove redundant calls to ConstantRange::contains. The same exact call was made in the if above and we already know it returned true. NFC
llvm-svn: 304857
2017-06-07 00:58:09 +00:00
Craig Topper 50b1e5135e [Constants] Use isUIntN/isIntN from MathExtras instead of reimplementing the same code. NFC
llvm-svn: 304856
2017-06-07 00:58:05 +00:00
Craig Topper 93ac6e14cd [Constants] Use APInt::isNullValue/isOneValue/uge to simplify some code and take advantage of APInt optimizations. NFC
llvm-svn: 304855
2017-06-07 00:58:02 +00:00
Quentin Colombet 9e9d638676 [InlineSpiller] Only account for real spills in the hoisting logic
Spills of undef values shouldn't impact the placement of the relevant
spills. Drive by review.

llvm-svn: 304850
2017-06-07 00:22:07 +00:00
Sanjay Patel f57015d4cc [CGP / PowerPC] use direct compares if there's only one load per block in memcmp() expansion
I'd like to enable CGP memcmp expansion for x86, but the output from CGP would regress the 
special cases (memcmp(x,y,N) != 0 for N=1,2,4,8,16,32 bytes) that we already handle.

I'm not sure if we'll actually be able to produce the optimal code given the block-at-a-time 
limitation in the DAG. We might have to just avoid those special-cases here in CGP. But 
regardless of that, I think this is a win for the more general cases.

http://rise4fun.com/Alive/cbQ

Differential Revision: https://reviews.llvm.org/D33963

llvm-svn: 304849
2017-06-07 00:17:08 +00:00
Zachary Turner 1bfb9f47af Fix uninitialized read.
llvm-svn: 304846
2017-06-06 23:54:23 +00:00
Adrian Prantl 318d1195f2 Introduce -brief command line option to llvm-dwarfdump
This patch introduces a new command line option, called brief, to
llvm-dwarfdump.  When -brief is used, the attribute forms for the
.debug_info section will not be emitted to output.

Patch by Spyridoula Gravani!

rdar://problem/21474365
Differential Revision: https://reviews.llvm.org/D33867

llvm-svn: 304844
2017-06-06 23:28:45 +00:00
Chandler Carruth abd32bad37 Fix the includes in lib/Fuzzer on Windows that have ordering
dependencies and add comments to tell future maintainers about those
requirements.

llvm-svn: 304843
2017-06-06 23:28:01 +00:00
Davide Italiano c88f2c712f [CFLAA] Remove unused include. NFCI.
llvm-svn: 304842
2017-06-06 23:16:19 +00:00
Eugene Zelenko fb69e66cff [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 304839
2017-06-06 22:22:41 +00:00
Dimitry Andric bc3feaaa88 Allow VersionPrinter to print to arbitrary raw_ostreams
Summary:
I would like to add printing of registered targets to clang's version
information.  For this to work correctly, the VersionPrinter logic in
CommandLine.cpp should support printing to arbitrary raw_ostreams,
instead of always defaulting to outs().

Add a raw_ostream& parameter to the function pointer type used for
VersionPrinter, and while doing so, introduce a typedef for convenience.

Note that VersionPrinter::print() will still default to using outs(),
the clang part will necessarily go into a separate review.

Reviewers: beanz, chandlerc, dberris, mehdi_amini, zturner

Reviewed By: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33899

llvm-svn: 304835
2017-06-06 21:54:04 +00:00
David Blaikie c662b50150 GlobalsModRef+OptNone: Don't prove readnone/other properties from an optnone function
Seems like at least one reasonable interpretation of optnone is that the
optimizer never "looks inside" a function. This fix is consistent with
that interpretation.

Specifically this came up in the situation:

f3 calls f2 calls f1
f2 is always_inline
f1 is optnone

The application of readnone to f1 (& thus to f2) caused the inliner to
kill the call to f2 as being trivially dead (without even checking the
cost function, as it happens - not sure if that's also a bug).

llvm-svn: 304833
2017-06-06 20:51:15 +00:00
Sanjay Patel b4b7df95de [CGP] fix formatting/typos in MemCmpExpansion; NFC
llvm-svn: 304830
2017-06-06 20:30:47 +00:00
Matthias Braun 7e23fc05c1 llc: Add ability to parse mir from stdin
- Add -x <language> option to switch between IR and MIR inputs.
- Change MIR parser to read from stdin when filename is '-'.
- Add a simple mir roundtrip test.

llvm-svn: 304825
2017-06-06 20:06:57 +00:00
Evgeny Stupachenko 3b88291581 Fix PR23384 (part 3 of 3)
Summary:
The patch makes instruction count the highest priority for
 LSR solution for X86 (previously registers had highest priority).

Reviewers: qcolombet

Differential Revision: http://reviews.llvm.org/D30562

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 304824
2017-06-06 20:04:16 +00:00
Sanjay Patel 2726ea2ac0 [DAG] remove duplicated code for isOnlyUsedInZeroEqualityComparison(); NFCI
llvm-svn: 304822
2017-06-06 19:40:09 +00:00
Anna Thomas 4acfc7e16e [LVI Printer] Rely on the LVI analysis functions rather than the LVI cache
Summary:
LVIPrinter pass was previously relying on the LVICache. We now directly call the
the LVI functions which solves the value if the LVI information is not already
available in the cache. This has 2 benefits over the printing of LVI cache:
1. higher coverage (i.e. catches errors) in LVI code when cache value is
invalidated.
2. relies on the core functions, and not dependent on the LVI cache (which may
be scrapped at some point).
It would still catch any cache invalidation errors, since we first go through
the cache.

Reviewers: reames, dberlin, sanjoy

Reviewed by: reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32135

llvm-svn: 304819
2017-06-06 19:25:31 +00:00
Sam Clegg acd7d2b00b [WebAssembly] MC: Refactor relocation handling
The change cleans up and unifies the handling of relocation
entries in WasmObjectWriter.  Type index relocation no longer
need to be handled separately.

The only externally visible change should be that type
index relocations are no longer grouped at the end.

Differential Revision: https://reviews.llvm.org/D33918

llvm-svn: 304816
2017-06-06 19:15:05 +00:00
Matthias Braun 8b5f9e4438 MIRPrinter: Avoid assert() when printing empty INLINEASM strings.
CodeGen uses MO_ExternalSymbol to represent the inline assembly strings.
Empty strings for symbol names appear to be invalid. For now just
special case the output code to avoid hitting an `assert()` in
`printLLVMNameWithoutPrefix()`.

This fixes https://llvm.org/PR33317

llvm-svn: 304815
2017-06-06 19:00:58 +00:00
Konstantin Zhuravlyov 1e2b87893b AMDGPU/NFC: Move amdgpu code object metadata to support
Differential Revision: https://reviews.llvm.org/D31437

llvm-svn: 304812
2017-06-06 18:35:50 +00:00
Daniel Berlin eafdd862e5 NewGVN: Fix PR/33187. This is a bug caused by two things:
1. When there is no perfect iteration order, we can't let phi nodes
put themselves in terms of things that come later in the iteration
order, or we will endlessly cycle (the normal RPO algorithm clears the
hashtable to avoid this issue).
2. We are sometimes erasing the wrong expression (causing pessimism)
because our equality says loads and stores are the same.
We introduce an exact equality function and use it when erasing to
make sure we erase only identical expressions, not equivalent ones.

llvm-svn: 304807
2017-06-06 17:15:28 +00:00
Anna Thomas b2a212c070 [Atomics][LoopIdiom] Recognize unordered atomic memcpy
Summary:
Expanding the loop idiom test for memcpy to also recognize
unordered atomic memcpy. The only difference for recognizing
an unordered atomic memcpy and instead of a normal memcpy is
that the loads and/or stores involved are unordered atomic operations.

Background:  http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html

Patch by Daniel Neilson!

Reviewers: reames, anna, skatkov

Reviewed By: reames, anna

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33243

llvm-svn: 304806
2017-06-06 16:45:25 +00:00
Stanislav Mekhanoshin e4cda7417c [AMDGPU] Return correct value from SDWA pass
Differential Revision: https://reviews.llvm.org/D33927

llvm-svn: 304805
2017-06-06 16:42:30 +00:00
Sam Clegg 6dc65e9105 [WebAssembly] Remove unused methods from MCWasmObjectTargetWriter
These methods looks like they were originally came from
MCELFObjectTargetWriter but they are never called by the
WasmObjectWriter.

Remove these methods meant the declaration of WasmRelocationEntry
could also move into the cpp file.

Differential Revision: https://reviews.llvm.org/D33905

llvm-svn: 304804
2017-06-06 16:38:59 +00:00
Petar Jovanovic 64fb7a8ebd [mips] Add madd4 subtarget feature
Addition of a feature and a predicate used to control generation of madd.fmt
and similar instructions.

Patch by Stefan Maksimovic.

Differential Revision: https://reviews.llvm.org/D33400

llvm-svn: 304801
2017-06-06 15:33:01 +00:00
Anna Thomas 7218032019 [IRCE] Canonicalize pre/post loops after the blocks are added into parent loop
Summary:
We were canonizalizing the pre loop (into loop-simplify form) before
the post loop blocks were added into parent loop. This is incorrect when IRCE is
done on a subloop. The post-loop blocks are created, but not yet added to the
parent loop. So, loop-simplification on the pre-loop incorrectly updates
LoopInfo.

This patch corrects the ordering so that pre and post loop blocks are added to
parent loop (if any), and then the loops are canonicalized to LCSSA and
LoopSimplifyForm.

Reviewers: reames, sanjoy, apilipenko

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33846

llvm-svn: 304800
2017-06-06 14:54:01 +00:00
Simon Pilgrim 3446ff4df5 Fix spelling mistake in getRThroughput static function names. NFCI.
llvm-svn: 304799
2017-06-06 14:25:34 +00:00
Simon Pilgrim f7113fd270 [X86][AVX1] Split 256-bit vector non-temporal FastISel loads to keep it non-temporal (PR32744)
Extension to D33728

llvm-svn: 304798
2017-06-06 14:18:39 +00:00
Tom Stellard 8cd60a5067 AMDGPU/GlobalISel: Mark 32-bit G_ICMP as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D33890

llvm-svn: 304797
2017-06-06 14:16:50 +00:00
Chandler Carruth aaeada6c75 Fix another ordering constraint with windows.h and comment about
a revers constraint that we got right (by chance).

llvm-svn: 304792
2017-06-06 12:43:20 +00:00
Chandler Carruth 185ddeffd4 Fix one place where I missed a commented requirement for a particular
include ordering.

I've changed the structure so that clang-format will preserve this going
forward.

llvm-svn: 304788
2017-06-06 12:11:24 +00:00
Chandler Carruth 6bda14b313 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

llvm-svn: 304787
2017-06-06 11:49:48 +00:00
Peter Smith d16c55de6d [ARM] Add curly braces around switch case [NFC]
My previous commit r304702 introduced a new case into a switch statement.
This case defined a variable but I forgot to add the curly brackets around the
case to limit the scope.

This change puts the curly braces back in so that the next person that adds a
case doesn't get a build failure. Thanks to avieira for the spot.

Differential Revision: https://reviews.llvm.org/D33931

llvm-svn: 304785
2017-06-06 10:22:49 +00:00
Joey Gouly 61eaa63b65 [InstSimplify] Constant fold the new GEP in SimplifyGEPInst.
llvm-svn: 304784
2017-06-06 10:17:14 +00:00
Vivek Pandya 56d87ef5d7 [Improve CodeGen Testing] This patch renables MIRPrinter print fields which have value equal to its default.
If -simplify-mir option is passed then MIRPrinter will not print such fields.
This change also required some lit test cases in CodeGen directory to be changed.

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D32304

llvm-svn: 304779
2017-06-06 08:16:19 +00:00
Craig Topper aa9a24bd8b [InstSimplify] Remove some redundant code from InstSimplify now that llvm::isKnownNonEqual handles vectors.
isKnownNonEqual is called a little earlier in this function and can handle the case that we were checking here as well as more complex cases.

llvm-svn: 304775
2017-06-06 07:13:17 +00:00
Craig Topper 3002d5b0bf [ValueTracking] Remove scalar only restriction from isKnownNonEqual. The computeKnownBits and isKnownNonZero calls this code relies on should work fine for vectors.
This will be used by another commit to remove some code from InstSimplify that is redundant for scalars, but was needed for vectors due to this issue.

llvm-svn: 304774
2017-06-06 07:13:15 +00:00
Craig Topper 2dfb4804f2 [InstSimplify] Use the getTrue/getFalse helpers and make sure we use the computed result type instead of hardcoding to i1. NFC
Currently, isKnownNonEqual punts on vectors so the hardcoding to i1 doesn't matter. But I plan to fix that in a future patch.

llvm-svn: 304773
2017-06-06 07:13:13 +00:00
Craig Topper 8e662f7f81 [ValueTracking] Use the computeKnownBits version that returns a KnownBits object instead of taking one by reference. NFC
llvm-svn: 304772
2017-06-06 07:13:11 +00:00
Craig Topper 8365df825e [ValueTracking] Use APInt::intersects to avoid some temporary APInts. NFC
llvm-svn: 304771
2017-06-06 07:13:09 +00:00
Craig Topper c2790ecda8 [InstSimplify] Use ICmpInst::isEquality predicate method. NFC
llvm-svn: 304770
2017-06-06 07:13:04 +00:00
Mandeep Singh Grang 5e1697ef28 [llvm] Remove double semicolons
Reviewers: craig.topper, arsenm, mehdi_amini

Reviewed By: mehdi_amini

Subscribers: mehdi_amini, wdng, nhaehnle, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33924

llvm-svn: 304767
2017-06-06 05:08:36 +00:00
Xin Tong 9d6f08a8d4 Add a dominanance check interface that uses caching for instructions within same basic block.
Summary:
This problem stems from the fact that instructions are allocated using new
in LLVM, i.e. there is no relationship that can be derived by just looking
at the pointer value.

This interface dispatches to appropriate dominance check given 2 instructions,
i.e. in case the instructions are in the same basic block, ordered basicblock
(with instruction numbering and caching) are used. Otherwise, dominator tree
is used.

This is a preparation patch for https://reviews.llvm.org/D32720

Reviewers: dberlin, hfinkel, davide

Subscribers: davide, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D33380

llvm-svn: 304764
2017-06-06 02:34:41 +00:00
Chandler Carruth 41ed4034dd [x86] Revert the X86FoldTablesEmitter due to more miscompiles.
In testing, we've found yet another miscompile caused by the new tables.
And this one is even less clear how to fix (we could teach it to fold
a 16-bit load instead of the 32-bit load it wants, or block folding
entirely).

Also, the approach to excluding instructions seems increasingly to not
scale well.

I have left a more detailed analysis on the review log for the original
patch (https://reviews.llvm.org/D32684) along with suggested path
forward. I will land an additional test case that I wrote which covers
the code that was miscompiling (folding into the output of `pextrw`) in
a subsequent commit to keep this a pure revert.

For each commit reverted here, I've restricted the revert to the
non-test code touching the x86 fold table emission until the last commit
where I did revert the test updates. This means the *new* test cases
added for `insertps` and `xchg` remain untouched (and continue to pass).

Reverted commits:
r304540: [X86] Don't fold into memory operands into insertps in the ...
r304347: [TableGen] Adapt more places to getValueAsString now ...
r304163: [X86] Don't fold away the memory operand of an xchg.
r304123: Don't capture a temporary std::string in a StringRef.
r304122: Resubmit "[X86] Adding new LLVM TableGen backend that ..."

Original commit was in r304088, and after a string of fixes was reverted
previously in r304121 to fix build bots, and then re-landed in r304122.

llvm-svn: 304762
2017-06-06 02:15:31 +00:00
Wolfgang Pieb 77d3e938f8 [DWARF] Adding support for the DWARF v5 string offsets table (consumer/reader part only).
Reviewers: dblaikie, aprantl

Differential Revision: https://reviews.llvm.org/D32779

llvm-svn: 304759
2017-06-06 01:22:34 +00:00
Matthias Braun 7bda195812 CodeGen: Refactor MIR parsing
When parsing .mir files immediately construct the MachineFunctions and
put them into MachineModuleInfo.

This allows us to get rid of the delayed construction (and delayed error
reporting) through the MachineFunctionInitialzier interface.

Differential Revision: https://reviews.llvm.org/D33809

llvm-svn: 304758
2017-06-06 00:44:35 +00:00
Matthias Braun c7c06f158c CodeGen/LLVMTargetMachine: Refactor ISel pass construction; NFCI
- Move ISel (and pre-isel) pass construction into TargetPassConfig
- Extract AsmPrinter construction into a helper function

Putting the ISel code into TargetPassConfig seems a lot more natural and
both changes together make make it easier to build custom pipelines
involving .mir in an upcoming commit. This moves MachineModuleInfo to an
earlier place in the pass pipeline which shouldn't have any effect.

llvm-svn: 304754
2017-06-06 00:26:13 +00:00
Quentin Colombet c668935d85 [InlineSpiller] Don't spill fully undef values
Althought it is not wrong to spill undef values, it is useless and harms
both code size and runtime. Before spilling a value, check that its
content actually matters.

http://www.llvm.org/PR33311

llvm-svn: 304752
2017-06-05 23:51:27 +00:00
Evgeny Stupachenko f2b3b467e5 Fix PR23384 (part 2 of 3) NFC
Summary:
The patch moves LSR cost comparison to target part.

Reviewers: qcolombet

Differential Revision: http://reviews.llvm.org/D30561

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 304750
2017-06-05 23:37:00 +00:00
Matt Arsenault 540d133138 Remove double semicolon
llvm-svn: 304749
2017-06-05 23:01:31 +00:00
Matthias Braun e89461a400 Remove some #include from StackProtector.h; NFC
llvm-svn: 304748
2017-06-05 22:59:21 +00:00
Matt Arsenault 9e5b5053d1 RenameIndependentSubregs: Fix handling of undef tied operands
If a tied source operand was undef, it would be replaced but not
update the other tied operand, which would end up using different
virtual registers.

llvm-svn: 304747
2017-06-05 22:58:57 +00:00
Evgeny Stupachenko 4d94e99446 LSR: Calculate instruction cost only if InsnsCost is set to true (NFC)
Summary:

The patch guard all instruction cost calculations with InsnCosts (-lsr-insns-cost) option.
Currently even if the option set to false we calculate and print (in debug mode) instruction costs.

Reviewers: qcolombet

Differential Revision: http://reviews.llvm.org/D33914

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 304746
2017-06-05 22:44:18 +00:00
Volkan Keles ebe6bb9006 [GlobalISel] IRTranslator: Add MachineMemOperand to target memory intrinsics
Reviewers: qcolombet, ab, t.p.northover, aditya_nandakumar, dsanders

Reviewed By: qcolombet

Subscribers: rovka, kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D33724

llvm-svn: 304743
2017-06-05 22:17:17 +00:00
Davide Italiano fb4d5c095b [SelectionDAG] Update the dominator after splitting critical edges.
Running `llc -verify-dom-info` on the attached testcase results in a
crash in the verifier, due to a stale dominator tree.

i.e.

  DominatorTree is not up to date!
  Computed:
  =============================--------------------------------
  Inorder Dominator Tree:
    [1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,7}
      [2] %lor.lhs.false.i61.i.i.i {1,2}
      [2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,6}
        [3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5}

  Actual:
  =============================--------------------------------
  Inorder Dominator Tree:
    [1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,9}
      [2] %lor.lhs.false.i61.i.i.i {1,2}
      [2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,8}
        [3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5}
        [3] %safe_mod_func_int8_t_s_s.exit.i.i.i.lor.lhs.false.i61.i.i.i_crit_edge {6,7}

This is because in `SelectionDAGIsel` we split critical edges without
updating the corresponding dominator for the function (and we claim
in `MachineFunctionPass::getAnalysisUsage()` that the domtree is preserved).

We could either stop preserving the domtree in `getAnalysisUsage`
or tell `splitCriticalEdge()` to update it.
As the second option is easy to implement, that's the one I chose.

Differential Revision:  https://reviews.llvm.org/D33800

llvm-svn: 304742
2017-06-05 22:16:41 +00:00
Zachary Turner 88101dadcc [CodeView] Fix endianness bug.
We should be outputting in little endian, but we were writing
in host endianness.

llvm-svn: 304741
2017-06-05 22:12:23 +00:00
Zachary Turner 349c18f837 [CodeView] Handle Cross Module Imports and Exports.
While it's not entirely clear why a compiler or linker might
put this information into an object or PDB file, one has been
spotted in the wild which was causing llvm-pdbdump to crash.

This patch adds support for reading-writing these sections.
Since I don't know how to get one of the native tools to
generate this kind of debug info, the only test here is one
in which we feed YAML into the tool to produce a PDB and
then spit out YAML from the resulting PDB and make sure that
it matches.

llvm-svn: 304738
2017-06-05 21:40:33 +00:00
Konstantin Zhuravlyov 5b0bf2ff0d AMDGPU: Remove deprecated and unused elf definitions
Differential Revision: https://reviews.llvm.org/D33689

llvm-svn: 304737
2017-06-05 21:33:40 +00:00
Saleem Abdulrasool 4c47434b25 CodeGen: add support for emitting ObjC image info
This ensures that we can emit the ObjC Image Info structure on COFF and
ELF as well.  The frontend already would attempt to emit this
information but would get dropped when generating assembly or an object
file.

llvm-svn: 304736
2017-06-05 21:26:39 +00:00
Craig Topper f6e138d794 [ConstantRange] Remove costly udivrem from ConstantRange::truncate
Truncate currently uses a udivrem call which is going to be slow particularly for larger than 64-bit widths.

As far as I can tell all we were trying to do was modulo LowerDiv by (MaxValue+1) and make sure whatever value was effectively subtracted from LowerDiv was also subtracted from UpperDiv.

This patch recognizes that MaxValue+1 is a power of 2 so we can just use a bitwise AND to accomplish a modulo operation or isolate the upper bits.

Differential Revision: https://reviews.llvm.org/D32672

llvm-svn: 304733
2017-06-05 20:48:05 +00:00
Mark Searles 602ee930bf [AMDGPU] Fix uninit'ed var (RevisitLoop)
Differential Revision: https://reviews.llvm.org/D33907

llvm-svn: 304729
2017-06-05 19:29:01 +00:00
Sanjay Patel 6350de76fa [DAGCombine] Fix unchecked calls to DAGCombiner::*ExtPromoteOperand
Other calls to DAGCombiner::*PromoteOperand check the result, but here it could cause an assertion in getNode. 
Falling back to any extend in this case instead of failing outright seems correct to me.

No test case because:
The failure was triggered by an out of tree backend. In order to trigger it, a backend would need to overload 
TargetLowering::IsDesirableToPromoteOp to return true for a type for which ISD::SIGN_EXTEND_INREG is marked 
illegal. In tree, only X86 overloads and sometimes returns true for MVT::i16 yet it marks 
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16  , Legal);.

Patch by Jacob Young!

Differential Revision: https://reviews.llvm.org/D33633

llvm-svn: 304723
2017-06-05 17:01:10 +00:00
Simon Pilgrim 807b708d13 [X86][SSE41] Non-temporal loads shouldn't be folded if it can be avoided (PR32743)
Missed SSE41 non-temporal load case in previous commit

Differential Revision: https://reviews.llvm.org/D33728

llvm-svn: 304722
2017-06-05 16:45:32 +00:00
Adam Nemet 4ef096b0c2 Handle non-unique edges in edge-dominance
This removes a quadratic behavior in assert-enabled builds.

GVN propagates the equivalence from a condition into the blocks guarded by the
condition.  E.g. for 'if (a == 7) { ... }', 'a' will be replaced in the block
with 7.  It does this by replacing all the uses of 'a' that are dominated by
the true edge.

For a switch with N cases and U uses of the value, this will mean N * U calls
to 'dominates'.  Asserting isSingleEdge in 'dominates' make this N^2 * U
because this function checks for the uniqueness of the edge. I.e. traverses
each edge between the SwitchInst's block and the cases.

The change removes the assert and makes 'dominates' works correctly in the
presence of non-unique edges.

This brings build time down by an order of magnitude for an input that has
~10k cases in a switch statement.

Differential Revision: https://reviews.llvm.org/D33584

llvm-svn: 304721
2017-06-05 16:27:09 +00:00
Frederich Munch ad12580012 Close DynamicLibraries in reverse order they were opened.
Summary: Matches C++ destruction ordering better and fixes possible problems of loaded libraries having inter-dependencies.

Reviewers: efriedma, v.g.vassilev, chapuni

Reviewed By: efriedma

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D33652

llvm-svn: 304720
2017-06-05 16:26:58 +00:00
Dmitry Mikulin db3b87b2c0 Symbols re-defined with -wrap and -defsym need to be excluded from inter-
procedural optimizations to prevent dropping symbols and allow the linker
to process re-directs.

PR33145: --wrap doesn't work with lto.
Differential Revision: https://reviews.llvm.org/D33621

llvm-svn: 304719
2017-06-05 16:24:25 +00:00
Simon Pilgrim b2ef948628 [X86][AVX1] Split 256-bit vector non-temporal loads to keep it non-temporal (PR32744)
Differential Revision: https://reviews.llvm.org/D33728

llvm-svn: 304718
2017-06-05 16:02:01 +00:00
Simon Pilgrim a25bf0b6b9 [X86][SSE] Non-temporal loads shouldn't be folded if it can be avoided (PR32743)
Differential Revision: https://reviews.llvm.org/D33728

llvm-svn: 304717
2017-06-05 15:43:03 +00:00
Diana Picus 0091cc3528 [ARM] GlobalISel: Constrain callee register on indirect calls
When lowering calls, we generate instructions with machine opcodes
rather than generic ones. Therefore, we need to constrain the register
classes of the operands.

Also enable the machine verifier on the arm-irtranslator.ll test, since
that would've caught this issue.

Fixes (part of) PR32146.

llvm-svn: 304712
2017-06-05 12:54:53 +00:00
whitequark f6059fdc54 [LLVM-C] [OCaml] Expose Type::subtypes.
The C functions added are LLVMGetNumContainedTypes and
LLVMGetSubtypes.

The OCaml function added is Llvm.subtypes.

Patch by Ekaterina Vaartis.

Differential Revision: https://reviews.llvm.org/D33677

llvm-svn: 304709
2017-06-05 11:49:52 +00:00
Dimitry Andric f5d486f43d Fix building DynamicLibrary.cpp with musl libc
Summary:
The workaround added in rL301240 for stderr/out/in symbols being both
macros and globals is only necessary for glibc, and it does not compile
with musl libc. Alpine Linux has had the following fix for it:

https://git.alpinelinux.org/cgit/aports/plain/main/llvm4/llvm-fix-DynamicLibrary-to-build-with-musl-libc.patch

Adapt the fix in our DynamicLibrary.inc for Unix.

Reviewers: marsupial, chandlerc, krytarowski

Reviewed By: krytarowski

Subscribers: srhines, krytarowski, llvm-commits

Differential Revision: https://reviews.llvm.org/D33883

llvm-svn: 304707
2017-06-05 11:22:18 +00:00
Javed Absar b16d146838 Add support for #pragma clang section
This patch provides a means to specify section-names for global variables,
functions and static variables, using #pragma directives.
This feature is only defined to work sensibly for ELF targets.
One can specify section names as:
#pragma clang section bss="myBSS" data="myData" rodata="myRodata" text="myText"
One can "unspecify" a section name with empty string e.g.
#pragma clang section bss="" data="" text="" rodata=""

Reviewers: Roger Ferrer, Jonathan Roelofs, Reid Kleckner
Differential Revision: https://reviews.llvm.org/D33413

llvm-svn: 304704
2017-06-05 10:09:13 +00:00
Peter Smith adde667007 [ARM] Support fixup for Thumb2 modified immediate
This change adds a new fixup fixup_t2_so_imm for the t2_so_imm_asmoperand
"T2SOImm". The fixup permits code such as:
.L1:
 sub r3, r3, #.L2 - .L1
.L2:
to assemble in Thumb2 as well as in ARM state.
    
The operand predicate isT2SOImm() explicitly doesn't match expressions
containing :upper16: and :lower16: as expressions with these operators
must match the movt and movw instructions.
    
The test mov r0, foo2 in thumb2-diagnostics is moved to a new file as the
fixup delays the error message till after the assembler has quit due to
the other errors.
    
As the mov instruction shares the t2_so_imm_asmoperand mov instructions
with a non constant expression now match t2MOVi rather than t2MOVi16 so the
error message is slightly different.
    
Fixes PR28647

Differential Revision: https://reviews.llvm.org/D33492

llvm-svn: 304702
2017-06-05 09:37:12 +00:00
Sven van Haastregt 78819e0fd4 [InstCombine] Fix extractelement use before def
This fixes a bug that can cause extractelements with operands that
haven't been defined yet to be inserted at a wrong point when
optimising insertelements.

Patch by Karl Hylen.

Differential Revision: https://reviews.llvm.org/D33449

llvm-svn: 304701
2017-06-05 09:18:10 +00:00
Renato Golin cdf840fd38 Revert "[sanitizer-coverage] one more flavor of coverage: -fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet."
This reverts commit r304630, as it broke ARM/AArch64 bots for 2 days.

llvm-svn: 304698
2017-06-05 07:35:52 +00:00
Stanislav Mekhanoshin 286a4225b9 [AMDGPU] Fix SIFoldOperands crash with clamp
Fixes bug #33302. Pass did not account that Src1 of max instruction
can be an immediate.

Differential Revision: https://reviews.llvm.org/D33884

llvm-svn: 304696
2017-06-05 01:03:04 +00:00
Craig Topper da8037f299 [InstSimplify] Use llvm::all_of instead of a manual loop. NFC
llvm-svn: 304692
2017-06-04 22:41:56 +00:00
Peter Collingbourne 2b9e9e474c IR: When creating a global variable, assert that its type is valid.
llvm-svn: 304690
2017-06-04 22:12:03 +00:00
Simon Pilgrim 46dd55f1e1 [X86][SSE] Change BUILD_VECTOR interleaving ordering to improve coalescing/combine opportunities
We currently generate BUILD_VECTOR as a tree of UNPCKL shuffles of the same type:

e.g. for v4f32:

Step 1: unpcklps 0, 2 ==> X: <?, ?, 2, 0>
      : unpcklps 1, 3 ==> Y: <?, ?, 3, 1>
Step 2: unpcklps X, Y ==>    <3, 2, 1, 0>

The issue is because we are not placing sequential vector elements together early enough, we fail to recognise many combinable patterns - consecutive scalar loads, extractions etc.

Instead, this patch unpacks progressively larger sequential vector elements together:

e.g. for v4f32:

Step 1: unpcklps 0, 2 ==> X: <?, ?, 1, 0>
      : unpcklps 1, 3 ==> Y: <?, ?, 3, 2>
Step 2: unpcklpd X, Y ==>    <3, 2, 1, 0>

This does mean that we are creating UNPCKL shuffle of different value types, but the relevant combines that benefit from this are quite capable of handling the additional BITCASTs that are now included in the shuffle tree.

Differential Revision: https://reviews.llvm.org/D33864

llvm-svn: 304688
2017-06-04 20:12:04 +00:00
Ayal Zaks ab32aff838 [LV] Make scalarizeInstruction() non-virtual. NFC.
Following the request made in https://reviews.llvm.org/D32871,
scalarizeInstruction() which is no longer overridden by InnerLoopUnroller is
hereby made non-virtual in InnerLoopVectorizer.

Should have been part of r297580 originally.

llvm-svn: 304685
2017-06-04 13:29:51 +00:00
Craig Topper d470d73c2d [ConstantFolding] Combine an if statement into an earlier one that checked the same condition. NFC
llvm-svn: 304681
2017-06-04 08:21:53 +00:00
Craig Topper 0dd29e2256 [ConstantFolding][X86] Replace an LLVM_FALLTHROUGH with a break because it really shouldn't fallthrough.
This is actually NFC because the next case starts with the same if statement as this case did. So the result will be the same and it will fallthrough to the end of the switch. But there's no reason to rely on that so we should just break.

llvm-svn: 304680
2017-06-04 08:21:51 +00:00
Craig Topper fe9ad82e44 [ConstantFolding] Properly support constant folding of vector powi intrinsic. The second argument is not a vector so needs special treatment.
llvm-svn: 304679
2017-06-04 07:30:28 +00:00
Davide Italiano be1b6a963e [PM] Add GVNSink to the pipeline.
With this, the two pipelines should be in sync again (modulo
LoopUnswitch, but Chandler is actively working on that).

Differential Revision:  https://reviews.llvm.org/D33810

llvm-svn: 304671
2017-06-03 23:18:29 +00:00
Saleem Abdulrasool ac76ec7ce8 ADT: handle special case of ARM environment for SUSE
SUSE treats "gnueabi" as "gnueabihf" so make sure that we normalise the
environment.

llvm-svn: 304670
2017-06-03 22:31:06 +00:00
Craig Topper 0799ff9e64 [InstCombine] Add support for simplifying ctlz/cttz intrinsics based on known bits.
llvm-svn: 304669
2017-06-03 18:50:32 +00:00
Craig Topper 7c553edced [ConstantFolding] Fix constant folding for vector cttz and ctlz intrinsics to understand that the second argument is still a scalar.
llvm-svn: 304668
2017-06-03 18:50:29 +00:00
Stanislav Mekhanoshin 0330660403 [AMDGPU] Untangle SDWA pass from SIShrinkInstructions
Remove dependency of SDWA pass on SIShrinkInstructions.
The goal is to move SDWA even higher in the stack to avoid second run
of MachineLICM, MachineCSE and SIFoldOperands.

Also added handling to preserve original src modifiers.

Differential Revision: https://reviews.llvm.org/D33860

llvm-svn: 304665
2017-06-03 17:39:47 +00:00
Simon Pilgrim f93debb40c [X86][SSE] Add SCALAR_TO_VECTOR(PEXTRW/PEXTRB) support to faux shuffle combining
Generalized existing SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT) code to support AssertZext + PEXTRW/PEXTRB cases as well. 

llvm-svn: 304659
2017-06-03 11:12:57 +00:00
Craig Topper a803d5b8b0 [LazyValueInfo] Use Type::getIntegerBitWidth instead of casting to IntegerType to call getBitWidth. NFC
llvm-svn: 304656
2017-06-03 07:47:14 +00:00
Craig Topper 0e5f1093ee [LazyValueInfo] Make solveBlockValueCast take a CastInst* instead of Instruction*. Makes getOpcode return the appropriate enum without a cast. NFC
llvm-svn: 304655
2017-06-03 07:47:08 +00:00
Galina Kistanova e9cacb6ae8 Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304638
2017-06-03 05:19:32 +00:00
Galina Kistanova 55344aba7e Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304637
2017-06-03 05:19:10 +00:00
Galina Kistanova 96d51f5bcb Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304636
2017-06-03 05:18:46 +00:00
Galina Kistanova bd79f73f02 Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304635
2017-06-03 05:11:14 +00:00
Sam Clegg 9e15f3592a [WebAssembly] Refactor WasmObjectWriter::writeObject
The size of this function was getting a little out of.
control.  Split code for writing each section type into
seperate functions.

Differential Revision: https://reviews.llvm.org/D33792

llvm-svn: 304634
2017-06-03 02:01:24 +00:00
Kostya Serebryany f7db346cdf [sanitizer-coverage] one more flavor of coverage: -fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet.
llvm-svn: 304630
2017-06-03 01:35:47 +00:00
Tom Stellard e042412ef1 AMDGPU/GlobalISel: Mark 1-bit integer constants as legal
Summary:
These are mostly legal, but will probably need special lowering for some
cases.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D33791

llvm-svn: 304628
2017-06-03 01:13:33 +00:00
Eugene Zelenko c85638b29d [CodeGen] Fix Windows builds which treat warnings as errors, broken in r304621.
llvm-svn: 304627
2017-06-03 01:04:06 +00:00
Evgeniy Stepanov 704003ea3d Revert "[CFI] Remove LinkerSubsectionsViaSymbols."
This reverts commit r304582: breaks cfi-devirt :: anon-namespace.cpp on Darwin.

llvm-svn: 304626
2017-06-03 00:46:27 +00:00
Stanislav Mekhanoshin f154b4f52c [AMDGPU] Preserve operand order in SIFoldOperands
SIFoldOperands can commute operands even if no folding was done.
This change is to preserve IR is no folding was done.

Differential Revision: https://reviews.llvm.org/D33802

llvm-svn: 304625
2017-06-03 00:41:52 +00:00
Zachary Turner 5b74ff33e7 [PDB] Fix use after free.
Previously MappedBlockStream owned its own BumpPtrAllocator that
it would allocate from when a read crossed a block boundary.  This
way it could still return the user a contiguous buffer of the
requested size.  However, It's not uncommon to open a stream, read
some stuff, close it, and then save the information for later.
After all, since the entire file is mapped into memory, the data
should always be available as long as the file is open.

Of course, the exception to this is when the data isn't *in* the
file, but rather in some buffer that we temporarily allocated to
present this contiguous view.  And this buffer would get destroyed
as soon as the strema was closed.

The fix here is to force the user to specify the allocator, this
way it can provide an allocator that has whatever lifetime it
chooses.

Differential Revision: https://reviews.llvm.org/D33858

llvm-svn: 304623
2017-06-03 00:33:35 +00:00
Matthias Braun 4e8624d138 LiveRegUnits: Port recent LivePhysRegs bugfixes
Adjust code to look more like the code in LivePhysRegs and port over the
fix for LivePhysRegs from r304001 and adapt to the new CSR management in
MachineRegisterInfo.

llvm-svn: 304622
2017-06-03 00:26:35 +00:00
Eugene Zelenko 167595ab51 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 304621
2017-06-03 00:22:41 +00:00
Stanislav Mekhanoshin ca5d2efe5a [AMDGPU] V_DIV_FIXUP_F16 is not a commutable operation
Differential Revision: https://reviews.llvm.org/D33808

llvm-svn: 304619
2017-06-03 00:16:44 +00:00
Alexey Bataev e4e5923ef1 [SLP] Improve comments and naming of functions/variables/members, NFC.
Fixed some comments, added an additional description of the algorithms,
improved readability of the code.

Differential revision: https://reviews.llvm.org/D33320

llvm-svn: 304616
2017-06-03 00:08:21 +00:00
Sanjay Patel e737cf8500 [x86] simplify code for vector icmp pred transforms; NFCI
Organizing by transform is smaller and easier to read than a squashed switch with fall-throughs.

llvm-svn: 304611
2017-06-02 23:21:53 +00:00
Kostya Serebryany aed6ba770c [sanitizer-coverage] refactor the code to make it easier to add more sections in future. NFC
llvm-svn: 304610
2017-06-02 23:13:44 +00:00
Alexey Bataev 03ca396b95 Revert "[SLP] Improve comments and naming of functions/variables/members, NFC."
This reverts commit 6e311de8b907aa20da9a1a13ab07c3ce2ef4068a.

llvm-svn: 304609
2017-06-02 23:09:15 +00:00
Philip Reames b70cecd60a [Statepoint] Be consistent about using deopt naming [NFCI]
We'd called this "vm state" in the early days, but have long since standardized on calling it "deopt" in line with the operand bundle tag.  Fix a few cases we'd missed.

llvm-svn: 304607
2017-06-02 23:03:26 +00:00
Matthias Braun 0021d46a1c RegisterScavenging: Add ScavengerTest pass
This pass allows to run the register scavenging independently of
PrologEpilogInserter to allow targeted testing.

Also adds some basic register scavenging tests.

llvm-svn: 304606
2017-06-02 23:01:42 +00:00
Quentin Colombet 2145cf3f07 [RABasic] Properly update the LiveRegMatrix when LR splitting occur
Prior to this patch we used to not touch the LiveRegMatrix while doing
live-range splitting. In other words, when live-range splitting was
occurring, the LiveRegMatrix was not reflecting the changes.
This is generally fine because it means the query to the LiveRegMatrix
will be conservately correct. However, when decisions are taken based on
what is going to happen on the interferences (e.g., when we spill a
register and know that it is going to be available for another one), we
might hit an assertion that the color used for the assignment is still
in use.

This patch makes sure the changes on the live-ranges are properly
reflected in the LiveRegMatrix, so the assertions don't break.
An alternative could have been to remove the assertion, but it would
make the invariants of the code and the general reasoning more
complicated in my opnion.

http://llvm.org/PR33057

llvm-svn: 304603
2017-06-02 22:46:31 +00:00
Quentin Colombet ebbaed6d3c [RABasic] Properly initialize the pass
Use the initializeXXX method to initialize the RABasic pass in the
pipeline. This enables us to take advantage of the .mir infrastructure.

llvm-svn: 304602
2017-06-02 22:46:26 +00:00
Xinliang David Li 5fdc75aea1 Fix debug build test failure
llvm-svn: 304600
2017-06-02 22:38:48 +00:00
Xinliang David Li 0b7d858fa3 [PartialInlining] Minor cost anaysis tuning
Also added a test option and 2 cost analysis related tests.

llvm-svn: 304599
2017-06-02 22:08:04 +00:00
David Blaikie 6aeacaa527 FunctionAttrs: Skip it if the effective SCC (ignoring optnone functions) is empty
Minor optimization but mostly simplifies my debugging so I'm not dealing
with empty SCCNodeSets while investigating issues in this optimization.

llvm-svn: 304597
2017-06-02 21:24:17 +00:00
Matthias Braun dfa892139c RegisterScavenging: Move scavenging logic from PEI to RegisterScavenging; NFC
These parts do not depend on any PrologEpilogInserter logic and
therefore better fits RegisterScaveging.cpp.

llvm-svn: 304596
2017-06-02 21:02:03 +00:00
Zachary Turner 64726f2269 Fix build error on gcc.
llvm-svn: 304595
2017-06-02 21:00:22 +00:00
Jun Bum Lim 2960d41e68 [InlineCost] Enable the new switch cost heuristic
Summary:
This is to enable the new switch inline cost heuristic (r301649) by removing the
old heuristic as well as the flag itself.
In my experiment for LLVM test suite and spec2000/2006, +17.82% performance and
8% code size reduce was observed in spec2000/vertex with O3 LTO in AArch64.
No significant code size / performance regression was found in O3/O2/Os. No
significant complain was reported from the llvm-dev thread.

Reviewers: hans, chandlerc, eraman, haicheng, mcrosier, bmakam, eastig, ddibyend, echristo

Reviewed By: echristo

Subscribers: javed.absar, kristof.beyls, echristo, aemerson, rengolin, mehdi_amini

Differential Revision: https://reviews.llvm.org/D32653

llvm-svn: 304594
2017-06-02 20:42:54 +00:00
Alexey Bataev 2c08fde9e5 [SLP] Improve comments and naming of functions/variables/members, NFC.
Summary:
Fixed some comments, added an additional description of the algorithms,
improved readability of the code.

Reviewers: anemet

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33320

llvm-svn: 304593
2017-06-02 20:39:27 +00:00
Ahmed Bougacha 018a68f9e4 [X86] Correctly broadcast NaN-like integers as float on AVX.
Since r288804, we try to lower build_vectors on AVX using broadcasts of
float/double.  However, when we broadcast integer values that happen to
have a NaN float bitpattern, we lose the NaN payload, thereby changing
the integer value being broadcast.

This is caused by ConstantFP::get, to which we pass the splat i32 as
a float (by bitcasting it using bitsToFloat).  ConstantFP::get takes
a double parameter, so we end up lossily converting a single-precision
NaN to double-precision.

Instead, avoid any kinds of conversions by directly building an APFloat
from the splatted APInt.

Note that this also fixes another piece of code (broadcast of
subvectors), that currently isn't susceptible to the same problem.

Also note that we could really just use APInt and ConstantInt
throughout: the constant pool type doesn't matter much.  Still, for
consistency, use the appropriate type.

llvm-svn: 304590
2017-06-02 20:02:59 +00:00
Zachary Turner 4bedb5fd00 Fix build error with clang and gcc.
llvm-svn: 304589
2017-06-02 20:00:10 +00:00
Zachary Turner 92dcdda623 [CodeView] Support CodeView subsections in any order.
Previously we would expect certain subsections to appear
in a certain order because some subsections would reference
other subsections, but in practice we need to support
arbitrary orderings since some object file and PDB file
producers generate them this way.  This also paves the
way for supporting Yaml <-> Object File conversion of
CodeView, since Object Files typically have quite a
large number of subsections in their debug info.

Differential Revision: https://reviews.llvm.org/D33807

llvm-svn: 304588
2017-06-02 19:49:14 +00:00
Keno Fischer 514a6a54e7 [SROA] Fix crash due to bad bitcast
Summary:
As shown in the test case, SROA was crashing when trying to split
stores (to the alloca) of loads (from anywhere), because it assumed
the pointer operand to the loads and stores had to have the same
address space. This isn't the case. Make sure to use the correct
pointer type for both the load and the store.

Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D32593

llvm-svn: 304585
2017-06-02 19:04:17 +00:00
Evgeniy Stepanov 63f056327d [CFI] Remove LinkerSubsectionsViaSymbols.
Since D17854 LinkerSubsectionsViaSymbols is unnecessary.

It is interfering with ThinLTO implementation of CFI-ICall, where
the aliases used on the !LinkerSubsectionsViaSymbols branch are
needed to export jump tables to ThinLTO backends.

llvm-svn: 304582
2017-06-02 18:45:14 +00:00
David Blaikie 358c012db2 BitcodeWriter: Removing unnecessary std::function in favor of template
More cleanup from post-commit discussion on r304516

llvm-svn: 304579
2017-06-02 18:25:29 +00:00
Evgeniy Stepanov b933ad3a77 Skip CFI for dead functions.
Differential Revision: https://reviews.llvm.org/D33805

llvm-svn: 304578
2017-06-02 18:24:23 +00:00
Evgeniy Stepanov 659b3bc77d Move summary dead stripping before regular LTO.
This way dead stripping results are recorded in combined summary and
can be used in regular LTO passes.

Differential Revision: https://reviews.llvm.org/D33615

llvm-svn: 304577
2017-06-02 18:24:17 +00:00
Sanjay Patel 469014ada4 [x86] fix formatting; NFCI
llvm-svn: 304576
2017-06-02 18:14:31 +00:00
Matt Arsenault 746e065716 AMDGPU: Register AMDGPUAlwaysInline
llvm-svn: 304574
2017-06-02 18:02:42 +00:00
Reid Kleckner 146eb7a65f Re-land "COFF: migrate def parser from LLD to LLVM"
This reverts commit r304561 and re-lands r303490 & co.

The fix was to use "SymbolName" when translating LLD's internal export
list to lib/Object's short export struct. The SymbolName reflects the
actual symbol name, which may include fastcall and stdcall mangling bits
not included in the /EXPORT or .def file EXPORTS name:

@@ -434,8 +434,7 @@ std::vector<COFFShortExport> createCOFFShortExportFromConfig() {
   std::vector<COFFShortExport> Exports;
   for (Export &E1 : Config->Exports) {
     COFFShortExport E2;
-    E2.Name = E1.Name;
+    // Use SymbolName, which will have any stdcall or fastcall qualifiers.
+    E2.Name = E1.SymbolName;
     E2.ExtName = E1.ExtName;
     E2.Ordinal = E1.Ordinal;
     E2.Noname = E1.Noname;

llvm-svn: 304573
2017-06-02 17:53:06 +00:00
Konstantin Zhuravlyov be6c0ca5e2 AMDGPU: Make auto waitcnt before barrier a feature
Differential Revision: https://reviews.llvm.org/D33793

llvm-svn: 304571
2017-06-02 17:40:26 +00:00
Sanjay Patel cdb5dad4cc [TargetLowering] fix formatting; NFC
llvm-svn: 304569
2017-06-02 17:35:02 +00:00
Craig Topper 9277a86f03 [LazyValueInfo] Fix formatting NFC.
llvm-svn: 304567
2017-06-02 17:28:12 +00:00
David Blaikie b6b42e018a Tidy up a bit of r304516, use SmallVector::assign rather than for loop
This might give a few better opportunities to optimize these to memcpy
rather than loops - also a few minor cleanups (StringRef-izing,
templating (to avoid std::function indirection), etc).

The SmallVector::assign(iter, iter) could be improved with the use of
SFINAE, but the (iter, iter) ctor and append(iter, iter) need it to and
don't have it - so, workaround it for now rather than bothering with the
added complexity.

(also, as noted in the added FIXME, these assign ops could potentially
be optimized better at least for non-trivially-copyable types)

llvm-svn: 304566
2017-06-02 17:24:26 +00:00
Philip Reames 0f02bbc6f4 Verify a couple more fields in STATEPOINT instructions
While doing so, clarify the comments and update them to reflect current reality.

Note: I'm going to let this sit for a week or so before adding further verification.  I want to give this time to cycle through bots and merge it into our downstream tree before pushing this further.
llvm-svn: 304565
2017-06-02 17:02:33 +00:00
Philip Reames 94cc4a29ed Add placeholder for more extensive verification of psuedo ops
This initial patch doesn't actually do much useful. It's just to show where the new code goes. Once this is in, I'll extend the verification logic to check more useful properties.

For those curious, the more complicated version of this patch already found one very suspicious thing.

Differential Revision: https://reviews.llvm.org/D33819

llvm-svn: 304564
2017-06-02 16:36:37 +00:00
Craig Topper 3778c8943b [LazyValueInfo] Make solveBlockValueBinaryOp take a BinaryOperator* instead of Instruction*. This removes a cast of getOpcode to BinaryOps.
llvm-svn: 304563
2017-06-02 16:33:13 +00:00
Sanjay Patel ce241f48c5 [InstCombine] fix icmp with not op and constant to work with splat vector constant
llvm-svn: 304562
2017-06-02 16:29:41 +00:00
Reid Kleckner d249e4a188 Revert "COFF: migrate def parser from LLD to LLVM"
This reverts commits r303490, r303491, r303493, and r303494.

This caused http://crbug.com/728726. Essentially, exporting stdcall
functions doesn't appear to work after this change. Reduced test case
soon.

llvm-svn: 304561
2017-06-02 16:26:24 +00:00
Craig Topper 84a9f168f1 [LazyValueInfo] Fix typo in comment. NFC
llvm-svn: 304560
2017-06-02 16:21:13 +00:00
Craig Topper b23e7c78a5 [InstSimplify][ConstantFolding] Teach constant folding how to handle icmp null, (inttoptr x) as well as it handles icmp (inttoptr x), null
Summary:
The constant folding code currently assumes that the constant expression will always be on the left and the simple null will be on the right. But that's not true at least on the path from InstSimplify.

This patch adds support to ConstantFolding to detect the reversed case.

Reviewers: spatel, dberlin, majnemer, davide, joey

Reviewed By: joey

Subscribers: joey, llvm-commits

Differential Revision: https://reviews.llvm.org/D33801

llvm-svn: 304559
2017-06-02 16:17:32 +00:00
Sanjay Patel 4dc85eb75a [InstCombine] improve perf by not creating a known non-canonical instruction
Op1 (RHS) is a constant, so putting it on the LHS makes us churn through visitICmp
an extra time to canonicalize it:

INSTCOMBINE ITERATION #1 on cmpnot
IC: ADDING: 3 instrs to worklist
IC: Visiting:   %notx = xor i8 %x, -1
IC: Visiting:   %cmp = icmp sgt i8 %notx, 42
IC: Old =   %cmp = icmp sgt i8 %notx, 42
    New =   <badref> = icmp sgt i8 -43, %x
IC: ADD:   %cmp = icmp sgt i8 -43, %x
IC: ERASE   %1 = icmp sgt i8 %notx, 42
IC: ADD:   %notx = xor i8 %x, -1
IC: DCE:   %notx = xor i8 %x, -1
IC: ERASE   %notx = xor i8 %x, -1
IC: Visiting:   %cmp = icmp sgt i8 -43, %x
IC: Mod =   %cmp = icmp sgt i8 -43, %x
    New =   %cmp = icmp slt i8 %x, -43
IC: ADD:   %cmp = icmp slt i8 %x, -43
IC: Visiting:   %cmp = icmp slt i8 %x, -43
IC: Visiting:   ret i1 %cmp

If we create the swapped ICmp directly, we go faster:

INSTCOMBINE ITERATION #1 on cmpnot
IC: ADDING: 3 instrs to worklist
IC: Visiting:   %notx = xor i8 %x, -1
IC: Visiting:   %cmp = icmp sgt i8 %notx, 42
IC: Old =   %cmp = icmp sgt i8 %notx, 42
    New =   <badref> = icmp slt i8 %x, -43
IC: ADD:   %cmp = icmp slt i8 %x, -43
IC: ERASE   %1 = icmp sgt i8 %notx, 42
IC: ADD:   %notx = xor i8 %x, -1
IC: DCE:   %notx = xor i8 %x, -1
IC: ERASE   %notx = xor i8 %x, -1
IC: Visiting:   %cmp = icmp slt i8 %x, -43
IC: Visiting:   ret i1 %cmp

llvm-svn: 304558
2017-06-02 16:11:14 +00:00
Alexander Timofeev 3f70b619a9 AMDGPUAnnotateUniformValue should always treat volatile loads as divergent
llvm-svn: 304554
2017-06-02 15:25:52 +00:00
Geoff Berry 57d8a417e7 [AArch64][Falkor] Model immediate forwarding.
llvm-svn: 304552
2017-06-02 14:27:41 +00:00
Mark Searles 70359ac60d [AMDGPU] Turn on the new waitcnt insertion pass. Adjust tests.
-enable-si-insert-waitcnts=1 becomes the default
-enable-si-insert-waitcnts=0 to use old pass

Differential Revision: https://reviews.llvm.org/D33730

llvm-svn: 304551
2017-06-02 14:19:25 +00:00
Zoran Jovanovic 2aae0649a1 [mips][microMIPS] Extending size reduction pass with LBU16, LHU16, SB16 and SH16
Author: milena.vujosevic.janicic
Reviewers: sdardis
The patch extends size reduction pass for MicroMIPS.
The following instructions are examined and transformed, if possible:
LBU instruction is transformed into 16-bit instruction LBU16
LHU instruction is transformed into 16-bit instruction LHU16
SB instruction is transformed into 16-bit instruction SB16
SH instruction is transformed into 16-bit instruction SH16
Differential Revision: https://reviews.llvm.org/D33091

llvm-svn: 304550
2017-06-02 14:14:21 +00:00
Krzysztof Parzyszek 066e8b56a0 [Hexagon] Return 0 from getDotNewPredOp when .new opcode does not exist
This allows using this function to test if an instruction can be converted
to a .new form.

llvm-svn: 304549
2017-06-02 14:07:06 +00:00
Benjamin Kramer c1f5ae236c [OrderedBasicBlock] Return false for comesBefore(A, A)
So far it would return true for the first uncached query, then cached
queries return false.

llvm-svn: 304545
2017-06-02 13:10:31 +00:00
John Brawn 6671616cde [GlobalMerge] Don't merge globals that may be preempted
When a global may be preempted it needs to be accessed directly, instead of
indirectly through a MergedGlobals symbol, for the preemption to work.

This fixes PR33136.

Differential Revision: https://reviews.llvm.org/D33727

llvm-svn: 304537
2017-06-02 10:24:14 +00:00
Diana Picus e7aa90987d [ARM] GlobalISel: Support struct params/returns
Very very similar to the support for arrays. As with arrays, we don't
support returning large structs that wouldn't fit in R0-R3. Most
front-ends would likely use sret arguments for that anyway.

The only significant difference is that when splitting a struct, we need
to make sure we set the correct original alignment on each member,
otherwise it may get split incorrectly between stack and registers.

llvm-svn: 304536
2017-06-02 10:16:48 +00:00
Amaury Sechet 437f7060fe nits in TargetLowering.cpp . NFC
llvm-svn: 304532
2017-06-02 09:18:18 +00:00
Javed Absar 4ae7e81233 [ARM] Cortex-A57 scheduling model for ARM backend (AArch32)
This patch implements the Cortex-A57 scheduling model.
The main code is in ARMScheduleA57.td, ARMScheduleA57WriteRes.td.
Small changes in cpp,.h files to support required scheduling predicates.

Scheduling model implemented according to:
 http://infocenter.arm.com/help/topic/com.arm.doc.uan0015b/Cortex_A57_Software_Optimization_Guide_external.pdf.

Patch by : Andrew Zhogin (submitted on his behalf, as requested).
Rewiewed by: Renato Golin, Diana Picus, Javed Absar, Kristof Beyls.
Differential Revision: https://reviews.llvm.org/D28152

llvm-svn: 304530
2017-06-02 08:53:19 +00:00
Max Kazantsev 4d8748a987 [SelectionDAG] Get rid of recursion in findNonImmUse
The recursive implementation of findNonImmUse may overflow stack
on extremely long use chains. This patch replaces it with an equivalent
iterative implementation.

Reviewed By: bogner

Differential Revision: https://reviews.llvm.org/D33775

llvm-svn: 304522
2017-06-02 07:11:00 +00:00
Gor Nishanov 053d2d24f7 [coroutines] PR33271: Remove stray coro.save intrinsics during CoroSplit
Summary:
Optimization passes may remove llvm.coro.suspend intrinsic while leaving matching llvm.coro.save intrinsic orphaned.
Make sure we clean up orphaned coro.saves.  The bug manifested with a crash similar to this:

```
    llvm_unreachable("Unknown type!");
    llvm::MVT::getVT (Ty=0x489518, HandleUnknown=false)
    llvm::EVT::getEVT
    llvm::TargetLoweringBase::getValueType
    llvm::ComputeValueVTs
    llvm::SelectionDAGBuilder::visitTargetIntrinsic
```

Reviewers: GorNishanov

Subscribers: EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D33817

llvm-svn: 304518
2017-06-02 02:18:36 +00:00
Xinliang David Li 621e8dcf1f [Profile] Enhance expect lowering to handle correlated branches
builtin_expect applied on && or || expressions were not
handled properly before. With this patch, the problem is fixed.

Differential Revision: http://reviews.llvm.org/D33164

llvm-svn: 304517
2017-06-02 02:09:31 +00:00
Teresa Johnson 7a27b132a8 [ThinLTO] Efficiency improvement when writing module path string table
Summary:
When writing the combined index, we are walking the entire module
path StringMap in the full index, and checking whether each one should be
included in the index being written. For distributed backends, where we
write an individual combined index for each file, each with only a few
module paths, this is incredibly inefficient. Add a method that takes
a callback and hides the details of whether we are writing the full
combined index, or just a slice, and in the latter case it walks the set
of modules to include instead of the entire index.

For a huge application with around 23K files (i.e. where we were iterating
through the 23K-entry modulePath StringMap 23K times), this change improved
the thin link time by a whopping 48%.

Reviewers: pcc

Subscribers: Prazek, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D33813

llvm-svn: 304516
2017-06-02 01:56:02 +00:00
Philip Reames ae80045deb [RS4GC] Comment clarification
llvm-svn: 304514
2017-06-02 01:52:06 +00:00
Jacob Gravelle 26115924a2 Revert r304117 - WebAssembly object format isn't ready to be the default
Summary: Wasm object format has some functionality regressions from the ELF format, and doesn't play nicely with the rest of the toolchain. It should eventually be the default, but not yet.

Reviewers: sunfish, sbc100

Subscribers: jfb, dschuff, llvm-commits

Differential Revision: https://reviews.llvm.org/D33811

llvm-svn: 304512
2017-06-02 01:26:17 +00:00
Sam Clegg c38e947e50 [WebAssembly] MC: Fix references to undefined externals in data section
Undefined externals don't need to have a size or an offset.
This was broken by r303915.  Added a test for this case.

This fixes the "Compile LLVM Torture (o)" step on the wasm
waterfall.

Differential Revision: https://reviews.llvm.org/D33803

llvm-svn: 304505
2017-06-02 01:05:24 +00:00
Davide Italiano 1dd5558e52 [PM] GVNSink is off by default, fix an obvious typo.
llvm-svn: 304497
2017-06-01 23:47:53 +00:00
Eugene Zelenko 7ea692373c [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 304495
2017-06-01 23:25:02 +00:00
Zachary Turner afb81a83a9 Fix 2 more -Wreorder warnings.
llvm-svn: 304494
2017-06-01 23:24:50 +00:00
Tim Shen 4e912aa5af [ThinLTO] Move -lto-use-new-pm to llvm-lto2, and change it to -use-new-pm.
Summary:
As we teach Clang to use ThinkLTO + new PM, it's good for the users to
inject through Config, instead of setting a flag in the LTOBackend
library. Move the flag to llvm-lto2.

As it moves to llvm-lto2, a new name -use-new-pm seems simpler and as
clear.

Reviewers: davide, tejohnson

Subscribers: mehdi_amini, Prazek, inglorion, eraman, chandlerc, llvm-commits

Differential Revision: https://reviews.llvm.org/D33799

llvm-svn: 304492
2017-06-01 23:13:44 +00:00
Davide Italiano c368831580 Move GVNHoist to the right position in the new pass manager pipeline.
GVNHoist was moved as part of simplification passes for the current
pass manager (but not for the new), so they're out-of-sync.

Differential Revision:  https://reviews.llvm.org/D33806

llvm-svn: 304490
2017-06-01 23:08:14 +00:00
Xinliang David Li d6cfba2a02 Fix compiler_rt buildbot failure
llvm-svn: 304489
2017-06-01 23:05:11 +00:00
Keno Fischer fa635d730f Reapply "[Cloning] Take another pass at properly cloning debug info"
This was rL304226, reverted in 304228 due to a clang assertion failure
on the build bots. That problem should have been addressed by clang
commit rL304470.

llvm-svn: 304488
2017-06-01 23:02:12 +00:00
Zachary Turner ebd3ae8371 [CodeView] Properly align symbol records on read/write.
Object files have symbol records not aligned to any particular
boundary (e.g. 1-byte aligned), while PDB files have symbol
records padded to 4-byte aligned boundaries.  Since they share
the same reading / writing code, we have to provide an option to
specify the alignment and propagate it up to the producer or
consumer who knows what the alignment is supposed to be for the
given container type.

Added a test for this by modifying the existing PDB -> YAML -> PDB
round-tripping code to round trip symbol records as well as types.

Differential Revision: https://reviews.llvm.org/D33785

llvm-svn: 304484
2017-06-01 21:52:41 +00:00
Yaxun Liu a618acf923 [AMDGPU] Fix kernel arg segment size for amdgizcl
Differential Revision: https://reviews.llvm.org/D33307

llvm-svn: 304482
2017-06-01 21:31:53 +00:00
Eli Friedman 0d823d610d Add opt-bisect support for region passes.
This is necessary to get opt-bisect working with polly.

Differential Revision: https://reviews.llvm.org/D33751

llvm-svn: 304476
2017-06-01 21:22:26 +00:00
Adrian Prantl d9cd4d52e3 DbgValueHistoryCalculator: Ignore call instructions that claim to clobber SP.
The AArch64 backend marks calls that involve aggregate function
arguments as having an implicit def of SP. We already have the same
workaround in LiveDebugValues and in DbgValueHistoryCalculator for SP
clobbers in register masks. This adds register defs to the list.

Fixes rdar://problem/30361929 and Swift SR-3851.

llvm-svn: 304471
2017-06-01 21:14:58 +00:00
Teresa Johnson 596b2e7ab2 [PGO] Adjust indirect call promotion threshold
Summary:
Reduce min percent required for indirect call promotion from 33% to 30%,
which matches gcc's threshold and catches the same hot opportunities.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33798

llvm-svn: 304469
2017-06-01 21:10:10 +00:00
Keno Fischer 3cdd4935cd [DIBuilder] Add a more fine-grained finalization method
Summary:
Clang wants to clone a function before it is done building the entire
compilation unit. As of now, there is no good way to do that, because
CloneFunction doesn't like dealing with temporary metadata. However,
as long as clang doesn't want to add any variables to this SP, it
should be fine to just prematurely finalize it. Add an API to allow this.

This is done in preparation of a clang commit to fix the assertion that
necessitated the revert of D33655.

Reviewers: aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33704

llvm-svn: 304467
2017-06-01 20:42:44 +00:00
Evgeniy Stepanov 56584bbf16 (NFC) Track global summary liveness in GVFlags.
Replace GVFlags::LiveRoot with GVFlags::Live and use that instead of
all the DeadSymbols sets. This is refactoring in order to make
liveness information available in the RegularLTO pipeline.

llvm-svn: 304466
2017-06-01 20:30:06 +00:00
Nirav Dave 4952871630 [SDAG] Fix CombineTo ordering in visitZERO_EXTEND and visitSIGN_EXTEND
Reorder CombineTo Calls to prevent references to stale/deleted SDNodes which caused undue assertions.

Reviewers: dbabokin

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D31625

llvm-svn: 304460
2017-06-01 19:33:50 +00:00
Xinliang David Li ee8d6acb1f [Profile] Fix builtin_expect lowering bug
The lowerer wrongly assumes the ICMP instruction 
 1) always has a constant operand;
 2) the operand has value 0.

It also assumes the expected value can only be one, thus
other values other than one will be considered 'zero'.

This leads to wrong profile annotation when other integer values
are used other than 0, 1 in the comparison or in the expect intrinsic.

Also missing is handling of equal predicate.

This patch fixes all the above problems.

Differential Revision: http://reviews.llvm.org/D33757

llvm-svn: 304453
2017-06-01 19:05:55 +00:00
Xinliang David Li 0a0acbcf78 [PartialInlining] Emit branch info and profile data as remarks
This allows us to collect profile statistics to tune static 
branch prediction.

Differential Revision: http://reviews.llvm.org/D33746

llvm-svn: 304452
2017-06-01 18:58:50 +00:00
Mandeep Singh Grang 33a1b73600 [PredicateInfo] Fix non-determinism in codegen uncovered by reverse iterating SmallPtrSet
Summary:
Sort OpsToRename before iterating to make iteration order deterministic.

Thanks to Daniel Berlin for the sorting logic.

Reviewers: dberlin, RKSimon, efriedma, davide

Reviewed By: dberlin, davide

Subscribers: sanjoy, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D33265

llvm-svn: 304447
2017-06-01 18:36:24 +00:00
Adrian Prantl f4bc1f77b7 [DWARF] Introduce Dump Options
This commit introduces a structure that holds all the flags that
control the pretty printing of dwarf output.

Patch by Spyridoula Gravani!

Differential Revision: https://reviews.llvm.org/D33749

llvm-svn: 304446
2017-06-01 18:18:23 +00:00
Krzysztof Parzyszek 3cf16576d5 [Hexagon] Fix dependence check in the packetizer
An incorrect check in the packetizer lead to an attempt to convert
an unconditional branch to a .new (conditional) form.

llvm-svn: 304442
2017-06-01 18:02:40 +00:00
Krzysztof Parzyszek 51fd5405d5 [Hexagon] Handle long-running simplification loop in idiom recognition
The initial assumption was that the simplification would converge to a
fixed point relatvely quickly. Turns out that there are legitimate situa-
tions where the complexity of the code causes it to take a large number
of iterations.

Two main changes:
- Instead of aborting upon hitting the limit, simply return nullptr.
- Reduce the limit to 10,000 from 100,000.

llvm-svn: 304441
2017-06-01 18:00:47 +00:00
Amaury Sechet 2adb7bdbca Remove ADDC, ADDE, SUBC, SUBE and SETCCE support from the X86 backend, use the CARRY ops instead.
Summary:
As per title. This cleanup some technical debt.

Depends on D33374

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33390

llvm-svn: 304435
2017-06-01 16:33:08 +00:00
Matt Arsenault 3416b8c874 AMDGPU: Remove error on call in AsmPrinter
Partial revert of r301938 which is making it harder
to split patches up.

llvm-svn: 304418
2017-06-01 15:05:15 +00:00
Matt Arsenault b083570532 DAG: Remove pointless type check
These are only integer operations.

llvm-svn: 304417
2017-06-01 14:49:46 +00:00
Matt Arsenault 50f43e4168 AMDGPU: Set high getCSRFirstUseCost
llvm-svn: 304416
2017-06-01 14:38:02 +00:00
Florian Hahn fca7b8348f [ARM] Create relocations for Thumb functions calling ARM fns in ELF.
Summary:
Without using a fixup in this case, BL will be used instead of BLX to
call internal ARM functions from Thumb functions.

Reviewers: rafael, t.p.northover, peter.smith, kristof.beyls

Reviewed By: peter.smith

Subscribers: srhines, echristo, aemerson, rengolin, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33436

llvm-svn: 304413
2017-06-01 13:50:57 +00:00
Kamil Rytarowski 07c81b1856 [Solaris] Fix PR33228 - llvm::sys::fs::is_local_impl done right
Summary:
Solaris-specific implementation for llvm::sys::fs::is_local_impl.
FStype pattern matching might be a bit unreliable, but at least it fixes the build failure.



Reviewers: mgorny, nlopes, llvm-commits, krytarowski

Reviewed By: krytarowski

Subscribers: voskresensky.vladimir, krytarowski

Differential Revision: https://reviews.llvm.org/D33695

llvm-svn: 304412
2017-06-01 12:57:00 +00:00
Amaury Sechet c84cc230b3 Only generate addcarry node when it is legal.
Summary:
This is a problem uncovered by stage2 testing. ADDCARRY end up being generated on target that do not support it.

The patch that introduced the problem has other patches layed on top of it, so we want to fix the issue rather than revert it to avoid creating a lor of churn.

A regression test will be added shortly, but this is committed as this in order to get the build back to green promptly.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33770

llvm-svn: 304409
2017-06-01 12:03:16 +00:00
Chandler Carruth 8b3be4e59d [PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM.
Based on the original patch by Davide, but I've adjusted the API exposed
to just be different entry points rather than exposing more state
parameters. I've factored all the common logic out so that we don't have
any duplicate pipelines, we just stitch them together in different ways.
I think this makes the build easier to reason about and understand.

This adds a direct method for getting the module simplification pipeline
as well as a method to get the optimization pipeline. While not my
express goal, this seems nice and gives a good place comment about the
restrictions that are imposed on them.

I did make some minor changes to the way the pipelines are structured
here, but hopefully not ones that are significant or controversial:

1) I sunk the PGO indirect call promotion to only be run when we have
   PGO enabled (or as part of the special ThinLTO pipeline).

2) I made the extra GlobalOpt run in ThinLTO just happen all the time
   and at a slightly more powerful place (before we remove available
   externaly functions). This seems like general goodness and not a big
   compile time sink, so it didn't make sense to *only* use it in
   ThinLTO. Fewer differences in the pipeline makes everything simpler
   IMO.

3) I hoisted the ThinLTO stop point pre-link above the the RPO function
   attr inference. The RPO inference won't infer anything terribly
   meaningful pre-link (recursiveness?) so it didn't make a lot of
   sense. But if the placement of RPO inference starts to matter, we
   should move it to the canonicalization phase anyways which seems like
   a better place for it (and there is a FIXME to this effect!). But
   that seemed a bridge too far for this patch.

If we ever need to parameterize these pipelines more heavily, we can
always sink the logic to helper functions with parameters to keep those
parameters out of the public API. But the changes above seemed minor
that we could possible get away without the parameters entirely.

I added support for parsing 'thinlto' and 'thinlto-pre-link' names in
pass pipelines to make it easy to test these routines and play with them
in larger pipelines. I also added a really basic manifest of passes test
that will show exactly how the pipelines behave and work as well as
making updates to them clear.

Lastly, this factoring does introduce a nesting layer of module pass
managers in the default pipeline. I don't think this is a big deal and
the flexibility of decoupling the pipelines seems easily worth it.

Differential Revision: https://reviews.llvm.org/D33540

llvm-svn: 304407
2017-06-01 11:39:39 +00:00
Zvi Rackover 7693733e80 [X86] Match bitcast of vxi1 to pmovmsk
Summary:
Add an early combine to match patterns such as:
  (i16 bitcast (v16i1 x))
  ->
  (i16 movmsk (v16i8 sext (v16i1 x)))

This combine needs to happen early enough before
type-legalization scalarizes the result of the setcc.

Reviewers: igorb, craig.topper, RKSimon

Subscribers: delena, llvm-commits

Differential Revision: https://reviews.llvm.org/D33311

llvm-svn: 304406
2017-06-01 11:27:57 +00:00
Amaury Sechet 251ea8a4f8 Do not legalize large setcc with setcce, introduce setcccarry and do it with usubo/setcccarry.
Summary:
This is a continuation of the work started in D29872 . Passing the carry down as a value rather than as a glue allows for further optimizations. Introducing setcccarry makes the use of addc/subc unecessary and we can start the removal process.

This patch only introduce the optimization strictly required to get the same level of optimization as was available before nothing more.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33374

llvm-svn: 304404
2017-06-01 11:14:17 +00:00
Amaury Sechet 6506a90a70 Remove ISD::SETCC match from combineX86ADD. It's done improperly and doesn't work.
llvm-svn: 304403
2017-06-01 11:13:10 +00:00
Amaury Sechet 9c5d1e966b [DAGCombine] Refactor common addcarry pattern.
Summary: This pattern is no very useful per se, but it exposes optimization for toehr patterns that wouldn't kick in otherwize. It's very common and worth optimizing for.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32756

llvm-svn: 304402
2017-06-01 10:48:04 +00:00
Amaury Sechet 2e43cb6d03 [DAGCombine] (add/uaddo X, Carry) -> (addcarry X, 0, Carry)
Summary:
This enables further transforms.

Depends on D32916

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32925

llvm-svn: 304401
2017-06-01 10:42:39 +00:00
Craig Topper f441226085 [TableGen] Remove RecordVal constructor that takes a StringRef and Record::setName(StringRef). Leave just the versions that take an Init.
They weren't used often enough to justify having two different interfaces. Push the responsiblity of creating a StringInit up to the caller.

llvm-svn: 304388
2017-06-01 06:56:16 +00:00
Tim Shen 6b41141863 [ThinLTO] Migrate ThinLTOBitcodeWriter to the new PM.
Summary: Also see D33429 for other ThinLTO + New PM related changes.

Reviewers: davide, chandlerc, tejohnson

Subscribers: mehdi_amini, Prazek, cfe-commits, inglorion, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D33525

llvm-svn: 304378
2017-06-01 01:02:12 +00:00
Xinliang David Li 32c5e809be [PartialInlining] Reduce outlining overhead by removing unneeded live-out(s)
Differential Revision: http://reviews.llvm.org/D33694

llvm-svn: 304375
2017-06-01 00:12:41 +00:00
Dehao Chen 6b737ddce7 Add LiveRangeShrink pass to shrink live range within BB.
Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB.

Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb

Reviewed By: MatzeB, andreadb

Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D32563

llvm-svn: 304371
2017-05-31 23:25:25 +00:00
Reid Kleckner fc7ba565ed [EH] Recognize __(gxx|gcc)_personality_seh0 as the GNU EH personalities
These are no-ops when there are no invokes. We don't need to emit LSDAs
for them.

Fixes PR33220.

llvm-svn: 304367
2017-05-31 22:35:52 +00:00
Matthias Braun 605f779516 ImplicitNullChecks: Clear kill/dead flags when moving instructions around
The values are marked as livein in the successor blocks so marking them
as killed or dead was wrong.

llvm-svn: 304366
2017-05-31 22:23:08 +00:00
Reid Kleckner 57ac61e005 Check hasPersonalityFn before calling getPersonalityFn
llvm-svn: 304365
2017-05-31 22:21:20 +00:00
Reid Kleckner c2f1bbfe4f [EH] Fix the LSDA that we emit for unknown EH personalities
We should have a single call site entry with no landing pad. This
indicates that no EH action should be taken and the unwinder should
unwind to the next frame.

We currently don't recognize __gxx_personality_seh0 as a known
personality, so we forcibly emit a table, and that table was wrong. This
was filed as PR33220. Now we emit a correct table for that personality.
The next step is to recognize that we can completely skip the table for
this personality.

llvm-svn: 304363
2017-05-31 22:18:49 +00:00
Steven Wu 97e2cf87e1 [MachOObject] Fix bind opcode parser error on valid opcode sequence
BIND_OPCODE_SET_DYLIB_SPECIAL_IMM(0) is a valid way to setp library
ordinal. MachOObject should set LibraryOrdinalSet even when IMM is zero.

llvm-svn: 304362
2017-05-31 22:17:43 +00:00
Galina Kistanova 244621faad Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304361
2017-05-31 22:16:24 +00:00
Galina Kistanova 8514dd540d Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304358
2017-05-31 22:09:46 +00:00
Galina Kistanova 0b69e363f6 Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304356
2017-05-31 22:02:05 +00:00
Galina Kistanova c752c4bf56 Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304355
2017-05-31 21:50:45 +00:00
Wei Mi 0bd3f41588 Revert rL304050. It may break sanitizer bootstrap. Revert it for now while investigating.
llvm-svn: 304350
2017-05-31 21:29:33 +00:00
Matthias Braun e2e65911a2 Try to fix buildbots
It seems not all of our bots have a std::vector::erase() taking a
const_iterator (even though that seems to be part of C++11) attempt to
workaround.

llvm-svn: 304349
2017-05-31 21:25:03 +00:00
Matthias Braun ac4beccaca X86FloatingPoint: Fix livein lists
After transforming FP to ST registers:
- Do not add the ST register to the livein lists, they are reserved so
  we do not need to track their liveness.
- Remove the FP registers from the livein lists, they don't have defs or
  uses anymore and so are not live.
- (The setKillFlags() call is moved to an earlier place as it relies on
   the FP registers still being present in the livein list.)

llvm-svn: 304342
2017-05-31 20:30:22 +00:00
Matthias Braun 43692a2245 X86FloatingPoint: Add some static assert, cleanup; NFC
llvm-svn: 304341
2017-05-31 20:30:17 +00:00
Galina Kistanova c2b642d009 Added missing break; added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304340
2017-05-31 20:25:13 +00:00
Kostya Serebryany 2e98c045cb [libFuzzer] fix a test to match the new sanitizer run-time
llvm-svn: 304333
2017-05-31 19:47:11 +00:00
Galina Kistanova b2c0116e71 Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304332
2017-05-31 19:41:33 +00:00
Reid Kleckner 5fbdd17714 [IR] Add additional addParamAttr/removeParamAttr to AttributeList API
Summary:
Fairly straightforward patch to fill in some of the holes in the
attributes API with respect to accessing parameter/argument attributes.
The patch aims to step further towards encapsulating the
idx+FirstArgIndex pattern to access these attributes to within the
AttributeList.

Patch by Daniel Neilson!

Reviewers: rnk, chandlerc, pete, javed.absar, reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33355

llvm-svn: 304329
2017-05-31 19:23:09 +00:00
Craig Topper 2b8419a22d [TableGen] Make Record::getValueAsString and getValueAsListOfStrings return StringRefs instead of std::string
Internally both these methods just return the result of getValue on either a StringInit or a CodeInit object. In both cases this returns a StringRef pointing to a string allocated in the BumpPtrAllocator so its not going anywhere. So we can just pass that StringRef along.

This is a fairly naive patch that targets just the build failures caused by this change. There's additional work that can be done to avoid creating std::string at call sites that still think getValueAsString returns a std::string. I'll try to clean those up in future patches.

Differential Revision: https://reviews.llvm.org/D33710

llvm-svn: 304325
2017-05-31 19:01:11 +00:00
Craig Topper fa5dc09292 [BPF] Correct the file name of the -gen-asm-matcher output file to not start with X86.
llvm-svn: 304324
2017-05-31 19:01:05 +00:00
Teresa Johnson a6a3fb57a1 [ThinLTO] Reduce unnecessary map lookups during combined summary write
Summary:
Don't assign values to undefined references, simply don't emit those
reference edges as they are not useful (we were already not emitting
call edges to undefined refs).

Also, streamline the later lookup of value ids when writing the
summaries, by combining the check for value id existence with the access
of that value id.

Reviewers: pcc

Subscribers: Prazek, llvm-commits, inglorion

Differential Revision: https://reviews.llvm.org/D33634

llvm-svn: 304323
2017-05-31 18:58:11 +00:00
Nirav Dave 3424373f30 [ScheduleDAG] Deal with already scheduled loads in ScheduleDAG.
Summary:
If we attempt to unfold an SUnit in ScheduleDAG that results in
finding an already scheduled load, we must should abort the
unfold as it will not improve scheduling.

This fixes PR32610.

Reviewers: jmolloy, sunfish, bogner, spatel

Subscribers: llvm-commits, MatzeB

Differential Revision: https://reviews.llvm.org/D32911

llvm-svn: 304321
2017-05-31 18:43:17 +00:00
Matthias Braun d6a36ae282 TargetMachine: Indicate whether machine verifier passes.
This adds a callback to the LLVMTargetMachine that lets target indicate
that they do not pass the machine verifier checks in all cases yet.

This is intended to be a temporary measure while the targets are fixed
allowing us to enable the machine verifier by default with
EXPENSIVE_CHECKS enabled!

Differential Revision: https://reviews.llvm.org/D33696

llvm-svn: 304320
2017-05-31 18:41:23 +00:00
Kostya Serebryany 53b34c8443 [sanitizer-coverage] remove stale code (old coverage); llvm part
llvm-svn: 304319
2017-05-31 18:27:33 +00:00
Sean Fertile 457ddd311a [PowerPC] Correctly specify the cache line size for Power 7, 8 and 9.
Fixes PPCTTIImpl::getCacheLineSize() returning the wrong cache line size for
newer ppc processors.

Commiting on behalf of Stefan Pintilie.
Differential Revision: https://reviews.llvm.org/D33656

llvm-svn: 304317
2017-05-31 18:20:17 +00:00
Anna Thomas 777bb90bdc Revert "[Atomics][LoopIdiom] Recognize unordered atomic memcpy"
This reverts commit r304310.

It caused build failures in polly and mingw
due to undefined reference to
llvm::RTLIB::getMEMCPY_ELEMENT_ATOMIC.

llvm-svn: 304315
2017-05-31 17:20:51 +00:00
Zaara Syeda 3a7578c658 [PPC] Inline expansion of memcmp
This patch does an inline expansion of memcmp.
It changes the memcmp library call into an inline expansion when the size is
known at compile time and is under a target specified threshold.
This expansion is implemented in CodeGenPrepare and expands into straight line
code. The target specifies a maximum load size and the expansion works by using
this size to load the two sources, compare, and exit early if a difference is
found. It also has a special case when the memcmp result is used in a compare
to zero equality.

Differential Revision: https://reviews.llvm.org/D28637

llvm-svn: 304313
2017-05-31 17:12:38 +00:00
Galina Kistanova 6ad77845e2 Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304312
2017-05-31 17:10:03 +00:00
Mark Searles 11d0a04050 [AMDGPU] Fix bugs in new waitcnt pass. Add test.
- new waitcnt pass remains off by default; -enable-si-insert-waitcnts=1 to enable it
- fix handling of PERMUTE ops
- fix insertion of waitcnt instrs at function begin/end ( port of analogous code that was added to old waitcnt pass )
- add new test

  Differential Revision: https://reviews.llvm.org/D33114

llvm-svn: 304311
2017-05-31 16:44:23 +00:00
Anna Thomas 056c009f1b [Atomics][LoopIdiom] Recognize unordered atomic memcpy
Summary:
Expanding the loop idiom test for memcpy to also recognize unordered atomic memcpy.
The only difference for recognizing
an unordered atomic memcpy and instead of a normal memcpy is
that the loads and/or stores involved are unordered atomic operations.
Background:  http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html

Patch by Daniel Neilson!

Reviewers: reames, anna, skatkov

Reviewed By: reames

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33243

llvm-svn: 304310
2017-05-31 16:39:52 +00:00
Dmitry Preobrazhensky 793c592652 [AMDGPU][MC] New syntax for ds_swizzle_b32 offset
See Bug 28601: https://bugs.llvm.org//show_bug.cgi?id=28601

Reviewers: artem.tamazov, vpykhtin

Differential Revision: https://reviews.llvm.org/D33542

llvm-svn: 304309
2017-05-31 16:26:47 +00:00
Florian Hahn ff25b6d8f6 [AArch64] Enable FeatureFuseAES on Cortex-A53.
It improves performance on Cortex-A53.

llvm-svn: 304307
2017-05-31 15:50:03 +00:00
Florian Hahn 064a2f9222 [AArch64] Enable FeatureFuseAES on Cortex-A73.
It improves performance on Cortex-A73.

llvm-svn: 304304
2017-05-31 15:25:25 +00:00
Reid Kleckner 1d7cbdfc3d Fix assertion when merging multiple empty AttributeLists
Patch by Nicholas Wilson!

Differential Revision: https://reviews.llvm.org/D33627

llvm-svn: 304300
2017-05-31 14:24:06 +00:00
Nirav Dave 7c70fddba6 [DAG] Avoid use of stale store.
Correct references to alignment of store which may be deleted in a
previous iteration of merge. Instead use first store that would be
merged.

Corrects pr33172's use-after-poison caught by ASan.

Reviewers: spatel, hfinkel, RKSimon

Reviewed By: RKSimon

Subscribers: thegameg, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33686

llvm-svn: 304299
2017-05-31 13:36:17 +00:00
Tony Jiang 60c247de18 [PowerPC] Fix a performance bug for PPC::XXPERMDI.
There are some VectorShuffle Nodes in SDAG which can be selected to XXPERMDI
Instruction, this patch recognizes them and does the selection to improve
the PPC performance.

Differential Revision: https://reviews.llvm.org/D33404

llvm-svn: 304298
2017-05-31 13:09:57 +00:00
Nemanja Ivanovic accab033c9 [PowerPC] Eliminate integer compare instructions - vol. 3
This patch builds upon https://reviews.llvm.org/rL302810 to add
handling for the 64-bit SETEQ patterns.

Differential Revision: https://reviews.llvm.org/D33369

llvm-svn: 304286
2017-05-31 08:04:07 +00:00
Dylan McKay 043fa4b3d6 [AVR] Fix a big in shift operator lowering; Authored by Dr. Gergo Erdi
When generating code for a shift loop, check the shift
 amount against the literal value 0, not R0

llvm-svn: 304284
2017-05-31 06:27:46 +00:00
Dylan McKay 48614d4a2c [AVR] CPIRdK can only work with r16..r31; Authored by Dr. Gergo Erdi
(https://github.com/avr-rust/rust/issues/50)

llvm-svn: 304283
2017-05-31 06:10:59 +00:00
Nemanja Ivanovic e597bd8230 [PowerPC] Eliminate integer compare instructions - vol. 2
This patch builds upon https://reviews.llvm.org/rL302810 to add
handling for bitwise logical operations in general purpose registers.
The idea is to keep the values in GPRs as long as possible - only
extracting them to a condition register bit when no further operations
are to be done.

Differential Revision: https://reviews.llvm.org/D31851

llvm-svn: 304282
2017-05-31 05:40:25 +00:00
Craig Topper 01197f686f [TableGen] Make one of RecordVal's constructors delegate to the other to reduce duplicate code.
llvm-svn: 304280
2017-05-31 05:12:33 +00:00
Zachary Turner 1b88f4f33a [ObjectYAML] Split CodeViewYAML into 3 pieces.
The code was a mess and disorganized due to the sheer amount
of it being in one file.  So I'm splitting this into three files.
One for CodeView types, one for CodeView symbols, and one for
CodeView debug subsections.  NFC.

llvm-svn: 304278
2017-05-31 04:17:13 +00:00
Gor Nishanov 2bc782d8da [coroutines] Call initializePass in coroutine pass constructors
Summary:

Fixes: https://bugs.llvm.org/show_bug.cgi?id=33226

Reviewers: chandlerc, davide, majnemer, dblaikie

Reviewed By: chandlerc

Subscribers: EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D33701

llvm-svn: 304277
2017-05-31 03:12:42 +00:00
George Burgess IV 0a7b989036 [CFLAA] Add missing break; note things are broken.
Thanks to Galina Kistanova for finding the missing break!

When trying to make a test for this, I realized our logic for handling
extractvalue/insertvalue/... is somewhat broken. This makes constructing
a test-case for this missing break nontrivial.

llvm-svn: 304275
2017-05-31 02:35:26 +00:00
Matthias Braun bcd4c68233 X86FrameLowering: No need to mark FP as live-in everywhere
The frame pointer (when used as frame pointer) is a reserved register.
We do not track liveness of reserved registers and hence do not need to
add them to the basic block livein lists.

llvm-svn: 304274
2017-05-31 02:11:10 +00:00
Daniel Berlin be3e7ba45e NewGVN: Fix PR 33185 by checking whether we need to recursively
generate a phi of ops, which we don't currently support.

llvm-svn: 304272
2017-05-31 01:47:32 +00:00
Daniel Berlin 71ff663e1b InstructionSimplify: Remove now-redundant reachability tests, as dominates() already does them
llvm-svn: 304270
2017-05-31 01:47:24 +00:00
Matthias Braun 05eeadbfd1 ARM: Fix cmpxchg O0 expansion
This is the equivalent of r304048 for ARM:

- Rewrite livein calculation to use the computeLiveIns() helper
  function. This is slightly less efficient but easier to reason about
  and doesn't unnecessarily add pristine and reserved registers[1]
- Zero the status register at the beginning of the loop to make sure it
  has a defined value.
- Remove kill flags of values that need to stay alive throughout the loop.

[1] An upcoming commit of mine will tighten the MachineVerifier to catch
    these.

llvm-svn: 304267
2017-05-31 01:21:35 +00:00
Matthias Braun 0dba4e3509 ARM: Do not add reserved registers to block livein lists; NFC
llvm-svn: 304266
2017-05-31 01:21:30 +00:00
Eugene Zelenko 4e9736b1c9 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 304265
2017-05-31 01:10:10 +00:00
Zachary Turner 083342bd34 [ObjectYAML] Clean up the CodeView headers a bit.
CodeViewYAML.h attempts to hide the details of many of the
CodeView yaml structures and types, but at the same time it
exposes the mapping traits for them to external users of the
header.

This patch just hides these in the implementation files so that
the interface is kept as simple as possible.

llvm-svn: 304263
2017-05-31 01:08:36 +00:00
Abderrazek Zaafrani 855411566b Add latency info for Exynos interleaved Load/Store instructions.
llvm-svn: 304259
2017-05-31 00:20:55 +00:00
Zachary Turner 7a75bc05b7 Try to fix build again.
llvm-svn: 304257
2017-05-30 23:57:46 +00:00
Zachary Turner 1e4d3693c4 [CodeView] Move CodeView symbol yaml logic to ObjectYAML.
This continues the effort to get the CodeView YAML parsing logic
into ObjectYAML.  After this patch, the only missing piece will
be the CodeView debug symbol subsections.

llvm-svn: 304256
2017-05-30 23:50:44 +00:00
Eric Beckmann 025e82bac1 Fix bug on Big-Endian system, due to reference to vector out of scope.
llvm-svn: 304255
2017-05-30 23:10:57 +00:00
Matthias Braun bc09894d6a MachineInstr: Do not skip dead def operands when printing.
This was introduced a long time ago in r86583 when regmask operands
didn't exist. Nowadays the behavior hurts more than it helps. This
removes it.

llvm-svn: 304254
2017-05-30 23:09:21 +00:00
Eric Beckmann ba395ef491 This patch should fix various clang warnings and a use of to_string
which isn't support before c++11.

llvm-svn: 304252
2017-05-30 22:29:06 +00:00
Tim Shen 0bd0aa8f07 [AntiDepBreaker] Revert r299124 and add a test.
Summary:
AntiDepBreaker intends to add all live-outs, including the implicit
CSRs, in StartBlock. r299124 was done without understanding that
intention.

Now with the live-ins propagated correctly (D32464), we can revert this change.

Reviewers: MatzeB, qcolombet

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D33697

llvm-svn: 304251
2017-05-30 22:26:52 +00:00
Zachary Turner 9c1ba225a9 Try to fix build.
llvm-svn: 304249
2017-05-30 22:00:37 +00:00
Zachary Turner d427383cb8 [CodeView] Move CodeView YAML code to ObjectYAML.
This is the beginning of an effort to move the codeview yaml
reader / writer into ObjectYAML so that it can be shared.
Currently the only consumer / producer of CodeView YAML is
llvm-pdbdump, but CodeView can exist outside of PDB files, and
indeed is put into object files and passed to the linker to
produce PDB files.  Furthermore, there are subtle differences
in the types of records that show up in object file CodeView
vs PDB file CodeView, but they are otherwise 99% the same.

By having this code in ObjectYAML, we can have llvm-pdbdump
reuse this code, while teaching obj2yaml and yaml2obj to use
this syntax for dealing with object files that can contain
CodeView.

This patch only adds support for CodeView type information
to ObjectYAML.  Subsequent patches will add support for
CodeView symbol information.

llvm-svn: 304248
2017-05-30 21:53:05 +00:00
Matthias Braun 5e394c3d6f TargetPassConfig: Keep a reference to an LLVMTargetMachine; NFC
TargetPassConfig is not useful for targets that do not use the CodeGen
library, so we may just as well store a pointer to an
LLVMTargetMachine instead of just to a TargetMachine.

While at it, also change the constructor to take a reference instead of a
pointer as the TM must not be nullptr.

llvm-svn: 304247
2017-05-30 21:36:41 +00:00
Tim Northover fb26d9a286 MIR: remove explicit "noVRegs" property.
We can infer this from the incoming MIR, so there's no reason to
represent it with a special flag.

llvm-svn: 304246
2017-05-30 21:28:57 +00:00
Xinliang David Li 74480adafd [PartialInlining] Shrinkwrap allocas with live range contained in outline region.
Differential Revision: http://reviews.llvm.org/D33618

llvm-svn: 304245
2017-05-30 21:22:18 +00:00
Quentin Colombet 73141d5b4b [Localizer] Don't trick to be smart for the insertion point
There is no guarantee that the first use of a constant that is traversed
is actually the first in the related basic block. Thus, if we use that
as the insertion point we may end up with definitions that don't
dominate there use.

llvm-svn: 304244
2017-05-30 20:53:06 +00:00
Matthew Simpson 646475a9bc [LV] Reapply r303763 with fix for PR33193
r303763 caused build failures in some out-of-tree tests due to an assertion in
TTI. The original patch updated cost estimates for induction variable update
instructions marked for scalarization. However, it didn't consider that the
incoming value of an induction variable phi node could be a cast instruction.
This caused queries for cast instruction costs with a mix of vector and scalar
types. This patch includes a fix for cast instructions and the test case from
PR33193.

The fix was suggested by Jonas Paulsson <paulsson@linux.vnet.ibm.com>.

Reference: https://bugs.llvm.org/show_bug.cgi?id=33193
Original Differential Revision: https://reviews.llvm.org/D33457

llvm-svn: 304235
2017-05-30 19:55:57 +00:00
Benjamin Kramer c69fe9cc62 [Object] Remove unused field + constructor.
llvm-svn: 304233
2017-05-30 19:37:02 +00:00
Benjamin Kramer 14ea122e6e [Object] Fix pessimizing move.
Returning the Error by value triggers copy elision, the move is more
expensive. Clang rightfully warns about it.

llvm-svn: 304232
2017-05-30 19:36:58 +00:00
Vedant Kumar 87aefe9042 Revert "This patch closes PR28513: an optimization of multiplication by different constants. It's implemented on DAG combiner level."
This reverts commit r304209.

I think this change is responsible for a tablgen failure in stage2 builds:

  http://green.lab.llvm.org/green/job/clang-stage2-configure-Rthinlto_build/2171/

I reproduced the failure locally (without ThinLTO), reverted the commit, rebuilt the stage1 clang, rebuilt the stage2 llvm-tblgen tool, and found that the crash disappears when the commit is reverted. Here is the stack trace:

FAILED: lib/Target/ARM/ARMGenRegisterBank.inc.tmp
cd /Volumes/Builds/pz-master-stage2-RA/lib/Target/ARM && /Volumes/Builds/pz-master-stage2-RA/bin/llvm-tblgen -gen-register-bank -I /Users/vk/llvm/lib/Target/ARM -I /Users/vk/llvm/include -I /Users/vk/llvm/lib/Target /Users/vk/llvm/lib/Target/ARM/ARM.td -o /Volumes
/Builds/pz-master-stage2-RA/lib/Target/ARM/ARMGenRegisterBank.inc.tmp
0  llvm-tblgen              0x0000000106fc9568 llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 40
1  llvm-tblgen              0x0000000106fc9be6 SignalHandler(int) + 422
2  libsystem_platform.dylib 0x00000001076a7fba _sigtramp + 26
3  libsystem_platform.dylib 0x00007fff58deb468 _sigtramp + 1366570184
4  llvm-tblgen              0x0000000106e89cc7 llvm::CodeGenRegBank::getCompositeSubRegIndex(llvm::CodeGenSubRegIndex*, llvm::CodeGenSubRegIndex*) + 615
5  llvm-tblgen              0x0000000106e88be6 llvm::CodeGenRegister::computeSubRegs(llvm::CodeGenRegBank&) + 2182
6  llvm-tblgen              0x0000000106e8e9f0 llvm::CodeGenRegBank::CodeGenRegBank(llvm::RecordKeeper&) + 2192
7  llvm-tblgen              0x0000000106f384a1 llvm::EmitRegisterBank(llvm::RecordKeeper&, llvm::raw_ostream&) + 65
8  llvm-tblgen              0x0000000106f72c64 (anonymous namespace)::LLVMTableGenMain(llvm::raw_ostream&, llvm::RecordKeeper&) + 1172
9  llvm-tblgen              0x0000000106fcb15f llvm::TableGenMain(char*, bool (*)(llvm::raw_ostream&, llvm::RecordKeeper&)) + 3599
10 llvm-tblgen              0x0000000106f727a6 main + 134
11 libdyld.dylib            0x000000010733c6a5 start + 1
Stack dump:
0.      Program arguments: /Volumes/Builds/pz-master-stage2-RA/bin/llvm-tblgen -gen-register-bank -I /Users/vk/llvm/lib/Target/ARM -I /Users/vk/llvm/include -I /Users/vk/llvm/lib/Target /Users/vk/llvm/lib/Target/ARM/ARM.td -o /Volumes/Builds/pz-master-stage2-RA/lib/Target/ARM/ARMGenRegisterBank.inc.tmp
/bin/sh: line 1: 41986 Segmentation fault: 11  /Volumes/Builds/pz-master-stage2-RA/bin/llvm-tblgen -gen-register-bank -I /Users/vk/llvm/lib/Target/ARM -I /Users/vk/llvm/include -I /Users/vk/llvm/lib/Target /Users/vk/llvm/lib/Target/ARM/ARM.td -o /Volumes/Builds/pz
-master-stage2-RA/lib/Target/ARM/ARMGenRegisterBank.inc.tmp

llvm-svn: 304231
2017-05-30 19:25:22 +00:00
Galina Kistanova 8c1e2f9108 Added missing break.
llvm-svn: 304230
2017-05-30 19:02:49 +00:00
Keno Fischer 3fa5db4c04 Revert "[Cloning] Take another pass at properly cloning debug info"
At least one build bot is complaining. Will investigate after lunch.

llvm-svn: 304228
2017-05-30 18:56:26 +00:00
Matthias Braun 700603555a ARM: Add missing flags to TBB_[JH]T pseudo instructions
NFC except for calming down the machine verifier in some cases.

llvm-svn: 304227
2017-05-30 18:52:33 +00:00
Keno Fischer 945dc1d2d1 [Cloning] Take another pass at properly cloning debug info
Summary:
In rL302576, DISubprograms gained the constraint that a !dbg attachments to functions must
have a 1:1 mapping to DISubprograms. As part of that change, the function cloning support
was adjusted to attempt to enforce this invariant during cloning. However, there
were several problems with the implementation. Part of these were fixed in rL304079.
However, there was a more fundamental problem with these changes, namely that it
bypasses the matadata value map, causing the cloned metadata to be a mix of metadata
pointing to the new suprogram (where manual code was added to fix those up) and the
old suprogram (where this was not the case). This mismatch could cause a number of
different assertion failures in the DWARF emitter. Some of these are given at
https://github.com/JuliaLang/julia/issues/22069, but some others have been observed
as well. Attempt to rectify this by partially reverting the manual DI metadata fixup,
and instead using the standard value map approach. To retain the desired semantics
of not duplicating the compilation unit and inlined subprograms, explicitly freeze
these in the value map.

Reviewers: dblaikie, aprantl, GorNishanov, echristo

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33655

llvm-svn: 304226
2017-05-30 18:28:30 +00:00
Eric Beckmann 72fb6a87fb Adding parsing ability for .res file.
Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33566

llvm-svn: 304225
2017-05-30 18:19:06 +00:00
Krzysztof Parzyszek ef58017b35 [Hexagon] Improve code generation for 32x32-bit multiplication
For multiplications of 64-bit values (giving 64-bit result), detect
cases where the arguments are sign-extended 32-bit values, on a per-
operand basis. This will allow few patterns to match a wider variety
of combinations in which extensions can occur.

llvm-svn: 304223
2017-05-30 17:47:51 +00:00
Zachary Turner 591312c5c1 [CodeView] Add more DebugSubsection implementations.
This adds implementations for Symbols and FrameData, and renames
the existing codeview::StringTable class to conform to the
DebugSectionStringTable convention.

llvm-svn: 304222
2017-05-30 17:13:33 +00:00
Craig Topper 5fd588be34 [SelectionDAG] Remove special case for ISD::FPOWI from the strict FP intrinsic handling.
This code was compensating for FPOWI defaulting to Legal and many targets not changing it to Expand. This was fixed in r304215 to default to Expand so this special handling should no longer be necessary.

llvm-svn: 304221
2017-05-30 17:12:18 +00:00
Stanislav Mekhanoshin 56ea488d8b [AMDGPU] Allow SDWA in instructions with immediates and SGPRs
An encoding does not allow to use SDWA in an instruction with
scalar operands, either literals or SGPRs. That is however possible
to copy these operands into a VGPR first.

Several copies of the value are produced if multiple SDWA conversions
were done. To cleanup MachineLICM (to hoist copies out of loops),
MachineCSE (to remove duplicate copies) and SIFoldOperands (to replace
SGPR to VGPR copy with immediate copy right to the VGPR) runs are added
after the SDWA pass.

Differential Revision: https://reviews.llvm.org/D33583

llvm-svn: 304219
2017-05-30 16:49:24 +00:00
Zachary Turner 8c099fe06e [CodeView] Rename ModuleDebugFragment -> DebugSubsection.
This is more concise, and matches the terminology used in other
parts of the codebase more closely.

llvm-svn: 304218
2017-05-30 16:36:15 +00:00
Mark Searles 00ce96f6ee [AMDGPU] Require waitcnt before barrier for all targets; adjust tests.
Differential Revision: https://reviews.llvm.org/D33576

llvm-svn: 304217
2017-05-30 16:22:43 +00:00
Craig Topper f6d4dc5b4a [SelectionDAG] Set ISD::FPOWI to Expand by default
Summary:
Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie".

This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default.

Reviewers: spatel, RKSimon, efriedma

Reviewed By: RKSimon

Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits

Differential Revision: https://reviews.llvm.org/D33530

llvm-svn: 304215
2017-05-30 15:27:55 +00:00
Andrew V. Tischenko 8b04826663 This patch closes PR28513: an optimization of multiplication by different constants.
It's implemented on DAG combiner level.

llvm-svn: 304209
2017-05-30 13:00:44 +00:00
Max Kazantsev d8fe3eb9cb [SCEV][NFC] Remove redundant params from isAvailableAtLoopEntry
Params DT and LI are redundant, because these values are contained in fields anyways.

Differential Revision: https://reviews.llvm.org/D33668

llvm-svn: 304204
2017-05-30 10:54:58 +00:00
Ulrich Weigand 3f484e68cc [SystemZ] Add decimal floating-point instructions
This adds assembler / disassembler support for the decimal
floating-point instructions.  Since LLVM does not yet have
support for decimal float types, these cannot be used for
codegen at this point.

llvm-svn: 304203
2017-05-30 10:15:16 +00:00
Ulrich Weigand f32adf6944 [SystemZ] Add hexadecimal floating-point instructions
This adds assembler / disassembler support for the hexadecimal
floating-point instructions.  Since the Linux ABI does not use
any hex float data types, these are not useful for codegen.

llvm-svn: 304202
2017-05-30 10:13:23 +00:00
Zoran Jovanovic 375b60de74 [mips] Expansion of LI.S and LI.D
Author: smaksimovic
Reviewers: dsanders sdardis
Introduces LI.S and LI.D pseudo instructions with floating point operands.
Differential Revision: https://reviews.llvm.org/D14390

llvm-svn: 304198
2017-05-30 09:33:43 +00:00
Kristof Beyls 2af1e90eb2 Fix PR33031: correct the estimate of maximum offset for instructions spilling/filling the stack.
llvm-svn: 304196
2017-05-30 06:58:41 +00:00
Daniel Berlin 2aa5dc1589 NewGVN: Compute hash value of expression on demand and use it in inequality testing.
llvm-svn: 304195
2017-05-30 06:58:18 +00:00
Daniel Berlin c8ed40400c NewGVN: Fix PR33194, memory corruption by putting temporary instructions in tables sometimes.
llvm-svn: 304194
2017-05-30 06:42:29 +00:00
Galina Kistanova 5c4f1a9b02 Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304187
2017-05-30 03:30:34 +00:00
Joerg Sonnenberger 9375a25342 Revert r303763, results in asserts i.e. while building Ruby.
llvm-svn: 304179
2017-05-29 22:52:17 +00:00
Craig Topper 638b1021bf [TableGen] Use StringMap instead of DenseMap<StringRef> to unique CodeInit and StringInit objects. Override the allocator to keep using the BumpPtrAllocator. NFCI
StringMap is better suited to mapping strings than a DenseMap.

llvm-svn: 304178
2017-05-29 21:49:37 +00:00
Craig Topper 481ff7087f [TableGen] Introduce DagInit::getArgs that returns an ArrayRef. Use it to fix 80 column violations in arg_begin/arg_end. Remove DagInit::args and use getArgs instead. NFC
llvm-svn: 304177
2017-05-29 21:49:34 +00:00
Benjamin Kramer 74de08031f [ManagedStatic] Avoid putting function pointers in template args.
This is super awkward, but GCC doesn't let us have template visible when
an argument is an inline function and -fvisibility-inlines-hidden is
used.

llvm-svn: 304175
2017-05-29 20:56:27 +00:00
Davide Italiano af66659d6b [GlobalIsel] Fix a warning with GCC 7 -Wpedantic. NFCI.
llvm-svn: 304174
2017-05-29 20:13:22 +00:00
Benjamin Kramer 2a441a52df Try to work around MSVC being buggy. Attempt #1.
error C2971: 'llvm::ManagedStatic': template parameter 'Creator': 'CreateDefaultTimerGroup': a variable with non-static storage duration cannot be used as a non-type argument

llvm-svn: 304157
2017-05-29 14:28:04 +00:00
Benjamin Kramer 351779e972 [Timer] Move DefaultTimerGroup into a ManagedStatic.
This used to be just leaked. r295370 made it use magic statics. This adds
a global destructor, which is something we'd like to avoid. It also creates
a weird situation where the mutex used by TimerGroup is re-created during
global shutdown and leaked.

Using a ManagedStatic here is also subtle as it relies on the mutex
inside of ManagedStatic to be recursive. I've added a test for that
in a previous change.

llvm-svn: 304156
2017-05-29 14:05:29 +00:00
Sanjay Patel 51152a3727 [DAGCombiner] fix load narrowing transform to exclude loads with extension
The extending load possibility was missed in:
https://reviews.llvm.org/rL304072

We might want to handle this cases as a follow-up, but bailing out for now
to avoid miscompiling.

llvm-svn: 304153
2017-05-29 13:24:58 +00:00
Jonas Paulsson fe0c0935c8 [SystemZ] Improve buildVector() in SystemZISelLowering.cpp.
Use VLREP when inserting one or more loads into a vector. This is more
efficient than to first load and then use a VLVGP.

Review: Ulrich Weigand
llvm-svn: 304152
2017-05-29 13:22:23 +00:00
Nikolai Bozhenov 82f0801c1b [Nios2] Target registration
Reviewers: craig.topper, hfinkel, joerg, lattner, zvi

Reviewed By: craig.topper

Subscribers: oren_ben_simhon, igorb, belickim, tvvikram, mgorny, llvm-commits, pavel.v.chupin, DavidKreitzer

Differential Revision: https://reviews.llvm.org/D32669
Patch by AndreiGrischenko <andrei.l.grischenko@intel.com>

llvm-svn: 304144
2017-05-29 09:48:30 +00:00
Diana Picus 0c05cce4e0 [ARM] GlobalISel: Extract helper. NFCI.
Create a helper to deal with the common code for merging incoming values
together after they've been split during call lowering. There's likely
more stuff that can be commoned up here, but we'll leave that for later.

llvm-svn: 304143
2017-05-29 09:09:54 +00:00
Hiroshi Inoue ac9cd3080d [trivial] fix a typo in comment, NFC
llvm-svn: 304139
2017-05-29 08:37:42 +00:00
Diana Picus bf4aed2c38 [ARM] GlobalISel: Support array returns
These are a bit rare in practice, but they don't require anything
special compared to array parameters, so support them as well.

llvm-svn: 304137
2017-05-29 08:19:19 +00:00
Hiroshi Inoue e3c14ebbfa [PPC] Fix assertion failure during binary encoding with -mcpu=pwr9
Summary
clang -c -mcpu=pwr9 test/CodeGen/PowerPC/build-vector-tests.ll causes an assertion failure during the binary encoding.
The failure occurs when a D-form load instruction takes two register operands instead of a register + an immediate.

This patch fixes the problem and also adds an assertion to catch this failure earlier before the binary encoding (i.e. during lit test).
The fix is from Nemanja Ivanovic @nemanjai.

Differential Revision: https://reviews.llvm.org/D33482

llvm-svn: 304133
2017-05-29 07:12:39 +00:00
Diana Picus 8cca8cb0ce [ARM] GlobalISel: Support array parameters/arguments
Clang coerces structs into arrays, so it's a good idea to support them.
Most of the support boils down to getting the splitToValueTypes helper
to actually split types. We then use G_INSERT/G_EXTRACT to deal with the
parts.

llvm-svn: 304132
2017-05-29 07:01:52 +00:00
Mehdi Amini 96ab48f9da DebugInfo: Include .dwo file name when hashing multiple CUs in a single file
This is really a workaround for ThinLTO in particular - since it can
import partial CUs that may end up looking very similar/the same as
the same partial import in another ThinLTO compile.

An alternative fix would be to change the DICompileUnit metadata to
include a "primary file" or the like - and when importing for ThinLTO
set the primary file to the name of the DICompileUnit that is being
imported into. This involves changing the schema and would reduce the
excessive uniqueness in the hash that this change creates - allowing
diagnosing of more duplicate CUs than will be caught with this change.

But duplicate CUs can still be caught in non-ThinLTO builds & are mostly
a nuisance rather than a particularly deliberate/effective tool for
finding broken code. (arguably the hash could always include the dwo
file and nothing in fission would break, I think..)

Reapply of r304119 after adding a triple to the test and moving it
to the X86 directory.

llvm-svn: 304130
2017-05-29 06:32:34 +00:00
Mehdi Amini 4181205563 DebugInfo: Omit an empty CU when a subprogram was moved into its use
When the only use of a CU is for a subprogram that's only emitted into
the using CU (to avoid cross-CU references in DWO files), avoid creating
that CU at all.

Reapply of r304111 after adding a triple to the test and moving it
to the X86 directory.

llvm-svn: 304129
2017-05-29 06:25:30 +00:00
Tobias Grosser 8cf785f6b1 Revert "[IfConversion] Keep the CFG updated incrementally in IfConvertTriangle"
The reverted change introdued assertions ala:

"MachineBasicBlock::succ_iterator
llvm::MachineBasicBlock::removeSuccessor(succ_iterator, bool): Assertion
`I != Successors.end() && "Not a current successor!"'

Mikael, the original committer, wrote me that he is working on a fix, but that
it likely will take some time to get this resolved. As this bug is one of the
last two issues that keep the AOSP buildbot from turning green, I revert the
original commit r302876.

I am looking forward to see this recommitted after the assertion has been
resolved.

llvm-svn: 304128
2017-05-29 06:12:18 +00:00
Mehdi Amini e161ced16a Revert "DebugInfo: Omit an empty CU when a subprogram was moved into its use"
This reverts commit r304111.
GreenDragon is broken.

llvm-svn: 304126
2017-05-29 05:17:57 +00:00
Mehdi Amini d8056bb7d8 Revert "DebugInfo: Include .dwo file name when hashing multiple CUs in a single file"
This reverts commit r304119 and r304118. GreenDragon is broken.

llvm-svn: 304125
2017-05-29 05:17:54 +00:00
Zachary Turner df1832cf86 Resubmit "[X86] Adding new LLVM TableGen backend that generates the X86 backend memory folding tables."
This was reverted due to buildbot breakages and I was not familiar
with this code to investigate it.  But while trying to get a
useful backtrace for the author, it turns out the fix was very
obvious.  Resubmitting this patch as is, and will submit the
fix in a followup so that the fix is not hidden in the larger
CL.

llvm-svn: 304122
2017-05-29 02:19:37 +00:00
Zachary Turner 5b199be769 Revert "[X86] Adding new LLVM TableGen backend that generates the X86 backend memory folding tables."
This reverts commit 28cb1003507f287726f43c771024a1dc102c45fe as well
as all subsequent followups.  llvm-tblgen currently segfaults with
this change, and it seems it has been broken on the bots all
day with no fixes in preparation.  See, for example:

http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/

llvm-svn: 304121
2017-05-29 01:48:53 +00:00
Galina Kistanova 229c9c1159 Disabled implicit-fallthrough warnings for ConvertUTF.cpp.
ConvertUTF.cpp has a little dependency on LLVM, and since the code extensively uses fall-through switches,
I prefer disabling the warning for the whole file, rather than adding attributes for each case.

llvm-svn: 304120
2017-05-29 01:34:26 +00:00
David Blaikie ce0c205813 DebugInfo: Include .dwo file name when hashing multiple CUs in a single file
This is really a workaround for ThinLTO in particular - since it can
import partial CUs that may end up looking very similar/the same as
the same partial import in another ThinLTO compile.

An alternative fix would be to change the DICompileUnit metadata to
include a "primary file" or the like - and when importing for ThinLTO
set the primary file to the name of the DICompileUnit that is being
imported into. This involves changing the schema and would reduce the
excessive uniqueness in the hash that this change creates - allowing
diagnosing of more duplicate CUs than will be caught with this change.

But duplicate CUs can still be caught in non-ThinLTO builds & are mostly
a nuisance rather than a particularly deliberate/effective tool for
finding broken code. (arguably the hash could always include the dwo
file and nothing in fission would break, I think..)

llvm-svn: 304119
2017-05-29 00:48:45 +00:00
Saleem Abdulrasool f122423ace Support: adjust the default obj format for wasm
WebAssemly uses a custom object file format.  For the wasm targets,
default to the `Wasm` object file format.

llvm-svn: 304117
2017-05-29 00:14:57 +00:00
Dylan McKay 74fc1ce0c2 [AVR] Remove SREG from CPI's Uses; authored by Florian Zeitz
Summary: CPI does not read the status register, but only writes it.

Reviewers: dylanmckay

Reviewed By: dylanmckay

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33223

llvm-svn: 304116
2017-05-29 00:10:14 +00:00
Erik Pilkington de83eea576 [ItaniumDemangle] Fix a exponential string copying bug
This is a port of libcxxabi's r304113.

llvm-svn: 304114
2017-05-28 23:24:52 +00:00
NAKAMURA Takumi a288ec412f Prune trailing whitespace. (To regenerate makefiles)
llvm-svn: 304112
2017-05-28 22:54:25 +00:00
David Blaikie f2f898a044 DebugInfo: Omit an empty CU when a subprogram was moved into its use
When the only use of a CU is for a subprogram that's only emitted into
the using CU (to avoid cross-CU references in DWO files), avoid creating
that CU at all.

llvm-svn: 304111
2017-05-28 22:51:37 +00:00
Geoff Berry 2739ebafb6 [AArch64][Falkor] Combine sched details files into one. NFC.
llvm-svn: 304109
2017-05-28 22:20:44 +00:00
Geoff Berry b542fb3817 [AArch64][Falkor] Fix some sched details.
- Remove all uses of base sched model entries and set them all to
  Unsupported so all the opcodes are described in
  AArch64SchedFalkorDetails.td.
- Remove entries for unsupported half-float opcodes.
- Remove entries for unsupported LSE extension opcodes.
- Add entry for MOVbaseTLS (and set Sched in base td file entry to
  WriteSys) and a few other pseudo ops.
- Fix a few FP load/store with reg offset entries to use the LSLfast
  predicates.
- Add Q size BIF/BIT/BSL entries.
- Fix swapped Q/D sized CLS/CLZ/CNT/RBIT entires.
- Fix pre/post increment address register latency (this operand is
  always dest 0).
- Fix swapped FCVTHD/FCVTHS/FCVTDH/FCVTDS entries.
- Fix XYZ resource over usage on LD[1-4] opcodes.

llvm-svn: 304108
2017-05-28 21:48:31 +00:00
Benjamin Kramer 9d8ed2653f [InstrProf] Use more ArrayRef/StringRef.
No functional change intended.

llvm-svn: 304089
2017-05-28 13:23:02 +00:00
Ayman Musa d9f1fe43a8 [X86] Adding new LLVM TableGen backend that generates the X86 backend memory folding tables.
X86 backend holds huge tables in order to map between the register and memory forms of each instruction.
This TableGen Backend automatically generated all these tables with the appropriate flags for each entry.

Differential Revision: https://reviews.llvm.org/D32684

llvm-svn: 304088
2017-05-28 12:55:36 +00:00
Ayman Musa 0b4f97d5e9 [X86] Adding FoldGenRegForm helper field (for memory folding tables tableGen backend) to X86Inst class and set its value for the relevant instructions.
Some register-register instructions can be encoded in 2 different ways, this happens when 2 register operands can be folded (separately). 
For example if we look at the MOV8rr and MOV8rr_REV, both instructions perform exactly the same operation, but are encoded differently. Here is the relevant information about these instructions from Intel's 64-ia-32-architectures-software-developer-manual:

Opcode  Instruction  Op/En  64-Bit Mode  Compat/Leg Mode  Description
8A /r   MOV r8,r/m8  RM     Valid        Valid            Move r/m8 to r8.
88 /r   MOV r/m8,r8  MR     Valid        Valid            Move r8 to r/m8.
Here we can see that in order to enable the folding of the output and input registers, we had to define 2 "encodings", and as a result we got 2 move 8-bit register-register instructions.

In the X86 backend, we define both of these instructions, usually one has a regular name (MOV8rr) while the other has "_REV" suffix (MOV8rr_REV), must be marked with isCodeGenOnly flag and is not emitted from CodeGen.

Automatically generating the memory folding tables relies on matching encodings of instructions, but in these cases where we want to map both memory forms of the mov 8-bit (MOV8rm & MOV8mr) to MOV8rr (not to MOV8rr_REV) we have to somehow point from the MOV8rr_REV to the "regular" appropriate instruction which in this case is MOV8rr.

This field enable this "pointing" mechanism - which is used in the TableGen backend for generating memory folding tables.

Differential Revision: https://reviews.llvm.org/D32683

llvm-svn: 304087
2017-05-28 12:39:37 +00:00
Oren Ben Simhon f3aab2fa33 [X86] Fixing VPOPCNTDQ feature set lookup.
llvm-svn: 304086
2017-05-28 11:26:11 +00:00
Gor Nishanov ffbeb22b6f Cloning: Fix debug info cloning
Summary:
I believe https://reviews.llvm.org/rL302576 introduced two bugs:

1) it produces duplicate distinct variables for every: dbg.value describing the same variable.
    To fix the problme I switched form getDistinct() to get() in DebugLoc.cpp: auto reparentVar = [&](DILocalVariable *Var) {
    return DILocalVariable::getDistinct(

2) It passes NewFunction plain name as a linkagename parameter to Subprogram constructor. Breaks assert in:

 || DeclLinkageName.empty()) || LinkageName == DeclLinkageName) && "decl has a linkage name and it is different"' failed.
#9 0x00007f5010261b75 llvm::DwarfUnit::applySubprogramDefinitionAttributes(llvm::DISubprogram const*, llvm::DIE&) /home/gor/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp:1173:3
#
(Edit: reproducer added)

Here how https://reviews.llvm.org/rL302576 broke coroutine debug info.
Coroutine body of the original function is split into several parts by cloning and removing unneeded code.
All parts describe the original function and variables present in the original function.

For a simple case, prior to Split, original function has these two blocks:

```
PostSpill:                                        ; preds = %AllocaSpillBB
  call void @llvm.dbg.value(metadata i32 %x, i64 0, metadata !14, metadata !15), !dbg !13
  store i32 %x, i32* %x.addr, align 4
  ...
and

sw.epilog:                                        ; preds = %sw.bb
  %x.addr.reload.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 4, !dbg !20
  %4 = load i32, i32* %x.addr.reload.addr, align 4, !dbg !20
  call void @llvm.dbg.value(metadata i32 %4, i64 0, metadata !14, metadata !15), !dbg !13

!14 = !DILocalVariable(name: "x", arg: 1, scope: !6, file: !7, line: 55, type: !11)

```

Note that in two blocks different expression represent the same original user variable X.

Before rL302576, for every cloned function there was exactly one cloned DILocalVariable(name: "x" as in:

```
define i8* @f(i32 %x) #0 !dbg !6 {
  ...
!6 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped,
...
!14 = !DILocalVariable(name: "x", arg: 1, scope: !6, file: !7, line: 55, type: !11)

define internal fastcc void @f.resume(%f.Frame* %FramePtr) #0 !dbg !25 {
...
!25 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped, isOptimized: false, unit: !0, variables: !2)
!28 = !DILocalVariable(name: "x", arg: 1, scope: !25, file: !7, line: 55, type: !11)
```
After rL302576, for every cloned function there were as many DILocalVariable(name: "x" as there were "call void @llvm.dbg.value" for that variable.
This was causing asserts in VerifyDebugInfo and AssemblyPrinter.

Example:

```
!27 = distinct !DISubprogram(name: "f", linkageName: "f.resume", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55,
!29 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11)
!39 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11)
!41 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11)
```

Second problem:

Prior to rL302576, all clones were described by DISubprogram referring to original function.

```
define i8* @f(i32 %x) #0 !dbg !6 {
...
!6 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped,

define internal fastcc void @f.resume(%f.Frame* %FramePtr) #0 !dbg !25 {
...
!25 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped,
```

After rL302576, DISubprogram for clones is of two minds, plain name refers to the original name, linkageName refers to plain name of the clone.

```
!27 = distinct !DISubprogram(name: "f", linkageName: "f.resume", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55,
```

I think the assumption in AsmPrinter is that both name and linkageName should refer to the same entity. It asserts here when they are not:

```
 || DeclLinkageName.empty()) || LinkageName == DeclLinkageName) && "decl has a linkage name and it is different"' failed.
#9 0x00007f5010261b75 llvm::DwarfUnit::applySubprogramDefinitionAttributes(llvm::DISubprogram const*, llvm::DIE&) /home/gor/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp:1173:3
```
After this fix, behavior (with respect to coroutines) reverts to exactly as it was before and therefore making them debuggable again, or even more importantly, compilable, with "-g"

Reviewers: dblaikie, echristo, aprantl

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33614

llvm-svn: 304079
2017-05-27 19:41:09 +00:00
George Rimar a25d329b33 Recommit "[DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC"
With fix of uninitialized variable.

Original commit message:

This change is intended to use for LLD in D33183. 
Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to.

Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo
interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason of incorrect behavior when
sections share the same offsets, like D33176 shows.

This patch makes DWARF parsers to return section index as well. That solves problem mentioned above.

Differential revision: https://reviews.llvm.org/D33184

llvm-svn: 304078
2017-05-27 18:10:23 +00:00
Craig Topper a568c72b7d [TableGen] Prevent DagInit from leaking its Args and ArgNames when they exceed the size of the SmallVector.
DagInits are allocated in a BumpPtrAllocator so they are never destructed. This means the destructor for the SmallVector never runs.

To fix this we now allocate the vectors in the BumpPtrAllocator too using TrailingObjects.

llvm-svn: 304077
2017-05-27 17:36:50 +00:00
Tobias Grosser e3684d0b84 [SCEV] Assume parameters coming from function calls contain IVs
The optimistic delinearization implemented in LLVM detects array sizes by
looking for non-linear products between parameters and induction variables.
In OpenCL code, such products often look like:

  A[get_global_id(0) * N + get_global_id(1)]

Hence, the IV is hidden in the get_global_id() call and consequently
delinearization would fail as no induction variable is available that helps
us to identify N as array size parameter.

We now use a very simple heuristic to change this. We assume that each parameter
that comes directly from a function call is a hidden induction variable. As
a result, we can delinearize the access above to:

  A[get_global_id(0)][get_global_id(1]

llvm-svn: 304073
2017-05-27 15:17:49 +00:00
Sanjay Patel 33f4a97287 [DAGCombiner] use narrow load to avoid vector extract
If we have (extract_subvector(load wide vector)) with no other users, 
that can just be (load narrow vector). This is intentionally conservative.
Follow-ups may loosen the one-use constraint to account for the extract cost
or just remove the one-use check.

The memop chain updating is based on code that already exists multiple times
in x86 lowering, so that should be pulled into a helper function as a follow-up.

Background: this is a potential improvement noticed via regressions caused by
making x86's peekThroughBitcasts() not loop on consecutive bitcasts (see 
comments in D33137).

Differential Revision: https://reviews.llvm.org/D33578

llvm-svn: 304072
2017-05-27 14:07:03 +00:00
Craig Topper b8ff353fc6 [TableGen] Remove all the static vectors named TheActualPool.
These used to hold std::unique_ptrs that managed the allocation for the various *Init object so that they would be deleted on exit. Everything is allocated in a BumpPtrAllocator name so there is no reason for these to still exist.

llvm-svn: 304066
2017-05-27 06:14:12 +00:00
Gor Nishanov 9c6ac6138d [coroutines] Define getPassName() for coroutine passes
Reviewers: GorNishanov

Reviewed By: GorNishanov

Subscribers: EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D33622

llvm-svn: 304065
2017-05-27 05:54:30 +00:00
Vitaly Buka a637489ef1 [PartialInlining] Replace delete with unique_ptr in computeCallsiteToProfCountMap
Reviewers: davidxl

Reviewed By: davidxl

Subscribers: vsk, llvm-commits

Differential Revision: https://reviews.llvm.org/D33220

llvm-svn: 304064
2017-05-27 05:32:09 +00:00
Matthias Braun 88c8c9847d AArch64/PEI: Do not add reserved regs to liveins
We do not track liveness for reserved registers. It is unnecessary to
add them to block livein lists.

llvm-svn: 304059
2017-05-27 03:38:02 +00:00
Keno Fischer 090f1959c1 [SCEVExpander] Try harder to avoid introducing inttoptr
Summary:
This fixes introduction of an incorrect inttoptr/ptrtoint pair in
the included test case which makes use of non-integral pointers. I
suspect there are more cases like this left, but this takes care of
the one I was seeing at the moment.

Reviewers: sanjoy

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D33129

llvm-svn: 304058
2017-05-27 03:22:55 +00:00
Matthias Braun 868bbd4022 ScheduleDAGInstrs: Fix fixupKills()
Rewrite fixupKills() to use the LivePhysRegs class. Simplifies the code
and fixes a bug where the CSR registers in return blocks where missed
leading to invalid kill flags. Also remove the unnecessary rule that we
wouldn't set kill flags on tied operands.

No tests as I have an upcoming commit improving MachineVerifier checks
to catch these cases in multiple existing lit tests.

llvm-svn: 304055
2017-05-27 02:50:50 +00:00
Erik Pilkington cbc82b3ca9 [Demangler] copy changes made in libcxxabi's r303718 to ItaniumDemangle
llvm-svn: 304053
2017-05-27 01:48:34 +00:00
Quentin Colombet 7a43eddf28 [AArch64][GlobalISel] Add the Localizer pass for the O0 pipeline
This should fix most of the issue we have right now with constants being
spilled all over the place.

llvm-svn: 304052
2017-05-27 01:34:07 +00:00
Quentin Colombet bece442bd8 [GlobalISel] Add a localizer pass for target to use
This reverts commit r299287 plus clean-ups.

The localizer pass is a helper pass that could be run at O0 in the GISel
pipeline to work around the deficiency of the fast register allocator.
It basically shortens the live-ranges of the constants so that the
allocator does not spill all over the place.

Long term fix would be to make the greedy allocator fast.

llvm-svn: 304051
2017-05-27 01:34:00 +00:00
Wei Mi 5bbb5aafc1 [GVN] Recommit the patch "Add phi-translate support in scalarpre".
The recommit is to fix a bug about ExtractValue and InsertValue ops. For those
ops, some varargs inside GVN::Expression are not value numbers but raw index
numbers. It is wrong to do phi-translate for raw index numbers, and the fix is
to stop doing that.

Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.

long a[100], b[100], g1, g2, g3;
__attribute__((pure)) long goo();

void foo(long a, long b, long c, long d) {
  g1 = a * b;
  if (__builtin_expect(g2 > 3, 0)) {
    a = c;
    b = d;
    g2 = a * b;
  }
  g3 = a * b;      // fully redundant.
}
The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.

Differential Revision: https://reviews.llvm.org/D32252

llvm-svn: 304050
2017-05-27 00:54:19 +00:00
Matthias Braun 24dc63a9b9 BranchRelaxation: computeLiveIns() after creating new block
One case in BranchRelaxation did not compute liveins after creating a
new block. This is catched by existing tests with an upcoming commit
that will improve MachineVerifier checking of livein lists.

llvm-svn: 304049
2017-05-27 00:53:48 +00:00
Matthias Braun b4f74224ff AArch64: Fix cmpxchg O0 expansion
- Rewrite livein calculation to use the computeLiveIns() helper
  function. This is slightly less efficient but easier to reason about
  and doesn't unnecessarily add pristine and reserved registers[1]
- Zero the status register at the beginning of the loop to make sure it
  has a defined value.
- Remove kill flags of values that need to stay alive throughout the loop.

[1] An upcoming commit of mine will tighten the MachineVerifier to catch
    these.

llvm-svn: 304048
2017-05-26 23:48:59 +00:00
Peter Collingbourne 2c26a18501 Bitcode: Remove some dead code. Spotted by Teresa.
Differential Revision: https://reviews.llvm.org/D33609

llvm-svn: 304046
2017-05-26 23:21:40 +00:00
Craig Topper 348314dfb8 [InstSimplify] Push commuted op checks for and/or of icmp further down to avoid duplicate work
Previously, we called simplifyPossiblyCastedAndOrOfICmps twice with the operands commuted, but the call to simplifyAndOrOfICmpsWithConstants further down already handles commuting and doesn't need to be called both ways.

This patch pushes double calls further down to just the individual routines that need to be called twice.

Differential Revision: https://reviews.llvm.org/D33603

llvm-svn: 304044
2017-05-26 22:42:34 +00:00
Alexei Starovoitov 3c585d3a8f [bpf] disallow global_addr+off folding
Wrong assembly code is generated for a simple program with
clang. If clang only produces IR and llc is used
for IR lowering and optimization, correct assembly
code is generated.

The main reason is that clang feeds default Reloc::Static
to llvm and llc feeds no RelocMode to llvm, where
for llc case, BPF backend picks up Reloc::PIC_ mode.
This leads different IR lowering behavior and clang
permits global_addr+off folding while llc doesn't.

This patch introduces isOffsetFoldingLegal function into
BPF backend and the function always return false.
This will make clang and llc behave the same for
the lowering.

Bug https://bugs.llvm.org//show_bug.cgi?id=33183
has more detailed explanation.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 304043
2017-05-26 22:32:41 +00:00
Davide Italiano ef9bfe9531 [Mips] Placate GCC's -Wmisleading-indentation. NFCI.
llvm-svn: 304041
2017-05-26 21:56:19 +00:00
Davide Italiano d4db116af8 [lib/LTO] Don't reinvent the code for switching linkage.
Differential Revision:  https://reviews.llvm.org/D33582

llvm-svn: 304040
2017-05-26 21:56:14 +00:00
Matthias Braun ac4307c41e LivePhysRegs: Rework constructor + documentation; NFC
- Take reference instead of pointer to a TRI that cannot be nullptr.
- Improve documentation comments.

llvm-svn: 304038
2017-05-26 21:51:00 +00:00
Matthias Braun 61cf1a9e85 LivePhysRegs: Add default for removeRegsInMask(Clobbers); NFC
llvm-svn: 304036
2017-05-26 21:50:51 +00:00
Matthias Braun d8f4e99933 MachineVerifier: Remove unused set; NFC
llvm-svn: 304035
2017-05-26 21:50:48 +00:00
Sumanth Gundapaneni a6cf2fd5ec [Hexagon] Cleanup of unused function isCalleeSaveReg (NFC)
llvm-svn: 304034
2017-05-26 21:09:54 +00:00
Konstantin Zhuravlyov b2ff8dfea0 Resubmit r303859 with test fixed.
[AMDGPU] add intrinsic for s_getpc

Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc.

Patch by Tim Corringham

llvm-svn: 304031
2017-05-26 20:38:26 +00:00
Benjamin Kramer debb3c35e0 Make helper functions static. NFC.
llvm-svn: 304029
2017-05-26 20:09:00 +00:00
Frederich Munch 8c3735e597 Fix the ManagedStatic list ordering when using DynamicLibrary::addPermanentLibrary.
Summary:
r295737 included a fix for leaking libraries loaded via. DynamicLibrary::addPermanentLibrary.
This created a problem where static constructors in a library could insert llvm::ManagedStatic objects before DynamicLibrary would register it's own ManagedStatic, meaning a crash could occur at shutdown.

r301562 exasperated this problem by cleaning up the DynamicLibrary ManagedStatic during llvm_shutdown.

Reviewers: v.g.vassilev, lhames, efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33581

llvm-svn: 304027
2017-05-26 19:43:23 +00:00
Craig Topper 9bce1ad232 [InstSimplify] Move a variable declaration to make simplifyAndOfICmps look more like simplifyOrOfICmps. NFC
llvm-svn: 304023
2017-05-26 19:04:02 +00:00
Craig Topper c8bebb1e84 [InstSimplify] Use commutable matchers to shorten some code
This code was replicated two additional times to handle commuted cases, but I think a commutable matcher can take care of it.

Differential Revision: https://reviews.llvm.org/D33585

llvm-svn: 304022
2017-05-26 19:03:59 +00:00
Craig Topper 1da22c3244 [InstSimplify] Use m_APInt instead of m_ConstantInt in ((V + N) & C1) | (V & C2) handling in order to support splat vectors.
The tests here are have operands commuted to provide more coverage. I also commuted one of the instructions in the scalar tests so the 4 tests cover the 4 commuted variations

Differential Revision: https://reviews.llvm.org/D33599

llvm-svn: 304021
2017-05-26 19:03:53 +00:00
David Blaikie 07963bd1d1 DebugInfo: Do not emit empty CUs
Consistent with GCC and addresses a shortcoming with ThinLTO where many
imported CUs may end up being empty (because the functions imported from
them either ended up not being used (and were then discarded, since
they're imported as available_externally) or optimized away entirely).

Test cases previously testing empty CUs (either intentionally, or
because they didn't need anything more complicated) had a trivial 'int'
or similar basic type added to their retained types list.

This is a first order approximation - a deeper implementation could do
things like:

1) Be more lazy about construction of the CU - for example if two CUs
containing a single identical retained type are linked together, with
this change one of the two CUs will be produced but empty (since a
duplicate type won't be produced).

2) Go further and invert all the CU links the same way the subprogram
link is inverted - keep named CU lists of retained types, macros, etc,
and have those link back to the CU. Then if they're emitted, the CU is
emitted, but never otherwise - this would allow the metadata itself to
be dropped earlier too, though it seems unlikely that's an important
optimization as there shouldn't be many CUs relative to the number of
other entities.

llvm-svn: 304020
2017-05-26 18:52:56 +00:00
Peter Collingbourne 7730b24448 PMB: Run the whole-program-devirt pass during LTO at --lto-O0.
The whole-program-devirt pass needs to run at -O0 because only it
knows about the llvm.type.checked.load intrinsic: it needs to both
lower the intrinsic itself and handle it in the summary.

Differential Revision: https://reviews.llvm.org/D33571

llvm-svn: 304019
2017-05-26 18:27:13 +00:00
Craig Topper d45185f231 [InstCombine] Pass the DominatorTree, AssumptionCache, and context instruction to a few calls to isKnownPositive, isKnownNegative, and isKnownNonZero
Every other place in InstCombine that uses these methods in ValueTracking already pass this information. This makes the remaining sites consistent.

Differential Revision: https://reviews.llvm.org/D33567

llvm-svn: 304018
2017-05-26 18:23:57 +00:00
Dmitry Preobrazhensky 6a2431df0b [AMDGPU][MC][GFX9] Corrected encoding of flat_scratch* for SDWA opcodes
See bug 33171: https://bugs.llvm.org/show_bug.cgi?id=33171

Reviewers: Sam Kolton

Differential Revision: https://reviews.llvm.org/D33553

llvm-svn: 304015
2017-05-26 18:01:29 +00:00
George Rimar 1f9cab6b1c Revert r304002 "[DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC"
Revert it again. Now another bot unhappy: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/8750

llvm-svn: 304011
2017-05-26 17:36:23 +00:00
David Blaikie 7f2b717b52 DebugInfo: Don't include locations for debug-having code inlined into nodebug functions
This produced 'strange' DWARF anyway - the CU would have no ranges (or
at least not a range including the inlined code) nor any subprogram or
inlined_subroutine - yet the line table would have entries for these
instructions.

(this actually becomes more relevant with changes coming after this,
where a CU without any contents will be omitted entirely - so there
would be no line table to put this on anyway)

llvm-svn: 304004
2017-05-26 17:05:15 +00:00
Tom Stellard dde28a8c92 AMDGPU/GlobalISel: Mark 32-bit float constants as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D33212

llvm-svn: 304003
2017-05-26 16:40:03 +00:00
George Rimar bc223c63cc [DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC
This change is intended to use for LLD in D33183. 
Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to.

Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo
interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason of incorrect behavior when
sections share the same offsets, like D33176 shows.

This patch makes DWARF parsers to return section index as well. That solves problem mentioned above.

Differential revision: https://reviews.llvm.org/D33184

llvm-svn: 304002
2017-05-26 16:26:18 +00:00
Matthias Braun eec1f3672a LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI
Re-commit r303938 and r303954 with a fix for addLiveIns(): the internal
addPristines() function must be called on an empty set or it may
accidentally reset saved registers.

- addLiveOutsNoPristines() needs to add callee saved registers that are
  actually saved and restored somewhere to the set (they are not
  pristine).
- Cleanup/rewrite the code for addLiveOuts()/addLiveOutsNoPristines().

This fixes the problem from D32156.

Differential Revision: https://reviews.llvm.org/D32464

llvm-svn: 304001
2017-05-26 16:23:08 +00:00
Sam Kolton 363f47a2c7 [AMDGPU] SDWA: add disassembler support for GFX9
Summary: Added decoder methods and tests

Reviewers: vpykhtin, artem.tamazov, dp

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D33545

llvm-svn: 303999
2017-05-26 15:52:00 +00:00
Sanjay Patel ec13ebf2c8 [DAGCombiner] use narrow vector ops to eliminate concat/extract (PR32790)
In the best case:
extract (binop (concat X1, X2), (concat Y1, Y2)), N --> binop XN, YN
...we kill all of the extract/concat and just have narrow binops remaining.

If only one of the binop operands is amenable, this transform is still
worthwhile because we kill some of the extract/concat.

Optional bitcasting makes the code more complicated, but there doesn't
seem to be a way to avoid that.

The TODO about extending to more than bitwise logic is there because we really
will regress several x86 tests including madd, psad, and even a plain
integer-multiply-by-2 or shift-left-by-1. I don't think there's anything
fundamentally wrong with this patch that would cause those regressions; those
folds are just missing or brittle.

If we extend to more binops, I found that this patch will fire on at least one
non-x86 regression test. There's an ARM NEON test in
test/CodeGen/ARM/coalesce-subregs.ll with a pattern like:

            t5: v2f32 = vector_shuffle<0,3> t2, t4
          t6: v1i64 = bitcast t5
          t8: v1i64 = BUILD_VECTOR Constant:i64<0>
        t9: v2i64 = concat_vectors t6, t8
      t10: v4f32 = bitcast t9
    t12: v4f32 = fmul t11, t10
  t13: v2i64 = bitcast t12
t16: v1i64 = extract_subvector t13, Constant:i32<0>

There was no functional change in the codegen from this transform from what I
could see though.

For the x86 test changes:

1. PR32790() is the closest call. We don't reduce the AVX1 instruction count in that case,
   but we improve throughput. Also, on a core like Jaguar that double-pumps 256-bit ops,
   there's an unseen win because two 128-bit ops have the same cost as the wider 256-bit op.
   SSE/AVX2/AXV512 are not affected which is expected because only AVX1 has the extract/concat
   ops to match the pattern.
2. do_not_use_256bit_op() is the best case. Everyone wins by avoiding the concat/extract.
   Related bug for IR filed as: https://bugs.llvm.org/show_bug.cgi?id=33026
3. The SSE diffs in vector-trunc-math.ll are just scheduling/RA, so nothing real AFAICT.
4. The AVX1 diffs in vector-tzcnt-256.ll are all the same pattern: we reduced the instruction
   count by one in each case by eliminating two insert/extract while adding one narrower logic op.

https://bugs.llvm.org/show_bug.cgi?id=32790

Differential Revision: https://reviews.llvm.org/D33137

llvm-svn: 303997
2017-05-26 15:33:18 +00:00
Nirav Dave 689709c928 [DAG] Move legal type checks in store merge to be checked only
on non-legal cases. NFC.

llvm-svn: 303994
2017-05-26 14:37:27 +00:00
John Brawn 9009d2905d [ARM] Fix lowering of misaligned memcpy/memset
Currently getOptimalMemOpType returns i32 for large enough sizes without
checking for alignment, leading to poor code generation when misaligned accesses
aren't permitted as we generate a word store then later split it up into byte
stores. This means we inadvertantly go over the MaxStoresPerMemcpy limit and for
memset we splat the memset value into a word then immediately split it up
again.

Fix this by leaving it up to FindOptimalMemOpLowering to figure out which type
to use, but also fix a bug there where it wasn't correctly checking if
misaligned memory accesses are allowed.

Differential Revision: https://reviews.llvm.org/D33442

llvm-svn: 303990
2017-05-26 13:59:12 +00:00
Andrew V. Tischenko fdb264e263 The fix for PR22004: X86AsmParser.cpp asserts: OperandStack.size() > 1 && "Too few operands."
llvm-svn: 303985
2017-05-26 13:23:34 +00:00
George Rimar a8403a64ea Revert "[DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC"
Broked BB again:

TEST 'LLVM :: DebugInfo/X86/dbg-value-regmask-clobber.ll' FAILED
...
LLVM ERROR: Section was outside of section table.

llvm-svn: 303984
2017-05-26 13:20:09 +00:00
George Rimar 655b7b63f6 Recommit r303978 "[DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC"
With fix of test compilation.

Initial commit message:

This change is intended to use for LLD in D33183. 
Problem we have in LLD when building .gdb_index is that we need to know section 
which address range belongs to.

Previously it was solved on LLD side by providing fake section addresses
with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed.
Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason 
of incorrect behavior when
sections share the same offsets, like D33176 shows.

This patch makes DWARF parsers to return section index as well. 
That solves problem mentioned above.

Differential revision: https://reviews.llvm.org/D33184

llvm-svn: 303983
2017-05-26 13:13:50 +00:00
George Rimar 7d5f12185a Revert r303978 "[DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC"
It failed BB.

llvm-svn: 303981
2017-05-26 12:53:41 +00:00
Nirav Dave 6ff50bf242 Fix signedness of constant. NFC.
llvm-svn: 303980
2017-05-26 12:53:10 +00:00
George Rimar 732f268aa0 [DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC
This change is intended to use for LLD in D33183. 
Problem we have in LLD when building .gdb_index is that we need to know section 
which address range belongs to.

Previously it was solved on LLD side by providing fake section addresses
with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed.
Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason 
of incorrect behavior when
sections share the same offsets, like D33176 shows.

This patch makes DWARF parsers to return section index as well. 
That solves problem mentioned above.

Differential revision: https://reviews.llvm.org/D33184

llvm-svn: 303978
2017-05-26 12:46:41 +00:00
Max Kazantsev 41450329f7 Re-enable "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start"
The patch rL303730 was reverted because test lsr-expand-quadratic.ll failed on
many non-X86 configs with this patch. The reason of this is that the patch
makes a correctless fix that changes optimizer's behavior for this test.
Without the change, LSR was making an overconfident simplification basing on a
wrong SCEV. Apparently it did not need the IV analysis to do this. With the
change, it chose a different way to simplify (that wasn't so confident), and
this way required the IV analysis. Now, following the right execution path,
LSR tries to make a transformation relying on IV Users analysis. This analysis
is target-dependent due to this code:

  // LSR is not APInt clean, do not touch integers bigger than 64-bits.
  // Also avoid creating IVs of non-native types. For example, we don't want a
  // 64-bit IV in 32-bit code just because the loop has one 64-bit cast.
  uint64_t Width = SE->getTypeSizeInBits(I->getType());
  if (Width > 64 || !DL.isLegalInteger(Width))
    return false;

To make a proper transformation in this test case, the type i32 needs to be
legal for the specified data layout. When the test runs on some non-X86
configuration (e.g. pure ARM 64), opt gets confused by the specified target
and does not use it, rejecting the specified data layout as well. Instead,
it uses some default layout that does not treat i32 as a legal type
(currently the layout that is used when it is not specified does not have
legal types at all). As result, the transformation we expect to happen does
not happen for this test.

This re-enabling patch does not have any source code changes compared to the
original patch rL303730. The only difference is that the failing test is
moved to X86 directory and now has requirement of running on x86 only to comply
with the specified target triple and data layout.

Differential Revision: https://reviews.llvm.org/D33543

llvm-svn: 303971
2017-05-26 06:47:04 +00:00
Matthias Braun e51c435c07 LivePhysRegs: Skip reserved regs in computeLiveIns; NFCI
Re-commit r303937 + r303949 as they were not the cause for the build
failures.

We do not track liveness of reserved registers so adding them to the
liveins list in computeLiveIns() was completely unnecessary.

llvm-svn: 303970
2017-05-26 06:32:31 +00:00
Wei Mi 3250ae3f7c Revert rL303923 since it broke the sanitizer bootstrap build bot.
llvm-svn: 303969
2017-05-26 05:42:50 +00:00
Craig Topper 25d9ba9a12 [InstSimplify] Use APInt::isMask isntead of manually implementing it. NFC
llvm-svn: 303968
2017-05-26 05:16:22 +00:00
Craig Topper 50500d5054 [InstSimplify] Use m_ConstantInt matchers to short some code. NFC
llvm-svn: 303967
2017-05-26 05:16:20 +00:00
Chandler Carruth 8fa1e37342 [IR] Add an iterator and range accessor for the PHI nodes of a basic
block.

This allows writing much more natural and readable range based for loops
directly over the PHI nodes. It also takes advantage of the same tricks
for terminating the sequence as the hand coded versions.

I've replaced one example of this mostly to showcase the difference and
I've added a unit test to make sure the facilities really work the way
they're intended. I want to use this inside of SimpleLoopUnswitch but it
seems generally nice.

Differential Revision: https://reviews.llvm.org/D33533

llvm-svn: 303964
2017-05-26 03:10:00 +00:00
Matthias Braun c93c063993 Revert "LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI"
Tentatively revert this to see if it fixes the buildbot stage2
breakages.

This reverts commit r303938.
This reverts commit r303954.

llvm-svn: 303960
2017-05-26 02:25:20 +00:00
Matthias Braun f56a6d84b6 Revert "LivePhysRegs: Skip reserved regs in computeLiveIns; NFCI"
Tentatively revert, suspecting that it caused breakage in stage2
buildbots.

This reverts commit r303949.
This reverts commit r303937.

llvm-svn: 303955
2017-05-26 01:29:32 +00:00
Chandler Carruth 86248d5632 [PM] Enable the new simple loop unswitch pass in the new pass manager
(where it is the only realistic option).

This passes the LLVM test suite for me, but I'm clearly still hammering
on this.

llvm-svn: 303952
2017-05-26 01:24:11 +00:00
Matthias Braun daea6f1e84 LivePhysRegs: Follow-up to r303937
We may have situations in which a superregister is reserved and not
added to liveins, so we have to add the subregisters.

llvm-svn: 303949
2017-05-26 00:54:24 +00:00
Zachary Turner f2110283c6 Remove unused member.
llvm-svn: 303942
2017-05-25 23:47:56 +00:00
Tim Shen a76f20c364 [PPC] Add text for assert.
llvm-svn: 303940
2017-05-25 23:40:46 +00:00
Peter Collingbourne f87197ad91 LTO: Do summary-based prevailing symbol resolution at --lto-O0.
Prevailing symbol resolution is necessary for correctness. Without
this we can end up dropping a referenced linkonce symbol from the link.

Differential Revision: https://reviews.llvm.org/D33570

llvm-svn: 303939
2017-05-25 23:40:11 +00:00
Matthias Braun e2133d5b42 LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI
- addLiveOutsNoPristines() needs to add callee saved registers that are
  actually saved and restored somewhere to the set (they are not
  pristine).
- Cleanup/rewrite the code for addLiveOuts()/addLiveOutsNoPristines().

This fixes the problem from D32156.

Differential Revision: https://reviews.llvm.org/D32464

llvm-svn: 303938
2017-05-25 23:39:40 +00:00
Matthias Braun 9512dd5ffd LivePhysRegs: Skip reserved regs in computeLiveIns; NFCI
We do not track liveness of reserved registers so adding them to the
liveins list in computeLiveIns() was completely unnecessary.

llvm-svn: 303937
2017-05-25 23:39:33 +00:00
Zachary Turner fed467eefb [CV Type Merging] Find nested type indices faster.
Merging two type streams is one of the most time consuming
parts of generating a PDB, and as such it needs to be as
fast as possible.  The visitor abstractions used for interoperating
nicely with many different types of inputs and outputs have
been used widely and help greatly for testability and implementing
tools, but the abstractions build up and get in the way of
performance.

This patch removes all of the visitation stuff from the type
stream merger, essentially re-inventing the leaf / member switch
and loop, but at a very low level.  This allows us many other
optimizations, such as not actually deserializing *any* records
(even member records which don't describe their own length), as
the operation of "figure out how long this record is" is somewhat
faster than "figure out how long this record *and* get all its
fields out".  Furthermore, whereas before we had to deserialize,
re-write type indices, then re-serialize, now we don't have to
do any of those 3 steps.  We just find out where the type indices
are and pull them directly out of the byte stream and re-write
them.

This is worth a 50-60% performance increase.  On top of all other
optimizations that have been applied this week, I now get the
following numbers when linking lld.exe and lld.pdb

MSVC: 25.67s
Before This Patch: 18.59s
After This Patch: 8.92s

So this is a huge performance win.

Differential Revision: https://reviews.llvm.org/D33564

llvm-svn: 303935
2017-05-25 23:36:16 +00:00
David Blaikie 2c78f183fe DebugInfo: Simplify scopes+subprogram handling since the subprogram<>cu link inversion
Previously this code was defensive to the situation in which the debug
info scopes would lead to a different subprogram from the subprogram in
the CU's subprogram list (this could've happened with linkonce
functions, etc as per the comment being removed). Since the CU<>SP link
reversal this is no longer possible.

llvm-svn: 303933
2017-05-25 23:11:28 +00:00
Tim Shen a2b85da879 [PPC] Fix atomics lowering in DAG lowering.
I forgot to forward the chain, causing some missing instruction
dependencies. The test crashes the compiler without this patch.

Inspired by the test case, D33519 also tries to remove the extra sync.

Differential Revision: https://reviews.llvm.org/D33573

llvm-svn: 303931
2017-05-25 22:58:35 +00:00
Craig Topper d4039f7283 [InstCombine] Add an InstCombine specific wrapper around isKnownToBeAPowerOfTwo to shorten code. NFC
We have wrappers for several other ValueTracking methods that take care of passing all of the analysis and assumption cache parameters. This extends it to isKnownToBeAPowerOfTwo.

llvm-svn: 303924
2017-05-25 21:51:12 +00:00
Wei Mi fd257fa7bf [GVN] Add phi-translate support in scalarpre.
Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.

  long a[100], b[100], g1, g2, g3;
  __attribute__((pure)) long goo();

  void foo(long a, long b, long c, long d) {
    g1 = a * b;
    if (__builtin_expect(g2 > 3, 0)) {
      a = c;
      b = d;
      g2 = a * b;
    }
    g3 = a * b;      // fully redundant.
  }

The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.

Differential Revision: https://reviews.llvm.org/D32252

llvm-svn: 303923
2017-05-25 21:49:02 +00:00
Andrew Kaylor f466001eef Add constrained intrinsics for some libm-equivalent operations
Differential revision: https://reviews.llvm.org/D32319

llvm-svn: 303922
2017-05-25 21:31:00 +00:00
Matthias Braun 1527baab0c CodeGen: Rename DEBUG_TYPE to match passnames
Rename the DEBUG_TYPE to match the names of corresponding passes where
it makes sense. Also establish the pattern of simply referencing
DEBUG_TYPE instead of repeating the passname where possible.

llvm-svn: 303921
2017-05-25 21:26:32 +00:00
Zachary Turner 2897e0306e [lld] Fix a bug where we continually re-follow type servers.
Originally this was intended to be set up so that when linking
a PDB which refers to a type server, it would only visit the
PDB once, and on subsequent visitations it would just skip it
since all the records had already been added.

Due to some C++ scoping issues, this was not occurring and it
was revisiting the type server every time, which caused every
record to end up being thrown away on all subsequent visitations.

This doesn't affect the performance of linking clang-cl generated
object files because we don't use type servers, but when linking
object files and libraries generated with /Zi via MSVC, this means
only 1 object file has to be linked instead of N object files, so
the speedup is quite large.

llvm-svn: 303920
2017-05-25 21:16:03 +00:00
Zachary Turner 7f97c362a4 [CodeView Type Merging] Don't keep re-allocating temp serializer.
Previously, every time we wanted to serialize a field list record, we
would create a new copy of FieldListRecordBuilder, which would in turn
create a temporary instance of TypeSerializer, which itself had a
std::vector<> that was about 128K in size. So this 128K allocation was
happening every time. We can re-use the same instance over and over, we
just have to clear its internal hash table and seen records list between
each run. This saves us from the constant re-allocations.

This is worth an ~18.5% speed increase (3.75s -> 3.05s) in my tests.

Differential Revision: https://reviews.llvm.org/D33506

llvm-svn: 303919
2017-05-25 21:15:37 +00:00
Zachary Turner 95c625ecc9 Make BinaryStreamReader::readCString a bit faster.
Previously it would do a character by character search for a null
terminator, to account for the fact that an arbitrary stream need not
store its data contiguously so you couldn't just do a memchr. However, the
stream API has a function which will return the longest contiguous chunk
without doing a copy, and by using this function we can do a memchr on the
individual chunks. For certain types of streams like data from object
files etc, this is guaranteed to find the null terminator with only a
single memchr, but even with discontiguous streams such as
MappedBlockStream, it's rare that any given string will cross a block
boundary, so even those will almost always be satisfied with a single
memchr.

This optimization is worth a 10-12% reduction in link time (4.2 seconds ->
3.75 seconds)

Differential Revision: https://reviews.llvm.org/D33503

llvm-svn: 303918
2017-05-25 21:12:27 +00:00
Bob Haarman 55256ada25 [pdb] pad source file name buffer at the end instead of the beginning
Summary:
DbiStreamBuilder calculated the offset of the source file names inside
the file info substream as the size of the file info substream minus
the size of the file names. Since the file info substream is padded to
a multiple of 4 bytes, this caused the first file name to be aligned
on a 4-byte boundary. By contrast, DbiModuleList would read the file
names immediately after the file name offset table, without skipping
to the next 4-byte boundary. This change makes it so that the file
names are written to the location where DbiModuleList expects them,
and puts any necessary padding for the file info substream after the
file names instead of before it.

Reviewers: amccarth, rnk, zturner

Reviewed By: amccarth, zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33475

llvm-svn: 303917
2017-05-25 21:12:15 +00:00
Zachary Turner c4e4b7e31e Fix a bug in MappedBlockStream.
It was using the number of blocks of the entire PDB file as the number
of blocks of each stream that was created.  This was only an issue in
the readLongestContiguousChunk function, which  was never called prior.
This bug surfaced when I updated an algorithm to use this function and
the algorithm broke.

llvm-svn: 303916
2017-05-25 21:12:00 +00:00
Sam Clegg 1c154a6107 [WebAssembly] MC: Include unnamed data when writing wasm files
Also, include global entries for all data symbols, not
just external ones, since these are referenced by the
relocation records.

Add a test case that includes unnamed data.

Differential Revision: https://reviews.llvm.org/D33079

llvm-svn: 303915
2017-05-25 21:08:07 +00:00
Zachary Turner dda25b128c [CodeView Type Merging] Avoid record deserialization when possible.
A profile shows the majority of time doing type merging is spent
deserializing records from sequences of bytes into friendly C++ structures
that we can easily access members of in order to find the type indices to
re-write.

Records are prefixed with their length, however, and most records have
type indices that appear at fixed offsets in the record. For these
records, we can save some cycles by just looking at the right place in the
byte sequence and re-writing the value, then skipping the record in the
type stream. This saves us from the costly deserialization of examining
every field, including potentially null terminated strings which are the
slowest, even though it was unnecessary to begin with.

In addition, we apply another optimization. Previously, after
deserializing a record and re-writing its type indices, we would
unconditionally re-serialize it in order to compute the hash of the
re-written record. This would result in an alloc and memcpy for every
record. If no type indices were re-written, however, this was an
unnecessary allocation. In this patch re-writing is made two phase. The
first phase discovers the indices that need to be rewritten and their new
values. This information is passed through to the de-duplication code,
which only copies and re-writes type indices in the serialized byte
sequence if at least one type index is different.

Some records have type indices which only appear after variable length
strings, or which have lists of type indices, or various other situations
that can make it tricky to make this optimization. While I'm not giving up
on optimizing these cases as well, for now we can get the easy cases out
of the way and lay the groundwork for more complicated cases later.

This patch yields another 50% speedup on top of the already large speedups
submitted over the past 2 days. In two tests I have run, I went from 9
seconds to 3 seconds, and from 16 seconds to 8 seconds.

Differential Revision: https://reviews.llvm.org/D33480

llvm-svn: 303914
2017-05-25 21:06:28 +00:00
Kyle Butt 13379d7c99 PPC: Correct Size for GETtlsADDR
PPC::GETtlsADDR is lowered to a branch and a nop, by the assembly
printer. Its size was incorrectly marked as 4, correct it to 8. The
incorrect size can cause incorrect branch relaxation in
PPCBranchSelector under the right conditions.

llvm-svn: 303904
2017-05-25 19:37:41 +00:00
Nico Weber b3d83a092a Revert r303859, CodeGen/AMDGPU/llvm.amdgcn.s.getpc.ll fails on bots.
llvm-svn: 303902
2017-05-25 19:19:29 +00:00
Manoj Gupta d536180fdc [AArch64]: add 'a' inline asm operand modifier.
Summary:
This is used in the Linux kernel, and effectively just means "print an
address". This brings back r193593.

Reviewed by: Renato Golin

Reviewers: t.p.northover, rengolin, richard.barton.arm, kristof.beyls

Subscribers: aemerson, javed.absar, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D33558

llvm-svn: 303901
2017-05-25 19:07:57 +00:00
Adrian Prantl f062192632 Fix SelectionDAGBuilder::getDbgValue to not expect DW_OP_deref on FI vars
This fixes an oversight in r300522, which changed alloca
dbg.values to no longer emit a DW_OP_deref.

The array.ll testcase was regenerated from source.

Fixes PR33166:
https://bugs.llvm.org/show_bug.cgi?id=33166

llvm-svn: 303897
2017-05-25 18:54:10 +00:00
David Blaikie b3cee2fb42 DebugInfo: Produce debug_{gnu_}pub{names,types} entries when explicitly requested, even in -gmlt or when empty
Turns out gold doesn't use the DW_AT_GNU_pubnames to decide whether to
parse the rest of the DIEs when building gdb-index. This causes gold to
trip over LLVM's output when there are DW_FORM_ref_addr present.

Gold does use the presence of a debug_gnu_pub{names,types} entry for the
CU to skip parsing the debug_info portion, so make sure that's included
even when empty (technically, when empty there couldn't be any ref_addr
anyway - it only came up when gmlt didn't produce any (even non-empty)
pubnames - but given what that reveals about gold's implementation, this
seems like a good thing to do for consistency).

llvm-svn: 303894
2017-05-25 18:50:28 +00:00
Daniel Berlin e67c322260 NewGVN: Fix PR 33119, PR 33129, due to regressed undef handling
Fix PR33120 and others by eliminating self-cycles a different way.

llvm-svn: 303875
2017-05-25 15:44:20 +00:00
Artur Pilipenko 315eafc339 [InstCombine] Teach isAllocSiteRemovable to look through addrspacecasts
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D28565

llvm-svn: 303870
2017-05-25 15:14:48 +00:00
Sanjay Patel 5150612012 [InstCombine] make icmp-mul fold more efficient
There's probably a lot more like this (see also comments in D33338 about responsibility), 
but I suspect we don't usually get a visible manifestation.

Given the recent interest in improving InstCombine efficiency, another potential micro-opt
that could be repeated several times in this function: morph the existing icmp pred/operands
instead of creating a new instruction.

llvm-svn: 303860
2017-05-25 14:13:57 +00:00
Tim Corringham 32d0d38679 [AMDGPU] add intrinsic for s_getpc
Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D32862

llvm-svn: 303859
2017-05-25 14:04:14 +00:00
Oren Ben Simhon 7bf27f03f2 [X86] Adding vpopcntd and vpopcntq instructions
AVX512_VPOPCNTDQ is a new feature set that was published by Intel.
The patch represents the LLVM side of the addition of two new intrinsic based instructions (vpopcntd and vpopcntq).

Differential Revision: https://reviews.llvm.org/D33169

llvm-svn: 303858
2017-05-25 13:45:23 +00:00
James Molloy dc2d64bc35 [GVNSink] Pacify MSVC
Don't convert an unsigned to a pointer for a sentinel, use a size_t instead.

llvm-svn: 303855
2017-05-25 13:14:10 +00:00
James Molloy 2a237f19f1 [GVNSink] Don't define operator<< in NDEBUG
Without debug macros enabled, the raw_ostream operator<< overload
is unused.

llvm-svn: 303852
2017-05-25 13:11:18 +00:00
James Molloy a929063233 [GVNSink] GVNSink pass
This patch provides an initial prototype for a pass that sinks instructions based on GVN information, similar to GVNHoist. It is not yet ready for commiting but I've uploaded it to gather some initial thoughts.

This pass attempts to sink instructions into successors, reducing static
instruction count and enabling if-conversion.
We use a variant of global value numbering to decide what can be sunk.
Consider:

[ %a1 = add i32 %b, 1  ]   [ %c1 = add i32 %d, 1  ]
[ %a2 = xor i32 %a1, 1 ]   [ %c2 = xor i32 %c1, 1 ]
                 \           /
           [ %e = phi i32 %a2, %c2 ]
           [ add i32 %e, 4         ]

GVN would number %a1 and %c1 differently because they compute different
results - the VN of an instruction is a function of its opcode and the
transitive closure of its operands. This is the key property for hoisting
and CSE.

What we want when sinking however is for a numbering that is a function of
the *uses* of an instruction, which allows us to answer the question "if I
replace %a1 with %c1, will it contribute in an equivalent way to all
successive instructions?". The (new) PostValueTable class in GVN provides this
mapping.

This pass has some shown really impressive improvements especially for codesize already on internal benchmarks, so I have high hopes it can replace all the sinking logic in SimplifyCFG.

Differential revision: https://reviews.llvm.org/D24805

llvm-svn: 303850
2017-05-25 12:51:11 +00:00
Chandler Carruth f4d62c480c [PM] Teach the PGO instrumentation pasess to run GlobalDCE before
instrumenting code.

This is important in the new pass manager. The old pass manager's
inliner has a small DCE routine embedded within it. The new pass manager
relies on the actual GlobalDCE pass for this.

Without this patch, instrumentation profiling with the new PM results in
massive code bloat in the object files because the instrumentation
itself ends up preventing DCE from working to remove the code.

We should probably change the instrumentation (and/or DCE) so that we
can eliminate dead code even if instrumented, but we shouldn't even
spend the time generating instrumentation for that code so this still
seems like a good patch.

Differential Revision: https://reviews.llvm.org/D33535

llvm-svn: 303845
2017-05-25 07:15:09 +00:00
Chandler Carruth dd2e275a47 [PM/Unswitch] Fix a bug in the domtree update logic for the new unswitch
pass.

The original logic only considered direct successors of the hoisted
domtree nodes, but that isn't really enough. If there are other basic
blocks that are completely within the subtree, their successors could
just as easily be impacted by the hoisting.

The more I think about it, the more I think the correct update here is
to hoist every block on the dominance frontier which has an idom in the
chain we hoist across. However, this is subtle enough that I'd
definitely appreciate some more eyes on it.

Sadly, if this is the correct algorithm, it requires computing a (highly
localized) dominance frontier. I've done this in the simplest (IE, least
code) way I could come up with, but that may be too naive. Suggestions
welcome here, dominance update algorithms are not an area I've studied
much, so I don't have strong opinions.

In good news, with this patch, turning on simple unswitch passes the
LLVM test suite for me with asserts enabled.

Differential Revision: https://reviews.llvm.org/D32740

llvm-svn: 303843
2017-05-25 06:33:36 +00:00
Chandler Carruth 29c22d2835 [LegacyPM] Make the 'addLoop' method accept a loop to add rather than
having it internally allocate the loop.

This is a much more flexible API and necessary in the new loop unswitch
to reasonably support both new and old PMs in common code. It also just
seems like a cleaner separation of concerns.

NFC, this should just be a pure refactoring.

Differential Revision: https://reviews.llvm.org/D33528

llvm-svn: 303834
2017-05-25 03:01:31 +00:00
Vitaly Buka bf40f1b6dd [libFuzzer] Don't replace custom signal handlers.
Summary:
This allows to keep handlers installed by sanitizers.
In other cases third-party code can replace handlers after libFuzzer
initialization anyway.

Reviewers: kcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33522

llvm-svn: 303828
2017-05-25 01:43:13 +00:00
George Karpenkov a1c532784d Fix coverage check for full post-dominator basic blocks.
Coverage instrumentation which does not instrument full post-dominators
and full-dominators may skip valid paths, as the reasoning for skipping
blocks may become circular.
This patch fixes that, by only skipping
full post-dominators with multiple predecessors, as such predecessors by
definition can not be full-dominators.

llvm-svn: 303827
2017-05-25 01:41:46 +00:00
Gor Nishanov 1fbc01f70f [coroutines] CoroFrame.cpp conform to coding convention (s/repeat/Repeat) (NFC)
llvm-svn: 303826
2017-05-25 01:07:10 +00:00
Gor Nishanov 0ea1863b27 [coroutines] Relocate instructions that maybe spilled after coro.begin
Summary:
Frontend generates store instructions after allocas, for example:

```
define i8* @f(i64 %this) "coroutine.presplit"="1" personality i32 0 {
entry:
  %this.addr = alloca i64
  store i64 %this, i64* %this.addr
  ..
  %hdl = call i8* @llvm.coro.begin(token %id, i8* %alloc)

```
Such instructions may require spilling into coro.frame, but, coro-frame address is only available after coro.begin and thus needs to be moved after coro.begin.
The only instructions that should not be moved are the arguments of coro.begin and all of their operands.

Reviewers: GorNishanov, majnemer

Reviewed By: GorNishanov

Subscribers: llvm-commits, EricWF

Differential Revision: https://reviews.llvm.org/D33527

llvm-svn: 303825
2017-05-25 00:46:20 +00:00
Tony Jiang 0a429f040e [PowerPC] Fix a performance bug for PPC::XXSLDWI.
There are some VectorShuffle Nodes in SDAG which can be selected to XXSLDWI
instruction, this patch recognizes them and does the selection to improve the
PPC performance.

llvm-svn: 303822
2017-05-24 23:48:29 +00:00
Eugene Zelenko 75480cce12 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 303820
2017-05-24 23:10:29 +00:00
Gor Nishanov 1f72d75714 [coroutines] Allow rematerialization upto 4 times. Remove incorrect assert
Reviewers: majnemer

Subscribers: EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D33524

llvm-svn: 303819
2017-05-24 23:01:02 +00:00
Sanjay Patel 07b1ba54b5 [InstCombine] use m_APInt to allow icmp-mul-mul vector fold
The swapped operands in the first test is a manifestation of an 
inefficiency for vectors that doesn't exist for scalars because 
the IRBuilder checks for an all-ones mask for scalars, but not 
vectors.

llvm-svn: 303818
2017-05-24 22:58:17 +00:00
Nirav Dave 7a8717d216 [DAG] Prevent crashes when merging constant stores with high-bit set. NFC.
llvm-svn: 303802
2017-05-24 19:56:39 +00:00
Nirav Dave bb20b5d5c3 [AArch64] Prevent nested ADDs from address calc in splitStoreSplat. NFC
In preparation for late-stage store merging.

llvm-svn: 303800
2017-05-24 19:55:49 +00:00
Craig Topper 2f9c6dafe3 [InstCombine] Merge together the SimplifyDemandedUseBits implementations for ZExt and Trunc. NFC
While there avoid resizing the DemandedMask twice. Make a copy into a separate variable instead. This potentially removes an allocation on large bit widths.

With the use of the zextOrTrunc methods on APInt and KnownBits these can be made almost source identical. The only difference is the zero of the upper bits for ZExt. This is similar to how its done in computeKnownBits in ValueTracking.

llvm-svn: 303791
2017-05-24 18:40:25 +00:00
Teresa Johnson cd2aa0d2e4 Fix a couple of typos in memory intrinsic optimization output (NFC)
s/instrinsic/intrinsic

llvm-svn: 303782
2017-05-24 17:55:25 +00:00
Zaara Syeda 932978315b P9: D-form vector load/store. Differential Revision: https://reviews.llvm.org/D33248
llvm-svn: 303780
2017-05-24 17:50:37 +00:00
Craig Topper 1c660dbea6 [InstCombine] Use less bitwise operations to handle Instruction::SExt in SimplifyDemandedUseBits. Other improvements.
The current code created a NewBits mask and used it as a mask several times. One of them just before a call to trunc making it unnecessary. A call to getActiveBits can get us the same information for the case. We also ORed with this mask later when we should have just sign extended the known bits.

We also called trunc on the guaranteed to be zero KnownZeros/Ones masks entering this code. Creating appropriately sized temporary APInts is probably better.

Differential Revision: https://reviews.llvm.org/D32098

llvm-svn: 303779
2017-05-24 17:33:30 +00:00
Craig Topper 77e07cc010 [InstSimplify] Simplify uadd/sadd/umul/smul with overflow intrinsics when the Zero or Undef is on the LHS.
Summary: This code was migrated from InstCombine a few years ago. InstCombine had nearby code that would move Constants to the RHS for these, but InstSimplify doesn't have such code on this path.

Reviewers: spatel, majnemer, davide

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33473

llvm-svn: 303774
2017-05-24 17:05:28 +00:00
Craig Topper 8205a1a9b6 [ValueTracking] Convert most of the calls to computeKnownBits to use the version that returns the KnownBits object.
This continues the changes started when computeSignBit was replaced with this new version of computeKnowBits.

Differential Revision: https://reviews.llvm.org/D33431

llvm-svn: 303773
2017-05-24 16:53:07 +00:00
Craig Topper a2025eaaef [ValueTracking] Add OptimizationRemarkEmitter to the other signature for commuteKnownBits.
This is needed for an upcoming patch.

llvm-svn: 303772
2017-05-24 16:53:03 +00:00
Matthew Simpson 6349380fa4 Revert r291254: [AArch64] Reduce vector insert/extract cost for Falkor
The default vector insert/extract cost is more profitable on Falkor than the
reduced cost.

llvm-svn: 303771
2017-05-24 16:48:39 +00:00
Nirav Dave d20066cbad [AMDGPU] Prevent too large store merges in AMDGPU Subtargets. NFCI.
Various address spaces on the SI and R600 subtargets have stricter
limits on memory access size that other address spaces. Use
canMergeStoresTo predicate to prevent the DAGCombiner from creating
these stores as they will be split up during legalization.

llvm-svn: 303767
2017-05-24 15:59:09 +00:00
Matthew Simpson d6f179cad6 [LV] Update type in cost model for scalarization
For non-uniform instructions marked for scalarization, we should update
`VectorTy` when computing instruction costs to reflect the scalar type. In
addition to determining instruction costs, this type is also used to signal
that all instructions in the loop will be scalarized. This currently affects
memory instructions and non-pointer induction variables and their updates. (We
also mark GEPs scalar after vectorization, but their cost is computed together
with memory instructions.) For scalarized induction updates, this patch also
scales the scalar cost by the vectorization factor, corresponding to each
induction step.

llvm-svn: 303763
2017-05-24 15:26:15 +00:00
Vadzim Dambrouski b07351f4f8 [MSP430] Fix PR33050: Don't use ADD16ri to lower FrameIndex.
Use ADDframe pseudo instruction instead.
This will fix machine verifier error, and will help to fix PR32146.

Differential Revision: https://reviews.llvm.org/D33452

llvm-svn: 303758
2017-05-24 15:08:30 +00:00
Marek Olsak 8973a0a22c Revert "AMDGPU: Fold CI-specific complex SMRD patterns into existing complex patterns"
This reverts commit e065977c4b5f68ab845400b256f6a3822b1325fa.

It doesn't work. S_LOAD_DWORD_IMM_ci and friends aren't selected by any of
the patterns, so it was putting 32-bit literals into the 8-bit field.

llvm-svn: 303754
2017-05-24 14:53:50 +00:00
Diana Picus 183863fc3b Revert "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start"
This reverts commit r303730 because it broke all the buildbots.

llvm-svn: 303747
2017-05-24 14:16:04 +00:00
Krzysztof Parzyszek e3ec97b031 [Hexagon] Fix comment in HexagonPacketizer::runOnMachineFunction
Patch by Wei-Ren Chen.

Differential Revision: https://reviews.llvm.org/D33439

llvm-svn: 303745
2017-05-24 13:43:42 +00:00
Jonas Paulsson 8624b7e1ce [LoopVectorizer] Let target prefer scalar addressing computations.
The loop vectorizer usually vectorizes any instruction it can and then
extracts the elements for a scalarized use. On SystemZ, all elements
containing addresses must be extracted into address registers (GRs). Since
this extraction is not free, it is better to have the address in a suitable
register to begin with. By forcing address arithmetic instructions and loads
of addresses to be scalar after vectorization, two benefits result:

* No need to extract the register
* LSR optimizations trigger (LSR isn't handling vector addresses currently)

Benchmarking show improvements on SystemZ with this new behaviour.

Any other target could try this by returning false in the new hook
prefersVectorizedAddressing().

Review: Renato Golin, Elena Demikhovsky, Ulrich Weigand
https://reviews.llvm.org/D32422

llvm-svn: 303744
2017-05-24 13:42:56 +00:00
Jonas Paulsson 081b5a1e9d [SystemZ] Fix register modelling in expandLoadStackGuard()
EXPENSIVE_CHECKS found this bug (https://bugs.llvm.org/show_bug.cgi?id=33047), which
this patch fixes. The EAR instruction defines a GR32, not a GR64.

Review: Ulrich Weigand
llvm-svn: 303743
2017-05-24 13:15:48 +00:00
Tamas Berghammer e3852fa405 Demangler: Fix constructor cv qualifier handling
Previously if we parsed a constructor then we set parsed_ctor_dtor_cv
to true and never reseted it. This causes issue when a template argument
references a constructor (e.g. type of lambda defined inside a
constructor) as we will have the parsed_ctor_dtor_cv flag set what will
cause issues when parsing later arguments.

Differential Revision: https://reviews.llvm.org/D33385
libcxxabi change: https://reviews.llvm.org/rL303737

llvm-svn: 303738
2017-05-24 11:29:02 +00:00
Simon Pilgrim 9f46d1d479 Strip trailing whitespace. NFCI.
llvm-svn: 303736
2017-05-24 11:02:27 +00:00
Florian Hahn d211fe7c26 [ARM] Remove ThumbTargetMachines. (NFC)
Summary:
Thumb code generation is controlled by ARMSubtarget and the concrete
ThumbLETargetMachine and ThumbBETargetMachine are not needed.

Eric Christopher suggested removing the unneeded target machines in
https://reviews.llvm.org/D33287.

I think it still makes sense to keep separate TargetMachines for big and
little endian as we probably do not want to have different endianess for
difference functions in a single compilation unit. The MIPS backend has
two separate TargetMachines for big and little endian as well. 

Reviewers: echristo, rengolin, kristof.beyls, t.p.northover

Reviewed By: echristo

Subscribers: aemerson, javed.absar, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D33318

llvm-svn: 303733
2017-05-24 10:18:57 +00:00
Mikael Holmen 2676f8269a MachineCSE: Respect interblock physreg liveness
Summary:
This is a fix for PR32538. MachineCSE first looks at MO.isDead(), but
if it is not marked dead, MachineCSE still wants to do its own check
to see if it is trivially dead. This check for the trivial case
assumed that physical registers cannot be live out of a block.

Patch by Mattias Eriksson.

Reviewers: qcolombet, jbhateja

Reviewed By: qcolombet, jbhateja

Subscribers: jbhateja, llvm-commits

Differential Revision: https://reviews.llvm.org/D33408

llvm-svn: 303731
2017-05-24 09:35:23 +00:00
Max Kazantsev 13e016bf48 [SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start
When folding arguments of AddExpr or MulExpr with recurrences, we rely on the fact that
the loop of our base recurrency is the bottom-lost in terms of domination. This assumption
may be broken by an expression which is treated as invariant, and which depends on a complex
Phi for which SCEVUnknown was created. If such Phi is a loop Phi, and this loop is lower than
the chosen AddRecExpr's loop, it is invalid to fold our expression with the recurrence.

Another reason why it might be invalid to fold SCEVUnknown into Phi start value is that unlike
other SCEVs, SCEVUnknown are sometimes position-bound. For example, here:

for (...) { // loop
  phi = {A,+,B}
}
X = load ...
Folding phi + X into {A+X,+,B}<loop> actually makes no sense, because X does not exist and cannot
exist while we are iterating in loop (this memory can be even not allocated and not filled by this moment).
It is only valid to make such folding if X is defined before the loop. In this case the recurrence {A+X,+,B}<loop>
may be existant.

This patch prohibits folding of SCEVUnknown (and those who use them) into the start value of an AddRecExpr,
if this instruction is dominated by the loop. Merging the dominating unknown values is still valid. Some tests that
relied on the fact that some SCEVUnknown should be folded into AddRec's are changed so that they no longer
expect such behavior.

llvm-svn: 303730
2017-05-24 08:52:18 +00:00
Craig Topper e6a2318573 [APInt] Use std::end to avoid mentioning the size of a local buffer repeatedly.
llvm-svn: 303726
2017-05-24 07:00:55 +00:00
Javed Absar a32e3a1acf [ARM] Add VLDx/VSTx sched defs for machine-schedulers. NFCI
This patch adds missing scheds for Neon VLDx/VSTx instructions.
This will help one write schedulers easier/faster in the future for ARM sub-targets.
Existing models will not affected by this patch.
Reviewed by: Renato Golin, Diana Picus
Differential Revision: https://reviews.llvm.org/D33120

llvm-svn: 303717
2017-05-24 05:32:48 +00:00
Davide Italiano fd9100e056 [NewGVN] Update additionalUsers when we simplify to a value.
Otherwise we don't revisit an instruction that could be simplified,
and when we verify, we discover there's something that changed, i.e.
what we had wasn't a maximal fixpoint.

Fixes PR32836.

llvm-svn: 303715
2017-05-24 02:30:24 +00:00
George Karpenkov 018472c34a Revert "Disable coverage opt-out for strong postdominator blocks."
This reverts commit 2ed06f05fc10869dd1239cff96fcdea2ee8bf4ef.
Buildbots do not like this on Linux.

llvm-svn: 303710
2017-05-24 00:29:12 +00:00
Zachary Turner bb64231d2d Don't do a full scan of the type stream before processing records.
LazyRandomTypeCollection is designed for random access, and in
order to provide this it lazily indexes ranges of types.  In the
case of types from an object file, there is no partial index
to build off of, so it has to index the full stream up front.
However, merging types only requires sequential access, and when
that is needed, this extra work is simply wasted.  Changing the
algorithm to work on sequential arrays of types rather than
random access type collections eliminates this up front scan.

llvm-svn: 303707
2017-05-24 00:26:27 +00:00
Davide Italiano c4861adad9 [SCCP] Use the `hasAddressTaken()` version defined in `Function`.
Instead of using the SCCP homegrown one. We should eventually
make the private SCCP version disappear, but that wont' be today.
PR33143 tracks this issue.

Add braces for consistency while here. No functional change intended.

llvm-svn: 303706
2017-05-23 23:59:23 +00:00
Davide Italiano 7bf95b964f [LIR] Use the newly `getRecurrenceVar()` helper. NFCI.
llvm-svn: 303704
2017-05-23 23:51:54 +00:00
Davide Italiano 4bc91190ea [LIR] Strengthen the check for recurrence variable in popcnt/CTLZ.
Fixes PR33114.
Differential Revision:  https://reviews.llvm.org/D33420

llvm-svn: 303700
2017-05-23 22:32:56 +00:00
George Karpenkov 9017ca290a Disable coverage opt-out for strong postdominator blocks.
Coverage instrumentation has an optimization not to instrument extra
blocks, if the pass is already "accounted for" by a
successor/predecessor basic block.
However (https://github.com/google/sanitizers/issues/783) this
reasoning may become circular, which stops valid paths from having
coverage.
In the worst case this can cause fuzzing to stop working entirely.

This change simplifies logic to something which trivially can not have
such circular reasoning, as losing valid paths does not seem like a
good trade-off for a ~15% decrease in the # of instrumented basic blocks.

llvm-svn: 303698
2017-05-23 21:58:54 +00:00
Tim Northover 8c605c0eda Revert LLVM changes for "Sema: allow imaginary constants via GNU extension if UDL overloads not present."
The changes accidentally crept into a Clang commit I was making.

llvm-svn: 303697
2017-05-23 21:53:11 +00:00
Vadzim Dambrouski 49dd6e68c2 [MSP430] Add subtarget features for hardware multiplier.
Also add more processors to make -mcpu option behave similar to gcc.

Differential Revision: https://reviews.llvm.org/D33335

llvm-svn: 303695
2017-05-23 21:49:42 +00:00
Tim Northover 6b5eceac2e Sema: allow imaginary constants via GNU extension if UDL overloads not present.
C++14 added user-defined literal support for complex numbers so that you can
write something like "complex<double> val = 2i". However, there is an existing
GNU extension supporting this syntax and interpreting the result as a _Complex
type.

This changes parsing so that such literals are interpreted in terms of C++14's
operators if an overload is present but otherwise falls back to the original
GNU extension.

llvm-svn: 303694
2017-05-23 21:41:49 +00:00
Reid Kleckner 26450bf579 Silence MSVC warning about unsigned integer overflow, which has defined behavior
llvm-svn: 303693
2017-05-23 21:35:32 +00:00
Simon Pilgrim c910a70b21 [AMDGPU] Add INDIRECT_BASE_ADDR to R600_Reg32 class (PR33045)
This fixes 17 of the 41 -verify-machineinstrs test failures identified in PR33045

Differential Revision: https://reviews.llvm.org/D33451

llvm-svn: 303691
2017-05-23 21:27:15 +00:00
Francis Visoiu Mistrih 1c98701e57 AsmPrinter: mark the beginning and the end of a function in verbose mode
llvm-svn: 303690
2017-05-23 21:22:16 +00:00
Changpeng Fang 1dbace195d AMDGPU/SI: Move the local memory usage related checking after calling convention checking in PromoteAlloca
Summary:
  Promoting Alloca to Vector and Promoting Alloca to LDS are two independent handling of Alloca and should not affect each other.
As a result, we should not give up promoting to vector if there is not enough LDS. This patch factors out the local memory usage
related checking out and replace it after the calling convention checking.

Reviewer:
  arsenm

Differential Revision:
  http://reviews.llvm.org/D33139

llvm-svn: 303684
2017-05-23 20:25:41 +00:00
Geoff Berry d6ac96f953 [AArch64][Falkor] Refine sched details for LSLfast/ASRfast.
llvm-svn: 303682
2017-05-23 19:57:45 +00:00
Stanislav Mekhanoshin 53a21292f8 [AMDGPU] Combine and (srl) into shl (bfe)
Perform DAG combine:
and (srl x, c), mask => shl (bfe x, nb + c, mask >> nb), nb
Where nb is a number of trailing zeroes in mask.

It replaces two instructions with two and BFE is generally a more
expensive one. However this is only done if we are selecting a byte
or word at an aligned boundary which results in a proper SDWA
operand pattern. It is only done if SDWA is supported.

TODO: improve SDWA pass to actually convert this pattern. It is not
done now because we have an immediate in the instruction, which has
be moved into a VGPR.

Differential Revision: https://reviews.llvm.org/D33455

llvm-svn: 303681
2017-05-23 19:54:48 +00:00
Geoff Berry e6366f505f [AArch64][Falkor] Fix sched details for FMOV of WZR/XZR.
llvm-svn: 303680
2017-05-23 19:54:28 +00:00
Oleg Ranevskyy 09df0020fc [ARM] Temporarily disable globals promotion to constant pools to prevent miscompilation
Summary:
A temporary workaround for PR32780 - rematerialized instructions accessing the same promoted global through different constant pool entries.

The patch turns off the globals promotion optimization leaving all its code in place, so that it can be easily turned on once PR32780 is fixed.

Since this is a miscompilation issue causing generation of misbehaving code, and the problem is very subtle, the patch might be valuable enough to get into 4.0.1.

Reviewers: efriedma, jmolloy

Reviewed By: efriedma

Subscribers: aemerson, javed.absar, llvm-commits, rengolin, asl, tstellar

Differential Revision: https://reviews.llvm.org/D33446

llvm-svn: 303679
2017-05-23 19:38:37 +00:00
Zachary Turner 7daf62e743 [CodeView] Eliminate redundant hashes and allocations.
When writing field list records, we would construct a temporary
type serializer that shared a bump ptr allocator with the rest
of the application, so anything allocated from here would live
forever.  Furthermore, this temporary serializer had all the
properties of a full blown serializer including record hashing
and de-duplication.

These features are required when you're merging multiple type
streams into each other, because different streams may contain
identical records, but records from the same type stream will
never collide with each other.  So all of this hashing was
unnecessary.

To solve this, two fixes are made:

1) The temporary serializer keeps its own bump ptr allocator
instead of sharing a global one.  When it's finished, all of
its memory is freed.

2) Instead of using the same temporary serializer for the life
of an entire type stream, we use it only for the life of a single
field list record and delete it when the field list record is
completed.  This way the hash table will not grow as other
records from the same type stream get inserted.  Further improvements
could eliminate hashing entirely from this codepath.

This reduces the link time by 85% in my test, from 1 minute to 9
seconds.

llvm-svn: 303676
2017-05-23 18:56:23 +00:00
Nirav Dave 6c910c0dd8 [DAG] Add AddressSpace parameter to canMergeStoresTo. NFC.
llvm-svn: 303673
2017-05-23 18:53:02 +00:00
Yuka Takahashi c8068dbb07 [GSoC] Shell autocompletion for clang
Summary:
This is a first patch for GSoC project, bash-completion for clang.
To use this on bash, please run `source clang/utils/bash-autocomplete.sh`.
bash-autocomplete.sh is code for bash-completion.

Simple flag completion and path completion is available in this patch.

Reviewers: teemperor, v.g.vassilev, ruiu, Bigcheese, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33237

llvm-svn: 303670
2017-05-23 18:39:08 +00:00
David Blaikie 7b0a6aa642 Fix DIEHash refactoring that dropped the DW_AT_name from the hash
llvm-svn: 303669
2017-05-23 18:36:07 +00:00
Nirav Dave 3b4f7cc0b3 [DAG] Add canMergeStoresTo predicate checks. NFCI.
Propagate canMergeStoresTo checks to missing cases in StoreMerge.

llvm-svn: 303668
2017-05-23 18:33:09 +00:00
Reid Kleckner 36238b15d7 Speculative build fix for non-Windows
llvm-svn: 303667
2017-05-23 18:28:13 +00:00
David Blaikie 74fa80399a Refactor DWARF hashing to use a .def file to avoid repetition
llvm-svn: 303666
2017-05-23 18:27:09 +00:00
Reid Kleckner ded38803c5 [PDB] Hash types up front when merging types instead of using StringMap
Summary:
First, StringMap uses llvm::HashString, which is only good for short
identifiers and really bad for large blobs of binary data like type
records. Moving to `DenseMap<StringRef, TypeIndex>` with some tricks for
memory allocation fixes that.

Unfortunately, that didn't buy very much performance. Profiling showed
that we spend a long time during DenseMap growth rehashing existing
entries. Also, in general, DenseMap is faster when the keys are small.
This change takes that to the logical conclusion by introducing a small
wrapper value type around a pointer to key data. The key data contains a
precomputed hash, the original record data (pointer and size), and the
type index, which is the "value" of our original map.

This reduces the time to produce llvm-as.exe and llvm-as.pdb from ~15s
on my machine to 3.5s, which is about a 4x improvement.

Reviewers: zturner, inglorion, ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33428

llvm-svn: 303665
2017-05-23 18:23:59 +00:00
Sanjay Patel d3106add77 [InstCombine] allow icmp-xor folds for vectors (PR33138)
This fixes the first part of:
https://bugs.llvm.org/show_bug.cgi?id=33138

More work is needed for the bitcasted variant.

llvm-svn: 303660
2017-05-23 17:29:58 +00:00
Marek Olsak 7dadd86a35 AMDGPU: Fold CI-specific complex SMRD patterns into existing complex patterns
This is just a cleanup. Also, it adds checking that ByteCount is aligned to 4.

Reviewers: arsenm, nhaehnle, tstellarAMD

Subscribers: kzhuravl, wdng, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D28994

llvm-svn: 303658
2017-05-23 17:14:34 +00:00
Reid Kleckner 545aa4f4dd Commit AttributeList change that was supposed to be part of r303654
llvm-svn: 303656
2017-05-23 17:03:28 +00:00
Ulrich Weigand fe9fcb8af3 [RuntimeDyld, PowerPC] Fix regression from r303637
Actually, to identify external symbols, we need to check for
*either* non-null Value.SymbolName *or* a SymType of
Symbol::ST_Unknown.

The former may happen for symbols not known to the JIT at all
(e.g. defined in a native library), while the latter happens
for symbols known to the JIT, but defined in a different module.

Fixed several regressions on big-endian ppc64.

llvm-svn: 303655
2017-05-23 17:03:23 +00:00
Reid Kleckner 8bf67fe98f [IR] Switch AttributeList to use an array for O(1) access
Summary:
Before this change, AttributeLists stored a pair of index and
AttributeSet. This is memory efficient if most arguments do not have
attributes. However, it requires doing a search over the pairs to test
an argument or function attribute. Profiling shows that this loop was
0.76% of the time in 'opt -O2' of sqlite3.c, because LLVM constantly
tests values for nullability.

This was worth about 2.5% of mid-level optimization cycles on the
sqlite3 amalgamation. Here are the full perf results:
https://reviews.llvm.org/P7995

Here are just the before and after cycle counts:
```
$ perf stat -r 5 ./opt_before -O2 sqlite3.bc -o /dev/null
    13,274,181,184      cycles                    #    3.047 GHz                      ( +-  0.28% )
$ perf stat -r 5 ./opt_after -O2 sqlite3.bc -o /dev/null
    12,906,927,263      cycles                    #    3.043 GHz                      ( +-  0.51% )
```

This patch *does not* change the indices used to query attributes, as
requested by reviewers. Tracking whether an index is usable for array
indexing is a huge pain that affects many of the internal APIs, so it
would be good to come back later and do a cleanup to remove this
internal adjustment.

Reviewers: pete, chandlerc

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D32819

llvm-svn: 303654
2017-05-23 17:01:48 +00:00
Stanislav Mekhanoshin a96ec3f360 [AMDGPU] Convert shl (add) into add (shl)
shl (or|add x, c2), c1 => or|add (shl x, c1), (c2 << c1)
This allows to fold a constant into an address in some cases as
well as to eliminate second shift if the expression is used as
an address and second shift is a result of a GEP.

Differential Revision: https://reviews.llvm.org/D33432

llvm-svn: 303641
2017-05-23 15:59:58 +00:00
Zachary Turner bf35e6ab2a Revert "Make TypeSerializer's StringMap use the same allocator."
This reverts commit e34ccb7b57da25cc89ded913d8638a2906d1110a.

This is causing failures on the ASAN bots.

llvm-svn: 303640
2017-05-23 15:50:37 +00:00
Simon Atanasyan 57253043a4 [mips] Remove unused class field. NFC
llvm-svn: 303639
2017-05-23 15:00:30 +00:00
Simon Atanasyan 039b02ec78 [mips] Change type of MipsSubtarget ctor arguments s/std::string/StringRef/. NFC
llvm-svn: 303638
2017-05-23 15:00:26 +00:00
Ulrich Weigand 7f02d67fce [RuntimeDyld, PowerPC] Fix check for external symbols when detecting reloction overflow
The PowerPC part of processRelocationRef currently assumes that external
symbols can be identified by checking for SymType == SymbolRef::ST_Unknown.
This is actually incorrect in some cases, causing relocation overflows to
be mis-detected. The correct check is to test whether Value.SymbolName
is null.

Includes test case. Note that it is a bit tricky to replicate the exact
condition that triggers the bug in a test case. The one included here
seems to fail reliably (before the fix) across different operating
system versions on Power, but it still makes a few assumptions (called
out in the test case comments).

Also add ppc64le platform name to the supported list in the lit.local.cfg
files for the MCJIT and OrcMCJIT directories, since those tests were
currently not run at all.

Fixes PR32650.

Reviewer: hfinkel

Differential Revision: https://reviews.llvm.org/D33402

llvm-svn: 303637
2017-05-23 14:51:18 +00:00
Anna Thomas c07d5544dd [JumpThreading] Safely replace uses of condition
This patch builds over https://reviews.llvm.org/rL303349 and replaces
the use of the condition only if it is safe to do so.

We should not blindly RAUW the condition if experimental.guard or assume
is a use of that
condition. This is because LVI may have used the guard/assume to
identify the
value of the condition, and RUAWing will fold the guard/assume and uses
before the guards/assumes.

Reviewers: sanjoy, reames, trentxintong, mkazantsev

Reviewed by: sanjoy, reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33257

llvm-svn: 303633
2017-05-23 13:36:25 +00:00
Ulrich Weigand b6d40cceee [RuntimeDyld, PowerPC] Fix relocation detection overflow
Code in RuntimeDyldELF currently uses 32-bit temporaries to detect
whether a PPC64 relocation target is out of range. This is incorrect,
and can mis-detect overflow where the distance between relocation site
and target is close to a multiple of 4GB. Fixed by using 64-bit
temporaries.

Noticed while debugging PR32650.

Reviewer: hfinkel

Differential Revision: https://reviews.llvm.org/D33403

llvm-svn: 303632
2017-05-23 12:43:57 +00:00
Sam Kolton f7659d71eb [AMDGPU] SDWA: Add assembler support for GFX9
Summary:
Added separate pseudo and real instruction for GFX9 SDWA instructions.
Currently supports only in assembler.
Depends D32493

Reviewers: vpykhtin, artem.tamazov

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D33132

llvm-svn: 303620
2017-05-23 10:08:55 +00:00
Florian Hahn abb4218b98 [AArch64] Make instruction fusion more aggressive.
Summary:
This patch makes instruction fusion more aggressive by
* adding artificial edges between the successors of FirstSU and
  SecondSU, similar to BaseMemOpClusterMutation::clusterNeighboringMemOps.
* updating PostGenericScheduler::tryCandidate to keep clusters together,
   similar to GenericScheduler::tryCandidate.

This change increases the number of AES instruction pairs generated on
 Cortex-A57 and Cortex-A72. This doesn't change code at all in
 most benchmarks or general code, but we've seen improvement on kernels
 using AESE/AESMC and AESD/AESIMC. 

Reviewers: evandro, kristof.beyls, t.p.northover, silviu.baranga, atrick, rengolin, MatzeB

Reviewed By: evandro

Subscribers: aemerson, rengolin, MatzeB, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33230

llvm-svn: 303618
2017-05-23 09:33:34 +00:00
Igor Breger 617be6e475 [GlobalISel][X86] G_LOAD/G_STORE vec256/512 support
Summary: mark G_LOAD/G_STORE vec256/512 legal for AVX/AVX512. Implement instruction selection.

Reviewers: zvi, guyblank

Reviewed By: zvi

Subscribers: rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D33268

llvm-svn: 303617
2017-05-23 08:23:51 +00:00
Craig Topper 7e0aeeb884 [KnownBits] Use !hasConflict() in asserts in place of Zero & One == 0 or similar. NFC
llvm-svn: 303614
2017-05-23 07:18:37 +00:00
Ayal Zaks 589e1d9610 [LV] Report multiple reasons for not vectorizing under allowExtraAnalysis
The default behavior of -Rpass-analysis=loop-vectorizer is to report only the
first reason encountered for not vectorizing, if one is found, at which time the
vectorizer aborts its handling of the loop. This patch allows multiple reasons
for not vectorizing to be identified and reported, at the potential expense of
additional compile-time, under allowExtraAnalysis which can currently be turned
on by Clang's -fsave-optimization-record and opt's -pass-remarks-missed.

Removed from LoopVectorizationLegality::canVectorize() the redundant checking
and reporting if we CantComputeNumberOfIterations, as LAI::canAnalyzeLoop() also
does that. This redundancy is caught by a lit test once multiple reasons are
reported.

Patch initially developed by Dror Barak.

Differential Revision: https://reviews.llvm.org/D33396

llvm-svn: 303613
2017-05-23 07:08:02 +00:00
David Blaikie 15d85fc537 libDebugInfo: Support symbolizing using DWP files
llvm-svn: 303609
2017-05-23 06:48:53 +00:00
Akira Hatanaka e8ae3346a3 [AArch64] Fix PRR33100.
This commit fixes a bug introduced in r301019 where optimizeLogicalImm
would replace a logical node's immediate operand that was CSE'd and
was also an operand of another node.

This commit fixes the bug by replacing the logical node instead of its
immediate operand.

rdar://problem/32295276

llvm-svn: 303607
2017-05-23 06:08:37 +00:00
Galina Kistanova 5e6c542ae3 Added LLVM_FALLTHROUGH to address gcc warning: this statement may fall through.
llvm-svn: 303597
2017-05-23 01:20:52 +00:00
Galina Kistanova fb9476ee6c Added LLVM_FALLTHROUGH to address gcc warning: this statement may fall through.
llvm-svn: 303595
2017-05-23 01:07:19 +00:00
David Blaikie 37d1cff491 FIX: Remove debugging assert left in previous commit
Sorry for the bot noise.

llvm-svn: 303592
2017-05-23 00:31:24 +00:00
David Blaikie f9803fb4bb libDebugInfo: Avoid independently parsing the same .dwo file for two separate CUs residing there
NFC, just an optimization. Will be building on this for DWP support
shortly.

llvm-svn: 303591
2017-05-23 00:30:42 +00:00
Teresa Johnson 2db1369c1f Support for taking the max of module flags when linking, use for PIE/PIC
Summary:
Add Max ModFlagBehavior, which can be used to take the max of two
module flag values when merging modules. Use it for the PIE and PIC
levels.

This avoids an error when we try to import from a module built -fpic
into a module built -fPIC, for example. For both PIE and PIC levels,
this will be legal, since the code generation gets more conservative
as the level is increased. Therefore we can take the max instead of
somehow trying to block importing between modules compiled with
different levels.

Reviewers: tmsriram, pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33418

llvm-svn: 303590
2017-05-23 00:08:00 +00:00
Davide Italiano 8e7d11ab2b [NewPM] Fix an innocent but silly typo. Reported by Craig Topper.
llvm-svn: 303587
2017-05-22 23:47:11 +00:00
Davide Italiano 8a09b8eba9 [NewPM] Add a temporary cl::opt() to test NewGVN.
llvm-svn: 303586
2017-05-22 23:41:40 +00:00
Galina Kistanova 6fa60f5e8b Added LLVM_FALLTHROUGH to address gcc warning: this statement may fall through.
llvm-svn: 303585
2017-05-22 22:46:31 +00:00
Vitaly Buka b238cb8fbc [CodeGen] Fix uninitialized variables exposed by r303084
All other calls of analyzeBranch reset PredTBB and PredFBB, so I assume it's
expected behavior.

llvm-svn: 303581
2017-05-22 21:33:54 +00:00
Tim Northover 997f5f10c6 InstructionSimplify: don't speculate about Constants changing.
When presented with an icmp/select pair, we can end up asking what would happen
if we replaced one constant with another in an instruction. This is a mistake,
while non-constant Values could become a constant, constants cannot change and
trying to do so can lead to completely invalid IR (a GEP referencing a
non-existant field in the original case).

llvm-svn: 303580
2017-05-22 21:28:08 +00:00
Evgeniy Stepanov b9f1b014e1 Infer relocation model from module flags in relocatable LTO link.
Fix for PR33096.

llvm-svn: 303578
2017-05-22 21:11:35 +00:00
Zachary Turner d4136e945e Implement various flavors of type merging.
Previous algotirhm assumed that types and ids are in a single
unified stream.  For inputs that come from object files, this
is the case.  But if the input is already a PDB, or is the result
of a previous merge, then the types and ids will already have
been split up, in which case we need an algorithm that can
accept operate on independent streams of types and ids that
refer across stream boundaries to each other.

Differential Revision: https://reviews.llvm.org/D33417

llvm-svn: 303577
2017-05-22 21:07:43 +00:00
Zachary Turner 12f8c31c04 Make TypeSerializer's StringMap use the same allocator.
llvm-svn: 303576
2017-05-22 21:07:14 +00:00
Adrian Prantl fb31da1306 Don't generate line&scope debug info for meta-instructions.
MachineInstructions that don't generate any code (such as
IMPLICIT_DEFs) should not generate any debug info either.

Fixes PR33107.

https://bugs.llvm.org/show_bug.cgi?id=33107

This reapplies r303566 without any modifications. The stage2 build
failures persisted even after reverting this patch, and looking back
through history, it looks like these tests are flaky.

llvm-svn: 303575
2017-05-22 20:47:09 +00:00
Teresa Johnson 525dcb617b Fix update VP metadata after inlining for instrumentation PGO
Summary:
With instrumentation profiling, when updating the VP metadata after
an inline, VP metadata on the inlined copy was inadvertantly having
all counts zeroed out. This was causing indirect calls from code inlined
during the call step to be marked as cold in the ThinLTO summaries and
not imported.

The CallerBFI needs to be passed down so that the CallSiteCount can be
computed from the profile summary info. With Sample PGO this was working
since the count is extracted from the branch weight metadata on the
call being inlined (even before we stopped looking at metadata for
non-sample PGO in r302844 this largely wasn't working for instrumentation
PGO since only promoted indirect calls would be getting inlined and have
the metadata).

Added an instrumentation PGO test and renamed the sample PGO test.

Reviewers: danielcdh, eraman

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D33389

llvm-svn: 303574
2017-05-22 20:28:18 +00:00
Krzysztof Parzyszek 9a23d40ee8 [Hexagon] Fix definitions of vector predicate loads and stores
This fixes http://llvm.org/PR33048.

llvm-svn: 303572
2017-05-22 20:02:53 +00:00
Craig Topper 64a65ec4fd [DataLayout] Add llvm_unreachable to the default of a nested switch statement that covers all values given to it by the outer switch. NFC
llvm-svn: 303571
2017-05-22 19:28:36 +00:00
Adrian Prantl 334a130a6f Revert "Don't generate line&scope debug info for meta-instructions."
This reverts commit r303566 while investigating a stage2 buildbot failure.

llvm-svn: 303570
2017-05-22 18:50:12 +00:00
Stanislav Mekhanoshin 5fa289f0d8 [AMDGPU] Narrow lshl from 64 to 32 bit if possible
Turn expensive 64 bit shift into 32 bit if shift does not overflow int:
shl (ext x) => zext (shl x)

Differential Revision: https://reviews.llvm.org/D33367

llvm-svn: 303569
2017-05-22 16:58:10 +00:00
Xinliang David Li 126157c3b4 [PartialInlining] Add internal options to enable partial inlining in pass pipeline (off by default)
1. Legacy: -mllvm -enable-partial-inlining
2. New:  -mllvm -enable-npm-partial-inlining -fexperimental-new-pass-manager

Differential Revision: http://reviews.llvm.org/D33382

llvm-svn: 303567
2017-05-22 16:41:57 +00:00
Adrian Prantl 4c047f8931 Don't generate line&scope debug info for meta-instructions.
MachineInstructions that don't generate any code (such as
IMPLICIT_DEFs) should not generate any debug info either.

Fixes PR33107.

https://bugs.llvm.org/show_bug.cgi?id=33107

llvm-svn: 303566
2017-05-22 16:21:02 +00:00
Nirav Dave e00da22ef3 [DAG] Rework store merge to loop on load candidates. NFCI.
Continue to consider remaining candidate merges until all possible
merges have been considered.

llvm-svn: 303560
2017-05-22 15:33:47 +00:00
Valery Pykhtin 74cb9c8831 [AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker
Differential revision: https://reviews.llvm.org/D33289

llvm-svn: 303548
2017-05-22 13:09:40 +00:00
Simon Atanasyan e0b726f2fa [mips] Support micromips attribute passed by front-end
This patch adds handling of the `micromips` and `nomicromips` attributes
passed by front-end. The patch depends on D33363.

Differential revision: https://reviews.llvm.org/D33364

llvm-svn: 303545
2017-05-22 12:47:41 +00:00
Artur Pilipenko edee25152b [LoopPredication] NFC. Add extra debug output in case we fail to parse the range check
llvm-svn: 303544
2017-05-22 12:06:57 +00:00
Artur Pilipenko c488dfabac [LoopPredication] NFC. Move a nested struct declaration before the fields, clang-format a bit
This will simplify the diff for an upcoming review.

llvm-svn: 303543
2017-05-22 12:01:32 +00:00
James Molloy 6110be9759 Re-apply r302416: [ARM] Clear the constant pool cache on explicit .ltorg directives
Re-applying now that PR32825 which was raised on the commit this fixed up is now known to have also been fixed by this commit.

Original commit message:
    Multiple ldr pseudoinstructions with the same constant value will
    reuse the same constant pool entry. However, if the constant pool
    is explicitly flushed with a .ltorg directive, we should not try
    to reference constants in the previous pool any longer, since they
    may be out of range.

    This fixes assembling hand-written assembler source which repeatedly
    loads the same constant value, across a binary size larger than the
    pc-relative fixup range for ldr instructions (4096 bytes). Such
    assembler source already uses explicit .ltorg instructions to emit
    constant pools with regular intervals. However if we try to reuse
    constants emitted in earlier pools, they end up out of range.

    This makes the output of the testcase match what binutils gas does
    (prior to this patch, it would fail to assemble).

    Differential Revision: https://reviews.llvm.org/D32847

llvm-svn: 303540
2017-05-22 09:42:07 +00:00
James Molloy 5193c80830 Re-apply r286006: Fix 24560: assembler does not share constant pool for same constants
Re-applying now that the open bug on this commit, PR32825, is known to be fixed.

Original commit message:
    Summary: This patch returns the same label if the CP entry with the same value has been created.

    Reviewers: eli.friedman, rengolin, jmolloy

    Subscribers: majnemer, jmolloy, llvm-commits

    Differential Revision: https://reviews.llvm.org/D25804

llvm-svn: 303539
2017-05-22 09:42:01 +00:00
Strahinja Petrovic ab9573f37c [MIPS] Add support to match more patterns for DINS instruction
This patch adds support for recognizing patterns to match
DINS instruction.

Differential Revision: https://reviews.llvm.org/D31465

llvm-svn: 303537
2017-05-22 09:06:44 +00:00
James Molloy 5cc75ae8f9 Revert "[ARM] Clear the constant pool cache on explicit .ltorg directives"
This reverts commit r302416. This was a fixup for r286006, which has now been reverted so this doesn't apply (either in concept or in code).

This commit itself has no problems, but the underlying issue it was fixing has now disappeared from the codebase.

llvm-svn: 303536
2017-05-22 08:49:28 +00:00
James Molloy 5a9cf2e22d Revert "Fix 24560: assembler does not share constant pool for same constants"
This reverts commit r286006. It caused PR32825 and wasn't fixed.

llvm-svn: 303535
2017-05-22 08:42:47 +00:00
David Blaikie d2f3a941e0 libDebugInfo/DWARF: Apply relocations for debug_addr addresses in object files
llvm-symbolizer would fail to symbolize addresses in unlinked object
files when handling .dwo file data because the addresses would not be
relocated in the same way as the ranges in the skeleton CU in the object
file.

Fix that so object files can be symbolized the same as executables.

llvm-svn: 303532
2017-05-22 07:02:47 +00:00
Sanjoy Das 036dda25a5 [SCEV] Clarify behavior around max backedge taken count
This is a re-application of a r303497 that was reverted in r303498.
I thought it had broken a bot when it had not (the breakage did not
go away with the revert).

This change makes the split between the "exact" backedge taken count
and the "maximum" backedge taken count a bit more obvious.  Both of
these are upper bounds on the number of times the loop header
executes (since SCEV does not account for most kinds of abnormal
control flow), but the latter is guaranteed to be a constant.

There were a few places where the max backedge taken count *was* a
non-constant; I've changed those to compute constants instead.

At this point, I'm not sure if the constant max backedge count can be
computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without
losing precision.  If it can, we can simplify even further by making
`getMaxBackedgeTakenCount` a thin wrapper around
`getBackedgeTakenCount` and `getUnsignedRange`.

llvm-svn: 303531
2017-05-22 06:46:04 +00:00
Craig Topper 2b1fc32f22 [InstCombine] Cleanup the interface for overflow checks
Summary:
Fix naming conventions and const correctness.
This completes the changes made in rL303029.

Patch by Yoav Ben-Shalom.

Reviewers: craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33377

llvm-svn: 303529
2017-05-22 06:25:31 +00:00
Craig Topper e777fed152 [SimplifyCFG] Prevent a few APInt copies on method calls that return const reference. NFCI
llvm-svn: 303523
2017-05-22 00:49:35 +00:00
Craig Topper aaef41f71b [KnownBits] Use isNegative/isNonNegative to shorten some code. NFC
llvm-svn: 303522
2017-05-22 00:49:33 +00:00
Daniel Berlin d130b6c27d NewGVN: Fix PR 33116, the memoryphi version of bug 32838.
llvm-svn: 303521
2017-05-21 23:41:58 +00:00
Daniel Berlin 0207cca8e0 NewGVN: Cleanup some repeated code using some templated helpers
llvm-svn: 303520
2017-05-21 23:41:56 +00:00
Daniel Berlin 0193997b7e NewGVN: Fix printing of simplified expression
llvm-svn: 303519
2017-05-21 23:41:53 +00:00
Davide Italiano 21a49dcdf1 [InstCombine] Take in account the size in sext->lshr->trunc patterns.
Otherwise we end up miscompiling, transforming:

define i8 @tinky() {
  %sext = sext i1 1 to i16
  %hibit = lshr i16 %sext, 15
  %tr = trunc i16 %hibit to i8
  ret i8 %tr
}

into:

  %sext = sext i1 1 to i8
  ret i8 %sext

and the first get folded to ret i8 1, while the second gets folded
to ret i8 -1.

Eventually we should get rid of this transform entirely, but for now,
this at least fixes a know correctness bug.

Differential Revision:  https://reviews.llvm.org/D33338

llvm-svn: 303513
2017-05-21 20:30:27 +00:00
Igor Breger 014fc566e7 [GlobalISel][X86] Fix G_TRUNC instruction selection.
Updated tests with -verify-machineinstrs flag.
It fixes 3 tests failed with machine verifier enabled and listed
in PR27481

llvm-svn: 303502
2017-05-21 11:13:56 +00:00
Hiroshi Inoue 37e63b1b21 Summary
PPC backend eliminates compare instructions by using record-form instructions in PPCInstrInfo::optimizeCompareInstr, which is called from peephole optimization pass.
This patch improves this optimization to eliminate more compare instructions in two types of common case.


- comparison against a constant 1 or -1

The record-form instructions set CR bit based on signed comparison against 0. So, the current implementation does not exploit the record-form instruction for comparison against a non-zero constant.
This patch enables record-form optimization for constant of 1 or -1 if possible; it changes the condition "greater than -1" into "greater than or equal to 0" and "less than 1" into "less than or equal to 0".
With this patch, compare can be eliminated in the following code sequence, as an example.

uint64_t a, b;
if ((a | b) & 0x8000000000000000ull) { ... }
else { ... }


- andi for 32-bit comparison on PPC64

Since record-form instructions execute 64-bit signed comparison and so we have limitation in eliminating 32-bit comparison, i.e. with cmplwi, using the record-form. The original implementation already has such checks but andi. is not recognized as an instruction which executes implicit zero extension and hence safe to convert into record-form if used for equality check.

%1 = and i32 %a, 10
%2 = icmp ne i32 %1, 0
br i1 %2, label %foo, label %bar

In this simple example, LLVM generates andi. + cmplwi + beq on PPC64.
This patch make it possible to eliminate the cmplwi for this case.
I added andi. for optimization targets if it is safe to do so.

Differential Revision: https://reviews.llvm.org/D30081

llvm-svn: 303500
2017-05-21 06:00:05 +00:00
Sanjoy Das 8963650cfa Revert "[SCEV] Clarify behavior around max backedge taken count"
This reverts commit r303497 since it breaks the msan bootstrap bot:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1379/

llvm-svn: 303498
2017-05-21 05:02:12 +00:00
Sanjoy Das 5207168383 [SCEV] Clarify behavior around max backedge taken count
This change makes the split between the "exact" backedge taken count
and the "maximum" backedge taken count a bit more obvious.  Both of
these are upper bounds on the number of times the loop header
executes (since SCEV does not account for most kinds of abnormal
control flow), but the latter is guaranteed to be a constant.

There were a few places where the max backedge taken count *was* a
non-constant; I've changed those to compute constants instead.

At this point, I'm not sure if the constant max backedge count can be
computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without
losing precision.  If it can, we can simplify even further by making
`getMaxBackedgeTakenCount` a thin wrapper around
`getBackedgeTakenCount` and `getUnsignedRange`.

llvm-svn: 303497
2017-05-21 01:47:50 +00:00
Xin Tong 9fbfeefadf Revert "Add pthread_self function prototype and make it speculatable."
This reverts commit 143d7445b5dfa2f6d6c45bdbe0433d9fc531be21.

Build breaking

llvm-svn: 303496
2017-05-21 00:37:55 +00:00
Xin Tong 75af3af957 Add pthread_self function prototype and make it speculatable.
Summary: This allows pthread_self to be pulled out of a loop by LICM.

Reviewers: hfinkel, arsenm, davide

Reviewed By: davide

Subscribers: davide, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D32782

llvm-svn: 303495
2017-05-20 22:40:25 +00:00
Martell Malone 36af8f4d42 COFF: Fix another StringRef return error
This should appease the lld build bot regression
Following up on rL303493

llvm-svn: 303494
2017-05-20 21:54:15 +00:00
Martell Malone d1a5d9eee5 COFF: Fix single StringRef return error
This should appease the lld build bot regression
Intrroduced by rL303490

llvm-svn: 303493
2017-05-20 21:00:36 +00:00
Martell Malone 375dc90ebf COFF: migrate def parser from LLD to LLVM [1/2]
This is split up into two commits.
The will create the DEF parser in LLVM.
Check the following commit to see the removal from LLD

Reviewers: ruiu

Differential Revision: https://reviews.llvm.org/D32689

llvm-svn: 303490
2017-05-20 19:56:29 +00:00
David Blaikie f1c3beecb2 Fix -Wunneeded-internal-declaration by removing constant arrays only used in sizeof expressions, in favor of constants containing the size directly
llvm-svn: 303483
2017-05-20 03:32:51 +00:00
David Blaikie 8d039d40c5 llvm-symbolizer: Support multiple CUs in a single DWO file
llvm-svn: 303482
2017-05-20 03:32:49 +00:00
Eric Beckmann a6bdf751a2 Add functionality to cvtres to parse all entries in res file.
Summary: Added the new modules in the Object/ folder.  Updated the
llvm-cvtres interface as well, and added additional tests.

Subscribers: llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D33180

llvm-svn: 303480
2017-05-20 01:49:19 +00:00
Matthias Braun 57fd12db0c Fix breakage after r303461
- Improve wchar_t size predicitions based on target triple.
- Be less strict in wchar_t size verifier.

llvm-svn: 303477
2017-05-20 01:28:52 +00:00
Davide Italiano 9a0f542db6 [NewGVN] Create a StoreExpression instead of a VariableExpression.
In the case where we have an operand defined by a lod of the
same memory location. Historically this was a VariableExpression
because we wanted to make sure they ended up in the same class,
but if we create the right expression, they end up in the same
class anyway.

Fixes PR32897. Thanks to Dan for the detailed discussion and the
fix suggestion.

llvm-svn: 303475
2017-05-20 00:46:54 +00:00
Davide Italiano 888965c8a2 [NewGVN] Get rid of an assertion.
This was here because we don't want to switch leaders too much,
in order to avoid fixpoint(ing) issue, but it's not sure if it
matters in practice.

A first step towards fixing PR32897.

llvm-svn: 303473
2017-05-20 00:24:04 +00:00
Adrian Prantl 981a799896 Revert "Revert "ThinLTO: Verify bitcode before lauching the ThinLTOCodeGenerator.""
This reapplies commit r303438 modified to not verify cross-imported
bitcode in FunctionImporter.

rdar://problem/31233625

Differential Revision: https://reviews.llvm.org/D33370

llvm-svn: 303470
2017-05-20 00:00:08 +00:00
Adrian Prantl 660437975b Revert "ThinLTO: Verify bitcode before lauching the ThinLTOCodeGenerator."
This reverts commit r303438 while deliberating buildbot breakage.

llvm-svn: 303467
2017-05-19 23:32:21 +00:00
Matthias Braun 50ec0b5dce SimplifyLibCalls: Optimize wcslen
Refactor the strlen optimization code to work for both strlen and wcslen.

This especially helps with programs in the wild where people pass
L"string"s to const std::wstring& function parameters and the wstring
constructor gets inlined.

This also fixes a lingerind API problem/bug in getConstantStringInfo()
where zeroinitializers would always give you an empty string (without a
length) back regardless of the actual length of the initializer which
did not work well in the TrimAtNul==false causing the PR mentioned
below.

Note that the fixed getConstantStringInfo() needed fixes to SelectionDAG
memcpy lowering and may lead to some cases for out-of-bounds
zeroinitializer accesses not getting optimized anymore. So some code
with UB may produce out of bound memory reads now instead of just
producing zeros.

The refactoring "accidentally" fixes http://llvm.org/PR32124

Differential Revision: https://reviews.llvm.org/D32839

llvm-svn: 303461
2017-05-19 22:37:09 +00:00
Matthias Braun 89f3bcf0b5 Verifier: Check wchar_size module flag.
Differential Revision: https://reviews.llvm.org/D32974

llvm-svn: 303460
2017-05-19 22:37:01 +00:00
Reid Kleckner bf6b3b1564 Fix off-by-one bug in AttributeList::addAttributes index handling
getParamAlignment expects an argument number, not an AttributeList
index.

Johan Englan, who works on LDC, found this bug and told me about it off
list.

llvm-svn: 303458
2017-05-19 22:23:47 +00:00
Galina Kistanova 78706a3dae Added LLVM_FALLTHROUGH to address gcc warning: this statement may fall through.
llvm-svn: 303457
2017-05-19 21:08:28 +00:00
Evgeniy Stepanov 2acea2786b [safestack] Disable stack coloring by default.
Workaround for apparent miscompilation of PR32143.

llvm-svn: 303456
2017-05-19 20:58:48 +00:00
Galina Kistanova f525c76ba1 Added missing break.
llvm-svn: 303454
2017-05-19 20:31:51 +00:00
Daniel Berlin e021d2d629 NewGVN: Fix PR32838.
This is a complicated bug involving two issues:
1. What do we do with phi nodes when we prove all arguments are not
live?
2. When is it safe to use value leaders to determine if we can ignore
an argumnet?

llvm-svn: 303453
2017-05-19 20:22:20 +00:00
Zachary Turner 526f4f2aa8 Resubmit "[CodeView] Provide a common interface for type collections."
This was originally reverted because it was a breaking a bunch
of bots and the breakage was not surfacing on Windows.  After much
head-scratching this was ultimately traced back to a bug in the
lit test runner related to its pipe handling.  Now that the bug
in lit is fixed, Windows correctly reports these test failures,
and as such I have finally (hopefully) fixed all of them in this
patch.

llvm-svn: 303446
2017-05-19 19:26:58 +00:00
Daniel Berlin b527b2cf13 Last of the major pieces to NewGVN - yay!
Summary:
NewGVN: Handle equivalence between phi of ops and op of phis.

This makes our GVN mostly-complete. It would be complete, modulo some
deliberate choices we make.  This means it detects roughly all herband
equivalences in polynomial time, including cases notoriously hard for
other GVN's to detect.  It also detects a very large swath of the
cases we currently rely on instcombine to detect that involve folding
upwards through phis.

Fixes PR 31125, 31463, PR 31868

Reviewers: davide

Subscribers: Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D32151

llvm-svn: 303444
2017-05-19 19:01:27 +00:00
Daniel Berlin ff15200b1d NewGVN: Get rid of most dominating leader check
llvm-svn: 303443
2017-05-19 19:01:24 +00:00
Daniel Berlin a5130bbd12 BasicAA: Uninserted instructions have no parent, and notDifferentParent explicitly allows for this case, but getParent crashes when handed one.
llvm-svn: 303442
2017-05-19 19:01:21 +00:00
Amaury Sechet 77cfb4a85f [DAGCombine] (addcarry 0, 0, X) -> (ext/trunc X)
Summary:
While this makes some case better and some case worse - so it's unclear if it is a worthy combine just by itself - this is a useful canonicalisation.

As per discussion in D32756 .

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32916

llvm-svn: 303441
2017-05-19 18:20:44 +00:00
Anna Thomas ae3f752f36 [NFC][loopIdiom] Clang format change rL303434
llvm-svn: 303439
2017-05-19 18:00:30 +00:00
Adrian Prantl f9ab9bfc39 ThinLTO: Verify bitcode before lauching the ThinLTOCodeGenerator.
rdar://problem/31233625

Differential Revision: https://reviews.llvm.org/D33151

llvm-svn: 303438
2017-05-19 17:55:02 +00:00
Anna Thomas 5ecb8f7593 [LoopIdiom] Refactor return value of isLegalStore [NFC]
Summary:

This NFC simply refactors the return value of LoopIdiomRecognize::isLegalStore() from bool to an enumeration, and
removes the return-through-parameter mechanism that the function was using. This function is constructed such that it will
only ever recognize a single store idiom (memset, memset_pattern, or memcpy), and never a combination of these. As such it
makes much more sense for the return value to be the single idiom that the store matches, rather than
having a separate argument-return for each idiom -- it's cleaner, and makes it clearer that
only a single idiom can be matched.

Patch by Daniel Neilson!

Reviewers: anna, sanjoy, davide, haicheng

Reviewed By: anna, haicheng

Subscribers: haicheng, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D33359

llvm-svn: 303434
2017-05-19 17:05:36 +00:00
Craig Topper 9c913bfd49 [InstSimplify] Fix 80 column violation. NFC
llvm-svn: 303433
2017-05-19 16:56:53 +00:00
Craig Topper 8885f933b2 [APInt] Add support for dividing or remainder by a uint64_t or int64_t.
Summary:
This patch adds udiv/sdiv/urem/srem/udivrem/sdivrem methods that can divide by a uint64_t. This makes division consistent with all the other arithmetic operations.

This modifies the interface of the divide helper method to work on raw arrays instead of APInts. This way we can pass the uint64_t in for the RHS without wrapping it in an APInt. This required moving all the Quotient and Remainder allocation handling up to the callers. For udiv/urem this was as simple as just creating the Quotient/Remainder with the right size when they were declared. For udivrem we have to rely on reallocate not changing the contents of the variable LHS or RHS is aliased with the Quotient or Remainder APInts. We also have to zero the upper bits of Remainder and Quotient that divide doesn't write to if lhsWords/rhsWords is smaller than the width.

I've update the toString method to use the new udivrem.

Reviewers: hans, dblaikie, RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33310

llvm-svn: 303431
2017-05-19 16:43:54 +00:00
Dmitry Preobrazhensky ce941c9c38 [AMDGPU][MC] Corrected disassembler to decode instructions with 2 literals
See bug 32922: https://bugs.llvm.org//show_bug.cgi?id=32922

Reviewers: artem.tamazov, vpykhtin

Differential Revision: https://reviews.llvm.org/D32912

llvm-svn: 303428
2017-05-19 14:27:52 +00:00
Artur Pilipenko a6c278049a [LoopPredication] NFC. Extract LoopICmp struct and parseLoopICmp helper
llvm-svn: 303427
2017-05-19 14:02:46 +00:00
Artur Pilipenko 6780ba65b9 [LoopPredication] NFC. Extract LoopPredication::expandCheck helper
llvm-svn: 303426
2017-05-19 14:00:58 +00:00
Artur Pilipenko aab28666bc [LoopPredication] NFC. Extract CanExpand helper lambda
llvm-svn: 303425
2017-05-19 14:00:04 +00:00
Artur Pilipenko 46c4e0a4bf [LoopPredication] NFC. Add an early exit if there is no guards in the loop
llvm-svn: 303424
2017-05-19 13:59:34 +00:00
Dmitry Preobrazhensky 9321e8fcec [AMDGPU][MC] Fixed bugs in export instruction
See Bugs 33019, 33056:
  https://bugs.llvm.org//show_bug.cgi?id=33019
  https://bugs.llvm.org//show_bug.cgi?id=33056

Reviewers: artem.tamazov, vpykhtin

Differential Revision: https://reviews.llvm.org/D33288

llvm-svn: 303423
2017-05-19 13:36:09 +00:00
Guy Blank 548e22a1a7 [X86][AVX512] Make i1 illegal in the CodeGen
This patch defines the i1 type as illegal in the X86 backend for AVX512.
For DAG operations on <N x i1> types (build vector, extract vector element, ...) i8 is used, and should be truncated/extended.
This should produce better scalar code for i1 types since GPRs will be used instead of mask registers.

Differential Revision: https://reviews.llvm.org/D32273

llvm-svn: 303421
2017-05-19 12:35:15 +00:00
Daniel Sanders a1b2db7919 [globalisel][tablegen] Demote OptForSize/OptForMinSize/ForCodeSize to per-function predicates.
Summary:
This causes them to be re-computed more often than necessary but resolves
objections that were raised post-commit on r301750.

Reviewers: qcolombet, ab, t.p.northover, rovka, kristof.beyls

Reviewed By: qcolombet

Subscribers: igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D32861

llvm-svn: 303418
2017-05-19 11:08:33 +00:00
Amara Emerson 4d33c86359 Fix vector pass-through value being unused in IRBuilder::CreateMaskedGather
Also s/0/nullptr in the call site in LV.

llvm-svn: 303416
2017-05-19 10:40:18 +00:00
Volkan Keles 6a36c64720 [GlobalISel] IRTranslator: Translate ConstantStruct
Reviewers: qcolombet, ab, t.p.northover, aditya_nandakumar, dsanders

Reviewed By: qcolombet

Subscribers: rovka, kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D33317

llvm-svn: 303412
2017-05-19 09:47:02 +00:00
Zachary Turner 1dfcf8d92c Revert "[CodeView] Provide a common interface for type collections."
This is a squash of ~5 reverts of, well, pretty much everything
I did today.  Something is seriously broken with lit on Windows
right now, and as a result assertions that fire in tests are
triggering failures.  I've been breaking non-Windows bots all
day which has seriously confused me because all my tests have
been passing, and after running lit with -a to view the output
even on successful runs, I find out that the tool is crashing
and yet lit is still reporting it as a success!

At this point I don't even know where to start, so rather than
leave the tree broken for who knows how long, I will get this
back to green, and then once lit is fixed on Windows, hopefully
hopefully fix the remaining set of problems for real.

llvm-svn: 303409
2017-05-19 05:57:45 +00:00
Zachary Turner 47fdc73771 Don't crash if someone tries to visit an empty type stream.
llvm-svn: 303408
2017-05-19 05:18:09 +00:00
Zachary Turner 59ab6a3816 [CodeView] Reduce memory usage in TypeSerializer.
We were using a BumpPtrAllocator to allocate stable storage for
a record, then trying to insert that into a hash table.  If a
collision occurred, the bytes were never inserted and the
allocation was unnecessary.  At the cost of an extra hash
computation, check first if it exists, and only if it does do
we allocate and insert.

llvm-svn: 303407
2017-05-19 04:56:48 +00:00
Davide Italiano ee49f4943c [NewGVN] Delete the old store when we find congruent to a load.
(or non-store, more in general). Fixes PR33086. Caught by the
store verifier.

llvm-svn: 303406
2017-05-19 04:06:10 +00:00
Zachary Turner 8f1d87a79a Fix crasher in CodeView test.
Apparently this was always broken, but previously we were more
graceful about it and we would print "unknown udt" if we couldn't
find the type index, whereas now we just segfault because we
assume it's valid.  But this exposed a real bug, which is that
we weren't looking in the right place.  So fix that, and also
fix this crash at the same time.

llvm-svn: 303397
2017-05-19 00:56:39 +00:00
Matthias Braun d6e75ed93e LiveIntervalAnalysis: Fix missing case in pruneSubRegValues()
pruneSubRegValues() needs to remove subregister ranges starting at
instructions that later get removed by eraseInstrs(). It missed to check
one case in which eraseInstrs() would remove an instruction.

Fixes http://llvm.org/PR32688

llvm-svn: 303396
2017-05-19 00:18:03 +00:00
Zachary Turner 613c29e45f Fix another warning.
llvm-svn: 303394
2017-05-18 23:30:51 +00:00
Davide Italiano eab0de2b82 [NewGVN] Break infinite recursion in singleReachablePHIPath().
We can have cycles between PHIs and this causes singleReachablePhi()
to call itself indefintely (until we run out of stack). The proper
solution would be that of computing SCCs, but it's not worth for
now, so just keep a visited set and give up when we find a cycle.
Thanks to Dan for the discussion/help with this.

Fixes PR33014.

llvm-svn: 303393
2017-05-18 23:22:44 +00:00
Zachary Turner 7b62d7ccc0 Fix some build errors and warnings.
llvm-svn: 303391
2017-05-18 23:12:42 +00:00
Zachary Turner b32ec02b80 [CodeView] Raise the source to ID map out of the TypeStreamMerger.
This map will be needed to rewrite symbol streams after re-writing
the corresponding type streams.

llvm-svn: 303390
2017-05-18 23:04:08 +00:00
Zachary Turner 8fb441ab9c [llvm-pdbdump] Add the ability to merge PDBs.
Merging PDBs is a feature that will be used heavily by
the linker.  The functionality already exists but does not
have deep test coverage because it's not easily exposed through
any tools.  This patch aims to address that by adding the
ability to merge PDBs via llvm-pdbdump.  It takes arbitrarily
many PDBs and outputs a single PDB.

Using this new functionality, a test is added for merging
type records.  Future patches will add the ability to merge
symbol records, module information, etc.

llvm-svn: 303389
2017-05-18 23:03:41 +00:00
Zachary Turner 0c60f269fc [CodeView] Provide a common interface for type collections.
Right now we have multiple notions of things that represent collections of
types. Most commonly used are TypeDatabase, which is supposed to keep
mappings from TypeIndex to type name when reading a type stream, which
happens when reading PDBs. And also TypeTableBuilder, which is used to
build up a collection of types dynamically which we will later serialize
(i.e. when writing PDBs).

But often you just want to do some operation on a collection of types, and
you may want to do the same operation on any kind of collection. For
example, you might want to merge two TypeTableBuilders or you might want
to merge two type streams that you loaded from various files.

This dichotomy between reading and writing is responsible for a lot of the
existing code duplication and overlapping responsibilities in the existing
CodeView library classes. For example, after building up a
TypeTableBuilder with a bunch of type records, if we want to dump it we
have to re-invent a bunch of extra glue because our dumper takes a
TypeDatabase or a CVTypeArray, which are both incompatible with
TypeTableBuilder.

This patch introduces an abstract base class called TypeCollection which
is shared between the various type collection like things. Wherever we
previously stored a TypeDatabase& in some common class, we now store a
TypeCollection&.

The advantage of this is that all the details of how the collection are
implemented, such as lazy deserialization of partial type streams, is
completely transparent and you can just treat any collection of types the
same regardless of where it came from.

Differential Revision: https://reviews.llvm.org/D33293

llvm-svn: 303388
2017-05-18 23:03:06 +00:00
Davide Italiano a76e5fa111 [NewGVN] Replace predicate info leftovers.
This time with an additional fix, i.e. we remove the dead
@llvm.ssa.copy instruction.

llvm-svn: 303385
2017-05-18 21:43:23 +00:00
Sanjay Patel 5e456b943a [InstCombine] add helper to foldXorOfICmps(); NFCI
Also, fix the old-style capitalization of the related functions
and move them to the 'private' section of the class since they
are just helpers of the visit* functions.

As shown in the post-commit comments for D32143, we are missing
folds for xor-of-icmps. 

llvm-svn: 303381
2017-05-18 20:53:16 +00:00
Rui Ueyama d66d976efd Revert r303375 "LLVM_FALLTHROUGH instead of fall-through comment."
This reverts commit r303375 since it didn't compile.

llvm-svn: 303377
2017-05-18 20:18:24 +00:00
Galina Kistanova 871417446b LLVM_FALLTHROUGH instead of fall-through comment.
llvm-svn: 303375
2017-05-18 20:01:52 +00:00
Hans Wennborg b00ffd8cb7 Revert r302938 "Add LiveRangeShrink pass to shrink live range within BB."
This also reverts follow-ups r303292 and r303298.

It broke some Chromium tests under MSan, and apparently also internal
tests at Google.

llvm-svn: 303369
2017-05-18 18:50:05 +00:00
Galina Kistanova f8355ccb77 Reduce gcc-7 warnings by fall-through comments.
llvm-svn: 303365
2017-05-18 17:53:47 +00:00
Reid Kleckner 96ab8726a3 [IR] De-virtualize ~Value to save a vptr
Summary:
Implements PR889

Removing the virtual table pointer from Value saves 1% of RSS when doing
LTO of llc on Linux. The impact on time was positive, but too noisy to
conclusively say that performance improved. Here is a link to the
spreadsheet with the original data:

https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing

This change makes it invalid to directly delete a Value, User, or
Instruction pointer. Instead, such code can be rewritten to a null check
and a call Value::deleteValue(). Value objects tend to have their
lifetimes managed through iplist, so for the most part, this isn't a big
deal.  However, there are some places where LLVM deletes values, and
those places had to be migrated to deleteValue.  I have also created
llvm::unique_value, which has a custom deleter, so it can be used in
place of std::unique_ptr<Value>.

I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which
derives from User outside of lib/IR. Code in IR cannot include MemorySSA
headers or call the MemoryAccess object destructors without introducing
a circular dependency, so we need some level of indirection.
Unfortunately, no class derived from User may have any virtual methods,
because adding a virtual method would break User::getHungOffOperands(),
which assumes that it can find the use list immediately prior to the
User object. I've added a static_assert to the appropriate OperandTraits
templates to help people avoid this trap.

Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv

Reviewed By: chandlerc

Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D31261

llvm-svn: 303362
2017-05-18 17:24:10 +00:00
Wei Mi 8848c1e3c7 [LSR] Call canonicalize after we generate a new Formula in GenerateTruncates. Fix PR33077.
The testcase in PR33077 generates a LSR Use Formula with two SCEVAddRecExprs for the same
loop. Such uncommon formula will become non-canonical after GenerateTruncates adds sign
extension to the ScaledReg of the Formula, and it will break the assertion that every
Formula to be inserted is canonical.

The fix is to call canonicalize for the raw Formula generated by GenerateTruncates
before inserting it.

llvm-svn: 303361
2017-05-18 17:21:22 +00:00
Francis Visoiu Mistrih 8b61764cbb [LegacyPassManager] Remove TargetMachine constructors
This provides a new way to access the TargetMachine through
TargetPassConfig, as a dependency.

The patterns replaced here are:

* Passes handling a null TargetMachine call
  `getAnalysisIfAvailable<TargetPassConfig>`.

* Passes not handling a null TargetMachine
  `addRequired<TargetPassConfig>` and call
  `getAnalysis<TargetPassConfig>`.

* MachineFunctionPasses now use MF.getTarget().

* Remove all the TargetMachine constructors.
* Remove INITIALIZE_TM_PASS.

This fixes a crash when running `llc -start-before prologepilog`.

PEI needs StackProtector, which gets constructed without a TargetMachine
by the pass manager. The StackProtector pass doesn't handle the case
where there is no TargetMachine, so it segfaults.

Related to PR30324.

Differential Revision: https://reviews.llvm.org/D33222

llvm-svn: 303360
2017-05-18 17:21:13 +00:00
Zachary Turner 5a83fb153f Fix some minor issues in PDB parsing library.
1) Until now I'd never seen a valid PDB where the DBI stream and
   the PDB Stream disagreed on the "Age" field.  Because of that,
   we had code to assert that they matched.  Recently though I was
   given a PDB where they disagreed, so this assumption has proven
   to be incorrect.  Remove this check.

2) We were walking the entire list of hash values for types up front
   and then throwing away the values.  For large PDBs this was a
   significant slow down.  Remove this.

With this patch, I can dump the list of all compilands from a
1.5GB PDB file in just a few seconds.

llvm-svn: 303351
2017-05-18 15:14:44 +00:00
Anna Thomas 7bca59152a [JumpThreading] Dont RAUW condition incorrectly
Summary:
We have a bug when RAUWing the condition if experimental.guard or assumes is a use of that
condition. This is because LazyValueInfo may have used the guards/assumes to identify the
value of the condition at the end of the block. RAUW replaces the uses
at the guard/assume as well as uses before the guard/assume. Both of
these are incorrect.
For now, disable RAUW for conditions and fix the logic as a next
step: https://reviews.llvm.org/D33257

Reviewers: sanjoy, reames, trentxintong

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33279

llvm-svn: 303349
2017-05-18 13:12:18 +00:00
Sam Kolton ebfdaf7394 [AMDGPU] SDWA operands should not intersect with potential MIs
Summary:
There should be no intesection between SDWA operands and potential MIs. E.g.:
```
v_and_b32 v0, 0xff, v1 -> src:v1 sel:BYTE_0
v_and_b32 v2, 0xff, v0 -> src:v0 sel:BYTE_0
v_add_u32 v3, v4, v2
```
In that example it is possible that we would fold 2nd instruction into 3rd (v_add_u32_sdwa) and then try to fold 1st instruction into 2nd (that was already destroyed). So if SDWAOperand is also a potential MI then do not apply it.

Reviewers: vpykhtin, arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D32804

llvm-svn: 303347
2017-05-18 12:12:03 +00:00
Guy Blank d19632fa16 [MVT] add v1i1 MVT
Adds the v1i1 MVT as a preparation for another commit (https://reviews.llvm.org/D32273)

Differential Revision: https://reviews.llvm.org/D32540

llvm-svn: 303346
2017-05-18 11:29:41 +00:00
Igor Breger 842b5b36ba [GlobalISel][X86] G_ADD/G_SUB vector legalizer/selector support.
Summary: G_ADD/G_SUB vector legalizer/selector support.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D33232

llvm-svn: 303345
2017-05-18 11:10:56 +00:00
Simon Pilgrim 6bba6068be [X86][AVX512] Add 512-bit vector ctpop costs + tests
llvm-svn: 303342
2017-05-18 10:42:34 +00:00
Daniel Sanders 89e9308623 Re-commit: [globalisel][tablegen] Import rules containing intrinsic_wo_chain.
Summary:
As of this patch, 1018 out of 3938 rules are currently imported.

Depends on D32275

Reviewers: qcolombet, kristof.beyls, rovka, t.p.northover, ab, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: dberris, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D32278

The previous commit failed on test-suite/Bitcode/simd_ops/AArch64_halide_runtime.bc
because isImmOperandEqual() assumed MO was a register operand and that's not
always true.

llvm-svn: 303341
2017-05-18 10:33:36 +00:00
Max Kazantsev 627ad0fec3 [SCEV][NFC] Remove duplication of isLoopInvariant code
Replace two places that duplicate the code of isLoopInvariant method with
the invocation of this method.

Differential Revision: https://reviews.llvm.org/D33313

llvm-svn: 303336
2017-05-18 08:26:41 +00:00
George Rimar 47f84b1a3c [DWARF] - Simplify RelocVisitor implementation.
We do not need to store relocation width field.
Patch removes relative code, that simplifies implementation.

Differential revision: https://reviews.llvm.org/D33274

llvm-svn: 303335
2017-05-18 08:25:11 +00:00
Lama Saba 2ea271b54a [X86] Replace slow LEA instructions in X86
According to Intel's Optimization Reference Manual for SNB+:
  " For LEA instructions with three source operands and some specific situations, instruction latency has increased to 3 cycles, and must
    dispatch via port 1:
  - LEA that has all three source operands: base, index, and offset
  - LEA that uses base and index registers where the base is EBP, RBP,or R13
  - LEA that uses RIP relative addressing mode
  - LEA that uses 16-bit addressing mode "
  This patch currently handles the first 2 cases only.
 
Differential Revision: https://reviews.llvm.org/D32277

llvm-svn: 303333
2017-05-18 08:11:50 +00:00
George Rimar f98b9ac5da [lib/Object] - Minor API update for llvm::Decompressor.
I revisited Decompressor API (issue with it was triggered during D32865 review)
and found it is probably provides more then we really need.

Issue was about next method's signature:

Error decompress(SmallString<32> &Out);
It is too strict. At first I wanted to change it to decompress(SmallVectorImpl<char> &Out),
but then found it is still not flexible because sticks to SmallVector.

During reviews was suggested to use templating to simplify code. Patch do that.

Differential revision: https://reviews.llvm.org/D33200

llvm-svn: 303331
2017-05-18 08:00:01 +00:00
Serguei Katkov ba831f78fd [BPI] Reduce the probability of unreachable edge to minimal value greater than 0
The probability of edge coming to unreachable block should be as low as possible.
The change reduces the probability to minimal value greater than zero.

The bug https://bugs.llvm.org/show_bug.cgi?id=32214 show the example when
the probability of edge coming to unreachable block is greater than for edge
coming to out of the loop and it causes incorrect loop rotation.

Please note that with this change the behavior of unreachable heuristic is a bit different
than others. Specifically, before this change the sum of probabilities
coming to unreachable blocks have the same weight for all branches
(it was just split over all edges of this block coming to unreachable blocks).
With this change it might be slightly different but not to much due to probability of
taken branch to unreachable block is really small.

Reviewers: chandlerc, sanjoy, vsk, congh, junbuml, davidxl, dexonsmith
Reviewed By: chandlerc, dexonsmith
Subscribers: reames, llvm-commits
Differential Revision: https://reviews.llvm.org/D30633

llvm-svn: 303327
2017-05-18 06:11:56 +00:00
Akira Hatanaka b10bff1183 [ThinLTO] Do not assert when adding a module with a different but
compatible target triple

Currently, an assertion fails in ThinLTOCodeGenerator::addModule when
the target triple of the module being added doesn't match that of the
one stored in TMBuilder. This patch relaxes the constraint and makes
changes to allow target triples that only differ in their version
numbers on Apple platforms, similarly to what r228999 did.

rdar://problem/30133904

Differential Revision: https://reviews.llvm.org/D33291

llvm-svn: 303326
2017-05-18 03:52:29 +00:00
Davide Italiano 9ae69a75ec [Target/X86] Remove unneeded return. NFCI.
llvm-svn: 303323
2017-05-18 02:36:42 +00:00
Craig Topper 8a950275f7 [Statistics] Add a method to atomically update a statistic that contains a maximum
Summary:
There are several places in the codebase that try to calculate a maximum value in a Statistic object. We currently do this in one of two ways:

  MaxNumFoo = std::max(MaxNumFoo, NumFoo);

or

  MaxNumFoo = (MaxNumFoo > NumFoo) ? MaxNumFoo : NumFoo;

The first version reads from MaxNumFoo one time and uncontionally rwrites to it. The second version possibly reads it twice depending on the result of the first compare.  But we have no way of knowing if the value was changed by another thread between the reads and the writes.

This patch adds a method to the Statistic object that can ensure that we only store if our value is the max and the previous max didn't change after we read it. If it changed we'll recheck if our value should still be the max or not and try again.

This spawned from an audit I'm trying to do of all places we uses the implicit conversion to unsigned on the Statistics objects. See my previous thread on llvm-dev https://groups.google.com/forum/#!topic/llvm-dev/yfvxiorKrDQ

Reviewers: dberlin, chandlerc, hfinkel, dblaikie

Reviewed By: chandlerc

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D33301

llvm-svn: 303318
2017-05-18 00:51:39 +00:00
Kyle Butt 0cf5b2f88a CodeGen: BlockPlacement: Add Message strings to asserts. NFC
Add message strings to all the unlabeled asserts in the file.

Differential Revision: https://reviews.llvm.org/D33078

llvm-svn: 303316
2017-05-17 23:44:41 +00:00
Craig Topper 48187cffe2 [Statistics] Use Statistic::operator+= instead of adding and assigning separately.
I believe this technically fixes a multithreaded race condition in this code. But my primary concern was as part of looking at removing the ability to treat Statistics like a plain unsigned. There are many weird operations on Statistics in the codebase.

llvm-svn: 303314
2017-05-17 23:22:10 +00:00
Sanjay Patel ba212c241a [InstCombine] handle icmp i1 X, C early to avoid creating an unknown pattern
The missing optimization for xor-of-icmps still needs to be added, but by
being more efficient (not generating unnecessary logic ops with constants)
we avoid the bug.

See discussion in post-commit comments:
https://reviews.llvm.org/D32143

llvm-svn: 303312
2017-05-17 22:29:40 +00:00
Sanjay Patel e5747e3cbd [InstCombine] move icmp bool canonicalizations to helper; NFC
As noted in the post-commit comments in D32143, we should be
catching the constant operand cases sooner to be more efficient
and less likely to expose a missing fold.

llvm-svn: 303309
2017-05-17 22:15:07 +00:00
Matt Arsenault 2b1f9aa577 AMDGPU: Start defining a calling convention
Partially implement callee-side for arguments and return values.
byval doesn't work properly, and most likely sret or other on-stack
return values most as well.

llvm-svn: 303308
2017-05-17 21:56:25 +00:00
Kyle Butt f6c61ef64d CodeGen: Power: Add lowering for shifts of v1i128.
When legalizing vector operations on vNi128, they will be split to v1i128
because that is a legal type on ppc64, but then the compiler will crash in
selection dag because it fails to select for these operations. This patch fixes
shift operations. Logical shift right and left shift can be performed in the
vector unit, but algebraic shift right requires being split.

Differential Revision: https://reviews.llvm.org/D32774

llvm-svn: 303307
2017-05-17 21:54:41 +00:00
Michael Liao ab12984634 Fix PR33028
- '-verify-mahcineinstrs' starts to complain allocatable live-in physical
  registers on non-entry or non-landing-pad basic blocks.
- Refactor the XBEGIN translation to define EAX on a dedicated fallback code
  path due to XABORT. Add a pseudo instruction to define EAX explicitly to
  avoid add physical register live-in.

Differential Revision: https://reviews.llvm.org/D33168

llvm-svn: 303306
2017-05-17 21:48:00 +00:00
Matt Arsenault 2525e4e4c2 AMDGPU: Expand frame indexes to be relative to scratch wave offset
In order for an arbitrary callee to access an object
in a caller's stack frame, the 32-bit offset used as
the private pointer needs to be relative to the kernel's
scratch wave offset register.

Convert to this by finding the difference from the current
stack frame and scaling by the wavefront size.

llvm-svn: 303303
2017-05-17 21:23:14 +00:00
Matt Arsenault 156d3ae0b6 AMDGPU: Change mubuf soffset register when SP relative
Check the MachinePointerInfo for whether the access is
supposed to be relative to the stack pointer.

No tests because this is used in later commits implementing
calls.

llvm-svn: 303301
2017-05-17 21:02:58 +00:00
Simon Pilgrim 23ef26728a [X86][AVX512] Add 512-bit vector ctlz costs + tests
llvm-svn: 303300
2017-05-17 21:02:18 +00:00
Bob Haarman de33a63784 [llvm-pdbdump] in yaml2pdb, generate default output filename if none given
Summary:
llvm-pdbdump yaml2pdb used to fail with a misleading error
message ("An I/O error occurred on the file system") if no output file
was specified. This change adds an assert to PDBFileBuilder to check
that an output file name is specified, and makes llvm-pdbdump generate
an output file name based on the input file name if no output file
name is explicitly specified.

Reviewers: amccarth, zturner

Reviewed By: zturner

Subscribers: fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D33296

llvm-svn: 303299
2017-05-17 20:46:48 +00:00
Zachary Turner d2b418bfe2 Add some helpers for manipulating BinaryStreamRefs.
llvm-svn: 303297
2017-05-17 20:42:52 +00:00
Matt Arsenault 98f2946ab3 AMDGPU: Make better use of op_sel with high components
Handle more general swizzles.

llvm-svn: 303296
2017-05-17 20:30:58 +00:00
Sanjay Patel e2787b9a35 [InstSimplify] handle all icmp i1 X, C in one place; NFCI
We already handled all of the new tests identically, but several
of those went through a lot of unnecessary processing before
getting folded.

Another motivation for grouping these cases together is that
InstCombine needs a similar fold. Currently, it handles the
'not' cases inefficiently which can lead to bugs as described
in the post-commit comments of:
https://reviews.llvm.org/D32143 

llvm-svn: 303295
2017-05-17 20:27:55 +00:00
Zachary Turner d9a626332e [BinaryStream] Reduce the amount of boiler plate needed to use.
Often you have an array and you just want to use it.  With the current
design, you have to first construct a `BinaryByteStream`, and then create
a `BinaryStreamRef` from it.  Worse, the `BinaryStreamRef` holds a pointer
to the `BinaryByteStream`, so you can't just create a temporary one to
appease the compiler, you have to actually hold onto both the `ArrayRef`
as well as the `BinaryByteStream` *AND* the `BinaryStreamReader` on top of
that.  This makes for very cumbersome code, often requiring one to store a
`BinaryByteStream` in a class just to circumvent this.

At the cost of some added complexity (not exposed to users, but internal
to the library), we can do better than this.  This patch allows us to
construct `BinaryStreamReaders` and `BinaryStreamWriters` directly from
source data (e.g. `StringRef`, `MutableArrayRef<uint8_t>`, etc).  Not only
does this reduce the amount of code you have to type and make it more
obvious how to use it, but it solves real lifetime issues when it's
inconvenient to hold onto a `BinaryByteStream` for a long time.

The additional complexity is in the form of an added layer of indirection.
Whereas before we simply stored a `BinaryStream*` in the ref, we now store
both a `BinaryStream*` **and** a `std::shared_ptr<BinaryStream>`.  When
the user wants to construct a `BinaryStreamRef` directly from an
`ArrayRef` etc, we allocate an internal object that holds ownership over a
`BinaryByteStream` and forwards all calls, and store this in the
`shared_ptr<>`.  This also maintains the ref semantics, as you can copy it
by value and references refer to the same underlying stream -- the one
being held in the object stored in the `shared_ptr`.

Differential Revision: https://reviews.llvm.org/D33293

llvm-svn: 303294
2017-05-17 20:23:31 +00:00
Simon Pilgrim d0365967c4 [X86][AVX512] Add 512-bit vector cttz costs + tests
llvm-svn: 303293
2017-05-17 20:22:54 +00:00
Dehao Chen 02828a93e8 Only enable LiveRangeShrink for x86.
Summary: Moving LiveRangeShrink to x86 as this pass is mostly useful for archtectures with great register pressure.

Reviewers: MatzeB, qcolombet

Reviewed By: qcolombet

Subscribers: jholewinski, jyknight, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33294

llvm-svn: 303292
2017-05-17 20:18:13 +00:00
Matt Arsenault 786eeea23e AMDGPU: Try to use op_sel when selecting packed instructions
Avoids instructions to pack a vector when the source is really
a scalar being broadcast.

Also be smarter and look for per-component fneg.

Doesn't yet handle scalar from upper half of register
or other swizzles.

llvm-svn: 303291
2017-05-17 20:00:00 +00:00
Jacob Gravelle c63fb00f13 [WebAssembly][NFC] Update expected testsuite failures for newly passing tests
Summary: r303050 fixes crashes when calling scalarizeMaskedMemIntrin pass from WebAssembly backend. This updates expected test failures for that.

Reviewers: sbc100

Subscribers: jfb, llvm-commits, dschuff

Differential Revision: https://reviews.llvm.org/D33295

llvm-svn: 303288
2017-05-17 19:45:22 +00:00
Matt Arsenault ea8a4ed588 AMDGPU: Use appropriate soffset for spilling
This needs to be the frame offset register, and not the global
scratch wave offset register. For kernels, these are the same.

llvm-svn: 303287
2017-05-17 19:37:57 +00:00
Dimitry Andric ebc8779301 Revert r303015, because it has the unintended side effect of breaking
driver-mode recognition in clang (this is because the sysctl method
always returns one and only one executable path, even for an executable
with multiple links):

Fix DynamicLibraryTest.cpp on FreeBSD and NetBSD

Summary:

After rL301562, on FreeBSD the DynamicLibrary unittests fail, because
the test uses getMainExecutable("DynamicLibraryTests", Ptr), and since
the path does not contain any slashes, retrieving the main executable
will not work.

Reimplement getMainExecutable() for FreeBSD and NetBSD using sysctl(3),
which is more reliable than fiddling with relative or absolute paths.

Also add retrieval of the original argv[] from the GoogleTest framework,
to use as a fallback for other OSes.

Reviewers: emaste, marsupial, hans, krytarowski

Reviewed By: krytarowski

Subscribers: krytarowski, llvm-commits

Differential Revision: https://reviews.llvm.org/D33171

llvm-svn: 303285
2017-05-17 19:33:10 +00:00
Matt Arsenault ee324ffc1f AMDGPU: Fix min3/max3 combines for f16/i16
Fix missing instruction definitions for min3/max3.

llvm-svn: 303284
2017-05-17 19:25:06 +00:00
Simon Pilgrim a9a92a1a6a [X86][AVX512] Add 512-bit vector bitreverse costs + tests
llvm-svn: 303283
2017-05-17 19:20:20 +00:00
Reid Kleckner 710c1cebb4 Re-land r303274: "[CrashRecovery] Use SEH __try instead of VEH when available"
We have to check gCrashRecoveryEnabled before using __try.

In other words, SEH works too well and we ended up recovering from
crashes in implicit module builds that we weren't supposed to. Only
libclang is supposed to enable CrashRecoveryContext to allow implicit
module builds to crash.

llvm-svn: 303279
2017-05-17 18:16:17 +00:00
Aditya Nandakumar be92993710 [GISel]: Fix undefined behavior in IRTranslator
Make sure IRTranslator->MachineIRBuilder->DebugLoc doesn't
outlive the DILocation. Clear it at the end of
IRTranslator::runOnMachineFunction

llvm-svn: 303277
2017-05-17 17:41:55 +00:00
Reid Kleckner 6f6f7d19f0 Revert "[CrashRecovery] Use SEH __try instead of VEH when available"
This reverts commit r303274, it appears to break some clang tests.

llvm-svn: 303275
2017-05-17 17:15:00 +00:00
Reid Kleckner 91fea018ee [CrashRecovery] Use SEH __try instead of VEH when available
Summary:
It avoids problems when other libraries raise exceptions. In particular,
OutputDebugString raises an exception that the debugger is supposed to
catch and suppress. VEH kicks in first right now, and that is entirely
incorrect.

Unfortunately, GCC does not support SEH, so I've kept the old buggy VEH
codepath around. We could fix it with SetUnhandledExceptionFilter, but
that is not per-thread, so a well-behaved library shouldn't set it.

Reviewers: zturner

Subscribers: llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D33261

llvm-svn: 303274
2017-05-17 17:02:16 +00:00
Zachary Turner 0daa7074bf Workaround for incorrect Win32 header on GCC.
llvm-svn: 303272
2017-05-17 16:39:33 +00:00
Zachary Turner 1d795c451e [CodeView] Simplify the use of visiting type records & streams.
There is often a lot of boilerplate code required to visit a type
record or type stream.  The #1 use case is that you have a sequence
of bytes that represent one or more records, and you want to
deserialize each one, switch on it, and call a callback with the
deserialized record that the user can examine.  Currently this
requires at least 6 lines of code:

  codeview::TypeVisitorCallbackPipeline Pipeline;
  Pipeline.addCallbackToPipeline(Deserializer);
  Pipeline.addCallbackToPipeline(MyCallbacks);

  codeview::CVTypeVisitor Visitor(Pipeline);
  consumeError(Visitor.visitTypeRecord(Record));

With this patch, it becomes one line of code:

  consumeError(codeview::visitTypeRecord(Record, MyCallbacks));

This is done by having the deserialization happen internally inside
of the visitTypeRecord function.  Since this is occasionally not
desirable, the function provides a 3rd parameter that can be used
to change this behavior.

Hopefully this can significantly reduce the barrier to entry
to using the visitation infrastructure.

Differential Revision: https://reviews.llvm.org/D33245

llvm-svn: 303271
2017-05-17 16:39:06 +00:00
Sanjay Patel b2e7003103 [InstCombine] add isCanonicalPredicate() helper function and use it; NFCI
There should be a slight efficiency improvement from handling icmp/fcmp with one matcher and reducing duplicated code.

The larger motivation is that there are questions about how predicate canonicalization is handled, and the refactoring
should make it easier if we want to change any of that behavior.

1. As noted in the code comment, we've chosen 3 of the 16 FCMP preds as not canonical. Why those 3? It goes back to 
   rL32751 from what I can tell, but I'm not sure if there's a justification for that rule.
2. We currently do not canonicalize integer select conditions. Should we use the same rule that applies to branches 
   for selects?
3. We currently do canonicalize some FP select conditions, and those rules would conflict with the rule shown here. 
   Should one or both be changed? 

No-functional-change-intended, but adding tests anyway because there's no coverage for most of the predicates.

Differential Revision: https://reviews.llvm.org/D33247

llvm-svn: 303261
2017-05-17 14:21:19 +00:00
Krzysztof Parzyszek 2b0533126e [PPC] Properly update register save area offsets
The variables MinGPR/MinG8R were not updated properly when resetting the
offsets, which in the included testcase lead to saving the CR register
in the same location as R30.

This fixes another issue reported in PR26519.

Differential Revision: https://reviews.llvm.org/D33017

llvm-svn: 303257
2017-05-17 13:25:09 +00:00
Igor Breger 28f290fab8 [GlobalISel][X86] Support add i64 in IA32.
Summary: support G_UADDE instruction selection.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D33096

llvm-svn: 303255
2017-05-17 12:48:08 +00:00
Jonas Paulsson 8722ade770 [SystemZ] Modelling of costs of divisions with a constant power of 2.
Such divisions will eventually be implemented with shifts which should
be reflected in the cost function.

Review: Ulrich Weigand
llvm-svn: 303254
2017-05-17 12:46:26 +00:00
Diana Picus eafa4aa910 Reland r303247: [ARM] GlobalISel: Remove dead instruction selection code
It only failed on llvm-clang-x86_64-expensive-checks-win, probably
because the TableGen stuff hasn't been regenerated.
Requires a clean build.

llvm-svn: 303252
2017-05-17 12:42:52 +00:00
George Rimar fed9f09f48 [DWARF] - Cleanup relocations proccessing.
RelocAddrMap was a pair of <width, address>, where width is relocation size (4/8/x, x < 8), 
and width field was never used in code.

Relocations proccessing loop had checks for width field. Does not look like DWARF parser
should do that. There is probably no much sense to validate relocations during proccessing 
them in parser.

Patch removes relocation's width relative code from DWARFContext.

Differential revision: https://reviews.llvm.org/D33194

llvm-svn: 303251
2017-05-17 12:10:51 +00:00
Diana Picus 36e4ba0f6e Revert "[ARM] GlobalISel: Remove dead instruction selection code"
This reverts commit r303247 because the tests are failing on some bots.
Sorry!

llvm-svn: 303249
2017-05-17 11:56:07 +00:00
Diana Picus 68d21c864e [ARM] GlobalISel: Remove dead instruction selection code
We can now generate code for selecting G_ADD, G_SUB and G_MUL. Remove
the hand-written versions.

llvm-svn: 303247
2017-05-17 11:39:26 +00:00
Daniel Cederman 4af795b499 [Sparc] Remove execute permissions from non-executable text files
Reviewers: jyknight, lero_chris, venkatra

Reviewed By: jyknight

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27127

llvm-svn: 303245
2017-05-17 11:05:20 +00:00
Pavel Labath 859d302349 [RuntimeDyld] Fix debug section relocation (pr20457)
Summary:
Debug info sections, (or non-SHF_ALLOC sections in general) should be
linked as if their load address was zero to emulate the behavior of the
static linker.

This bug was discovered because it was breaking lldb expression evaluation on
linux.

Reviewers: lhames

Subscribers: aprantl, eugene, clayborg, lldb-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D32899

llvm-svn: 303239
2017-05-17 08:47:28 +00:00
Jonas Paulsson 0f8678016f Make sure -optimize-regalloc=false is used correctly by user.
Don't allow -optimize-regalloc=false with -regalloc given for anything other
than 'fast'. The other register allocators depend on the supporting passes
added by addOptimizedRegAlloc().

Reviewers: Quentin Colombet, Matthias Braun
https://reviews.llvm.org/D33181

llvm-svn: 303238
2017-05-17 07:36:03 +00:00
Max Kazantsev 4c7f293d24 [SCEV] Always sort AddRecExprs from different loops by dominance
Sorting of AddRecExprs by loop nesting does not make sense since we only invoke
the CompareSCEVComplexity for AddRecExprs that are used by one SCEV. This
guarantees that there is always a dominance relationship between them. This
patch removes the sorting by nesting which is a dead code in current usage of
this function.

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D33228

llvm-svn: 303235
2017-05-17 04:09:14 +00:00
Max Kazantsev b67d344850 [SCEV][NFC] Replace redundant dyn_cast with cast in getAddExpr
Replace dyn_cast which is ensured by isa just one line above with cast.

Differential Revision: https://reviews.llvm.org/D33231

llvm-svn: 303234
2017-05-17 03:58:42 +00:00
Gor Nishanov db38485588 [coroutines] Handle spills before catchswitch
If we need to spill the result of the PHI instruction, we insert the spill after
all of the PHIs and EHPads, however, in a catchswitch block there is no
room to insert the spill. Make room by splitting away catchswitch into a separate
block.

Before the fix:

    catch.dispatch:
       %val = phi i32 [ 1, %if.then ], [ 2, %if.else ]
       %switch = catchswitch within none [label %catch] unwind label %cleanuppad

After:

    catch.dispatch:
       %val = phi i32 [ 1, %if.then ], [ 2, %if.else ]
       %tok = cleanuppad within none []
       ; spill goes here
       cleanupret from %tok unwind label %catch.dispatch.switch
    catch.dispatch.switch:
       %switch = catchswitch within none [label %catch] unwind label %cleanuppad

https://reviews.llvm.org/D31846

llvm-svn: 303232
2017-05-17 03:09:22 +00:00
Francis Visoiu Mistrih b52e036600 BitVector: add iterators for set bits
Differential revision: https://reviews.llvm.org/D32060

llvm-svn: 303227
2017-05-17 01:07:53 +00:00
Eugene Zelenko a369a45746 [ADT] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC).
llvm-svn: 303221
2017-05-16 23:10:25 +00:00
Zachary Turner c9c39291c7 Fix for compilers with older CRT header libraries.
llvm-svn: 303220
2017-05-16 22:59:34 +00:00
Zachary Turner 13e87f43d9 [Support] Ignore OutputDebugString exceptions in our crash recovery.
Since we use AddVectoredExceptionHandler, we get notified of
every exception that gets raised by a program.  Sometimes these
are not necessarily errors though, and this can be especially
true when linking against a library that we have no control
over, and may raise an exception internally which it intends
to catch.

In particular, the Windows API OutputDebugString does exactly
this.  It raises an exception inside of a __try / __except,
giving the debugger a chance to handle the exception to print
the message to the debug console.

But this doesn't interoperate nicely with our vectored exception
handler, which just sees another exception and decides that we
need to terminate the program.

Add a special case for this so that we ignore ODS exceptions
and continue normally.

Note that a better fix is to simply not use vectored exception
handlers and use SEH instead, but given that MinGW doesn't support
SEH, this is the only solution for MinGW.

Differential Revision: https://reviews.llvm.org/D33260

llvm-svn: 303219
2017-05-16 22:50:32 +00:00
Davide Italiano 79eb3b0366 [IR] Prefer use_empty() to !hasNUsesOrMore(1) for clarity.
llvm-svn: 303218
2017-05-16 22:38:40 +00:00
Sanjay Patel 877364ff99 [InstSimplify] add folds for constant mask of value shifted by constant
We would eventually catch these via demanded bits and computing known bits in InstCombine,
but I think it's better to handle the simple cases as soon as possible as a matter of efficiency.

This fold allows further simplifications based on distributed ops transforms. eg:
  %a = lshr i8 %x, 7
  %b = or i8 %a, 2
  %c = and i8 %b, 1

InstSimplify can directly fold this now:
  %a = lshr i8 %x, 7

Differential Revision: https://reviews.llvm.org/D33221

llvm-svn: 303213
2017-05-16 21:51:04 +00:00
Evgeny Stupachenko cc19560253 The patch exclude a case from zero check skip in
CTLZ idiom recognition (r303102).

Summary:

The following case:
i = 1;
if(n)
  while (n >>= 1)
    i++;
use(i);

Was converted to:

i = 1;
if(n)
  i += builtin_ctlz(n >> 1, false);
use(i);

Which is not correct. The patch make it:

i = 1;
if(n)
  i += builtin_ctlz(n >> 1, true);
use(i);

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 303212
2017-05-16 21:44:59 +00:00
Amara Emerson c9916d7e97 Re-commit r302678, fixing PR33053.
The issue was that the AArch64 TTI hook allowed unpacked integer cmp reductions
which didn't have a lowering.

llvm-svn: 303211
2017-05-16 21:29:22 +00:00
Easwaran Raman 3cd1479c3f [Inliner] Do not mix callsite and callee hotness based updates.
Update threshold based on callee's hotness only when BFI is not available.
Otherwise use only callsite's hotness. This makes it easier to reason about
hotness related threshold updates.

Differential revision: https://reviews.llvm.org/D33157

llvm-svn: 303210
2017-05-16 21:18:09 +00:00
Tim Shen 3bef27cc6f [PPC] Lower load acquire/seq_cst trailing fence to cmp + bne + isync.
Summary:
This fixes pr32392.

The lowering pipeline is:
llvm.ppc.cfence in IR -> PPC::CFENCE8 in isel -> Actual instructions in
expandPostRAPseudo.

The reason why expandPostRAPseudo is chosen is because previous passes
are likely eliminating instructions like cmpw 3, 3 (early CSE) and bne-
7, .+4 (some branch pass(s)).

Differential Revision: https://reviews.llvm.org/D32763

llvm-svn: 303205
2017-05-16 20:18:06 +00:00
Easwaran Raman dadc0f11ad Add hasProfileSummary and has{Sample|Instrumentation}Profile methods
ProfileSummaryInfo already checks whether the module has sample profile
in determining profile counts. This will also be useful in inliner to
clean up threshold updates.

llvm-svn: 303204
2017-05-16 20:14:39 +00:00
Dmitry Mikulin fce148c568 In debug builds non-trivial amount of time is spent in InstCombine processing
@llvm.dbg.* calls in visitCallInst(). They can be safely ignored.

llvm-svn: 303202
2017-05-16 20:08:49 +00:00
Daniel Berlin 6c66e9a22a NewGVN: Only do something in verifyStoreExpressions if assertions are enabled, to avoid unused code warnings.
llvm-svn: 303201
2017-05-16 20:02:45 +00:00
Daniel Berlin 4540357240 NewGVN: Fix PR 33051 by making sure we remove old store expressions
from the ExpressionToClass mapping.

llvm-svn: 303200
2017-05-16 19:58:47 +00:00
Reid Kleckner 0ad69fc89f Revert "[X86] Replace slow LEA instructions in X86"
This reverts commit r303183, it broke various buildbots and introduced
sanitizer errors.

llvm-svn: 303199
2017-05-16 19:55:03 +00:00
Nirav Dave da8f221273 Elide stores which are overwritten without being observed.
Summary:
In SelectionDAG, when a store is immediately chained to another store
to the same address, elide the first store as it has no observable
effects. This is causes small improvements dealing with intrinsics
lowered to stores.

Test notes:

* Many testcases overwrite store addresses multiple times and needed
  minor changes, mainly making stores volatile to prevent the
  optimization from optimizing the test away.

* Many X86 test cases optimized out instructions associated with
  associated with va_start.

* Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has
  dependencies to check and can probably be removed and potentially
  replaced with another test.

Reviewers: rnk, john.brawn

Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33206

llvm-svn: 303198
2017-05-16 19:43:56 +00:00
Matthias Braun d625bedb40 ShrinkWrap: Add skipFunction() call
ShrinkWrapping is a performance optimization that can safely be skipped,
so we can add `if (!skipFunction()) return;`

llvm-svn: 303197
2017-05-16 18:43:30 +00:00
Davide Italiano 56a08b40d2 [MetadataLoader] Remove unused Vector. NFCI.
llvm-svn: 303196
2017-05-16 18:41:46 +00:00
Renato Golin d69570e017 Revert "[ARM] Mark LEApcrel instructions as isAsCheapAsAMove"
Revert "[ARM] Mark LEApcrel as not having side effects"

This reverts commit r303054 and r303053, as they broke the ARM
self-hosting buildbots:

http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/1550

http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/1349

http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost/builds/1845

Offline investigation on course.

llvm-svn: 303193
2017-05-16 17:59:07 +00:00
Stanislav Mekhanoshin acca0f5c02 [AMDGPU] Use GCNRPTracker dumper methods in scheduler
Differential Revision: https://reviews.llvm.org/D33244

llvm-svn: 303186
2017-05-16 16:31:45 +00:00
Stanislav Mekhanoshin b10860788f [AMDGPU] Cache live-ins and register pressure in scheduler
Using LIS can be quite expensive, so caching of calculated region
live-ins and pressure is implemented. It does two things:

1. Caches the info for the second stage when we schedule with
   decreased target occupancy.
2. Tracks the basic block from top to bottom thus eliminating the
   need to scan whole register file liveness at every region split
   in the middle of the block.

The scheduling is now done in 3 stages instead of two, with the first
one being really a no-op and only used to collect scheduling regions
as sent by the scheduler driver.

There is no functional change to the current behavior, only compilation
speed is affected. In general computeBlockPressure() could be simplified
if we switch to backward RP tracker, because scheduler sends regions
within a block starting from the last upward. We could use a natural
order of upward tracker to seamlessly change between regions of the same
block, since live reg set of a previous tracked region would become a
live-out of the next region. That however requires fixing upward tracker
to properly account defs and uses of the same instruction as both are
contributing to the current pressure. When we converge on the produced
pressure we should be able to switch between them back and forth. In
addition, backward tracker is less expensive as it uses LIS in recede
less often than forward uses it in advance.

At the moment the worst known case compilation time has improved from 26
minutes to 8.5.

Differential Revision: https://reviews.llvm.org/D33117

llvm-svn: 303184
2017-05-16 16:11:26 +00:00
Lama Saba 52e892577d [X86] Replace slow LEA instructions in X86
According to Intel's Optimization Reference Manual for SNB+:
  " For LEA instructions with three source operands and some specific situations, instruction latency has increased to 3 cycles, and must
    dispatch via port 1:
  - LEA that has all three source operands: base, index, and offset
  - LEA that uses base and index registers where the base is EBP, RBP,or R13
  - LEA that uses RIP relative addressing mode
  - LEA that uses 16-bit addressing mode "
  This patch currently handles the first 2 cases only.
 
Differential Revision: https://reviews.llvm.org/D32277

llvm-svn: 303183
2017-05-16 16:01:36 +00:00
Matthew Simpson af60af1ed5 Revert 303174, 303176, and 303178
These commits are breaking the bots. Reverting to investigate.

llvm-svn: 303182
2017-05-16 15:50:30 +00:00
Nirav Dave cfd357a61a [DAG] Prune deleted nodes in TokenFactor
Fix visitTokenFactor to correctly remove deleted nodes. NFC.

llvm-svn: 303181
2017-05-16 15:49:02 +00:00
Stanislav Mekhanoshin 464cecf81e [AMDGPU] Turn register pressure estimation into forward tracker
This factors register pressure estimation mechanism from the
GCNSchedStrategy into the forward tracker to unify interface
with other strategies and expose it to other interested phases.

Differential Revision: https://reviews.llvm.org/D33105

llvm-svn: 303179
2017-05-16 15:43:52 +00:00
Matthew Simpson b7b5d55c38 [LV] Avoid potentential division by zero when selecting IC
llvm-svn: 303174
2017-05-16 14:43:55 +00:00
Gor Nishanov 23453c11ff [coroutines] Handle unwind edge splitting
Summary:
RewritePHIs algorithm used in building of CoroFrame inserts a placeholder
```
%placeholder = phi [%val]
```
on every edge leading to a block starting with PHI node with multiple incoming edges,
so that if one of the incoming values was spilled and need to be reloaded, we have a
place to insert a reload. We use SplitEdge helper function to split the incoming edge.

SplitEdge function does not deal with unwind edges comping into a block with an EHPad.

This patch adds an ehAwareSplitEdge function that can correctly split the unwind edge.

For landing pads, we clone the landing pad into every edge block and replace the original
landing pad with a PHI collection the values from all incoming landing pads.

For WinEH pads, we keep the original EHPad in place and insert cleanuppad/cleapret in the
edge blocks.

Reviewers: majnemer, rnk

Reviewed By: majnemer

Subscribers: EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D31845

llvm-svn: 303172
2017-05-16 14:11:39 +00:00
George Rimar 41e656768d [DWARF] - Add RelocAddrEntry for cleanup. NFCi.
Was mentioned as possible cleanup during review of D33184.

llvm-svn: 303171
2017-05-16 14:05:45 +00:00
Chad Rosier 8b12a03215 Fix an improperly placed curly bracket. NFC.
llvm-svn: 303165
2017-05-16 12:43:23 +00:00
George Rimar 4671f2e08c [DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector.
Recommit of r303159 "[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector"
All places were shitched to use DWARFAddressRange now.

Suggested during review of D33184.

llvm-svn: 303163
2017-05-16 12:30:59 +00:00
George Rimar 3824cca7b3 Revert r303159 "[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector."
Something went wrong, it broke BB.
http://green.lab.llvm.org/green//job/clang-stage1-cmake-RA-incremental_build/38477/consoleFull#-200034420049ba4694-19c4-4d7e-bec5-911270d8a58c

llvm-svn: 303162
2017-05-16 12:05:03 +00:00
George Rimar 8680b6ee9c [DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector.
Suggested during review of D33184.

llvm-svn: 303159
2017-05-16 11:54:19 +00:00
James Henderson 852f6fde01 [LTO] Print time-passes information at conclusion of LTO codegen
The information collected when requested by -time-passes is only printed when
llvm_shutdown is called at the moment. This means that when linking against the LTO
library dynamically and using the C interface, it is not possible to see the timing
information, because llvm_shutdown cannot be called. This change modifies the LTO
code generation functions for both regular LTO and thin LTO to explicitly print and
reset the timing information.

I have tested that this works with our proprietary linker. However, as this relies
on a specific method of building and linking against the LTO library, I'm not sure
how or if this can be tested in the LLVM testsuite.

Reviewed by: mehdi_amini

Differential Revision: https://reviews.llvm.org/D32803

llvm-svn: 303152
2017-05-16 09:43:21 +00:00
Max Kazantsev b09b5db793 [SCEV] Fix sorting order for AddRecExprs
The existing sorting order in defined CompareSCEVComplexity sorts AddRecExprs
by loop depth, but does not pay attention to dominance of loops. This can
lead us to the following buggy situation:

for (...) { // loop1
  op1 = {A,+,B}
}
for (...) { // loop2
  op2 = {A,+,B}
  S = add op1, op2
}

In this case there is no guarantee that in operand list of S the op2 comes
before op1 (loop depth is the same, so they will be sorted just
lexicographically), so we can incorrectly treat S as a recurrence of loop1,
which is wrong.

This patch changes the sorting logic so that it places the dominated recs
before the dominating recs. This ensures that when we pick the first recurrency
in the operands order, it will be the bottom-most in terms of domination tree.
The attached test set includes some tests that produce incorrect SCEV
estimations and crashes with oldlogic.

Reviewers: sanjoy, reames, apilipenko, anna

Reviewed By: sanjoy

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33121

llvm-svn: 303148
2017-05-16 07:27:06 +00:00
Craig Topper 064adc6bfa [CorrelatedValuePropagation] Don't use -> to call a static method of ConstantRange. NFC
llvm-svn: 303147
2017-05-16 07:05:38 +00:00
Daniel Berlin 629e1ff6e6 NewGVN: Use StoreExpression StoredValue instead of looking it up again, since it was already looked up when it was created
llvm-svn: 303144
2017-05-16 06:06:15 +00:00
Daniel Berlin abd632dfeb NewGVN: Formatting fixes
llvm-svn: 303143
2017-05-16 06:06:12 +00:00
Davide Italiano a641842845 Revert "[NewGVN] Replace predicate info leftovers."
It's breaking the bots.

llvm-svn: 303142
2017-05-16 05:51:21 +00:00
Davide Italiano 331058fcc4 [NewGVN] Replace predicate info leftovers.
Fixes PR32945.

Differential Revision:  https://reviews.llvm.org/D33226

llvm-svn: 303141
2017-05-16 05:23:23 +00:00
NAKAMURA Takumi 994a43d27a AMDGPUCodeGen: Fix warnings in r303111. [-Wunused-variable]
llvm-svn: 303137
2017-05-16 04:01:23 +00:00
Peter Collingbourne 6f0ecca3b5 IR: Give function GlobalValue::getRealLinkageName() a less misleading name: dropLLVMManglingEscape().
This function gives the wrong answer on some non-ELF platforms in some
cases. The function that does the right thing lives in Mangler.h. To try to
discourage people from using this function, give it a different name.

Differential Revision: https://reviews.llvm.org/D33162

llvm-svn: 303134
2017-05-16 00:39:01 +00:00
Francis Visoiu Mistrih ebbc7159e9 [ShrinkWrapping] Handle restores on no-return paths
Shrink-wrapping uses post-dominators to find a restore point that
post-dominates all the uses of CSR / stack.

The way dominator trees are modeled in LLVM today is that unreachable
blocks are not present in a generic dominator tree, so, an unreachable node is
dominated by anything: include/llvm/Support/GenericDomTree.h:467.

Since for post-dominators, a no-return block is considered
"unreachable", calling findNearestCommonDominator on an unreachable node
A and a non-unreachable node B, will return B, which can be false. If we
find such node, we bail out since there is no good restore point
available.

rdar://problem/30186931

llvm-svn: 303130
2017-05-15 23:13:35 +00:00
Kostya Serebryany cf50d43be9 [libFuzzer] fix tests on Windows
llvm-svn: 303128
2017-05-15 22:55:00 +00:00
Xinliang David Li 8726d91d29 Fix memory leak
llvm-svn: 303126
2017-05-15 22:43:52 +00:00
Kostya Serebryany 87813b1bf8 [libFuzzer] improve the afl driver and it's tests. Make it possible to run individual inputs with afl driver
llvm-svn: 303125
2017-05-15 22:38:29 +00:00
Davide Italiano 60d36c7506 [AMDGPU] Kill now unused phiInfoElementGetDebugLoc(). NFCI.
llvm-svn: 303122
2017-05-15 22:10:15 +00:00
Craig Topper 6a1d02024e [APInt] Simplify a for loop initialization based on the fact that 'n' is known to be 1 by an earlier 'if'.
llvm-svn: 303120
2017-05-15 22:01:03 +00:00
Eugene Zelenko d761e2c264 [IR] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC).
llvm-svn: 303119
2017-05-15 21:57:41 +00:00
Tim Northover 203c6f055d AArch64: use linker-private symbols for globals in MachO.
We don't use section-relative relocations on AArch64, so all symbols must be at
least visible to the linker (i.e. properly global or l_whatever, but not
L_whatever).

llvm-svn: 303118
2017-05-15 21:51:38 +00:00
David Blaikie 441cfee780 PR32288: Describe a bool parameter's DWARF location with a simple register
There's no need (& a bit incorrect) to mask off the high bits of the
register reference when describing a simple bool value.

Reviewers: aprantl

Differential Revision: https://reviews.llvm.org/D31062

llvm-svn: 303117
2017-05-15 21:34:01 +00:00
Adam Nemet e29686e5c1 [SLP] Enable 64-bit wide vectorization on AArch64
ARM Neon has native support for half-sized vector registers (64 bits).  This
is beneficial for example for 2D and 3D graphics.  This patch adds the option
to lower MinVecRegSize from 128 via a TTI in the SLP Vectorizer.

*** Performance Analysis

This change was motivated by some internal benchmarks but it is also
beneficial on SPEC and the LLVM testsuite.

The results are with -O3 and PGO.  A negative percentage is an improvement.
The testsuite was run with a sample size of 4.

** SPEC

* CFP2006/482.sphinx3  -3.34%

A pretty hot loop is SLP vectorized resulting in nice instruction reduction.
This used to be a +22% regression before rL299482.

* CFP2000/177.mesa     -3.34%
* CINT2000/256.bzip2   +6.97%

My current plan is to extend the fix in rL299482 to i16 which brings the
regression down to +2.5%.  There are also other problems with the codegen in
this loop so there is further room for improvement.

** LLVM testsuite

* SingleSource/Benchmarks/Misc/ReedSolomon               -10.75%

There are multiple small SLP vectorizations outside the hot code.  It's a bit
surprising that it adds up to 10%.  Some of this may be code-layout noise.

* MultiSource/Benchmarks/VersaBench/beamformer/beamformer -8.40%

The opt-viewer screenshot can be seen at F3218284.  We start at a colder store
but the tree leads us into the hottest loop.

* MultiSource/Applications/lambda-0.1.3/lambda            -2.68%
* MultiSource/Benchmarks/Bullet/bullet                    -2.18%

This is using 3D vectors.

* SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists +6.67%

Noise, binary is unchanged.

* MultiSource/Benchmarks/Ptrdist/anagram/anagram          +4.90%

There is an additional SLP in the cold code.  The test runs for ~1sec and
prints out over 2000 lines. This is most likely noise.

* MultiSource/Applications/aha/aha                        +1.63%
* MultiSource/Applications/JM/lencod/lencod               +1.41%
* SingleSource/Benchmarks/Misc/richards_benchmark         +1.15%

Differential Revision: https://reviews.llvm.org/D31965

llvm-svn: 303116
2017-05-15 21:15:01 +00:00
Hans Wennborg bd6e9e77a7 Revert r302678 "[AArch64] Enable use of reduction intrinsics."
This caused PR33053.

Original commit message:

> The new experimental reduction intrinsics can now be used, so I'm enabling this
> for AArch64. We will need this for SVE anyway, so it makes sense to do this for
> NEON reductions as well.
>
> The existing code to match shufflevector patterns are replaced with a direct
> lowering of the reductions to AArch64-specific nodes. Tests updated with the
> new, simpler, representation.
>
> Differential Revision: https://reviews.llvm.org/D32247

llvm-svn: 303115
2017-05-15 20:59:32 +00:00
Evgeniy Stepanov b56012b548 [asan] Better workaround for gold PR19002.
See the comment for more details. Test in a follow-up CFE commit.

llvm-svn: 303113
2017-05-15 20:43:42 +00:00
Jan Sjodin a06bfe054e Re-submit AMDGPUMachineCFGStructurizer.
Differential Revision: https://reviews.llvm.org/D23209

llvm-svn: 303111
2017-05-15 20:18:37 +00:00
Tim Northover 8b96c7e9b5 AArch64: diagnose unrecognized features in .cpu directive.
We were silently ignoring any features we couldn't match up, which led to
errors in an inline asm block missing the conventional "\n\t".

llvm-svn: 303108
2017-05-15 19:42:15 +00:00
Davide Italiano cff8a34716 [NewGVN] Remove unused setDefiningExpr(). NFCI.
llvm-svn: 303107
2017-05-15 19:35:40 +00:00
Sanjay Patel 878715f978 [InstCombine] restrict icmp fold with 2 sdiv exact operands (PR32949)
This is the InstCombine counterpart to D32954. 
I added some comments about the code duplication in:
rL302436

Alive-based verification:
http://rise4fun.com/Alive/dPw

This is a 2nd fix for the problem reported in:
https://bugs.llvm.org/show_bug.cgi?id=32949

Differential Revision: https://reviews.llvm.org/D32970

llvm-svn: 303105
2017-05-15 19:27:53 +00:00
Sanjay Patel a23b141cd2 [InstSimplify] restrict icmp fold with 2 sdiv exact operands (PR32949)
These folds were introduced with https://reviews.llvm.org/rL127064 as part of solving:
https://bugs.llvm.org/show_bug.cgi?id=9343

As shown here:
http://rise4fun.com/Alive/C8
...however, the sdiv exact case needs a stronger predicate.

I opted for duplicated code instead of adding another fallthrough because I think that's 
easier to read (and edit in case we need/want to restrict/loosen the predicates any more).

This should fix:
https://bugs.llvm.org/show_bug.cgi?id=32949
https://bugs.llvm.org/show_bug.cgi?id=32948

Differential Revision: https://reviews.llvm.org/D32954

llvm-svn: 303104
2017-05-15 19:16:49 +00:00
Evgeny Stupachenko 2fecd38ab8 The patch adds CTLZ idiom recognition.
Summary:

The following loops should be recognized:
i = 0;
while (n) {
  n = n >> 1;
  i++;
  body();
}
use(i);

And replaced with builtin_ctlz(n) if body() is empty or
for CPUs that have CTLZ instruction converted to countable:

for (j = 0; j < builtin_ctlz(n); j++) {
  n = n >> 1;
  i++;
  body();
}
use(builtin_ctlz(n));

Reviewers: rengolin, joerg

Differential Revision: http://reviews.llvm.org/D32605

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 303102
2017-05-15 19:08:56 +00:00
Davide Italiano 6e7a212748 [NewGVN] Fix verification of MemoryPhis in verifyMemoryCongruency().
verifyMemoryCongruency() filters out trivially dead MemoryDef(s),
as we find them immediately dead, before moving from TOP to a new
congruence class.
This fixes the same problem for PHI(s) skipping MemoryPhis if all
the operands are dead.

Differential Revision:  https://reviews.llvm.org/D33044

llvm-svn: 303100
2017-05-15 18:50:53 +00:00
Geoff Berry e369653bf3 [AArch64][Falkor] Fix sched details for FMOV
llvm-svn: 303099
2017-05-15 18:50:22 +00:00
Jan Sjodin 0e289822fa Revert 303091.
llvm-svn: 303098
2017-05-15 18:39:47 +00:00
Teresa Johnson 41db92f9ae Add support for handling ifuncs to GlobalValue::getBaseObject
Summary:
All GlobalIndirectSymbol types (not just GlobalAlias) should return
their base object.

Without this patch LTO would warn "Unable to determine comdat of
alias!" for an ifunc.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D33202

llvm-svn: 303096
2017-05-15 18:28:29 +00:00
Craig Topper 716cad8bb7 [SCEV] Use copy initialization of APInts instead of direct initialization.
This is based on post commit feed back from r302769.

llvm-svn: 303092
2017-05-15 18:14:16 +00:00
Jan Sjodin e9d2ddc9dd Add AMDGPUMachineCFGStructurizer.
Differential Revision: https://reviews.llvm.org/D23209

llvm-svn: 303091
2017-05-15 18:13:56 +00:00
Sanjay Patel 941e8dfcbf [InstCombine] use m_OneUse to reduce code; NFCI
llvm-svn: 303090
2017-05-15 18:08:17 +00:00
Kostya Serebryany e8a49b3850 [libFuzzer] fix a warning from Wunreachable-code-loop-increment reported by Christian Holler. This also fixes a logical bug, which however does not affect the libFuzzer's ability too much (I wasn't able to create a differentiating test)
llvm-svn: 303087
2017-05-15 17:39:42 +00:00
Kyle Butt 7d531daece CodeGen: BlockPlacement: Increase tail duplication size for O3.
At O3 we are more willing to increase size if we believe it will improve
performance. The current threshold for tail-duplication of 2 instructions is
conservative, and can be relaxed at O3.

Benchmark results:
llvm test-suite:
6% improvement in aha, due to duplication of loop latch
3% improvement in hexxagon

2% slowdown in lpbench. Seems related, but couldn't completely diagnose.

Internal google benchmark:
Produces 4% improvement on internal google protocol buffer serialization
benchmarks.

Differential-Revision: https://reviews.llvm.org/D32324
llvm-svn: 303084
2017-05-15 17:30:47 +00:00
Simon Pilgrim 55ff57861a [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146)
Follow up to D33147

NVPTXTargetLowering::LowerCall was trusting the default argument values.

Fixes another 17 of the NVPTX '-verify-machineinstrs with EXPENSIVE_CHECKS' errors in PR32146.

Differential Revision: https://reviews.llvm.org/D33189

llvm-svn: 303082
2017-05-15 17:17:44 +00:00
Florian Hahn af91e7e6d2 [AArch64] Enable FeatureFuseAES on Cortex-A72.
This patch enables fusing dependent AESE/AESMC and AESD/AESIMC
instruction pairs on Cortex-A72, as recommended in the Software
Optimization Guide, section 4.10.

llvm-svn: 303073
2017-05-15 15:15:22 +00:00
Dmitry Preobrazhensky 167f8b69e3 [AMDGPU][MC] Corrected several VI opcodes to avoid printing _e64
See bug 32936: https://bugs.llvm.org//show_bug.cgi?id=32936

Reviewers: artem.tamazov, vpykhtin

Differential Revision: https://reviews.llvm.org/D33123

llvm-svn: 303070
2017-05-15 14:28:23 +00:00
Dmitry Preobrazhensky 03852a9dca [AMDGPU][MC] Removed V_MQSAD_U16_U8
This instruction does not really exist

See Bug 33018: https://bugs.llvm.org//show_bug.cgi?id=33018

Reviewers: vpykhtin, artem.tamazov

Differential Revision: https://reviews.llvm.org/D33126

llvm-svn: 303055
2017-05-15 12:37:03 +00:00
John Brawn 9486becf09 [ARM] Mark LEApcrel instructions as isAsCheapAsAMove
Doing this means that if an LEApcrel is used in two places we will rematerialize
instead of generating two MOVs. This is particularly useful for printfs using
the same format string, where we want to generate an address into a register
that's going to get corrupted by the call.

Differential Revision: https://reviews.llvm.org/D32858

llvm-svn: 303054
2017-05-15 11:57:54 +00:00
John Brawn 43132c46a6 [ARM] Mark LEApcrel as not having side effects
Doing this lets us hoist it out of loops, and I've also marked it as
rematerializable the same as the thumb1 and thumb2 counterparts.

It looks like it being marked as such was just a mistake, as the commit that
made that change only mentions LEApcrelJT and in thumb1 and thumb2 only the
LEApcrelJT instructions were marked as having side-effects, so it looks like
the intent was to only mark LEApcrelJT as having side-effects but LEApcrel was
accidentally marked as such also.

Differential Revision: https://reviews.llvm.org/D32857

llvm-svn: 303053
2017-05-15 11:50:21 +00:00
George Rimar 958b01aa69 [DWARF] - Speedup handling of relocations in DWARFContextInMemory.
I am working on a speedup of building .gdb_index in LLD and 
noticed that relocations that are proccessed in DWARFContextInMemory often uses
the same symbol in a row. This patch introduces caching to reduce the relocations
proccessing time.

For benchmark,
I took debug LLC binary objects configured with -ggnu-pubnames and linked it using LLD.

Link time without --gdb-index is about 4,45s.
Link time with --gdb-index: a) Without patch: 19,16s b) With patch: 15,52s
That means time spent on --gdb-index in this configuration is 
19,16s - 4,45s = 14,71s (without patch) vs 15,52s - 4,45s = 11,07s (with patch).

Differential revision: https://reviews.llvm.org/D31136

llvm-svn: 303051
2017-05-15 11:45:28 +00:00
Ayman Musa c5490e5a29 [X86] Relocate code of replacement of subtarget unsupported masked memory intrinsics to run also on -O0 option.
Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional).

CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0).

Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation.

Differential Revision: https://reviews.llvm.org/D32487

llvm-svn: 303050
2017-05-15 11:30:54 +00:00
Simon Pilgrim f8389656e3 [NVPTX] Don't rely on default arguments to SelectionDAG::getMemIntrinsicNode. NFC.
NFC followup to D33147, this explicitly sets all the arguments (instead of relying on the defaults) to SelectionDAG::getMemIntrinsicNode to help identify -verify-machineinstrs issues.

llvm-svn: 303047
2017-05-15 10:47:48 +00:00
Tom Stellard 049e7e0791 [RegisterBankInfo] Remove overly-agressive asserts
Summary:
We were asserting in RegisterBankInfo if RBI.copyCost() returns
UINT_MAX.  This is OK for RegBankSelect::Mode::Fast since we only
try one instruction mapping and can't recover from this, but for
RegBankSelect::Mode::Greedy we will be considering multiple
instruction mappings, so we can recover if we see a UNIT_MAX copy
cost.

The copy cost for one pair of register banks in the AMDGPU backend
will be UNIT_MAX, so this patch will prevent AMDGPU tests from
breaking.

Reviewers: ab, qcolombet, t.p.northover, dsanders

Reviewed By: qcolombet

Subscribers: tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D33144

llvm-svn: 303043
2017-05-15 09:52:33 +00:00
Arnaud A. de Grandmaison 6d2417924c MCObjectStreamer : fail with a diagnostic when emitting an out of range value.
We were previously silently emitting bogus data in release mode,
making it very hard to diagnose the error, or crashing with an
assert in debug mode. A proper diagnostic is now always emitted
when the value to be emitted is out of range.

llvm-svn: 303041
2017-05-15 08:43:27 +00:00
Craig Topper 1a36b7d836 [ValueTracking] Replace all uses of ComputeSignBit with computeKnownBits.
This patch finishes off the conversion of ComputeSignBit to computeKnownBits.

Differential Revision: https://reviews.llvm.org/D33166

llvm-svn: 303035
2017-05-15 06:39:41 +00:00
Sanjoy Das f6f6fb903e Move some code into ScalarEvolution.cpp; NFC
I need to add some asserts to these constructors that are easier to
add once they're in the .cpp file.

llvm-svn: 303032
2017-05-15 04:22:09 +00:00
Craig Topper bb9737247a [InstCombine] Merge duplicate functionality between InstCombine and ValueTracking
Summary:
Merge overflow computation for signed add,
appearing both in InstCombine and ValueTracking.

As part of the merge,
cleanup the interface for overflow checks in InstCombine.

Patch by Yoav Ben-Shalom.

Reviewers: craig.topper, majnemer

Reviewed By: craig.topper

Subscribers: takuto.ikuta, llvm-commits

Differential Revision: https://reviews.llvm.org/D32946

llvm-svn: 303029
2017-05-15 02:44:08 +00:00
Craig Topper 26c4159956 [InstCombine] Remove 'return' of a called function that also returned void. NFC
llvm-svn: 303028
2017-05-15 02:30:27 +00:00
Zvi Rackover e6b278bc65 [X86] Utilize SelectionDAG::getSelect(). NFC.
Replace SelectionDAG::getNode(ISD::SELECT, ...)
and SelectionDAG::getNode(ISD::VSELECT, ...)
with SelectionDAG::getSelect(...)
Saves a few lines of code and in some cases saves the need to explicitly
check the type of the desired node.

llvm-svn: 303024
2017-05-14 21:30:38 +00:00
Simon Pilgrim d0ef9d8e93 [X86][AVX1] Account for cost of extract/insert of 256-bit shifts
llvm-svn: 303023
2017-05-14 20:52:11 +00:00
Simon Pilgrim f96b4ab92d [X86][AVX2] Fix costs for v4i64 ashr by splat
llvm-svn: 303022
2017-05-14 20:25:42 +00:00
Simon Pilgrim de4467b182 [X86][AVX1] Account for cost of extract/insert of 256-bit shifts by splat
llvm-svn: 303021
2017-05-14 20:02:34 +00:00
Craig Topper ceea1a76a1 [X86] Remove unused value from IntrinsicType enum. NFC
llvm-svn: 303018
2017-05-14 19:38:06 +00:00
Simon Pilgrim d3f0d03cc5 [X86][AVX1] Account for cost of extract/insert of 256-bit SDIV/UDIV by mul sequences
llvm-svn: 303017
2017-05-14 18:52:15 +00:00
Dimitry Andric 4043373e84 Fix DynamicLibraryTest.cpp on FreeBSD and NetBSD
Summary:

After rL301562, on FreeBSD the DynamicLibrary unittests fail, because
the test uses getMainExecutable("DynamicLibraryTests", Ptr), and since
the path does not contain any slashes, retrieving the main executable
will not work.

Reimplement getMainExecutable() for FreeBSD and NetBSD using sysctl(3),
which is more reliable than fiddling with relative or absolute paths.

Also add retrieval of the original argv[] from the GoogleTest framework,
to use as a fallback for other OSes.

Reviewers: emaste, marsupial, hans, krytarowski

Reviewed By: krytarowski

Subscribers: krytarowski, llvm-commits

Differential Revision: https://reviews.llvm.org/D33171

llvm-svn: 303015
2017-05-14 18:35:38 +00:00
Shoaib Meenai ee97c5f012 [COFF] Gracefully handle empty .drectve sections
Running `llvm-readobj -coff-directives msvcrt.lib` resulted in this error:

    Invalid data was encountered while parsing the file

This happened because some of the object files in the archive have empty
`.drectve` sections. These empty sections result in a `parse_failed` error being
returned from `COFFObjectFile::getSectionContents()`, which in turn caused
`llvm-readobj` to stop. With this change, `getSectionContents` now returns
success, and like before the resulting array is empty.

Patch by Dave Lee.

Differential Revision: https://reviews.llvm.org/D32652

llvm-svn: 303014
2017-05-14 18:34:56 +00:00
Simon Pilgrim 5bef9c627e [X86][XOP] XOP's general v16i8 shifts will be used instead of v8i16 shift + mask.
Tweak cost model to match what lowering actually does.

llvm-svn: 303013
2017-05-14 17:59:46 +00:00
Simon Pilgrim aa8dffb69b [X86][SSE] Account for cost of extract/insert of v32i8 vector shifts
llvm-svn: 303012
2017-05-14 17:36:07 +00:00
Simon Pilgrim 4599eaa09a [X86][XOP] Account for cost of extract/insert of 256-bit vector shifts
llvm-svn: 303010
2017-05-14 13:38:53 +00:00
Simon Pilgrim f3ee9c6997 [X86][AVX] Allow 32-bit targets to peek through subvectors to extract constant splats for vXi64 shifts.
llvm-svn: 303009
2017-05-14 11:46:26 +00:00
Craig Topper 479daaf74c [InstSimplify] Add patterns for folding (A & B) | (~A ^ B) -> (~A ^ B) and its commuted variants.
We already had (A & ~B) | (A ^ B), but we missed the cases where the not was part of the xor.

llvm-svn: 303004
2017-05-14 07:54:43 +00:00
Craig Topper dfc8955ee6 [BasicAA] Alphabetize includes. NFC
llvm-svn: 303002
2017-05-14 06:18:34 +00:00
Xinliang David Li 392e975693 Fix test failure on windows -- do not return deleted func
llvm-svn: 302999
2017-05-14 02:54:02 +00:00
Simon Pilgrim 754c1618ec [SelectionDAG] Added support for EXTRACT_SUBVECTOR/CONCAT_VECTORS demandedelts in ComputeNumSignBits
llvm-svn: 302997
2017-05-13 22:10:58 +00:00
Peter Collingbourne d891d89ce8 Add missing files
llvm-svn: 302996
2017-05-13 22:10:13 +00:00
Peter Collingbourne c6f07c423d Move lib/LibDriver -> lib/ToolDrivers/llvm-lib. NFCI.
This reorganisation prevents us from cluttering up the top-level lib directory
with more driver libraries such as llvm-dlltool (see D29892).

llvm-svn: 302995
2017-05-13 22:06:46 +00:00
Simon Pilgrim 7666afd042 [SelectionDAG] Add VECTOR_SHUFFLE support to ComputeNumSignBits
llvm-svn: 302993
2017-05-13 19:57:10 +00:00
Craig Topper 9fe357971c [ValueTracking] Remove const_casts on several calls to computeKnownBits and ComputeSignBit. NFC
llvm-svn: 302991
2017-05-13 17:22:16 +00:00
Simon Pilgrim ef46c2762a [x86, SSE] AVX1 PR28129 (256-bit all-ones rematerialization)
Further perf tests on Jaguar indicate that:

vxorps  %ymm0, %ymm0, %ymm0
vcmpps  $15, %ymm0, %ymm0, %ymm0

is consistently faster (by about 9%) than:

vpcmpeqd  %xmm0, %xmm0, %xmm0
vinsertf128  $1, %xmm0, %ymm0, %ymm0

Testing equivalent code on a SandyBridge (E5-2640) puts it slightly (~3%) faster as well.

Committed on behalf of @dtemirbulatov

Differential Revision: https://reviews.llvm.org/D32416

llvm-svn: 302989
2017-05-13 13:42:35 +00:00
Simon Pilgrim 7d62e4b455 [LoopOptimizer][Fix]PR32859, PR24738
The Loop vectorizer pass introduced undef value while it is fixing output of LCSSA form.
Here it is:

before: %e.0.ph = phi i32 [ 0, %for.inc.2.i ]
after: %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ undef, %middle.block ]

and after this change we have:

%e.0.ph = phi i32 [ 0, %for.inc.2.i ]
%e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ 0, %middle.block ]

Committed on behalf of @dtemirbulatov

Differential Revision: https://reviews.llvm.org/D33055

llvm-svn: 302988
2017-05-13 13:25:57 +00:00
Vivek Pandya 1d12790f37 This reverts r302984
llvm-svn: 302985
2017-05-13 10:59:05 +00:00
Vivek Pandya d20de87fd5 Simplify MIR Output used for Codegen Testing
- MIRYamlMapping: Default value provided for fields which have optional
mappings. Implemented == operators for required classes. When a field's value is
same as default value specified YAML IO class will not print it.

- MIRPrinter: Above mentioned behaviour is not on by default. If -simplify-mir
option not specified, then make yaml::Output to print fields with default values
too.

Differential Revision: https://reviews.llvm.org/D32304

llvm-svn: 302984
2017-05-13 08:55:43 +00:00
Craig Topper 2c9a70661c [APInt] Use Lo_32/Hi_32/Make_64 in a few more places in the divide code. NFCI
llvm-svn: 302983
2017-05-13 07:14:17 +00:00
Craig Topper 935f7b050f [InstCombine] Prevent InstCombine from triggering an extra iteration if something changed in the initial Worklist creation
Summary:
If the Worklist build causes an IR change this change flag currently factors into the flag for running another iteration of the iteration loop. But only changes during processing should trigger another loop.

This patch captures the worklist creation change flag into the outside the loop flag currently used for DbgDeclares and only sends that flag up to the caller. Rerunning the loop only depends on IC.run() now.

This uses the debug output of InstCombine to determine if one or two iterations run. I couldn't think of a better way to detect it since the second spurious iteration shoudn't make any visible changes. Just wasted computation.

I can do a pre-commit of the test case with the CHECK-NOT as a CHECK if this is an ok way to check this.

This is a subset of D31678 as I'm still not sure how to verify the analysis behavior for that.

Reviewers: davide, majnemer, spatel, chandlerc

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32453

llvm-svn: 302982
2017-05-13 06:56:04 +00:00
Craig Topper 4b83b4d560 [APInt] Fix typo in comment. NFC
llvm-svn: 302974
2017-05-13 00:35:30 +00:00
Dylan McKay 0c4debc123 [AVR] When lowering Select8/Select16, put newly generated MBBs in the same spot
Contributed by Dr. Gergő Érdi.

Fixes a bug.

Raised from (https://github.com/avr-rust/rust/issues/49).

llvm-svn: 302973
2017-05-13 00:22:34 +00:00
Dylan McKay 0c707da6ac [AVR] Remove an unused variable
llvm-svn: 302970
2017-05-13 00:00:26 +00:00
Xinliang David Li 66bdfca77a [PartialInlining] Profile based cost analysis
Implemented frequency based cost/saving analysis
and related options.

The pass is now in a state ready to be turne on
in the pipeline (in follow up).

Differential Revision: http://reviews.llvm.org/D32783

llvm-svn: 302967
2017-05-12 23:41:43 +00:00
Aditya Nandakumar 2a735421d1 [GISel]: Add a getConstantFPVRegVal utility
This might be useful across various GISel Passes

https://reviews.llvm.org/D33051

llvm-svn: 302964
2017-05-12 22:54:52 +00:00
Aditya Nandakumar 479ddd20fc [GISel]: Fix undefined behavior while accessing DefaultAction map
We end up dereferencing the end iterator here when the Aspect doesn't exist in the DefaultAction map.
Change the API to return Optional<LLT> and return None when not found.
Also update the callers to handle the None case

llvm-svn: 302963
2017-05-12 22:43:58 +00:00
Eugene Zelenko 0cd7948876 [IR] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC).
llvm-svn: 302961
2017-05-12 22:25:07 +00:00
Andrew Kaylor b01e94ee8d [TLI] Add mapping for various '__<func>_finite' forms of the math routines to SVML routines
Patch by Chris Chrulski

Differential Revision: https://reviews.llvm.org/D31789

llvm-svn: 302957
2017-05-12 22:11:26 +00:00
Andrew Kaylor f7c864f89c [ConstantFolding] Add folding for various math '__<func>_finite' routines generated from -ffast-math
Patch by Chris Chrulski

Differential Revision: https://reviews.llvm.org/D31788

llvm-svn: 302956
2017-05-12 22:11:20 +00:00
Andrew Kaylor 3cd8c16d7f [TLI] Add declarations for various math header file routines from math-finite.h that create '__<func>_finite as functions
Patch by Chris Chrulski

Differential Revision: https://reviews.llvm.org/D31787

llvm-svn: 302955
2017-05-12 22:11:12 +00:00
Craig Topper b1a71cac4b [APInt] Add early outs for a division by 1 to udiv/urem/udivrem
We already counted the number of bits in the RHS so its pretty cheap to just check if the RHS is 1.

Differential Revision: https://reviews.llvm.org/D33154

llvm-svn: 302953
2017-05-12 21:45:50 +00:00
Craig Topper 2579c7c69f [APInt] In udivrem, remember the bit width in a local variable so we don't reread it from the LHS which might be aliased with Quotient or Remainder.
This helped the compiler generate better code for the single word case. It was able to remember that the bit width was still a single word when it created the Remainder APInt and not create code for it possibly being multiword.

llvm-svn: 302952
2017-05-12 21:45:44 +00:00
Adrian Prantl 1fa362f811 LTO: Don't verify modules twice in verifyMergedModuleOnce
Differential Revision: https://reviews.llvm.org/D33140

llvm-svn: 302951
2017-05-12 21:38:32 +00:00
Changpeng Fang 161e8c39af AMDGPU/SI: Don't promote to vector if the load/store is volatile.
Summary:
  We should not change volatile loads/stores in promoting alloca to vector.

Reviewers:
  arsenm

Differential Revision:
  http://reviews.llvm.org/D33107

llvm-svn: 302943
2017-05-12 20:31:12 +00:00
Simon Pilgrim a1978aaefd [NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146)
This fixes 47 of the 75 NVPTX '-verify-machineinstrs with EXPENSIVE_CHECKS' errors in PR32146.

Differential Revision: https://reviews.llvm.org/D33147

llvm-svn: 302942
2017-05-12 19:56:43 +00:00
Teresa Johnson 4cd12ce9a0 Remove ignore-empty-index-file option
Summary:
As discussed in the D32195 review thread and on IRC, remove this option
and replace with parameter, which will be set to true when invoked
from clang in the context of a ThinLTO distributed backend.

Reviewers: pcc

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D33133

llvm-svn: 302939
2017-05-12 19:32:11 +00:00
Dehao Chen 65dd23e273 Add LiveRangeShrink pass to shrink live range within BB.
Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB.

Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb

Reviewed By: MatzeB, andreadb

Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D32563

llvm-svn: 302938
2017-05-12 19:29:27 +00:00
Tim Shen 10c64e6aea [PPC] Move the combine "a << (b % (sizeof(a) * 8)) -> (PPCshl a, b)" to the backend. NFC.
Summary:
Eli pointed out that it's unsafe to combine the shifts to ISD::SHL etc.,
because those are not defined for b > sizeof(a) * 8, even after some of
the combiners run.

However, PPCISD::SHL defines that behavior (as the instructions themselves).
Move the combination to the backend.

The tests in shift_mask.ll still pass.

Reviewers: echristo, hfinkel, efriedma, iteratee

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D33076

llvm-svn: 302937
2017-05-12 19:25:37 +00:00
Zachary Turner dd3a739d52 [CodeView] Add a random access type visitor.
This adds a visitor that is capable of accessing type
records randomly and caching intermediate results that it
learns about during partial linear scans.  This yields
amortized O(1) access to a type stream even though type
streams cannot normally be indexed.

Differential Revision: https://reviews.llvm.org/D33009

llvm-svn: 302936
2017-05-12 19:18:12 +00:00
Geoff Berry ddbbf6416c [AArch64][Falkor] Refine modeling of multiply accumulate forwarding.
llvm-svn: 302933
2017-05-12 18:57:10 +00:00
Craig Topper 4bdd621e93 [APInt] Add an assert to check for divide by zero in udivrem. NFC
udiv and urem already had the same assert.

llvm-svn: 302931
2017-05-12 18:19:01 +00:00
Craig Topper 06da0816fd [APInt] Remove unnecessary checks of rhsWords==1 with lhsWords==1 from udiv and udivrem. NFC
At this point in the code rhsWords is guaranteed to be non-zero and less than or equal to lhsWords. So if lhsWords is 1, rhsWords must also be 1. urem alread had the check removed so this makes all 3 consistent.

llvm-svn: 302930
2017-05-12 18:18:57 +00:00
Simon Pilgrim b146e61828 Strip trailing whitespace. NFCI.
llvm-svn: 302927
2017-05-12 17:42:36 +00:00
Craig Topper 8df66c602a [KnownBits] Add bit counting methods to KnownBits struct and use them where possible
This patch adds min/max population count, leading/trailing zero/one bit counting methods.

The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give.

Differential Revision: https://reviews.llvm.org/D32931

llvm-svn: 302925
2017-05-12 17:20:30 +00:00
Reid Kleckner 5bc8543a36 [codeview] Fix assertion failure introduced in r295354 refactoring
CodeViewDebug sets Asm to nullptr to disable debug info generation.  You
can get a .ll file like no-cus.ll from 'clang -gcodeview -g0', which
happens in the ubsan test suite.

llvm-svn: 302923
2017-05-12 17:02:40 +00:00
Tom Stellard a0d67c748a AMDGPU/GlobalISel: Mark 32-bit integer constants as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D33115

llvm-svn: 302919
2017-05-12 16:46:46 +00:00
James Y Knight d4e1b00e7c [SPARC] Support 'f' and 'e' inline asm constraints.
Based on patch by Patrick Boettcher and Chris Dewhurst.

Differential Revision: https://reviews.llvm.org/D29116

llvm-svn: 302911
2017-05-12 15:59:10 +00:00
Davide Italiano c43a9f80ed [NewGVN] Improve debug output a bit. NFCI.
While debugging a predicate info problem, I noticed this was missing
a newline, making the debug output slightly less readable.

llvm-svn: 302908
2017-05-12 15:28:12 +00:00
Simon Pilgrim eabf6fc4b5 [DAGCombine] Use SelectionDAG::getAnyExtOrTrunc helper. NFCI.
llvm-svn: 302907
2017-05-12 15:26:50 +00:00
Davide Italiano b60f6e0550 [NewGVN] Format an assertion and fix a typo. NFCI.
llvm-svn: 302906
2017-05-12 15:25:56 +00:00
Davide Italiano 41f5c7bcba [NewGVN] Don't incorrectly reset the memory leader.
This code was missing a check for stores, so we were thinking the
congruency class didn't have any memory members, and reset the
memory leader.

Differential Revision:  https://reviews.llvm.org/D33056

llvm-svn: 302905
2017-05-12 15:22:45 +00:00
Simon Pilgrim f01c301f72 [DAGCombine] Use SelectionDAG::getZExtOrTrunc helper. NFCI.
llvm-svn: 302897
2017-05-12 13:22:12 +00:00
Simon Pilgrim a6ed1b2f12 Use SDValue::getOperand() helper. NFCI.
llvm-svn: 302896
2017-05-12 13:20:24 +00:00
Simon Pilgrim 7f03231cc6 Use SDValue::getOperand() helper. NFCI.
llvm-svn: 302894
2017-05-12 13:08:45 +00:00
Leslie Zhai a1149e01d2 [AVR] Migrate to new StructType::get owing to Supress all uses of LLVM_END_WITH_NULL
Reviewers: dylanmckay, jroelofs, RKSimon, serge-sans-paille

Reviewed By: serge-sans-paille

Differential Revision: https://reviews.llvm.org/D33119

llvm-svn: 302885
2017-05-12 09:08:03 +00:00
Serguei Katkov 63c9c81152 [BPI] Ignore remainder while distributing the remaining probability from unreachanble
This is a follow up patch for https://reviews.llvm.org/rL300440
to address a comment.

To make implementation to be consistent with other cases we just
ignore the remainder after distribution of remaining probability between
reachable edges.

If we reduced the probability of some edges coming to unreachable
blocks we should distribute the remaining part across other edges
coming to reachable blocks to satisfy the condition that sum of all
probabilities should be equal to one. If this remaining part is not
divided by number of "reachable" edges then we get this remainder.
This remainder probability should be pretty small. Other cases just ignore
if the sum of probabilities is not equal to one so we do the same.

Reviewers: chandlerc, sanjoy, vsk, junbuml, reames
Reviewed By: reames
Subscribers: reames, llvm-commits
Differential Revision: https://reviews.llvm.org/D32124

llvm-svn: 302883
2017-05-12 07:50:06 +00:00
Craig Topper 8769403d49 [APInt] Fix a case where udivrem might delete and create a new allocation instead of reusing the original.
llvm-svn: 302882
2017-05-12 07:21:09 +00:00
Jonas Paulsson d1ec738502 Handle a COPY with undef source operand in LowerCopy()
Llvm-stress discovered that a COPY may end up in ExpandPostRA::LowerCopy()
with an undef source operand. It is not possible for the target to handle
this, as this flag is not passed to TII->copyPhysReg().

This patch solves this by treating such a COPY as an identity COPY.

Review: Matthias Braun
https://reviews.llvm.org/D32892

llvm-svn: 302877
2017-05-12 06:32:03 +00:00
Mikael Holmen ce3ec4519b [IfConversion] Keep the CFG updated incrementally in IfConvertTriangle
Summary:
Instead of using RemoveExtraEdges (which uses analyzeBranch, which cannot
always be trusted) at the end to fixup the CFG we keep the CFG updated as
we go along and remove or add branches and merge blocks.

This way we won't have any problems if the involved MBBs contain
unanalyzable instructions.

This fixes PR32721.

In that case we had a triangle

   EBB
   | \
   |  |
   | TBB
   |  /
   FBB

where FBB didn't have any successors at all since it ended with an
unconditional return. Then TBB and FBB were be merged into EBB, but EBB
would still keep its successors, and the use of analyzeBranch and
CorrectExtraCFGEdges wouldn't help to remove them since the return
instruction is not analyzable (at least not on ARM).

Reviewers: kparzysz, iteratee, MatzeB

Reviewed By: iteratee

Subscribers: aemerson, rengolin, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33037

llvm-svn: 302876
2017-05-12 06:28:58 +00:00
Chandler Carruth d869b18826 [PM/Unswitch] Teach the new simple loop unswitch to handle loop
invariant PHI inputs and to rewrite PHI nodes during the actual
unswitching.

The checking is quite easy, but rewriting the PHI nodes is somewhat
surprisingly challenging. This should handle both branches and switches.

I think this is now a full featured trivial unswitcher, and more full
featured than the trivial cases in the old pass while still being (IMO)
somewhat simpler in how it works.

Next up is to verify its correctness in more widespread testing, and
then to add non-trivial unswitching.

Thanks to Davide and Sanjoy for the excellent review. There is one
remaining question that I may address in a follow-up patch (see the
review thread for details) but it isn't related to the functionality
specifically.

Differential Revision: https://reviews.llvm.org/D32699

llvm-svn: 302867
2017-05-12 02:19:59 +00:00
Craig Topper a92fd0bebb [APInt] Add a utility method to change the bit width and storage size of an APInt.
Summary:
This adds a resize method to APInt that manages deleting/allocating storage for an APInt and changes its bit width. Use this to simplify code in copy assignment and divide.

The assignment code in particular was overly complicated. Treating every possible case as a separate implementation. I'm also pretty sure the clearUnusedBits code at the end was unnecessary. Since we always copying whole words from the source APInt. All unused bits should be clear in the source.

Reviewers: hans, RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33073

llvm-svn: 302863
2017-05-12 01:46:01 +00:00
David Blaikie 488393f822 DWARF: Avoid cross-CU references under Fission
Turns out that the Fission/Split DWARF package format (DWP) is currently
insufficient to handle cross-CU (ref_addr) references. So for now,
duplicate any debug info needed in these situations:
* inlined_subroutine's abstract_origin
* inlined variable's abstract_origin
* types

Keep the ref_addr behavior in general, including in the split DWARF
inline debug info that can be emitted into the object files for online
symbolication.
Keep a flag to use the old (ref_addr) behavior for testing ways of
addressing this limitation in the DWP tool (& for those not using DWP
packaging).

llvm-svn: 302858
2017-05-12 01:13:45 +00:00
Dean Michael Berris a7bbe4481a [XRay][lib] Support and temporarily skip over CustomEvent records
Summary:
In D30630 we will start writing custom event records. To avoid breaking
the tools that read the FDR mode records, we skip over these records.
To support these custom event records more effectively, we will have to
expose them in the trace loading API. Those changes will be forthcoming.

Reviewers: kpw, pelikan

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33032

llvm-svn: 302856
2017-05-12 01:06:41 +00:00
Peter Collingbourne f3e9f12296 CallGraph: Remove almost-unused field 'Root'.
llvm-svn: 302852
2017-05-11 23:59:05 +00:00
Dehao Chen 8d1c983f45 Change sample profile writer to make it deterministic.
Summary: This patch changes the function profile output order to be deterministic. In order to make it easier to understand, hottest functions (with most total samples) is ordered first.

Reviewers: dnovillo, davidxl

Reviewed By: dnovillo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33111

llvm-svn: 302851
2017-05-11 23:43:44 +00:00
Teresa Johnson 2a6b7991d4 Restrict call metadata based hotness detection to Sample PGO mode
Summary:
Don't use the metadata on call instructions for determining hotness
unless we are in sample PGO mode, where it is needed because profile
counts are not accurate. In instrumentation mode this is not necessary
and does more harm than good when calls have VP metadata that hasn't
been properly scaled after transformations or dropped after constant
prop based devirtualization (both should be fixed, but we don't need
to do this in the first place for instrumentation PGO).

This required adjusting a number of tests to distinguish between sample
and instrumentation PGO handling, and to add in profile summary metadata
so that getProfileCount can get the summary.

Reviewers: davidxl, danielcdh

Subscribers: aemerson, rengolin, mehdi_amini, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D32877

llvm-svn: 302844
2017-05-11 23:18:05 +00:00
Reid Kleckner 43bbeb4c9f Issue diagnostics when returning FP values on x86_64 without SSE1/2
Avoid using report_fatal_error, because it will ask the user to file a
bug. If the user attempts to disable SSE on x86_64 and them use floating
point, that's a bug in their code, not a bug in the compiler.

This is just a start. There are other ways to crash the backend in this
configuration, but they should be updated to follow this pattern.

Differential Revision: https://reviews.llvm.org/D27522

llvm-svn: 302835
2017-05-11 22:43:02 +00:00
Guozhi Wei 22e7da9597 [PPC] Change the register constraint of the first source operand of instruction mtvsrdd to g8rc_nox0
According to Power ISA V3.0 document, the first source operand of mtvsrdd is constant 0 if r0 is specified. So the corresponding register constraint should be g8rc_nox0.

This bug caused wrong output generated by 401.bzip2 when -mcpu=power9 and fdo are specified.

Differential Revision: https://reviews.llvm.org/D32880

llvm-svn: 302834
2017-05-11 22:17:35 +00:00
Aditya Nandakumar fd484c443f [GISel]: Remove unused lambda captures. NFC
https://reviews.llvm.org/D33085

llvm-svn: 302831
2017-05-11 21:56:51 +00:00
Easwaran Raman c103ef89ee Decrease inlinecold-threshold to 45
I ran the test-suite (including SPEC 2006) in PGO mode comparing cold
thresholds of 225 and 45. Here are some stats on the text size:

Out of 904 tests that ran, 197 see a change in text size. The average
text size reduction (of all the 904 binaries) is 1.07%. Of the 197
binaries, 19 see a text size increase, as high as 18%, but most of them
are small single source benchmarks. There are 3 multisource benchmarks
with a >0.5% size increase (0.7, 1.3 and 2.1 are their % increases). On
the other side of the spectrum, 31 benchmarks see >10% size reduction
and 6 of them are MultiSource.

I haven't run the test-suite with other values of inlinecold-threshold.
Since we have a cold callsite threshold of 45, I picked this value.

Differential revision: https://reviews.llvm.org/D33106

llvm-svn: 302829
2017-05-11 21:36:28 +00:00
Reid Kleckner 45a13e1b54 De-virtualize TerminatorInst successor accessors
Use the same switch technique to eliminate virtual successor accessors
from TerminatorInst. Extracted from D31261.

NFC

llvm-svn: 302827
2017-05-11 21:26:55 +00:00
Reid Kleckner e7c7854cb1 De-virtualize GlobalValue
The erase/remove from parent methods now use a switch table to remove
themselves from their appropriate parent ilist.

The copyAttributesFrom method is now completely non-virtual, since we
only ever copy attributes from a global of the appropriate type.

Pre-requisite to de-virtualizing Value to save a vptr
(https://reviews.llvm.org/D31261).

NFC

llvm-svn: 302823
2017-05-11 21:14:29 +00:00
Chad Rosier aeffffdb44 [AArch64][MachineCombine] Fold FNMUL+FSUB -> FNMADD.
Differential Revision: http://reviews.llvm.org/D33101.

llvm-svn: 302822
2017-05-11 20:07:24 +00:00
Davide Italiano 0dcc015a81 [AMDGPU] Placate unused variable warning in release builds.
llvm-svn: 302821
2017-05-11 19:58:52 +00:00
Vadzim Dambrouski 38e30197c3 [MSP430] Generate EABI-compliant libcalls
Updates the MSP430 target to generate EABI-compatible libcall names.
As a byproduct, adjusts the hardware multiplier options available in
the MSP430 target, adds support for promotion of the ISD::MUL operation
for 8-bit integers, and correctly marks R11 as used by call instructions.

Patch by Andrew Wygle.

Differential Revision: https://reviews.llvm.org/D32676

llvm-svn: 302820
2017-05-11 19:56:14 +00:00
Davide Italiano 36acbc716d [LiveVariables] Switch Kill/Defs sets to be DenseSet(s).
The testcase in PR32984 shows a non linear compile time increase
after a change that made the LoopUnroll pass more aggressive
(increasing the threshold).

My profiling shows all the time of PHI elimination goes to
llvm::LiveVariables::addNewBlock. This is because we keep
Defs/Kills registers in a SmallSet and vfind(const T &V); is O(N).

Switching to a DenseSet reduces the time spent in the pass from
297 seconds to 97 seconds. Profiling still shows a lot of time is
spent iterating the data structure, so I guess there's room for
improvement.

Dan tells me GCC uses real set operations for live registers and
it takes no-time on this testcase. Matthias points out we might
want to switch all this to LiveIntervalAnalysis so it's not entirely
sure if a rewrite is worth it.

Differential Revision:  https://reviews.llvm.org/D33088

llvm-svn: 302819
2017-05-11 19:37:43 +00:00
Craig Topper dbd6219f81 [APInt] Remove an APInt copy from the return of APInt::multiplicativeInverse.
llvm-svn: 302816
2017-05-11 18:40:53 +00:00
Craig Topper 3fbecadab6 [APInt] Fix typo in comment. NFC
llvm-svn: 302815
2017-05-11 17:57:43 +00:00
Matt Arsenault 47ccafe787 AMDGPU: Remove tfe bit from flat instruction definitions
We don't use it and it was removed in gfx9, and the encoding
bit repurposed.

Additionally actually using it requires changing the output register
class, which wasn't done anyway.

llvm-svn: 302814
2017-05-11 17:38:33 +00:00
Matt Arsenault bf5482e4bb AMDGPU: Pull fneg out of extract_vector_elt
This allows folding source modifiers in more f16 cases.
Makes it easier to select per-component packed neg modifiers.

llvm-svn: 302813
2017-05-11 17:26:25 +00:00
Stanislav Mekhanoshin 33a97ec4ed [AMDGPU] Fix incorrect register pressure calculation
Earlier fix D32572 introduced a bug where live-ins were calculated
for basic block instead of scheduling region. This change fixes it.

Differential Revision: https://reviews.llvm.org/D33086

llvm-svn: 302812
2017-05-11 17:16:55 +00:00
Adam Nemet 0aca09fc6c [SLP] Emit optimization remarks
The approach I followed was to emit the remark after getTreeCost concludes
that SLP is profitable.  I initially tried emitting them after the
vectorizeRootInstruction calls in vectorizeChainsInBlock but I vaguely
remember missing a few cases for example in HorizontalReduction::tryToReduce.

ORE is placed in BoUpSLP so that it's available from everywhere (notably
HorizontalReduction::tryToReduce).

We use the first instruction in the root bundle as the locator for the remark.
In order to get a sense how far the tree is spanning I've include the size of
the tree in the remark.  This is not perfect of course but it gives you at
least a rough idea about the tree.  Then you can follow up with -view-slp-tree
to really see the actual tree.

llvm-svn: 302811
2017-05-11 17:06:17 +00:00
Nemanja Ivanovic 96c3d626a2 [PowerPC] Eliminate integer compare instructions - vol. 1
This patch is the first in a series of patches to provide code gen for
doing compares in GPRs when the compare result is required in a GPR.

It adds the infrastructure to select GPR sequences for i1->i32 and i1->i64
extensions. This first patch handles equality comparison on i32 operands with
the result sign or zero extended.

Differential Revision: https://reviews.llvm.org/D31847

llvm-svn: 302810
2017-05-11 16:54:23 +00:00
Simon Pilgrim 6faddcbd07 [DAGCombine] Use SelectionDAG::getAnyExtOrTrunc helper. NFCI.
llvm-svn: 302808
2017-05-11 16:40:44 +00:00
Hans Wennborg 905da7458b Fix -DLLVM_ENABLE_THREADS=OFF build after r302748
llvm-svn: 302806
2017-05-11 15:32:47 +00:00
Javed Absar f3d7904d20 [IR] Allow attributes with global variables
This patch extends llvm-ir to allow attributes to be set on global variables.
An RFC was sent out earlier by my colleague James Molloy: http://lists.llvm.org/pipermail/cfe-dev/2017-March/053100.html
A key part of that proposal was to extend LLVM-IR to carry attributes on global variables.
This generic feature could be useful for multiple purposes.
In our present context, it would be useful to carry user specified sections for bss/rodata/data.

Reviewed by: Jonathan Roelofs, Reid Kleckner
Differential Revision: https://reviews.llvm.org/D32009

llvm-svn: 302794
2017-05-11 12:28:08 +00:00
Igor Breger a44fc83d9f [GlobalISel][X86] Remove hand-written G_FADD/F_SUB selection.
Now it handle by TableGen.

llvm-svn: 302793
2017-05-11 12:15:03 +00:00
Ayal Zaks 58b28d549a [LV] Refactor ILV.vectorize{Loop}() by introducing LVP.executePlan(); NFC
Introduce LoopVectorizationPlanner.executePlan(), replacing ILV.vectorize() and
refactoring ILV.vectorizeLoop(). Method collectDeadInstructions() is moved from
ILV to LVP. These changes facilitate building VPlans and using them to generate
code, following https://reviews.llvm.org/D28975 and its tentative breakdown.

Method ILV.createEmptyLoop() is renamed ILV.createVectorizedLoopSkeleton() to
improve clarity; it's contents remain intact.

Differential Revision: https://reviews.llvm.org/D32200

llvm-svn: 302790
2017-05-11 11:36:33 +00:00
Alexander Potapenko a658ae8fe2 [msan] Fix PR32842
It turned out that MSan was incorrectly calculating the shadow for int comparisons: it was done by truncating the result of (Shadow1 OR Shadow2) to i1, effectively rendering all bits except LSB useless.
This approach doesn't work e.g. in the case where the values being compared are even (i.e. have the LSB of the shadow equal to zero).
Instead, if CreateShadowCast() has to cast a bigger int to i1, we replace the truncation with an ICMP to 0.

This patch doesn't affect the code generated for SPEC 2006 binaries, i.e. there's no performance impact.

For the test case reported in PR32842 MSan with the patch generates a slightly more efficient code:

  orq     %rcx, %rax
  jne     .LBB0_6
, instead of:

  orl     %ecx, %eax
  testb   $1, %al
  jne     .LBB0_6

llvm-svn: 302787
2017-05-11 11:07:48 +00:00
Chandler Carruth 97500a9918 [x86] Fix a failure to select with AVX-512 when the type legalizer
manages to form a VSELECT with a non-i1 element type condition. Those
are technically allowed in SDAG (at least, the generic type legalization
logic will form them and I wouldn't want to try to audit everything te
preclude forming them) so we need to be able to lower them.

This isn't too hard to implement. We mark VSELECT as custom so we get
a chance in C++, add a fast path for i1 conditions to get directly
handled by the patterns, and a fallback when we need to manually force
the condition to be an i1 that uses the vptestm instruction to turn
a non-mask into a mask.

This, unsurprisingly, generates awful code. But it at least doesn't
crash. This was actually impacting open source packages built with LLVM
for AVX-512 in the wild, so quickly landing a patch that at least stops
the immediate bleeding.

I think I've found where to fix the codegen quality issue, but less
confident of that change so separating it out from the thing that
doesn't change the result of any existing test case but causes mine to
not crash.

llvm-svn: 302785
2017-05-11 10:52:16 +00:00
Simon Pilgrim a4a13a0da0 Strip trailing whitespace. NFCI.
llvm-svn: 302784
2017-05-11 10:03:05 +00:00
Diana Picus 9cfbc6d94f [ARM][GlobalISel] Legalize narrow scalar ops by widening
This is the same as r292827 for AArch64: we widen 8- and 16-bit ADD, SUB
and MUL to 32 bits since we only have TableGen patterns for 32 bits.
See the commit message for r292827 for more details.

At this point we could just remove some of the tests for regbankselect
and instruction-select, since we're not going to see any narrow
operations at those levels anymore. Instead I decided to update them
with G_ANYEXT/G_TRUNC operations, so we can validate the full sequences
generated by the legalizer.

llvm-svn: 302782
2017-05-11 09:45:57 +00:00
Serge Guelton f4dc59ba8e Remove spurious cast of nullptr. NFC.
Conversion rules allow automatic casting of nullptr to any pointer type.

llvm-svn: 302780
2017-05-11 08:53:00 +00:00
Serge Guelton 1b421c259f Remove now useless trailing nullptr in StructType::get
llvm-svn: 302779
2017-05-11 08:46:02 +00:00
Diana Picus 657bfd3302 [ARM][GlobalISel] Support for G_ANYEXT
G_ANYEXT can be introduced by the legalizer when widening scalars. Add
support for it in the register bank info (same mapping as everything
else) and in the instruction selector.

When selecting it, we treat it as a COPY, just like G_TRUNC. On this
occasion we get rid of some assertions in selectCopy so we can reuse it.
This shouldn't be a problem at the moment since we're not supporting any
complicated cases (e.g. FPR, different register banks). We might want to
separate the paths when we do.

llvm-svn: 302778
2017-05-11 08:28:31 +00:00
Igor Breger c7b5977bb1 [GlobalISel][X86] G_ICMP support.
Summary: support G_ICMP for scalar types i8/i16/i64.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: rovka, kristof.beyls, llvm-commits, krytarowski

Differential Revision: https://reviews.llvm.org/D32995

llvm-svn: 302774
2017-05-11 07:17:40 +00:00
Craig Topper c59ced36aa [APInt] Remove an unneeded extra temporary APInt from toString.
Turns out udivrem can write its output to the same location as one of its inputs so the extra temporary isn't needed.

llvm-svn: 302772
2017-05-11 07:10:43 +00:00
Craig Topper b3c1f56737 [APInt] Use negate() instead of copying an APInt to negate it and then writing back over the original value.
llvm-svn: 302770
2017-05-11 07:02:04 +00:00
Craig Topper e3e1a35f68 [SCEV] Reduce possible APInt allocations a bit.
llvm-svn: 302769
2017-05-11 06:48:54 +00:00
Craig Topper 6694a4e6d6 [SCEV] Remove unneeded 'using namespace APIntOps'.
llvm-svn: 302768
2017-05-11 06:48:51 +00:00
Igor Breger db75455990 [X86] Move getX86ConditionCode() from X86FastISel.cpp to X86InstrInfo.cpp. NFC
Summary:
Move getX86ConditionCode() from X86FastISel.cpp to X86InstrInfo.cpp so it can be used by GloabalIsel instruction selector.
This is a pre-commit for a patch I'm working on to support G_ICMP. NFC.

Reviewers: zvi, guyblank, delena

Reviewed By: guyblank, delena

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33038

llvm-svn: 302767
2017-05-11 06:36:37 +00:00
Zachary Turner 7a2d56813d Final (hopefully) fix for the build bots.
This time it actually occurred to me to change the #defines
to actually test the pre-processed out codepath.  Hopefully
this time it works.

llvm-svn: 302752
2017-05-11 00:22:18 +00:00
Zachary Turner 20c8e9192d Try again to fix the buildbots.
TaskGroup and Latch need to be in llvm::parallel::detail, not
in llvm::detail.

llvm-svn: 302751
2017-05-11 00:18:52 +00:00
Zachary Turner bfb8e189d2 Fix build errors with Parallel.
llvm-svn: 302749
2017-05-11 00:09:30 +00:00
Zachary Turner 3a57fbd6db [Support] Move Parallel algorithms from LLD to LLVM.
Differential Revision: https://reviews.llvm.org/D33024

llvm-svn: 302748
2017-05-11 00:03:52 +00:00
Kostya Serebryany ae0317e4a9 [libFuzzer] fix a compiler warning
llvm-svn: 302747
2017-05-10 23:59:03 +00:00
David L. Jones bbd97d273b Revert "[SDAG] Relax conditions under stores of loaded values can be merged"
This reverts r302712.

The change fails with ASAN enabled:

ERROR: AddressSanitizer: use-after-poison on address ... at ...
READ of size 2 at ... thread T0
  #0 ... in llvm::SDNode::getNumValues() const <snip>/include/llvm/CodeGen/SelectionDAGNodes.h:855:42
  #1 ... in llvm::SDNode::hasAnyUseOfValue(unsigned int) const <snip>/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7270:3
  #2 ... in llvm::SDValue::use_empty() const <snip> include/llvm/CodeGen/SelectionDAGNodes.h:1042:17
  #3 ... in (anonymous namespace)::DAGCombiner::MergeConsecutiveStores(llvm::StoreSDNode*) <snip>/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12944:7

Reviewers: niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33081

llvm-svn: 302746
2017-05-10 23:56:21 +00:00
Eugene Zelenko eba7e4ec55 [IR] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC).
llvm-svn: 302744
2017-05-10 23:41:30 +00:00
Davide Italiano 1e8c59a8ac [PHIElimination] Use the same name for DEBUG_TYPE and pass name.
In an attempt to reduce the confusion.

llvm-svn: 302742
2017-05-10 23:13:26 +00:00
Sanjay Patel 40a87a909b [InstCombine] remove fold that swaps xor/or with constants; NFCI
// (X ^ C1) | C2 --> (X | C2) ^ (C1&~C2)

This canonicalization was added at:
https://reviews.llvm.org/rL7264 

By moving xors out/down, we can more easily combine constants. I'm adding
tests that do not change with this patch, so we can verify that those kinds
of transforms are still happening.

This is no-functional-change-intended because there's a later fold:
// (X^C)|Y -> (X|Y)^C iff Y&C == 0
...and demanded-bits appears to guarantee that any fold that would have
hit the fold we're removing here would be caught by that 2nd fold.

Similar reasoning was used in:
https://reviews.llvm.org/rL299384

The larger motivation for removing this code is that it could interfere with 
the fix for PR32706:
https://bugs.llvm.org/show_bug.cgi?id=32706

Ie, we're not checking if the 'xor' is actually a 'not', so we could reverse
a 'not' optimization and cause an infinite loop by altering an 'xor X, -1'. 

Differential Revision: https://reviews.llvm.org/D33050

llvm-svn: 302733
2017-05-10 21:33:55 +00:00
Matt Arsenault 3c5e4237c6 AMDGPU: Make some packed shuffles free
VOP3P instructions can encode access to either
half of the register.

llvm-svn: 302730
2017-05-10 21:29:33 +00:00
Matt Arsenault acdc7659cc AMDGPU: Add new subtarget features for gfx9 flat instructions
Flat instructions gain an immediate offset, and 2 new
sets of segment specific flat instructions are added.

llvm-svn: 302729
2017-05-10 21:19:05 +00:00
Craig Topper c51d05369a [ConstantRange] Fix the early out in ConstantRange::multiply for positive numbers to really do what the comment says
r271020 added an early out to skip the signed multiply portion of ConstantRange::multiply. The comment says we don't need to do signed multiply if the range is only positive numbers, but the implemented check only ensures that the start of the range is positive. It doesn't look at the end of the range.

This patch checks the end of the range instead. Because Upper is one more than the end we have to see if its positive or if its one past the last positive number.

llvm-svn: 302717
2017-05-10 20:01:48 +00:00
Craig Topper ef0114c4f0 [APInt] Add negate helper method to implement twos complement. Use it to shorten code.
llvm-svn: 302716
2017-05-10 20:01:38 +00:00
Davide Italiano dc435325a8 [NewGVN] Introduce a definesNoMemory() helper and use it.
This is nice as is, but it will be used in my next patch to
fix a bug. Suggested by Daniel Berlin.

llvm-svn: 302714
2017-05-10 19:57:43 +00:00
Nirav Dave a38c049fc5 [SDAG] Relax conditions under stores of loaded values can be merged
Summary:

Allow consecutive stores whose values come from consecutive loads to
merged in the presense of other uses of the loads. Previously this was
disallowed as in general the merged load cannot be shared with the
other uses. Merging N stores into 1 may cause as many as N redundant
loads. However in the context of caching this should have neglible
affect on memory pressure and reduce instruction count making it
almost always a win.

Fixes PR32086.

Reviewers: spatel, jyknight, andreadb, hfinkel, efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30471

llvm-svn: 302712
2017-05-10 19:53:41 +00:00
Teresa Johnson 94624aca2a Ensure non-null ProfileSummaryInfo passed to ModuleSummaryIndex builder
This fixes a ubsan bot failure after r302597, which made getProfileCount
non-static, but ended up invoking it on a null ProfileSummaryInfo object
in some cases from buildModuleSummaryIndex.

Most testing passed because the non-static getProfileCount currently
doesn't access any member variables, but I found this when testing a
follow on patch (D32877) that adds a member variable access.

llvm-svn: 302705
2017-05-10 18:52:16 +00:00
Craig Topper ecb97da108 [APInt] Make toString use udivrem instead of calling the divide helper method directly. Do a better job of reusing allocations while looping. NFCI
This lets toString take advantage of the degenerate case checks in udivrem and is just generally cleaner.

One minor downside of this is that the divisor APInt now needs to be the same size as Tmp which requires an additional allocation. But we were doing a poor job of reusing allocations before so the new code should still be an improvement.

llvm-svn: 302704
2017-05-10 18:15:24 +00:00
Craig Topper 6271bc7146 [APInt] Use uint32_t instead of unsigned for the storage type throughout the divide code. Use Lo_32/Hi_32/Make_64 helpers instead of casts and shifts. NFCI
llvm-svn: 302703
2017-05-10 18:15:20 +00:00
Craig Topper f86b9d5063 [APInt] Use getRawData to slightly simplify some code.
llvm-svn: 302702
2017-05-10 18:15:17 +00:00
Craig Topper 93eabae4aa [APInt] Remove check for single word since single word was handled earlier in the function. NFC
llvm-svn: 302701
2017-05-10 18:15:14 +00:00
Amaury Sechet 197685c6d8 Small refactoring in DAGCombine. NFC
llvm-svn: 302699
2017-05-10 17:58:28 +00:00
Quentin Colombet 307e29124c [AArch64][RegisterBankInfo] Change the default mapping of fp stores.
For stores, check if the stored value is defined by a floating point
instruction and if yes, we return a default mapping with FPR instead
of GPR.

llvm-svn: 302679
2017-05-10 15:19:41 +00:00
Amara Emerson 816542ceb3 [AArch64] Enable use of reduction intrinsics.
The new experimental reduction intrinsics can now be used, so I'm enabling this
for AArch64. We will need this for SVE anyway, so it makes sense to do this for
NEON reductions as well.

The existing code to match shufflevector patterns are replaced with a direct
lowering of the reductions to AArch64-specific nodes. Tests updated with the
new, simpler, representation.

Differential Revision: https://reviews.llvm.org/D32247

llvm-svn: 302678
2017-05-10 15:15:38 +00:00
Ulrich Weigand 93b369ed11 [SystemZ] Add miscellaneous instructions
This adds a few missing instructions for the assembler and
disassembler.  Those should be the last missing general-
purpose (Chapter 7) instructions for the z10 ISA.

llvm-svn: 302667
2017-05-10 14:20:15 +00:00
Ulrich Weigand d3604dc72c [SystemZ] Add missing arithmetic instructions
This adds the remaining general arithmetic instructions
for assembler / disassembler use.  Most of these are not
useful for codegen; a few might be, and those are listed
in the README.txt for future improvements.

llvm-svn: 302665
2017-05-10 14:18:47 +00:00
Michael Zuckerman ff29879f2e chang type from 'int' to 'size_t'. This will fix revision number 302652
llvm-svn: 302660
2017-05-10 14:00:57 +00:00
Sanjay Patel 2e069f250a [InstCombine] add (ashr (shl i32 X, 31), 31), 1 --> and (not X), 1
This is another step towards favoring 'not' ops over random 'xor' in IR:
https://bugs.llvm.org/show_bug.cgi?id=32706

This transformation may have occurred in longer IR sequences using computeKnownBits,
but that could be much more expensive to calculate.

As the scalar result shows, we do not currently favor 'not' in all cases. The 'not'
created by the transform is transformed again (unnecessarily). Vectors don't have
this problem because vectors are (wrongly) excluded from several other combines.

llvm-svn: 302659
2017-05-10 13:56:52 +00:00
Serge Guelton 778ece82ae Use explicit false instead of casted nullptr. NFC.
llvm-svn: 302656
2017-05-10 13:24:17 +00:00
Michael Zuckerman 1f1a912c60 [LLVM][inline-asm] Altmacro string escape character '!'
This patch is the fourth patch in a series of reviews for the Altmacro feature. 
This patch introduces a new escape character '!' and it depends on D32701.

according to https://sourceware.org/binutils/docs/as/Altmacro.html:
"single-character string escape
To include any single character literally in a string (even if the character would otherwise have some special meaning), you can prefix the character with !' (an exclamation mark). For example, you can write <4.3 !> 5.4!!>' to get the literal text `4.3 > 5.4!'. "

Differential Revision: https://reviews.llvm.org/D32792

llvm-svn: 302652
2017-05-10 13:08:11 +00:00
Simon Pilgrim cd4d913336 [DAGCombiner] Dropped explicit (sra 0, x) -> 0 and (sra -1, x) -> 0 folds.
These are both handled (and tested) by the earlier ComputeNumSignBits == EltSizeInBits fold.

llvm-svn: 302651
2017-05-10 13:06:26 +00:00
Mikael Holmen 21c867c26e [IfConversion] Add missing check in IfConversion/canFallThroughTo
Summary:
When trying to figure out if MBB could fallthrough to ToMBB (possibly by
falling through a bunch of other MBBs) we didn't actually check if there
was fallthrough between the last two blocks in the chain.

Reviewers: kparzysz, iteratee, MatzeB

Reviewed By: kparzysz, iteratee

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D32996

llvm-svn: 302650
2017-05-10 13:06:13 +00:00
Jonas Paulsson 11d251c05c [SystemZ] Implement getRepRegClassFor()
This method must return a valid register class, or the list-ilp isel
scheduler will crash. For MVT::Untyped nullptr was previously returned, but
now ADDR128BitRegClass is returned instead. This is needed just as long as
list-ilp (and probably also list-hybrid) is still there.

Review: Ulrich Weigand, A Trick
https://reviews.llvm.org/D32802

llvm-svn: 302649
2017-05-10 13:03:25 +00:00
Dmitry Preobrazhensky da61a7f9ef [AMDGPU][MC] Corrected v_madak/madmk to avoid printing "_e32" in disassembler output
See bug 32927: https://bugs.llvm.org//show_bug.cgi?id=32927

Reviewers: vpykhtin, artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D32913

llvm-svn: 302648
2017-05-10 13:00:28 +00:00
Ulrich Weigand c7eb5a95b2 [SystemZ] Add decimal integer instructions
This adds the set of decimal integer (BCD) instructions for
assembler / disassembler use.

llvm-svn: 302646
2017-05-10 12:42:45 +00:00
Ulrich Weigand 33a441adf9 [SystemZ] Add crypto instructions
This adds the set of message-security assist instructions for
assembler / disassembler use.

llvm-svn: 302645
2017-05-10 12:42:00 +00:00
Ulrich Weigand 435cd1a3e4 [SystemZ] Add translate/convert instructions
This adds the set of character-set translate and convert instructions
for assembler / disassembler use.

llvm-svn: 302644
2017-05-10 12:41:12 +00:00
Ulrich Weigand eb17909536 [SystemZ] Add missing memory/string instructions
This adds a number of missing memory and string instructions
for assembler / disassembler use.

llvm-svn: 302643
2017-05-10 12:40:15 +00:00
Simon Pilgrim c29af824bf [DAGCombiner] Add vector support to fold (shl/srl 0, x) -> 0
llvm-svn: 302641
2017-05-10 12:34:27 +00:00
Chandler Carruth f3bd8ddedb Revert r301950: SpeculativeExecution: Stop using whitelist for costs
This pass doesn't correctly handle testing for when it is legal to hoist
arbitrary instructions. The whitelist happens to make it safe, so before
it is removed the pass's legality checks will need to be enhanced.

Details have been added to the code review thread for the patch.

llvm-svn: 302640
2017-05-10 12:30:07 +00:00
Martin Storsjo 605b0466ea [AArch64] Fix a comment to match the code. NFC.
For the ELF case, the default/preferred form is the generic one, not
the short one as used for Apple - fix the comment to say so. Currently
it is a copy-paste typo.

Make the comments on the darwin default a bit more verbose.

Use enum names instead of literal 0/1 to further increase readability
and reduce fragility.

Differential Revision: https://reviews.llvm.org/D32963

llvm-svn: 302634
2017-05-10 10:51:32 +00:00
Amara Emerson 836b0f48c1 Add a late IR expansion pass for the experimental reduction intrinsics.
This pass uses a new target hook to decide whether or not to expand a particular
intrinsic to the shuffevector sequence.

Differential Revision: https://reviews.llvm.org/D32245

llvm-svn: 302631
2017-05-10 09:42:49 +00:00
Craig Topper a584af5c8e [APInt] Fix indentation of tcDivide. Combine variable declaration and initialization.
llvm-svn: 302626
2017-05-10 07:50:17 +00:00
Craig Topper 62de039bd1 [APInt] Use getNumWords function in udiv/urem/udivrem instead of reimplementinging it.
llvm-svn: 302625
2017-05-10 07:50:15 +00:00
Igor Breger fda31e64e0 [GlobalISel][X86] G_ZEXT i1 to i32/i64 support.
Summary: Support G_ZEXT i1 to i32/i64 instruction selection.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D32965

llvm-svn: 302623
2017-05-10 06:52:58 +00:00
Mikael Holmen af14a21e50 [UnreachableBlockElim] Check return value of constrainRegClass().
Summary:
MachineRegisterInfo::constrainRegClass() can fail if two register classes
don't have a common subclass or if the register class doesn't contain
enough registers. Check the return value before trying to remove Phi nodes,
and if we can't constrain, we output a COPY instead of simply replacing
registers.

Reviewers: kparzysz, david2050, wmi

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32999

llvm-svn: 302622
2017-05-10 06:33:43 +00:00
Ahmed Bougacha a09ff59cc2 [CodeGen] Don't require AA in TwoAddress at -O0.
This is a follow-up to r302611, which moved an -O0 computation of DT
from SDAGISel to TwoAddress.

Don't use it here either, and avoid computing it completely.  The only
use was forwarding the analysis as an optional argument to utility
functions.

Differential Revision: https://reviews.llvm.org/D32766

llvm-svn: 302612
2017-05-10 00:56:00 +00:00
Ahmed Bougacha 604526fe87 [CodeGen] Don't require AA in SDAGISel at -O0.
Before r247167, the pass manager builder controlled which AA
implementations were used, exporting them all in the AliasAnalysis
analysis group.

Now, AAResultsWrapperPass always uses BasicAA, but still uses other AA
implementations if made available in the pass pipeline.

But regardless, SDAGISel is required at O0, and really doesn't need to
be doing fancy optimizations based on useful AA results.

Don't require AA at CodeGenOpt::None, and only use it otherwise.

This does have a functional impact (and one testcase is pessimized
because we can't reuse a load).  But I think that's desirable no matter
what.

Note that this alone doesn't result in less DT computations: TwoAddress
was previously able to reuse the DT we computed for SDAG.  That will be
fixed separately.

Differential Revision: https://reviews.llvm.org/D32766

llvm-svn: 302611
2017-05-10 00:39:30 +00:00
Ahmed Bougacha 8c358e3016 [CodeGen] Compute DT/LI lazily in SafeStackLegacyPass. NFC.
We currently require SCEV, which requires DT/LI.  Those are expensive to
compute, but the pass only runs for functions that have the safestack
attribute.

Compute DT/LI to build SCEV lazily, only when the pass is actually going
to transform the function.

Differential Revision: https://reviews.llvm.org/D31302

llvm-svn: 302610
2017-05-10 00:39:25 +00:00
Ahmed Bougacha 00d6822278 [CodeGen] Split SafeStack into a LegacyPass and a utility. NFC.
This lets the pass focus on gathering the required analyzes, and the
utility class focus on the transformation.

Differential Revision: https://reviews.llvm.org/D31303

llvm-svn: 302609
2017-05-10 00:39:22 +00:00
Sam Clegg 41db519ba6 [WebAssembly] Fix build error in wasm YAML code
This warning didn't show up on my local build
but is causing the bots to fail.  Seems like a
bad idea to have types and variables with the
same name anyhow.

Differential Revision: https://reviews.llvm.org/D33022

llvm-svn: 302606
2017-05-10 00:14:04 +00:00
Sanjay Patel 4133d4a56e [InstCombine] add helper function for add X, C folds; NFCI
llvm-svn: 302605
2017-05-10 00:07:16 +00:00
Sam Clegg 2ffff5af85 [WebAssembly] Improve libObject support for wasm imports and exports
Previously we had only supported the importing and
exporting of functions and globals.

Also, add usefull overload of getWasmSymbol() and
getNumberOfSymbols() in support of lld port.

Differential Revision: https://reviews.llvm.org/D33011

llvm-svn: 302601
2017-05-09 23:48:41 +00:00
Easwaran Raman f5f9160072 [ProfileSummary] Make getProfileCount a non-static member function.
This change is required because the notion of count is different for
sample profiling and getProfileCount will need to determine the
underlying profile type.

Differential revision: https://reviews.llvm.org/D33012

llvm-svn: 302597
2017-05-09 23:21:10 +00:00
Peter Collingbourne c3d677f9d9 FunctionImport: Simplify function llvm::thinLTOInternalizeModule. NFCI.
llvm-svn: 302595
2017-05-09 22:43:31 +00:00
Lang Hames c936ac7f37 [ExecutionEngine] Make RuntimeDyld::MemoryManager responsible for tracking EH
frames.

RuntimeDyld was previously responsible for tracking allocated EH frames, but it
makes more sense to have the RuntimeDyld::MemoryManager track them (since the
frames are allocated through the memory manager, and written to memory owned by
the memory manager). This patch moves the frame tracking into
RTDyldMemoryManager, and changes the deregisterFrames method on
RuntimeDyld::MemoryManager from:

void deregisterEHFrames(uint8_t *Addr, uint64_t LoadAddr, size_t Size);

to:

void deregisterEHFrames();

Separating this responsibility will allow ORC to continue to throw the
RuntimeDyld instances away post-link (saving a few dozen bytes per lazy
function) while properly deregistering frames when modules are unloaded.

This patch also updates ORC to call deregisterEHFrames when modules are
unloaded. This fixes a bug where an exception that tears down the JIT can then
unwind through dangling EH frames that have been deallocated but not
deregistered, resulting in UB.

For people using SectionMemoryManager this should be pretty much a no-op. For
people with custom allocators that override registerEHFrames/deregisterEHFrames,
you will now be responsible for tracking allocated EH frames.

Reviewed in https://reviews.llvm.org/D32829

llvm-svn: 302589
2017-05-09 21:32:18 +00:00
Keno Fischer 06f962c1e8 [GVN] Fix a crash on encountering non-integral pointers
Summary:
This fixes the immediate crash caused by introducing an incorrect inttoptr
before attempting the conversion. There may still be a legality
check missing somewhere earlier for non-integral pointers, but this change
seems necessary in any case.

Reviewers: sanjoy, dberlin

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32623

llvm-svn: 302587
2017-05-09 21:07:20 +00:00
Stanislav Mekhanoshin 7e3794d5c3 [AMDGPU] Fixed typo in GCNRegPressure, NFC
VGRP -> VGPR, SGRP -> SGPR

llvm-svn: 302586
2017-05-09 20:50:04 +00:00
Zvi Rackover b483e28c77 DAGCombine: Combine shuffles of splat-shuffles
Summary: Reapply r299047, but this time handle correctly splat-masks with undef elements.

Reviewers: spatel, RKSimon, eli.friedman, andreadb

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31961

llvm-svn: 302583
2017-05-09 20:25:38 +00:00
Matthew Simpson 78fd46b230 [AArch64] Consider widening instructions in cost calculations
The AArch64 instruction set has a few "widening" instructions (e.g., uaddl,
saddl, uaddw, etc.) that take one or more doubleword operands and produce
quadword results. The operands are automatically sign- or zero-extended as
appropriate. However, in LLVM IR, these extends are explicit. This patch
updates TTI to consider these widening instructions as single operations whose
cost is attached to the arithmetic instruction. It marks extends that are part
of a widening operation "free" and applies a sub-target specified overhead
(zero by default) to the arithmetic instructions.

Differential Revision: https://reviews.llvm.org/D32706

llvm-svn: 302582
2017-05-09 20:18:12 +00:00
Sanjay Patel 7caaa79879 [InstCombine] clean up matchDeMorgansLaws(); NFCI
The motivation for getting rid of dyn_castNotVal is to allow fixing:
https://bugs.llvm.org/show_bug.cgi?id=32706

So this was supposed to be functional-change-intended for the case
of inverting constants and applying DeMorgan. However, I can't find 
any cases where that pattern will actually get to matchDeMorgansLaws()
because we have other folds in visitAnd/visitOr that do the same
thing. So this ends up just being a clean-up patch with slight efficiency
improvement, but no-functional-change-intended.

llvm-svn: 302581
2017-05-09 20:05:05 +00:00
Davide Italiano b7a6698ae9 [NewGVN] Simplify a DEBUG() statement. NFCI.
llvm-svn: 302579
2017-05-09 20:02:48 +00:00
Reid Kleckner b5fced7324 [codeview] Check for a DIExpression offset for local variables
Fixes inalloca parameters, which previously all pointed to the same
offset. Extend the test to use llvm-readobj so that we can test the
offset in a readable way.

llvm-svn: 302578
2017-05-09 19:59:29 +00:00
Adrian Prantl c10d0e5ccd Make it illegal for two Functions to point to the same DISubprogram
As recently discussed on llvm-dev [1], this patch makes it illegal for
two Functions to point to the same DISubprogram and updates
FunctionCloner to also clone the debug info of a function to conform
to the new requirement. To simplify the implementation it also factors
out the creation of inlineAt locations from the Inliner into a
general-purpose utility in DILocation.

[1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html
<rdar://problem/31926379>

Differential Revision: https://reviews.llvm.org/D32975

This reapplies r302469 with a fix for a bot failure (reparentDebugInfo
now checks for the case the orig and new function are identical).

llvm-svn: 302576
2017-05-09 19:47:37 +00:00
Piotr Padlewski d979c1f806 NFC: refactor replaceDominatedUsesWith
Summary:
Since I will post patch with some changes to
replaceDominatedUsesWith, it would be good to avoid
duplicating code again.

Reviewers: davide, dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32798

llvm-svn: 302575
2017-05-09 19:39:44 +00:00
Wolfgang Pieb 15fa44698c [DWARF] Fix a parsing issue with type unit headers.
Reviewers: dblaikie

Differential Revision: https://reviews.llvm.org/D32987

llvm-svn: 302574
2017-05-09 19:38:38 +00:00
Eric Beckmann 674deed94e Fix the Endianness bug by adding the little endian UTF marker.
Summary: Quick fix

Reviewers: zturner, uweigand

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33014

llvm-svn: 302573
2017-05-09 19:35:45 +00:00
Serge Guelton e38003f839 Suppress all uses of LLVM_END_WITH_NULL. NFC.
Use variadic templates instead of relying on <cstdarg> + sentinel.
This enforces better type checking and makes code more readable.

Differential Revision: https://reviews.llvm.org/D32541

llvm-svn: 302571
2017-05-09 19:31:13 +00:00
Jacques Pienaar 0dbcc34f6b [lanai] Add computeKnownBitsForTargetNode for Lanai.
Summary: computeKnownBitsForTargetNode was not defined for Lanai which resulted in additional AND's with 0x1 for the output of SETCC instructions.

Reviewers: eliben, majnemer

Reviewed By: majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29605

llvm-svn: 302568
2017-05-09 18:35:26 +00:00
Davide Italiano 63998ec3c8 [NewGVN] Explain why sorting by pointer values doesn't introduce non-determinism.
Thanks to Eli for pointing out in a post-commit review comment.

llvm-svn: 302566
2017-05-09 18:29:37 +00:00
Ulrich Weigand d94e6d9ac1 [SystemZ] Support missing relocation types in RuntimeDyldELF
Handle some more relocation types in
RuntimeDyldELF::resolveSystemZRelocation

This fixes a number of failing LLDB test cases.

llvm-svn: 302565
2017-05-09 18:27:39 +00:00
Sam Clegg a0efcfe92b [WebAssembly] Fix validation of start function
The check for valid start function was inverted.  Added a new
test in test/Object to check this case and fixed the existing
tests in for ObjectYAML.

Differential Revision: https://reviews.llvm.org/D32986

llvm-svn: 302560
2017-05-09 17:51:38 +00:00
Krzysztof Parzyszek fd5a81594e [RegScavenger] Rangify a loop, NFC
llvm-svn: 302554
2017-05-09 17:16:52 +00:00
Davide Italiano d6bb8cab03 [NewGVN] Fix a consistent order for phi nodes operands.
The way we currently define congruency for two PHIExpression(s) is:

1) The operands to the phi functions are congruent
2) The PHIs are defined in the same BasicBlock.

NewGVN works under the assumption that phi operands are in predecessor
order, or at least in some consistent order. OTOH, is valid IR:

patatino:
  %meh = phi i16 [ %0, %winky ], [ %conv1, %tinky ]
  %banana = phi i16 [ %0, %tinky ], [ %conv1, %winky ]
  br label %end

and the in-memory representations of the two SSA registers have an
inconsistent order. This violation of NewGVN assumptions results into
two PHIs found congruent when they're not. While we think it's useful
to have always a consistent order enforced, let's fix this in NewGVN
sorting uses in predecessor order before creating a PHI expression.

Differential Revision:  https://reviews.llvm.org/D32990

llvm-svn: 302552
2017-05-09 16:58:28 +00:00
Craig Topper 0acb6654d3 [APInt] Remove return value from tcFullMultiply.
The description says it returns the number of words needed to represent the results. But the way it was coded it always returns (lhsWords + rhsWords) or (lhsWords + rhsWords - 1). But the result could be even smaller than that and it wouldn't tell you.

No one uses the result today so rather than try to fix it, just remove it.

llvm-svn: 302551
2017-05-09 16:47:33 +00:00
Daniel Berlin 6604a2ffbb NewGVN: Make all of symbolic evaluation logically const.
llvm-svn: 302550
2017-05-09 16:40:04 +00:00
Craig Topper f893d49f0c [X86] Add more patterns for BZHI isel
This patch adds more patterns that a reasonable person might write that can be compiled to BZHI.

This adds support for

(~0U >> (32 - b)) & a;

and

a << (32 - b) >> (32 - b);

This was inspired by the code in APInt::clearUnusedBits.

This can pass an index of 32 to the bzhi instruction which a quick test of Haswell hardware shows will not mask any bits. Though the description text in the Intel manual says the "index is saturated to OperandSize-1". The pseudocode in the same manual indicates no bits will be zeroed for this case.

I think this is still missing cases where the subtract portion is an 8-bit operation.

Differential Revision: https://reviews.llvm.org/D32616

llvm-svn: 302549
2017-05-09 16:32:11 +00:00
Sanjay Patel 6844e21f59 [InstCombineCasts] Fix checks in sext->lshr->trunc pattern.
The comment says to avoid the case where zero bits are shifted into the truncated value, 
but the code checks that the shift is smaller than the truncated value instead of the 
number of bits added by the sign extension. Fixing this allows a shift by more than the 
value size to be introduced, which is undefined behavior, so the shift is capped at the 
value size minus one, which has the expected behavior of filling the value with the sign 
bit.

Patch by Jacob Young!

Differential Revision: https://reviews.llvm.org/D32285

llvm-svn: 302548
2017-05-09 16:24:59 +00:00
Guy Blank 0c42d8c35b VX512] Only look at lower bit in constant scalar masks
for scalar masked instructions only the lower bit of the mask is relevant. so for constant masks we should either do an unmasked operation or no operation, depending on the value of the lower bit.
This patch handles cases where the lower bit is '1'.

Differential Revision: https://reviews.llvm.org/D32805

llvm-svn: 302546
2017-05-09 16:16:48 +00:00
Reid Kleckner 3a363fff7e Re-land "Use the frame index side table for byval and inalloca arguments"
This re-lands r302483. It was not the cause of PR32977.

llvm-svn: 302544
2017-05-09 16:02:20 +00:00
Reid Kleckner 84075fddff Re-land "Don't add DBG_VALUE instructions for static allocas in dbg.declare"
This re-lands commit r302461. It was not the cause of PR32977.

llvm-svn: 302543
2017-05-09 16:01:47 +00:00
Tim Shen 04de70d3a7 [Atomic] Remove IsStore/IsLoad in the interface, and pass the instruction instead. NFC.
Now both emitLeadingFence and emitTrailingFence take the instruction
itself, instead of taking IsLoad/IsStore pairs.
Instruction::mayReadFromMemory and Instrucion::mayWriteToMemory are used
for determining those two booleans.

The instruction argument is also useful for later D32763, in
emitTrailingFence. For emitLeadingFence, it seems to have cleaner
interface with the proposed change.

Differential Revision: https://reviews.llvm.org/D32762

llvm-svn: 302539
2017-05-09 15:27:17 +00:00
Aaron Ballman 3234647df6 Amend r302535; ifndef and ifdef are different, as it turns out.
llvm-svn: 302537
2017-05-09 15:12:03 +00:00
Aaron Ballman 06297e839a ARMRegisterBankInfo.h requires LLVM_BUILD_GLOBAL_ISEL to be defined. If it is not defined, then ARMGenRegisterBank.inc is not table generated and the inclusion of this header causes the build to fail.
llvm-svn: 302535
2017-05-09 14:59:48 +00:00
Hans Wennborg 66fb0d9768 Revert r302469 "Make it illegal for two Functions to point to the same DISubprogram"
This caused PR32977.

Original commit message:

> Make it illegal for two Functions to point to the same DISubprogram
>
> As recently discussed on llvm-dev [1], this patch makes it illegal for
> two Functions to point to the same DISubprogram and updates
> FunctionCloner to also clone the debug info of a function to conform
> to the new requirement. To simplify the implementation it also factors
> out the creation of inlineAt locations from the Inliner into a
> general-purpose utility in DILocation.
>
> [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html
> <rdar://problem/31926379>
>
> Differential Revision: https://reviews.llvm.org/D32975

llvm-svn: 302533
2017-05-09 14:44:15 +00:00
Anna Thomas 0691483435 [LV] Fix insertion point for shuffle vectors in first order recurrence
Summary:
In first order recurrence vectorization, when the previous value is a phi node, we need to
set the insertion point to the first non-phi node.
We can have the previous value being a phi node, due to the generation of new
IVs as part of trunc optimization [1].

[1] https://reviews.llvm.org/rL294967

Reviewers: mssimpso, mkuper

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D32969

llvm-svn: 302532
2017-05-09 14:29:33 +00:00
Aaron Ballman f22f885b66 Removing a file that is not necessary (and was causing link diagnostics with MSVC 2015); NFC.
llvm-svn: 302531
2017-05-09 14:22:48 +00:00
Serge Pavlov d526b13e61 Add extra operand to CALLSEQ_START to keep frame part set up previously
Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs the backend that the
argument is prepared in stack prior to  CALLSEQ_START..CALLSEQ_END
sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size
stored in CALLSEQ_START in this case does not count the size of this
argument. However CALLSEQ_END still keeps total frame size, as caller can
be responsible for cleanup of entire frame. So CALLSEQ_START and
CALLSEQ_END keep different frame size and the difference is treated by
MachineVerifier as stack error. Currently there is no way to distinguish
this case from actual errors.

This patch adds additional argument to CALLSEQ_START and its
target-specific counterparts to keep size of stack that is set up prior to
the call frame sequence. This argument allows MachineVerifier to calculate
actual frame size associated with frame setup instruction and correctly
process the case of inalloca arguments.

The changes made by the patch are:
- Frame setup instructions get the second mandatory argument. It
  affects all targets that use frame pseudo instructions and touched many
  files although the changes are uniform.
- Access to frame properties are implemented using special instructions
  rather than calls getOperand(N).getImm(). For X86 and ARM such
  replacement was made previously.
- Changes that reflect appearance of additional argument of frame setup
  instruction. These involve proper instruction initialization and
  methods that access instruction arguments.
- MachineVerifier retrieves frame size using method, which reports sum of
  frame parts initialized inside frame instruction pair and outside it.

The patch implements approach proposed by Quentin Colombet in
https://bugs.llvm.org/show_bug.cgi?id=27481#c1.
It fixes 9 tests failed with machine verifier enabled and listed
in PR27481.

Differential Revision: https://reviews.llvm.org/D32394

llvm-svn: 302527
2017-05-09 13:35:13 +00:00
Simon Dardis 659c43f11a Revert "[MIPS] Add support to match more patterns for DINS instruction"
This reverts commit rL302512. This broke the mips buildbots.

llvm-svn: 302526
2017-05-09 13:18:48 +00:00
Simon Pilgrim ca3a63a849 [X86][SSE42] Lower v2i64/v4i64 ASHR(X, 63) as PCMPGTQ(0, X)
Similar to what we do for vXi8 ASHR(X, 7), use SSE42's PCMPGTQ to splat the sign instead of using the PSRAD+PSHUFD.

Avoiding bitcasts this improves combines that utilize computeNumSignBits, permits memory folding and reduces pipe pressure. Although it does require a second register, given that this is a (cheap) zero register the impact is minimal.

Differential Revision: https://reviews.llvm.org/D32973

llvm-svn: 302525
2017-05-09 13:14:40 +00:00
Diana Picus e8da53f4e0 Revert "[Dwarf] Disable reference verification for now (PR32972)"
This reverts commit r302520 because it break the unit tests.

llvm-svn: 302524
2017-05-09 13:05:43 +00:00
Renato Golin 94d6c8fb36 [Dwarf] Disable reference verification for now (PR32972)
There is no other explanation about why this only started happening
now, even though it crashes on old code (supposedly reachable from
here).

The only common factor between the failing bots is that they use GCC
(4.9 and 5.3) to compile Clang, while the others use Clang 3.8, but the
failure is while building the tests, as an assertion, on Clang.

Commenting it out for now in hope the bots will go back green, but we
should keep looking for the real cause, and update bugzilla.

llvm-svn: 302520
2017-05-09 12:36:50 +00:00
Amara Emerson cf9daa33a7 Introduce experimental generic intrinsics for horizontal vector reductions.
- This change allows targets to opt-in to using them instead of the log2
  shufflevector algorithm.
- The SLP and Loop vectorizers have the common code to do shuffle reductions
  factored out into LoopUtils, and now have a unified interface for generating
  reductions regardless of the preference of the target. LoopUtils now uses TTI
  to determine what kind of reductions the target wants to handle.
- For CodeGen, basic legalization support is added.

Differential Revision: https://reviews.llvm.org/D30086

llvm-svn: 302514
2017-05-09 10:43:25 +00:00
Nikolai Bozhenov b7bf386e80 [X86] Clang option -fuse-init-array has no effect when generating for MCU target
Reviewers: Eugene.Zelenko, dschuff, craig.topper

Reviewed By: craig.topper

Subscribers: ahatanak, aaboud, DavidKreitzer, llvm-commits, cfe-commits

Differential Revision: https://reviews.llvm.org/D32543
Patch by AndreiGrischenko <andrei.l.grischenko@intel.com>

llvm-svn: 302513
2017-05-09 10:14:03 +00:00
Strahinja Petrovic 27ae4c3259 [MIPS] Add support to match more patterns for DINS instruction
This patch adds support for recognizing patterns to match
DINS instruction.

Differential Revision: https://reviews.llvm.org/D31465

llvm-svn: 302512
2017-05-09 10:02:00 +00:00
Diana Picus 95640a1c4d [ARM GlobalISel] Remove hand-written G_FADD selection
Remove the code selecting G_FADD - now that TableGen can handle more
opcodes, it's not needed anymore.

llvm-svn: 302511
2017-05-09 08:32:42 +00:00
Craig Topper ef02803bed [ConstantRange] Rewrite shl to avoid repeated calls to getUnsignedMax and avoid creating the min APInt until we're sure we need it. Use inplace shift operations.
llvm-svn: 302510
2017-05-09 07:04:04 +00:00
Craig Topper 79b7666f02 [ConstantRange] Combine the two adds max+1 in lshr into a single addition.
llvm-svn: 302509
2017-05-09 07:04:02 +00:00
Craig Topper 61729fd036 [ConstantRange] Use APInt::isNullValue in place of comparing with 0. The compiler should be able to generate slightly better code for the former. NFC
llvm-svn: 302508
2017-05-09 05:01:29 +00:00
Reid Kleckner 41bb94233b Revert "Don't add DBG_VALUE instructions for static allocas in dbg.declare"
This reverts commit r302461.

It appears to be causing failures compiling gtest with debug info on the
Linux sanitizer bot. I was unable to reproduce the failure locally,
however.

llvm-svn: 302504
2017-05-09 01:57:44 +00:00
Teresa Johnson 720d9b4111 Fix code section prefix for proper layout
Summary:
r284533 added hot and cold section prefixes based on profile
information, to enable grouping of hot/cold functions at link time.
However, it used "cold" as the prefix for cold sections, but gold only
recognizes "unlikely" (which is used by gcc for cold sections).
Therefore, cold sections were not properly being grouped. Switch to
using "unlikely"

Reviewers: danielcdh, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32983

llvm-svn: 302502
2017-05-09 01:43:24 +00:00
Kostya Serebryany b068087bd8 [libFuzzer] update docs on -print_coverage/-dump_coverage
llvm-svn: 302498
2017-05-09 01:34:27 +00:00
Kostya Serebryany fe4ed9bd85 [libFuzzer] make sure the input data is not overwritten in the fuzz target (if it is -- report an error)
llvm-svn: 302494
2017-05-09 01:17:29 +00:00
Reid Kleckner 9f29914d40 Revert "Use the frame index side table for byval and inalloca arguments"
This reverts r302483 and it's follow up fix.

llvm-svn: 302493
2017-05-09 01:14:39 +00:00
Craig Topper 3369f8cc4a [APInt] Use default constructor instead of explicitly creating a 1-bit APInt in udiv and urem. NFC
The default constructor does the same thing.

llvm-svn: 302487
2017-05-08 23:49:54 +00:00
Craig Topper 24ae69515b [APInt] Remove 'else' after 'return' in udiv and urem. NFC
llvm-svn: 302486
2017-05-08 23:49:49 +00:00
Evgeniy Stepanov f7e8acf0fc Ignore !associated metadata with null argument.
Fixes PR32577 (comment 10).
Such metadata may legitimately appear in LTO.

llvm-svn: 302485
2017-05-08 23:46:20 +00:00
Reid Kleckner 45efcf0c96 Use the frame index side table for byval and inalloca arguments
Summary:
For inalloca functions, this is a very common code pattern:

  %argpack = type <{ i32, i32, i32 }>
  define void @f(%argpack* inalloca %args) {
  entry:
    %a = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 0
    %b = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 1
    %c = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 2
    tail call void @llvm.dbg.declare(metadata i32* %a, ... "a")
    tail call void @llvm.dbg.declare(metadata i32* %c, ... "b")
    tail call void @llvm.dbg.declare(metadata i32* %b, ... "c")

Even though these GEPs can be simplified to a constant offset from EBP
or RSP, we don't do that at -O0, and each GEP is computed into a
register. Registers used to compute argument addresses are typically
spilled and clobbered very quickly after the initial computation, so
live debug variable tracking loses information very quickly if we use
DBG_VALUE instructions.

This change moves processing of dbg.declare between argument lowering
and basic block isel, so that we can ask if an argument has a frame
index or not. If the argument lives in a register as is the case for
byval arguments on some targets, then we don't put it in the side table
and during ISel we emit DBG_VALUE instructions.

Reviewers: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32980

llvm-svn: 302483
2017-05-08 23:20:27 +00:00
Sanjoy Das 76bbdd1a16 [InstNamer] Use range-for
llvm-svn: 302481
2017-05-08 23:18:43 +00:00
Sanjoy Das 0ac3bf1cc8 [InstNamer] Don't check type of arguments (they're never void)
llvm-svn: 302480
2017-05-08 23:18:39 +00:00
Sanjoy Das 7be961b081 Delete trailing whitespace
llvm-svn: 302479
2017-05-08 23:18:36 +00:00
Greg Clayton 58a2e0d90b Add const to "DWARFDie &Die" in a few functions as they can't change the DWARFDie.
llvm-svn: 302471
2017-05-08 21:29:17 +00:00
Eugene Zemtsov 3b52dbd934 Fix typo
llvm-svn: 302470
2017-05-08 21:20:53 +00:00
Adrian Prantl 200a5ef526 Make it illegal for two Functions to point to the same DISubprogram
As recently discussed on llvm-dev [1], this patch makes it illegal for
two Functions to point to the same DISubprogram and updates
FunctionCloner to also clone the debug info of a function to conform
to the new requirement. To simplify the implementation it also factors
out the creation of inlineAt locations from the Inliner into a
general-purpose utility in DILocation.

[1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html
<rdar://problem/31926379>

Differential Revision: https://reviews.llvm.org/D32975

llvm-svn: 302469
2017-05-08 21:17:08 +00:00
Greg Clayton 5404f114d3 Fix typo "veify" to "verify".
llvm-svn: 302466
2017-05-08 20:53:00 +00:00
Sanjay Patel a1c8814891 [InstCombine] add folds for not-of-shift-right
This is another step towards getting rid of dyn_castNotVal, 
so we can recommit:
https://reviews.llvm.org/rL300977

As the tests show, we were missing the lshr case for constants
and both ashr/lshr vector splat folds. The ashr case with constant
was being performed inefficiently in 2 steps. It's also possible
there was a latent bug in that case because we can't do that fold
if the constant is positive:
http://rise4fun.com/Alive/Bge

llvm-svn: 302465
2017-05-08 20:49:59 +00:00
Davide Italiano aa42a10051 [PartialInlining] Capture by reference rather than by value.
llvm-svn: 302464
2017-05-08 20:44:01 +00:00
Tim Northover c48c993b75 ARM: use divmod libcalls on embedded MachO platforms too.
The separated libcalls are implemented in terms of __divmodsi4 and __udivmodsi4
anyway, so we should always use them if possible.

llvm-svn: 302462
2017-05-08 20:00:14 +00:00
Reid Kleckner bf828eedb4 Don't add DBG_VALUE instructions for static allocas in dbg.declare
Summary:
An llvm.dbg.declare of a static alloca is always added to the
MachineFunction dbg variable map, so these values are entirely
redundant. They survive all the way through codegen to be ignored by
DWARF emission.

Effectively revert r113967

Two bugpoint-reduced test cases from 2012 broke as a result of this
change. Despite my best efforts, I haven't been able to rewrite the test
case using dbg.value. I'm not too concerned about the lost coverage
because these were reduced from the test-suite, which we still run.

Reviewers: aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32920

llvm-svn: 302461
2017-05-08 19:58:15 +00:00
Zachary Turner 1dacb24222 [CodeView] Add support for random access type visitors.
Previously type visitation was done strictly sequentially, and
TypeIndexes were computed by incrementing the TypeIndex of the
last visited record.  This works fine for situations like dumping,
but not when you want to visit types in random order.  For example,
in a debug session someone might lookup a symbol by name, find that
it has TypeIndex 10,000 and then want to go straight to TypeIndex
10,000.

In order to make this work, the visitation framework needs a mode
where it can plumb TypeIndices through the callback pipeline.  This
patch adds such a mode.  In doing so, it is necessary to provide
an alternative implementation of TypeDatabase that supports random
access, so that is done as well.

Nothing actually uses these random access capabilities yet, but
this will be done in subsequent patches.

Differential Revision: https://reviews.llvm.org/D32928

llvm-svn: 302454
2017-05-08 18:38:43 +00:00
Quentin Colombet 55a72b3b05 [AArch64][RegisterBankInfo] Change the default mapping of fp loads.
This fixes PR32550, in a way that does not imply running the greedy
mode at O0.

The fix consists in checking if a load is used by any floating point
instruction and if yes, we return a default mapping with FPR instead
of GPR.

llvm-svn: 302453
2017-05-08 18:16:31 +00:00
Quentin Colombet 0e41a41b87 [AArch64][RegisterBankInfo] Fix mapping cost for GPR.
In r292478, we changed the order of the enum that is referenced by
PMI_FirstXXX. This had the side effect of changing the cost of the
mapping of all the loads, instead of just the FPRs ones.

Reinstate the higher cost for all but GPR loads.
Note: This did not have any external visible effects:
- For Fast mode, the cost would have been higher, but we don't care
  because we don't try to use alternative mappings.
- For Greedy mode, the higher cost of the GPR loads, would have
  triggered the use of the supposedly alternative mapping, that
  would be in fact the same GPR mapping but with a lower cost.

llvm-svn: 302452
2017-05-08 18:16:23 +00:00
Craig Topper 8297e52285 [ARM] Use a Changed flag to avoid making a pass's return value dependent on a compare with a Statistic object.
Statistic compile to always be 0 in release build so this compare would always return false. And in the debug builds Statistic are global variables and remember their values across pass runs. So this compare returns true anytime the pass runs after the first time it modifies something.

This was found after reviewing all usages of comparison operators on a Statistic object. We had some internal code that did a compare with a statistic that caused a mismatch in output between debug and release builds. So we did an audit out of paranoia.

llvm-svn: 302450
2017-05-08 18:02:51 +00:00
Craig Topper ef869ecf0e [SCEV] Don't use std::move on both inputs to APInt::operator+ or operator-. It might be confusing to the reader. NFC
llvm-svn: 302448
2017-05-08 17:39:01 +00:00
Daniel Berlin 0f2af7f93b ConstantFold: Handle gep nonnull, undef as well
llvm-svn: 302447
2017-05-08 17:37:33 +00:00
Daniel Berlin 74ffa5c62f ConstantFold: Fold getelementptr (i32, i32* null, i64 undef) to null.
Transforms/IndVarSimplify/2011-10-27-lftrnull will fail if this regresses.
Transforms/GVN/PRE/2011-06-01-NonLocalMemdepMiscompile.ll has been changed to still test what it was
trying to test.

llvm-svn: 302446
2017-05-08 17:37:29 +00:00
Craig Topper 868813ffbb [ValueTracking] Use KnownOnes to provide a better bound on known zeros for ctlz/cttz intrinics
This patch uses KnownOnes of the input of ctlz/cttz to bound the value that can be returned from these intrinsics. This makes these intrinsics more similar to the handling for ctpop which already uses known bits to produce a similar bound.

Differential Revision: https://reviews.llvm.org/D32521

llvm-svn: 302444
2017-05-08 17:22:34 +00:00
Sanjay Patel 6745447753 [InstSimplify] fix typo; NFC
llvm-svn: 302439
2017-05-08 16:35:02 +00:00
Sanjay Patel 2a06263036 [InstCombine] use local variable to reduce code duplication; NFCI
llvm-svn: 302438
2017-05-08 16:33:42 +00:00
Craig Topper 6e11a05e7e [ValueTracking] Introduce a version of computeKnownBits that returns a KnownBits struct. Begin using it to replace internal usages of ComputeSignBit
This introduces a new interface for computeKnownBits that returns the KnownBits object instead of requiring it to be pre-constructed and passed in by reference.

This is a much more convenient interface as it doesn't require the caller to figure out the BitWidth to pre-construct the object. It's so convenient that I believe we can use this interface to remove the special ComputeSignBit flavor of computeKnownBits.

As a step towards that idea, this patch replaces all of the internal usages of ComputeSignBit with this new interface. As you can see from the patch there were a couple places where we called ComputeSignBit which really called computeKnownBits, and then called computeKnownBits again directly. I've reduced those places to only making one call to computeKnownBits. I bet there are probably external users that do it too.

A future patch will update the external users and remove the ComputeSignBit interface. I'll also working on moving more locations to the KnownBits returning interface for computeKnownBits.

Differential Revision: https://reviews.llvm.org/D32848

llvm-svn: 302437
2017-05-08 16:22:48 +00:00
Sanjay Patel 2df38a80f1 [InstCombine/InstSimplify] add comments about code duplication; NFC
llvm-svn: 302436
2017-05-08 16:21:55 +00:00
Zvi Rackover 558f86b4bc InstructionSimplify: Refactor foldIdentityShuffles. NFC.
Summary:
Minor refactoring of foldIdentityShuffles() which allows the removal of a
ConstantDataVector::get() in SimplifyShuffleVectorInstruction.

Reviewers: spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32955

Conflicts:
	lib/Analysis/InstructionSimplify.cpp

llvm-svn: 302433
2017-05-08 15:46:58 +00:00
Simon Pilgrim df39b03f29 [X86][SSE] Improve combineLogicBlendIntoPBLENDV to use general masks.
Currently combineLogicBlendIntoPBLENDV can only match ASHR to detect sign splatting of a bit mask, this patch generalises this to use computeNumSignBits instead.

This is a first step in several things we can do to improve PBLENDV support:

 * Better matching of X86ISD::ANDNP patterns.
 * Handle floating point cases.
 * Better vector and bitcast support in computeNumSignBits.
 * Recognise that PBLENDV only uses the sign bit of the mask, we should be able strip away sign splats (ASHR, PCMPGT isNeg tests etc.).

Differential Revision: https://reviews.llvm.org/D32953

llvm-svn: 302424
2017-05-08 14:16:39 +00:00
Zvi Rackover dfbd3d7903 IR: Add a shufflevector mask commutation helper function. NFC.
Summary:
Following up on Sanjay's suggetion in D32955, move this functionality
into ShuffleVectornstruction.

Reviewers: spatel, RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32956

llvm-svn: 302420
2017-05-08 12:40:18 +00:00
Simon Pilgrim f5ca255d18 [ARM][NEON] Add support for ISD::ABS lowering
Update NEON int_arm_neon_vabs intrinsic to use the ISD::ABS opcode directly

Added constant folding tests.

Differential Revision: https://reviews.llvm.org/D32938

llvm-svn: 302417
2017-05-08 10:37:34 +00:00
Martin Storsjo fd4c158a84 [ARM] Clear the constant pool cache on explicit .ltorg directives
Multiple ldr pseudoinstructions with the same constant value will
reuse the same constant pool entry. However, if the constant pool
is explicitly flushed with a .ltorg directive, we should not try
to reference constants in the previous pool any longer, since they
may be out of range.

This fixes assembling hand-written assembler source which repeatedly
loads the same constant value, across a binary size larger than the
pc-relative fixup range for ldr instructions (4096 bytes). Such
assembler source already uses explicit .ltorg instructions to emit
constant pools with regular intervals. However if we try to reuse
constants emitted in earlier pools, they end up out of range.

This makes the output of the testcase match what binutils gas does
(prior to this patch, it would fail to assemble).

Differential Revision: https://reviews.llvm.org/D32847

llvm-svn: 302416
2017-05-08 10:26:24 +00:00
Simon Pilgrim 7a28a3ac78 [AARCH64][NEON] Add support for ISD::ABS lowering
Update int_aarch64_neon_abs intrinsic to use the ISD::ABS opcode directly

Differential Revision: https://reviews.llvm.org/D32940

llvm-svn: 302415
2017-05-08 10:25:18 +00:00
Igor Breger 810c6257f1 [GlobalISel][X86] G_GEP selection support.
Summary: [GlobalISel][X86] G_GEP selection support.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: dberris, rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D32396

llvm-svn: 302412
2017-05-08 09:40:43 +00:00
Igor Breger 605b965ae5 [GlobalISel][X86] G_MUL legalizer/selector support.
Summary:
G_MUL legalizer/selector/regbank support.
Use only Tablegen-erated instruction selection.
This patch dealing with legal operations only.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: krytarowski, rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D32698

llvm-svn: 302410
2017-05-08 09:03:37 +00:00
Craig Topper c96a84d813 [APInt] Modify tcMultiplyPart's overflow detection to not depend on 'i' from the earlier loop. NFC
The value of 'i' is always the smaller of DstParts and SrcParts so we can just use that fact to write all the code in terms of SrcParts and DstParts.

llvm-svn: 302408
2017-05-08 06:34:41 +00:00
Craig Topper 0cbab7cc7a [APInt] Use std::min instead of writing the same thing with the ternary operator. NFC
llvm-svn: 302407
2017-05-08 06:34:39 +00:00
Craig Topper a6c142ab4d [APInt] Remove 'else' after 'return' in tcMultiply methods. NFC
llvm-svn: 302406
2017-05-08 06:34:36 +00:00
Dean Michael Berris 9bcaed867a [XRay] Custom event logging intrinsic
This patch introduces an LLVM intrinsic and a target opcode for custom event
logging in XRay. Initially, its use case will be to allow users of XRay to log
some type of string ("poor man's printf"). The target opcode compiles to a noop
sled large enough to enable calling through to a runtime-determined relative
function call. At runtime, when X-Ray is enabled, the sled is replaced by
compiler-rt with a trampoline to the logic for creating the custom log entries.

Future patches will implement the compiler-rt parts and clang-side support for
emitting the IR corresponding to this intrinsic.

Reviewers: timshen, dberris

Subscribers: igorb, pelikan, rSerge, timshen, echristo, dberris, llvm-commits

Differential Revision: https://reviews.llvm.org/D27503

llvm-svn: 302405
2017-05-08 05:45:21 +00:00