Commit Graph

151300 Commits

Author SHA1 Message Date
Anna Thomas e5e5e59d8b [RuntimeUnrolling] Add logic for loops with multiple exit blocks
Summary:
Runtime unrolling is done for loops with a single exit block and a
single exiting block (and this exiting block should be the latch block).
This patch adds logic to support unrolling in the presence of multiple exit
blocks (which also means multiple exiting blocks).
Currently this is under an off-by-default option and is supported when
epilog code is generated. Support in presence of prolog code will be in
a future patch (we just need to add more tests, and update comments).

This patch is essentially an implementation patch. I have not added any
heuristic (in terms of branches added or code size) to decide when
this should be enabled.

Reviewers: mkuper, sanjoy, reames, evstupac

Reviewed by: reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33001

llvm-svn: 306846
2017-06-30 17:57:07 +00:00
Juergen Ributzka 5086cb55e8 [DWARF] Don't include TestingSupport in LLVM_LINK_COMPONENTS.
This fixes a cmake configuration issue when LLVM is configured with no targets.
Instead we need to add TestingSupport directly with target_link_libraries.

llvm-svn: 306842
2017-06-30 16:50:51 +00:00
Jakub Kuderski 069e5cfaf1 [Dominators] Do not perform expensive checks by default. Fix PR33656.
Summary:
Some transforms assume that DT.verifyDomInfo() is not expensive and call it even when ENABLE_EXPENSIVE_CHECKS is not set.
This patch disables expensive Dominator Tree verification (reachability, parent property, sibling property) to fix
[[ https://bugs.llvm.org/show_bug.cgi?id=33656 | PR33656 ]].

Note that this is only a temporary fix.

Reviewers: dberlin, chapuni, kparzysz, grosser

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34894

llvm-svn: 306839
2017-06-30 16:33:04 +00:00
Zachary Turner e9db96e6d9 Revert "[lit] Clean output directories before running tests."
This reverts commit da6318a92fba793e4f2447ec478b001392d57d43.

This is causing failures on some build bots due to what appears
to be some kind of lit ordering dependency.

llvm-svn: 306833
2017-06-30 16:05:03 +00:00
Zachary Turner 0955739b36 [lit] Clean output directories before running tests.
Presently lit leaks files in the tests' output directories.
Specifically, if a test creates output files, lit makes no
effort to remove them prior to the next test run.  This is
problematic because it leads to false positives whenever a
test passes because stale  files were present.  In general
it is a source of flakiness that should be removed.

This patch addresses this by building the list of all test
directories that are part of the current run set, and then
deleting those directories and recreating them anew.  This
gives each test a clean baseline to start from.

Differential Revision: https://reviews.llvm.org/D34732

llvm-svn: 306832
2017-06-30 16:01:30 +00:00
Simon Dardis da96c43682 [MIPS] Handle PIC load address macro instructions in N64.
In particular, use CALL16 (similar to O32) for address loads into T9 for certain
cases.  Otherwise use a %got_disp relocation to load the address of a symbol.
Small offsets (small enough to fit in a 16-bit signed immediate) can be used and
are added to the symbol address after it is loaded from the GOT.  Larger offsets
are currently unsupported and result in an error from the assembler.

Reviewers: sdardis

Reviewed By: sdardis

Patch by: John Baldwin

Subscribers: llvm-commits, seanbruno, arichardson, emaste, dim

Differential Revision: https://reviews.llvm.org/D33948

llvm-svn: 306831
2017-06-30 15:44:27 +00:00
Alexey Bataev 8843aab83a [SLP] A test for limiting vectorization of instructions, NFC.
llvm-svn: 306828
2017-06-30 14:37:32 +00:00
Teresa Johnson b247ffbaed [LTO] Remove values from non-prevailing comdats
Summary:
When linking a regular LTO module, if it has any non-prevailing values
(dropped to available_externally) in comdats, we need to do more than
just remove those values from their comdat. We also remove all values
from that comdat, so as to avoid leaving an incomplete comdat.

This is necessary in case we are compiling in mixed regular and ThinLTO
mode, since the resulting regularLTO native object is always linked into
the final binary first. We need to prevent the linker from selecting an
incomplete comdat that was not the prevailing copy.

Fixes PR32980.

Reviewers: pcc, rafael

Subscribers: mehdi_amini, david2050, llvm-commits, inglorion

Differential Revision: https://reviews.llvm.org/D34803

llvm-svn: 306826
2017-06-30 14:03:24 +00:00
Simon Pilgrim d1698af8c8 Remove unnecessary commented out argument. NFCI.
llvm-svn: 306824
2017-06-30 13:26:17 +00:00
Ulrich Weigand 9932f92882 [SystemZ] Add missing high-word facility instructions
There are a few instructions provided by the high-word facility (z196)
that we cannot easily exploit for code generation.  This patch at least
adds those missing instructions for the assembler and disassembler.

This means that now all nonprivileged instructions up to z13 are
supported by the LLVM assembler / disassembler.

llvm-svn: 306821
2017-06-30 12:56:29 +00:00
Nirav Dave a35938d827 Revert "[DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset"
This reverts commit r306819 which appears be exposing underlying
issues in a stage1 ppc64be build

llvm-svn: 306820
2017-06-30 12:56:02 +00:00
Nirav Dave c5a48c1ee8 [DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset
As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using
generic checks. Also, propagate missing local handling from there to
BaseIndexOffset checks.

Tests of note:

  * test/CodeGen/X86/build-vector* - Improved.
  * test/CodeGen/BPF/undef.ll - Improved store alignment allows an
    additional store merge

  * test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a
    case we already do not handle well. Here, the DAG is improved, but
    scheduling causes a code size degradation.

Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D34472

llvm-svn: 306819
2017-06-30 12:23:41 +00:00
NAKAMURA Takumi f0e58816e4 CREDITS.TXT: Update myself.
llvm-svn: 306818
2017-06-30 11:59:53 +00:00
Simon Pilgrim e5e9232260 [X86] Updated 32-bit memcmp tests to run with/without SSE2
llvm-svn: 306816
2017-06-30 11:23:59 +00:00
Nikolai Bozhenov bde9b14c6f Revert of r306525: "Canonicalize clamp of float types to minmax"
llvm-svn: 306815
2017-06-30 10:39:09 +00:00
George Rimar 892c6c86ea [YAML] - Teach yaml2obj/obj2yaml to work with numeric relocation values.
That may be useful if we want to produce or parse object containing
broken relocation values using yaml2obj/obj2yaml.

Previously that was impossible because only enum values were parsed
correctly, this patch allows to put any numeric value as a
relocation type.

Differential revision: https://reviews.llvm.org/D34758

llvm-svn: 306814
2017-06-30 10:31:03 +00:00
George Rimar d8508b0af3 [DWARF] - Simplify HandleExpectedError implementation in DWARFDebugInfoTest
Current implementation looks a bit confusing. It looks like it should
report/print something on error, but it does not do that.
It silently drops a error message when creating triple, though
this behavior is fine generally.

For example if LLVM configured with -DLLVM_TARGETS_TO_BUILD=ARM and
our host is windows, it is expected that we will be unable to
create "i386-pc-windows-msvc" target.

Patch introduces isConfigurationSupported() function that checks
if current configuration is supported for each test and returns early if not.

llvm-svn: 306812
2017-06-30 10:09:01 +00:00
Ilya Biryukov 4709275205 Fixed misplaced table border in the docs.
llvm-svn: 306811
2017-06-30 09:47:17 +00:00
Ilya Biryukov af351dae02 Added Dockerfiles to build clang from sources.
Reviewers: klimek, chandlerc, mehdi_amini

Reviewed By: klimek, mehdi_amini

Subscribers: mehdi_amini, jlebar, llvm-commits

Differential Revision: https://reviews.llvm.org/D34197

llvm-svn: 306810
2017-06-30 09:46:45 +00:00
Hiroshi Inoue a89d4b5f2f fix trivial typos, NFC
llvm-svn: 306808
2017-06-30 09:11:50 +00:00
Kristof Beyls b539ea5393 [GlobalISel] Make multi-step legalization work.
In r301116, a custom lowering needed to be introduced to be able to
legalize 8 and 16-bit divisions on ARM targets without a division
instruction, since 2-step legalization (WidenScalar from 8 bit to 32
bit, then Libcall the 32-bit division) doesn't work.

This fixes this and makes this kind of multi-step legalization, where
first the size of the type needs to be changed and then some action is
needed that doesn't require changing the size of the type,
straighforward to specify.

Differential Revision: https://reviews.llvm.org/D32529

llvm-svn: 306806
2017-06-30 08:26:20 +00:00
Ayal Zaks 8d26f0a602 [LV] Optimize for size when vectorizing loops with tiny trip count
It may be detrimental to vectorize loops with very small trip count, as various
costs of the vectorized loop body as well as enclosing overheads including
runtime tests and scalar iterations may outweigh the gains of vectorizing. The
current cost model measures the cost of the vectorized loop body only, expecting
it will amortize other costs, and loops with known or expected very small trip
counts are not vectorized at all. This patch allows loops with very small trip
counts to be vectorized, but under OptForSize constraints, which ensure the cost
of the loop body is dominant, having no runtime guards nor scalar iterations.

Patch inspired by D32451.

Differential Revision: https://reviews.llvm.org/D34373

llvm-svn: 306803
2017-06-30 08:02:35 +00:00
Craig Topper 97cd0173b9 [InstCombine] Add test cases to demonstrate failure to fold (a | b) ^ (~a | ~b) --> ~(a ^ b) and its commuted variants.
llvm-svn: 306801
2017-06-30 07:37:42 +00:00
Craig Topper 880bf82685 [InstCombine] In foldXorToXor, move the commutable matcher from the LHS match to the RHS match. No meaningful change intended.
There are two conditions ORed here with similar checks and each contain two matches that must be true for the if to succeed. With the commutable match on the first half of the OR then both ifs basically have the same first part and only the second part distinguishs. With this change we move the commutable match to second half and make the first half unique.

This caused some tests to change because we now produce a commuted result, but this shouldn't matter in practice.

llvm-svn: 306800
2017-06-30 07:37:41 +00:00
Hiroshi Inoue c39696441c fix trivial typo; NFC
llvm-svn: 306798
2017-06-30 07:17:53 +00:00
Chandler Carruth 3545a9e1f9 Remove the BBVectorize pass.
It served us well, helped kick-start much of the vectorization efforts
in LLVM, etc. Its time has come and past. Back in 2014:
http://lists.llvm.org/pipermail/llvm-dev/2014-November/079091.html

Time to actually let go and move forward. =]

I've updated the release notes both about the removal and the
deprecation of the corresponding C API.

llvm-svn: 306797
2017-06-30 07:09:08 +00:00
Martin Storsjo 43c854535e [llvm-readobj] Improve printouts for COFF ARM64 binaries
Differential Revision: https://reviews.llvm.org/D34835

llvm-svn: 306795
2017-06-30 07:02:13 +00:00
Martin Storsjo 8ae07ac837 [llvm-readobj] Include the PE magic value in printouts
This is useful for a testcase in lld.

Differential Revision: https://reviews.llvm.org/D34836

llvm-svn: 306794
2017-06-30 07:02:04 +00:00
Daniel Jasper 3b704ceba1 Revert "r306541 - Add zero-length check to memcpy/memset load store loop expansion"
Segfaults in non-optimized builds. I'll get a stack trace and a
reproducer to Teresa.

llvm-svn: 306793
2017-06-30 06:37:33 +00:00
Daniel Jasper 5ce1ce742e Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default."
This still breaks PPC tests we have. I'll forward reproduction
instructions to dehao.

llvm-svn: 306792
2017-06-30 06:32:21 +00:00
Eric Christopher dbb92cad2e Rewrite demangle memory handling.
The return of itaniumDemangle is allocated with malloc rather than new[]
and so using unique_ptr isn't called for here. As a note for the future
we should rewrite it to do this.

llvm-svn: 306788
2017-06-30 05:38:56 +00:00
Max Kazantsev 8d0322e612 [SCEV] Use depth limit instead of local cache for SExt and ZExt
In rL300494 there was an attempt to deal with excessive compile time on
invocations of getSign/ZeroExtExpr using local caching. This approach only
helps if we request the same SCEV multiple times throughout recursion. But
in the bug PR33431 we see a case where we request different values all the time,
so caching does not help and the size of the cache grows enormously.

In this patch we remove the local cache for this methods and add the recursion
depth limit instead, as we do for arithmetics. This gives us a guarantee that the
invocation sequence is limited and reasonably short.

Differential Revision: https://reviews.llvm.org/D34273

llvm-svn: 306785
2017-06-30 05:04:09 +00:00
Vedant Kumar 3fde2d3657 Try to appease a buildbot.
The failure is:
C:\ps4-buildslave2\llvm-clang-x86_64-expensive-checks-win\llvm\unittests\ProfileData\CoverageMappingTest.cpp(244):
error C2668: 'llvm::make_unique': ambiguous call to overloaded function

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/3489/

llvm-svn: 306784
2017-06-30 04:04:44 +00:00
Eric Christopher a95aac3751 Reduce indenting and clean up comparisons around sign bit.
llvm-svn: 306781
2017-06-30 01:57:48 +00:00
Eric Christopher 5dcdc7a3d3 Change the type of Undecorated to unique_ptr<char[]> since we're looking at a null terminated string and not a single character.
Fixes an error in tcmalloc sized delete checking.

llvm-svn: 306780
2017-06-30 01:45:56 +00:00
Eric Christopher 710c1c8faa Reduce the complexity of the signbit/branch test functions.
llvm-svn: 306779
2017-06-30 01:35:31 +00:00
Jakub Kuderski 837755cf8b [Dominators] Don't compute DFS InOut numbers eagerly.
Summary:
DFS InOut numbers currently get eagerly computer upon DomTree construction. They are only needed to answer dome dominance queries and they get invalidated by updates and recalculations. Because of that, it is faster in practice to compute them lazily when they are actually needed.

Clang built without this patch takes 6m 45s to boostrap on my machine, and with the patch applied 6m 38s.

Reviewers: sanjoy, dberlin, chandlerc

Reviewed By: dberlin

Subscribers: davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D34296

llvm-svn: 306778
2017-06-30 01:28:21 +00:00
Eric Christopher bc02ef17ca Add a C API section to the release notes.
llvm-svn: 306777
2017-06-30 01:17:45 +00:00
Vedant Kumar cc34e619ba [Coverage] Remove two overloads of CoverageMapping::load. NFC.
These overloads are essentially dead, and pose a maintenance cost
without adding any benefit. This is coming up now because I'd like to
experiment with changing the way we store coverage mapping data, and
would rather not have to fix up the old overloads while doing so.

Testing: check-{llvm,profile}, build clang.
llvm-svn: 306776
2017-06-30 00:45:26 +00:00
Heejin Ahn ac62b05d05 [WebAssembly] Add support for exception handling instructions
Summary:
This adds backend support for throw, rethrow, try, and try_end instructions.
This needs the corresponding clang builtin support:
https://reviews.llvm.org/D34783
This follows the Wasm exception handling proposal in
https://github.com/WebAssembly/exception-handling/blob/master/proposals/Exceptions.md

Reviewers: sunfish, dschuff

Reviewed By: dschuff

Subscribers: jfb, sbc100, jgravelle-google

Differential Revision: https://reviews.llvm.org/D34826

llvm-svn: 306774
2017-06-30 00:43:15 +00:00
Wolfgang Pieb e60147c89f [DWARF] Move a couple of member functions to the DWARFUnit baseclass. NFC.
Reviewer: dblaikie

Differential revision: https://reviews.llvm.org/D34765

llvm-svn: 306771
2017-06-30 00:27:45 +00:00
Eric Christopher ee837a59f7 Unified logic for computing target ABI in backend and front end by moving this common code to Support/TargetParser.
Modeled Triple::GNU after front end code (aapcs abi) and  updated tests that expect apcs abi.

Based heavily on a patch by Ana Pazos!

llvm-svn: 306768
2017-06-30 00:03:54 +00:00
Aditya Nandakumar 20f6207013 [GISel]: New Opcode G_FLOG/G_FLOG2
https://reviews.llvm.org/D34837

llvm-svn: 306766
2017-06-29 23:43:44 +00:00
Dehao Chen 2f31d0d86e Hook the sample PGO machinery in the new PM
Summary: This patch hooks up SampleProfileLoaderPass with the new PM.

Reviewers: chandlerc, davidxl, davide, tejohnson

Reviewed By: chandlerc, tejohnson

Subscribers: tejohnson, llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D34720

llvm-svn: 306763
2017-06-29 23:33:05 +00:00
Eric Christopher 56f481b78e To help readability of mightUseCTR pull out the inline asm handling support into a function.
llvm-svn: 306762
2017-06-29 23:28:47 +00:00
Eric Christopher b16eacf528 Make the PPCCTRLoops pass depend on being able to access the TargetMachine and clean up accordingly.
llvm-svn: 306761
2017-06-29 23:28:45 +00:00
Taewook Oh 0e35ea3b7c Remove redundant copy in recurrences
Summary:
If there is a chain of instructions formulating a recurrence, commuting operands can help removing a redundant copy. In the following example code,

```
BB#1: ; Loop Header
  %vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13
  ...

BB#6: ; Loop Latch
  %vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15
  %vreg10<def,tied1> = ADD32rr %vreg1<kill,tied0>, %vreg0<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg1,%vreg0
  %vreg3<def,tied1> = ADD32rr %vreg2<kill,tied0>, %vreg10<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2,%vreg10
  CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3
  %vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3
  JL_1 <BB#1>, %EFLAGS<imp-use,kill>
```

Existing two-address generation pass generates following code:

```
BB#1:
  %vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13
  ...

BB#6:
    Predecessors according to CFG: BB#5 BB#4
  %vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15
  %vreg10<def> = COPY %vreg1<kill>; GR32:%vreg10,%vreg1
  %vreg10<def,tied1> = ADD32rr %vreg10<tied0>, %vreg0<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg0
  %vreg3<def> = COPY %vreg10<kill>; GR32:%vreg3,%vreg10
  %vreg3<def,tied1> = ADD32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2
  CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3
  %vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3
  JL_1 <BB#1>, %EFLAGS<imp-use,kill>
  JMP_1 <BB#7>
```

This is suboptimal because the assembly code generated has a redundant copy at the end of #BB6 to feed %vreg13 to BB#1:

```
.LBB0_6:
  addl  %esi, %edi
  addl  %ebx, %edi
  cmpl  $10, %edi
  movl  %edi, %esi
  jl  .LBB0_1
```

This redundant copy can be elimiated by making instructions in the recurrence chain to compute the value "into" the register that actually holds the feedback value. In this example, this can be achieved by commuting %vreg0 and %vreg1 to compute %vreg10. With that change, code after two-address generation becomes

```
BB#1:
  %vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13
  ...

BB#6: derived from LLVM BB %bb7
    Predecessors according to CFG: BB#5 BB#4
  %vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15
  %vreg10<def> = COPY %vreg0<kill>; GR32:%vreg10,%vreg0
  %vreg10<def,tied1> = ADD32rr %vreg10<tied0>, %vreg1<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg1
  %vreg3<def> = COPY %vreg10<kill>; GR32:%vreg3,%vreg10
  %vreg3<def,tied1> = ADD32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2
  CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3
  %vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3
  JL_1 <BB#1>, %EFLAGS<imp-use,kill>
  JMP_1 <BB#7>
```

and the final assembly does not have redundant copy:

```
.LBB0_6:
  addl  %edi, %eax
  addl  %ebx, %eax
  cmpl  $10, %eax
  jl  .LBB0_1
```

Reviewers: qcolombet, MatzeB, wmi

Reviewed By: wmi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31821

llvm-svn: 306758
2017-06-29 23:11:24 +00:00
Tim Shen 664706916b [ThinkLTO] Invoke build(Thin)?LTOPreLinkDefaultPipeline.
Previously it doesn't actually invoke the designated new PM builder
functions.

This patch moves NameAnonGlobalPass out from PassBuilder, as Chandler
points out that PassBuilder is used for non-O0 builds, and for
optimizations only.

Differential Revision: https://reviews.llvm.org/D34728

llvm-svn: 306756
2017-06-29 23:08:38 +00:00
Davide Italiano f6b3d21198 [CFLAA] Remove unneded function declaration. NFCI.
llvm-svn: 306754
2017-06-29 22:57:37 +00:00
Dinar Temirbulatov f05c73c132 [SLPVectorizer] Moving Entry->NeedToGather check out of inner loop,
since it is invariant there. NFCI.

llvm-svn: 306749
2017-06-29 21:56:33 +00:00
Simon Dardis dede76f428 Revert "[mips] Fix multiprecision arithmetic."
This reverts commit r305389. This broke chromium builds, so reverting
while I investigate further.

llvm-svn: 306741
2017-06-29 20:59:47 +00:00
Chad Rosier 4c1bc656d0 [AArch64] Silence an unused variable warning in Release builds. NFC.
llvm-svn: 306738
2017-06-29 20:43:35 +00:00
Keno Fischer 05e4ac26a2 [CodeGenPrepare] Don't create inttoptr for ni ptrs
Summary:
Arguably non-integral pointers probably shouldn't show up here at all,
but since the backend doesn't complain and this takes valid (according
to the Verifier) IR and makes it invalid, make sure not to introduce
any inttoptr instructions if we're dealing with non-integral pointers.

Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D33110

llvm-svn: 306737
2017-06-29 20:28:59 +00:00
Reid Kleckner f03096b3c3 Attempt to fix Orc JIT test timeouts
I think there are some destruction ordering issues here. The
ShouldDelete map seems to be getting destroyed before the shared_ptr
deleter lambda accesses it. In any case, this avoids inserting elements
into the map during shutdown.

llvm-svn: 306736
2017-06-29 20:15:08 +00:00
Spyridoula Gravani 837c110cb1 [DWARF] Added verification checks for the .apple_names section.
This patch verifies the number of atoms, the validity of the form for each atom, as well as the validity of the
hashdata. For hashdata, we're verifying that the hashdata offset is correct and that the offset in the .debug_info for
each DIE in the hashdata is also valid.

llvm-svn: 306735
2017-06-29 20:13:05 +00:00
Sam Clegg 3d65030c45 Remove `inline` keyword from inline `classof` methods
The style guide states that the explicit `inline`
should not be used with inline methods.  classof is
very common inline method with a fair amount on
inconsistency:

$ git grep classof ./include | grep inline | wc -l
230
$ git grep classof ./include | grep -v inline | wc -l
257

I chose to target this method rather the larger change
since this method is easily cargo-culted (I did it at
least once).  I considered doing the larger change and
removing all occurrences but that would be a much larger
change.

Differential Revision: https://reviews.llvm.org/D33906

llvm-svn: 306731
2017-06-29 19:35:17 +00:00
Keno Fischer 99886f09a1 [AliasSetTracker] Don't drop AA MD so eagerly
Summary:
When we have patterns like
loop:
    %la = load %ptr, !tbaa
    %lba = load %ptr, !tbaa !noalias

AliasSetTracker would previously think that the two types of annotation for
the pointer conflict, dropping both for the purpose of determining alias sets.
That is clearly way too conservative, as the tbaa is still valid whether or
not one of the memory accesses has additional AA metadata. We could go
one step further and attempt to properly merge the AA metadata,
but it's not clear that that would be worth it since that may introduce
additional MD nodes, which may be undesirable since this is merely an
Analysis.

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32139

llvm-svn: 306727
2017-06-29 19:13:11 +00:00
Brian Gesiak 5e0a9465c4 [opt-viewer] Add progress indicators (PR33522)
Summary:
Provide feedback to users of opt-diff.py, opt-stats.py, and opt-viewer.py,
on how many YAML files have finished being processed, and how many HTML
files have been generated. This feedback is particularly helpful for
opt-viewer.py, which may take a long time to complete when given many
large YAML files as input.

The progress indicators use simple output such as the following:

```
Reading YAML files...
    9 of 1197
```

Test plan:
Run `utils/opt-viewer/opt-*.py` on a CentOS and macOS machine, using
Python 3.4 and Python 2.7 respectively, and ensure the output is
formatted well on both.

Reviewers: anemet, davidxl

Reviewed By: anemet

Subscribers: simon.f.whittaker, llvm-commits

Differential Revision: https://reviews.llvm.org/D34735

llvm-svn: 306726
2017-06-29 18:56:25 +00:00
Alexandre Isoard 41044876fc Reverting r306695 while investigating failing test case.
Failing test case:
    Transforms/LoopVectorize.iv_outside_user.ll

llvm-svn: 306723
2017-06-29 18:48:56 +00:00
Brian Gesiak d3a0571301 [opt-viewer] Python 3 support in opt-viewer.py
Summary:
Minor changes that allow opt-stats.py to support both Python 2 and 3.
In addition to the same dictionary iterator changes that were necessary
in https://reviews.llvm.org/D34564, this diff also:

* Explcitly converts strings to bytes when reading from and writing to stdin
  and stdout.
* No longer uses dictionaries as a sort key for optimization remarks.
  Dictionary sort order in Python 2 is pretty esoteric anyway, so it's
  not clear that the additional sorting had a benefit for end users
  (for details, https://stackoverflow.com/a/3484456/679254 is a good
  resource on Python 2 dictionary sort order).

Reviewers: anemet, davidxl

Reviewed By: anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34647

llvm-svn: 306720
2017-06-29 18:47:31 +00:00
Sam Clegg 531da0081f llvm-nm: Add support for symbol demangling (-C/--demangle)
Differential Revision: https://reviews.llvm.org/D34668

llvm-svn: 306718
2017-06-29 18:29:05 +00:00
Xin Tong 66f6d69c85 [OrderedInst] Add const to constant parameter. NFCI
llvm-svn: 306717
2017-06-29 18:04:31 +00:00
Hiroshi Inoue ff8453db56 fix trivial typo, NFC
llvm-svn: 306716
2017-06-29 18:03:28 +00:00
Jakub Kuderski c4bbce724e [Dominators] Rearrange access specifiers in DominatorTreeBase
Summary:
This patch makes DominatorTreeBase more readable by putting most important members on top of the class.

Before, the class looked like that: private -> protected (including data members) -> public -> protected.
The patch changes it to: protected (data members only) -> public -> protected -> public.

Reviewers: dberlin, sanjoy, chandlerc

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34527

llvm-svn: 306714
2017-06-29 17:53:35 +00:00
Jakub Kuderski 35fca34132 [Dominators] Remove DominatorBase class
Summary:
DominatorBase class was only used by DominatorTreeBase. It didn't provide any useful abstractions, nor simplified anything, so I see no point keeping it.

This commit removes the DominatorBase class and moves its content into DominatorTreeBase.

This is the first patch in a series that tries to make all DomTrees have a single virtual root, which will allow to further simplify code (especially when it comes to incremental updates).

Reviewers: dberlin, sanjoy, chandlerc

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34493

llvm-svn: 306713
2017-06-29 17:50:19 +00:00
Xin Tong 02008c30b5 Remove useless header. NFC
llvm-svn: 306712
2017-06-29 17:48:12 +00:00
Jakub Kuderski f92233652e [Dominators] Add parent and sibling property verification (non-hacky)
Summary:
This patch adds an additional level of verification - it checks parent and sibling properties of a tree. By definition, every tree with these two properties is a dominator tree.

It is possible to run those check by running llvm with `-verify-dom-info=1`.

Bootstrapping clang and building the llvm test suite with this option enabled doesn't yield any errors.

Reviewers: dberlin, sanjoy, chandlerc

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34482

llvm-svn: 306711
2017-06-29 17:45:51 +00:00
Leo Li 20fbad9307 [ConstantHoisting] Avoid hoisting constants in GEPs that index into a struct type.
Summary:
Indices for GEPs that index into a struct type should always be
constants. This added more checks in `collectConstantCandidates:` which make
sure constants for GEP pointer type are not hoisted.

This fixed Bug https://bugs.llvm.org/show_bug.cgi?id=33538

Reviewers: ributzka, rnk

Reviewed By: ributzka

Subscribers: efriedma, llvm-commits, srhines, javed.absar, pirama

Differential Revision: https://reviews.llvm.org/D34576

llvm-svn: 306704
2017-06-29 17:03:34 +00:00
Daniel Berlin b7df17ec59 PredicateInfo: Use OrderedInstructions instead of our homemade
version.

llvm-svn: 306703
2017-06-29 17:01:14 +00:00
Daniel Berlin b779db7ebc NewGVN: Remove useless test in addPhiOfOps.
llvm-svn: 306702
2017-06-29 17:01:10 +00:00
Daniel Berlin 7c757aee38 Remove unneeded else from OrderedInstructions::dominates.
llvm-svn: 306701
2017-06-29 17:01:03 +00:00
Paul Robinson 17536b935a [DWARF] NFC: DWARFDataExtractor combines relocs with DataExtractor.
Requires callers to directly associate relocations with a DataExtractor
used to read data from a DWARF section, which helps a callee not make
assumptions about which section it is reading.
This is the next step in reducing DWARFFormValue's dependence on DWARFUnit.

Differential Revision: https://reviews.llvm.org/D34704

llvm-svn: 306699
2017-06-29 16:52:08 +00:00
Alexandre Isoard aa29afc756 ScalarEvolution: Add URem support
In LLVM IR the following code:

    %r = urem <ty> %t, %b

is equivalent to:

    %q = udiv <ty> %t, %b
    %s = mul <ty> nuw %q, %b
    %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t

As UDiv, Mul and Sub are already supported by SCEV, URem can be
implemented with minimal effort this way.

Note: While SRem and SDiv are also related this way, SCEV does not
provides SDiv yet.

llvm-svn: 306695
2017-06-29 16:29:04 +00:00
Brian Gesiak 39b88f01a5 [opt-viewer] opt-viewer.py takes -o argument
Summary:
Change how the output directory is specified when invoking
opt-viewer.py, from `opt-viewer.py yaml_file_one yaml_file_two output_dir` to
`opt-viewer.py -o output_dir yaml_file_one yaml_file_two`.

This makes it easier to pipe the results of another command into
opt-viewer.py. For example:

```
find . -name "*.yaml" -print | xargs /path/to/opt-viewer.py -o html
```

Reviewers: anemet, davidxl

Reviewed By: anemet

Subscribers: fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D34711

llvm-svn: 306694
2017-06-29 16:20:31 +00:00
Krzysztof Parzyszek 0089419417 [Hexagon] Keep all phi nodes when building DFG in addr-mode-opt
The dead phis are needed for finding correct would-be reaching defs
in register propagation.

llvm-svn: 306690
2017-06-29 15:55:59 +00:00
Nirav Dave 168c5a6a40 [DAG] Fold FrameIndex offset into BaseIndexOffset analysis. NFCI.
Relanding after restricting equalBaseIndex to not erroneuosly consider
a FrameIndices stemming from alloca from being comparable as its
offset is set post-selectionDAG.

Pull FrameIndex comparision reasoning from DAGCombiner::isAlias to
general BaseIndexOffset.

llvm-svn: 306688
2017-06-29 15:48:11 +00:00
Eugene Leviant 6269d39f44 [llvm-objdump] Handle invalid instruction gracefully on ARM
Differential revision: https://reviews.llvm.org/D34813

llvm-svn: 306687
2017-06-29 15:38:47 +00:00
Yonghong Song 5fbe01b12d bpf: remove unnecessary truncate operation
For networking-type bpf program, it often needs to access
packet data. A context data structure is provided to the bpf
programs with two fields:
        u32 data;
        u32 data_end;
User can access these two fields with ctx->data and ctx->data_end.
During program verification process, the kernel verifier modifies
the bpf program with loading of actual pointer value from kernel
data structure.
    r = ctx->data      ===> r = actual data start ptr
    r = ctx->data_end  ===> r = actual data end ptr

A typical program accessing ctx->data like
    char *data_ptr = (char *)(long)ctx->data
will result in a 32-bit load followed by a zero extension.
Such an operation is combined into a single LDW in DAG combiner
as bpf LDW does zero extension automatically.

In cases like the below (which can be a result of global value numbering
and partial redundancy elimination before insn selection):
B1:
   u32 a = load-32-bit &ctx->data
   u64 pa = zext a
   ...
B2:
   u32 b = load-32-bit &ctx->data
   u64 pb = zext b
   ...
B3:
   u32 m = PHI(a, b)
   u64 pm = zext m

In B3, "pm = zext m" cannot be removed, which although is legal
from compiler perspective, will generate incorrect code after
kernel verification.

This patch recognizes this pattern and traces through PHI node
to see whether the operand of "zext m" is defined with LDWs or not.
If it is, the "zext m" itself can be removed.

The patch also recognizes the pattern where the load and use of
the load value not in the same basic block, where truncate operation
may be removed as well.

The patch handles 1-byte, 2-byte and 4-byte truncation.

Two test cases are added to verify the transformation happens properly
for the above code pattern.

Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 306685
2017-06-29 15:18:54 +00:00
Nikolai Bozhenov 1925594ea0 [NFC] Use stdin for some tests instead of positional argument.
Summary: Otherwise unexpected matches with the path to the tests might happen.

Reviewers: rengolin, spatel, efriedma, RKSimon

Reviewed By: spatel

Subscribers: n.bozhenov, javed.absar, llvm-commits

Patch by Andrei Elovikov <andrei.elovikov@intel.com>

Differential Revision: https://reviews.llvm.org/D32994

llvm-svn: 306684
2017-06-29 14:51:54 +00:00
Daniel Neilson aad1a6f0a4 Restore original intent of memset instcombine test
Summary:
The original intent of test/Transforms/InstCombine/memset.ll was to test for lowering of llvm.memset into stores when the size of the memset is 1, 2, 4, or 8. Sometime between then and now the test has stopped testing for that, but remained passing due to testing for the absence of llvm.memset calls rather than the presence of store instructions. Right now this test ends up with an empty function body because the alloca is eliminated as safe-to-remove, which results in the llvm.memset calls's being eliminated due to their pointer args being undef; so it is not testing for conversion of llvm.memset into store instructions at all.

This change alters the test to verify that store instructions are created, and moves the target of the memset to an arg of the proc to avoid it being eliminated as unused.

Reviewers: anna, efriedma

Reviewed By: efriedma

Subscribers: efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D34642

llvm-svn: 306681
2017-06-29 14:21:28 +00:00
Daniel Neilson 82b016672d Explicitly check for presence of correct results in instcombine memmove test
Summary:
Rather than testing for expected results, test/Transforms/InstCombine/memmove.ll is testing for the absence of calls to llvm.memmove.

In the case of test3, the test has stopped testing for materialization of loads/stores, but remained passing due to testing for the absence of llvm.memset calls rather than the presence of load/store instructions. Right now this test ends up with an empty function body because the alloca is eliminated as safe-to-remove, which results in the llvm.memmove calls being eliminated due to a pointer arg being undef; so it is not testing for conversion of llvm.memmove into load/store instructions at all.

Reviewers: eli.friedman, anna, efriedma

Reviewed By: efriedma

Subscribers: efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D34645

llvm-svn: 306679
2017-06-29 14:17:50 +00:00
Hiroshi Inoue 6989caa931 [PowerPC] fix potential verification error on __tls_get_addr
This patch fixes a verification error with -verify-machineinstrs while expanding __tls_get_addr by not creating ADJCALLSTACKUP and ADJCALLSTACKDOWN if there is another ADJCALLSTACKUP in this basic block since nesting ADJCALLSTACKUP/ADJCALLSTACKDOWN is not allowed.

Here, ADJCALLSTACKUP and ADJCALLSTACKDOWN are created as a fence for instruction scheduling to avoid _tls_get_addr is scheduled before mflr in the prologue (https://bugs.llvm.org//show_bug.cgi?id=25839). So if another ADJCALLSTACKUP exists before _tls_get_addr, we do not need to create a new ADJCALLSTACKUP.

Differential Revision: https://reviews.llvm.org/D34347

llvm-svn: 306678
2017-06-29 14:13:38 +00:00
George Rimar b25a09d0f5 [DWARF] - Fix message reporting about broken relocation.
Because of mistake introduced in r306517,
wrong variable ("name" instead of "Name") was used
in error message.
As a result it reported section name instead of
relocation name.

This file still needs cleanup to match LLVM coding style
and more tests I think.

llvm-svn: 306677
2017-06-29 14:05:18 +00:00
Daniel Jasper 559aa75382 Revert "r306529 - [X86] Correct dwarf unwind information in function epilogue"
I am 99% sure that this breaks the PPC ASAN build bot:
http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/3112/steps/64-bit%20check-asan/logs/stdio

If it doesn't go back to green, we can recommit (and fix the original
commit message at the same time :) ).

llvm-svn: 306676
2017-06-29 13:58:24 +00:00
Florian Hahn 8a44b7be76 [TBAA] Remove metadata keyword from IR examples in comments (NFC).
The metadata keyword has been removed from the IR.

llvm-svn: 306675
2017-06-29 13:55:23 +00:00
Evgeny Astigeevich 70ed78e504 [TargetTransformInfo, API] Add a list of operands to TTI::getUserCost
The changes are a result of discussion of https://reviews.llvm.org/D33685.
It solves the following problem:

1. We can inform getGEPCost about simplified indices to help it with
   calculating the cost. But getGEPCost does not take into account the
   context which GEPs are used in.
2. We have getUserCost which can take the context into account but we cannot
   inform about simplified indices.

With the changes getUserCost will have access to additional information
as getGEPCost has.

The one parameter getUserCost is also provided.

Differential Revision: https://reviews.llvm.org/D34057

llvm-svn: 306674
2017-06-29 13:42:12 +00:00
Pavel Labath fe09f506b6 Recommit "[Support] Add RetryAfterSignal helper function"
The difference from the previous version is the use of decltype, as the
implementation of std::result_of in libc++ did not work correctly for
variadic function like open(2).

Original summary:
This function retries an operation if it was interrupted by a signal
(failed with EINTR). It's inspired by the TEMP_FAILURE_RETRY macro in
glibc, but I've turned that into a template function. I've also added a
fail-value argument, to enable the function to be used with e.g.
fopen(3), which is documented to fail for any reason that open(2) can
fail (which includes EINTR).

The main user of this function will be lldb, but there were also a
couple of uses within llvm that I could simplify using this function.

Reviewers: zturner, silvas, joerg

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D33895

llvm-svn: 306671
2017-06-29 13:15:31 +00:00
Igor Breger 0cddd34876 [GlobalISel][X86] Support vector type G_MERGE_VALUES selection.
Summary:
Support vector type G_MERGE_VALUES selection. For now G_MERGE_VALUES marked as legal for any type, so nothing to do in legalizer.
Split from https://reviews.llvm.org/D33665

Reviewers: qcolombet, t.p.northover, zvi, guyblank

Reviewed By: guyblank

Subscribers: rovka, kristof.beyls, guyblank, llvm-commits

Differential Revision: https://reviews.llvm.org/D33958

llvm-svn: 306665
2017-06-29 12:08:28 +00:00
Simon Pilgrim 9a68e69c68 [X86][SSE] Dropped -mcpu from palignr tests
Use triple and attribute only for consistency 

Add AVX tests as well

llvm-svn: 306664
2017-06-29 11:13:39 +00:00
Simon Pilgrim e2eacbfc23 [X86][SSE] Regenerate shuffle test with update_llc_test_checks.py
llvm-svn: 306663
2017-06-29 11:11:37 +00:00
Simon Pilgrim 0afe97f480 [X86][SSE] Dropped -mcpu from vector shift tests
Use triple and attribute only for consistency 

llvm-svn: 306662
2017-06-29 11:09:53 +00:00
Simon Pilgrim 91539ce2d3 [X86][SSE] Dropped -mcpu from zero insertion tests
Use triple and attribute only for consistency 

llvm-svn: 306661
2017-06-29 11:08:11 +00:00
Michael Zuckerman 4bcb9c3349 [LLVM][X86][Goldmont] Adding new target-cpu: Goldmont
[LLVM SIDE]
Connecting the GoldMont processor to his feature.

Reviewers: 
1. igorb
2. zvi
3. delena
4. RKSimon
5. craig.topper        

Differential Revision: https://reviews.llvm.org/D34504

llvm-svn: 306658
2017-06-29 10:00:33 +00:00
NAKAMURA Takumi 6936506f50 Test commit
llvm-svn: 306657
2017-06-29 09:46:01 +00:00
Dinar Temirbulatov 7b96266a16 [SLPVectorizer] Introducing getTreeEntry() helper function [NFC]
Differential Revision: https://reviews.llvm.org/D34756

llvm-svn: 306655
2017-06-29 08:46:18 +00:00
Florian Hahn 08fdd040b5 [ARM] Add tGPRwithpc register class and use it for TBB/THH
Summary:
TBB and THH allow using a Thumb GPR or the PC as destination operand.
A few machine verifier failures where due to those instructions not
expecting PC as destination operand.

Add -verify-machineinstrs to test/CodeGen/ARM/jump-table-tbh.ll to add
test coverage even if expensive checks are disabled.



Reviewers: MatzeB, t.p.northover, jmolloy

Reviewed By: MatzeB

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34610

llvm-svn: 306654
2017-06-29 08:45:31 +00:00
Martin Storsjo 3fa1213004 [BinaryFormat] Identify AArch64 COFF files
Differential Revision: https://reviews.llvm.org/D34742

llvm-svn: 306647
2017-06-29 06:30:56 +00:00
Zvi Rackover da3943d600 [X86] Adding shuffle tests demonstrating missed vcompress opportunities. NFC
llvm-svn: 306646
2017-06-29 06:22:01 +00:00
David L. Jones 0a466fc209 [lit] Re-apply: Fix some convoluted logic around Unicode encoding, and de-duplicate across modules that used it.
(Take 2: this patch re-applies r306625, which was reverted in r306629. This
patch includes only trivial fixes.)

In Python2 and Python3, the various (non-)?Unicode string types are sort of
spaghetti. Python2 has unicode support tacked on via the 'unicode' type, which
is distinct from 'str' (which are bytes). Python3 takes the "unicode-everywhere"
approach, with 'str' representing a Unicode string.

Both have a 'bytes' type. In Python3, it is the only way to represent raw bytes.
However, in Python2, 'bytes' is an alias for 'str'. This leads to interesting
problems when an interface requires a precise type, but has to run under both
Python2 and Python3.

The previous logic appeared to be correct in all cases, but went through more
layers of indirection than necessary. This change does the necessary conversions
in one shot, with documentation about which paths might be taken in Python2 or
Python3.

Changes from r306625: some tests just print binary outputs, so in those cases,
fall back to str() in Python3. For googletests, add one missing call to
to_string().

(Tested by verifying the visible breakage with Python3. Verified that everything
works in py2 and py3.)

llvm-svn: 306643
2017-06-29 04:37:35 +00:00
David Blaikie d16a61d2c8 llvm-profdata: Indirect infrequently used fields to reduce memory usage
Examining a large profile example, it seems relatively few records have
non-empty IndirectCall and MemOP data, so indirecting these through a
unique_ptr (non-null only when they are non-empty) Reduces memory usage
on this particular example from 14GB to 10GB according to valgrind's
massif.

I suspect it'd still be worth moving InstrProfWriter to its own data
structure that had Counts and the indirected IndirectCall+MemOP, and did
not include the Name, Hash, or Error fields. This would reduce the size
of this dominant data structure by half of this new, lower amount.
(Name(2), Hash(1), Error(1) ~= Counts(vector, 3), ValueProfData
(unique_ptr, 1))
-> From code review feedback, might actually refactor InstrProfRecord
itself to have a sub-struct with all the counts, and use that from
InstrProfWriter, rather than InstrProfWriter owning its own data
structure for this.

Reviewers: davidxl

Differential Revision: https://reviews.llvm.org/D34694

llvm-svn: 306631
2017-06-29 02:51:58 +00:00
David L. Jones eb615506b3 Revert "[lit] Fix some convoluted logic around Unicode encoding, and de-duplicate across modules that used it."
This reverts r306625.

llvm-svn: 306629
2017-06-29 02:22:49 +00:00
David L. Jones 30251947ed Fix spelling: uncode -> unicode.
Remember kids: there is no 'I' in str or bytes, but there is ALWAYS an
'I' in unicode.

llvm-svn: 306626
2017-06-29 01:03:56 +00:00
David L. Jones d59c9cd539 [lit] Fix some convoluted logic around Unicode encoding, and de-duplicate across modules that used it.
Summary:
In Python2 and Python3, the various (non-)?Unicode string types are sort of
spaghetti. Python2 has unicode support tacked on via the 'unicode' type, which
is distinct from 'str' (which are bytes). Python3 takes the "unicode-everywhere"
approach, with 'str' representing a Unicode string.

Both have a 'bytes' type. In Python3, it is the only way to represent raw bytes.
However, in Python2, 'bytes' is an alias for 'str'. This leads to interesting
problems when an interface requires a precise type, but has to run under both
Python2 and Python3.

The previous logic appeared to be correct in all cases, but went through more
layers of indirection than necessary. This change does the necessary conversions
in one shot, with documentation about which paths might be taken in Python2 or
Python3.

Reviewers: zturner, modocache

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D34793

llvm-svn: 306625
2017-06-29 01:03:55 +00:00
David L. Jones 34a18722fd [lit] Remove dead code not referenced in the LLVM SVN repo.
Summary:
This change removes the intermediate 'FileBasedTest' format from lit. This
format is only ever used by the ShTest format, so the logic can be moved into
ShTest directly.

In order to better clarify what the TestFormat subclasses do, I fleshed out the
TestFormat base class with Python's notion of abstract methods, using
@abc.abstractmethod. This gives a convenient way to document the expected
interface, without the risk of instantiating an abstract class (that's what
ABCMeta does -- it raises an exception if you try to instantiate a class which
has abstract methods, but not if you instantiate a subclass that implements
them).

Reviewers: zturner, modocache

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D34792

llvm-svn: 306623
2017-06-29 01:01:03 +00:00
Eric Beckmann d40dd64ff0 Revert "Replace trivial use of external rc.exe by writing our own .res file."
This reverts commit d4c7e9fc63c10dbab0c30186ef8575474a704496.

This is done in order to address the failure of CrWinClangLLD etc. bots.
These throw an error of "side-by-side configuration is incorrect" during
compilation, which sounds suspiciously related to these manifest
changes.

Revert "Switch external cvtres.exe for llvm's own resource library."

This reverts commit 71fe8ef283a9dab9a3f21432c98466cbc23990d1.

llvm-svn: 306618
2017-06-29 00:17:26 +00:00
Craig Topper 798a19ab8e [InstCombine] In visitXor, use m_Not on the instruction itself instead of looking for all ones in Op1. This is consistent with 3 other not checks before this one. NFCI
llvm-svn: 306617
2017-06-29 00:07:08 +00:00
Eugene Zelenko 8456b16ea9 [CodeView] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 306616
2017-06-29 00:05:44 +00:00
Keno Fischer a236dae5d1 [InstCombine] Retain TBAA when narrowing memory accesses
Summary:
As discussed on the mailing list it is legal to propagate TBAA to loads/stores
from/to smaller regions of a larger load tagged with TBAA. Do so for
(load->extractvalue)=>(gep->load) and similar foldings.

Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D31954

llvm-svn: 306615
2017-06-28 23:36:40 +00:00
Mandeep Singh Grang d159c688aa [NFC] Remove multiple semicolons
Reviewers: bogner, whitequark, mgrang

Reviewed By: mgrang

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34785

llvm-svn: 306613
2017-06-28 23:15:16 +00:00
Adrian McCarthy bf0afc3246 Introduce symbol cache to PDB NativeSession
Instead of creating symbols directly in the findChildren methods of the native
symbol implementations, they will rely on the NativeSession to act as a factory
for these types.  This lets NativeSession cache the NativeRawSymbols in its
new symbol cache and makes that cache the source of unique IDs for the symbols.

Right now, this affects only NativeCompilandSymbols.  There's no external
change yet, so I think the existing tests are still sufficient.  Coming soon
are patches to extend this to built-in types and enums.

llvm-svn: 306610
2017-06-28 22:47:40 +00:00
Xin Tong 1548bbc7dd Revert "Make OrderedInstructions and OrderedBasicBlock use AssertingVH, to try and catch mistakes"
This reverts commit 50ec560f05dcb8a1be18be442660d0305bc7de25.

It catches some bug in NewGVN it seems. I am in middle of something and will not be able to investigate
Revert for now.

http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/6268

llvm-svn: 306608
2017-06-28 22:35:54 +00:00
Xin Tong 21ef353622 Make OrderedInstructions and OrderedBasicBlock use AssertingVH, to try and catch mistakes
Summary: Make OrderedInstructions and OrderedBasicBlock use AssertingVH to try and catch mistakes

Reviewers: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34780

llvm-svn: 306605
2017-06-28 22:12:22 +00:00
Matt Arsenault 7c525903ef AMDGPU: Remove SITypeRewriter
This was an old workaround for using v16i8 in some old intrinsics
for resource descriptors.

llvm-svn: 306603
2017-06-28 21:38:50 +00:00
David L. Jones b3c88339ad [lit] Remove dead code (not referenced anywhere), and clarify some function names.
Summary:
The dead code seems to be unreferenced, according to textual search across the
LLVM SVN repo.

The clarification part of this change alters the name of a module-level function
so that it is different from the name of the class-methods that call it.
Currently, there are no erroneous references, but stylistically (c.f. PEP-8),
internal "helper" functions should generally be named accordingly by prepending
an underscore. (I also chose to add '_impl', which isn't necessary, but helps me
at least to mentally disambiguate the interface and implementation functions.)

Reviewers: zturner, modocache

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D34775

llvm-svn: 306600
2017-06-28 21:14:13 +00:00
Eric Christopher 7ad02eee8a Fix a typo.
llvm-svn: 306599
2017-06-28 21:10:31 +00:00
Stanislav Mekhanoshin a45584bebe Fold fneg and fabs like multiplications
Given no NaNs and no signed zeroes it folds:

(fmul X, (select (fcmp X > 0.0), -1.0, 1.0)) -> (fneg (fabs X))
(fmul X, (select (fcmp X > 0.0), 1.0, -1.0)) -> (fabs X)

Differential Revision: https://reviews.llvm.org/D34579

llvm-svn: 306592
2017-06-28 20:25:50 +00:00
Sanjay Patel 57f57262c5 [InstCombine] add tests for icmp with bitreversed ops; NFC
This is similar enough to bswap that we might as well handle them together in one patch.

llvm-svn: 306591
2017-06-28 20:02:35 +00:00
Mandeep Singh Grang 6f61e237cc [AArch64] Make assert messages uniform and general [NFC]
Summary: Make assert messages related to Darwin, ELF and COFF uniform.

Reviewers: rnk, ruiu, compnerd, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, aemerson, rengolin, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D34730

llvm-svn: 306589
2017-06-28 19:37:38 +00:00
Geoff Berry 0abd980680 [AArch64][Falkor] Attempt to fix Windows buildbots
llvm-svn: 306588
2017-06-28 19:36:10 +00:00
Rafael Espindola d926ea2ff7 Reuse existing variables. NFC.
llvm-svn: 306586
2017-06-28 19:26:37 +00:00
Krzysztof Parzyszek 9b54878b38 Break up long lines, NFC
llvm-svn: 306585
2017-06-28 18:59:18 +00:00
Geoff Berry 378374d457 [AArch64][Falkor] Try to avoid exhausting HW prefetcher resources when unrolling.
Reviewers: t.p.northover, mcrosier

Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34533

llvm-svn: 306584
2017-06-28 18:53:09 +00:00
Rafael Espindola f530b322b3 Reuse existing variable. NFC.
llvm-svn: 306582
2017-06-28 18:24:02 +00:00
Jakub Kuderski f01cd723a8 [Dominators] Move helper functions into SemiNCAInfo
Summary: Helper functions (DFSPass, ReverseDFSPass, Eval) need SemiNCAInfo anyway, so it's simpler to have them there as member functions. This also makes them simpler by removing template boilerplate.

Reviewers: dberlin, sanjoy, chandlerc

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34427

llvm-svn: 306579
2017-06-28 18:15:45 +00:00
Simon Pilgrim 64d182d36f [BBVectorize][X86] Regenerate simple tests
llvm-svn: 306578
2017-06-28 18:08:40 +00:00
Craig Topper 65aeba70de [InstCombine] Remove 64-bit bit width restriction from m_ConstantInt(uint64_t*&)
I think we only need to make sure the value fits in 64-bits not that bit width is 64-bit.

This helps places that use this for shift amounts since the shift amount needs to be the same bitwidth as the LHS, but can't be larger than the bit width.

Differential Revision: https://reviews.llvm.org/D34737

llvm-svn: 306577
2017-06-28 18:07:29 +00:00
Jakub Kuderski be9931dc85 [Dominators] Move SemiNCAInfo and helper functions out of DominatorTreeBase
Summary:
This moves SemiNCAInfo from DeminatorTreeBase to GenericDomTreeConstruction. It also put helper functions used during tree constructions in the same file.

The point of this change is to further clean up DominatorTreeBase and make it easier to construct and verify (in future patches).

Reviewers: dberlin, sanjoy, chandlerc

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34420

llvm-svn: 306576
2017-06-28 18:00:36 +00:00
Ayal Zaks d9bc43ef2a [LV] Fix PR33613 - retain order of insertelement per part
r306381 caused PR33613, by reversing the order in which insertelements were
generated per unroll part. This patch fixes PR33613 by retraining this order,
placing each set of insertelements per part immediately after the last scalar
being packed for this part. Includes a test case derived from PR33613.

Reference: https://bugs.llvm.org/show_bug.cgi?id=33613
Differential Revision: https://reviews.llvm.org/D34760

llvm-svn: 306575
2017-06-28 17:59:33 +00:00
Jakub Kuderski 475489e500 [Dominators] Move IDoms out of DominatorTreeBase and put them in SNCAInfo
Summary: The temporary IDoms map was used only during DomTree calculation. We can move it to SNCAInfo so that it's no longer a DominatorTreeBase member.

Reviewers: sanjoy, dberlin, chandlerc

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34317

llvm-svn: 306574
2017-06-28 17:56:09 +00:00
Rafael Espindola 96367a3d1e Fix PR33625.
We were failing to convert this expression to pcrel.

llvm-svn: 306573
2017-06-28 17:56:07 +00:00
Jakub Kuderski f43d34a526 [Dominators] Move InfoRec outside of DominatorTreeBase
Summary:
The InfoRec struct is used only during tree construction, so there is no point having it as a DominatorTreeBase member.

This patch moves it into the Calculate function instead and makes it pass it to its helper functions.

Reviewers: sanjoy, dberlin, chandlerc

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34305

llvm-svn: 306572
2017-06-28 17:43:54 +00:00
Simon Pilgrim 54822d1432 [BBVectorize] Regenerate simple tests
llvm-svn: 306571
2017-06-28 17:40:20 +00:00
Rafael Espindola 1416c2d052 Don't repeat name in comment and format. NFC.
llvm-svn: 306568
2017-06-28 17:23:13 +00:00
Chih-Hung Hsieh 514dafdae3 Another test commit.
llvm-svn: 306567
2017-06-28 17:12:51 +00:00
Geoff Berry b0573547f6 [LoopUnroll] Fix bug in computeUnrollCount causing it to not honor MaxCount
Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper

Subscribers: mcrosier, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D34532

llvm-svn: 306564
2017-06-28 17:01:15 +00:00
Sanjay Patel 1a132d27c6 [InstCombine] add tests for icmp with bswapped operands; NFC
llvm-svn: 306563
2017-06-28 16:56:45 +00:00
Jakub Kuderski 22033efc29 [Dominators] Move number to node mapping out of DominatorTreeBase
Summary: Number to node mapping in DominatorTreeBase is used only during calculation, so there is no point keeping is as a member variable. This patch moves this mapping to Calculate function and passes it to helper functions. It also makes the name more descriptive.

Reviewers: sanjoy, dberlin, davide, chandlerc

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34295

llvm-svn: 306562
2017-06-28 16:54:34 +00:00
Sanjay Patel 4e96f19052 [InstCombine] use local variable to reduce code; NFCI
llvm-svn: 306560
2017-06-28 16:39:06 +00:00
Krzysztof Parzyszek 93cf232338 Rangify loops, formatting changes, use bool instead of unsigned, NFC
llvm-svn: 306557
2017-06-28 16:02:00 +00:00
Rafael Espindola 920fa14011 Don't repeat names and reformat. NFC.
llvm-svn: 306556
2017-06-28 16:00:16 +00:00
Geoff Berry 66d9bdbca8 [LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.
Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper

Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D34531

llvm-svn: 306554
2017-06-28 15:53:17 +00:00
Krzysztof Parzyszek 3008594cd4 Missed a check for UndefVI in r306466
llvm-svn: 306553
2017-06-28 15:46:16 +00:00
Daniel Sanders 320390bab8 [globalisel][tablegen] Post-commit review nits for r306388. NFC
One early exit and a missing assert string.

llvm-svn: 306552
2017-06-28 15:16:03 +00:00
Alexandros Lamprineas c0432d86aa [AArch64] AArch64CondBrTuningPass generates wrong branch instructions
Some conditional branch instructions generated by this pass are checking
the wrong condition code. The instructions TBZ and TBNZ are transformed
into B.GE and B.LT instead of B.PL and B.MI respectively. They should
only be checking the Negative bit.

Differential Revision: https://reviews.llvm.org/D34743

llvm-svn: 306550
2017-06-28 15:09:11 +00:00
Rafael Espindola 9a450d9b29 Don't repeat name in comments. 80 columns. NFC.
llvm-svn: 306548
2017-06-28 14:59:30 +00:00
John Brawn 75d76e5e95 [ARM] Improve if-conversion for M-class CPUs without branch predictors
The current heuristic in isProfitableToIfCvt assumes we have a branch predictor,
and so gives the wrong answer in some cases when we don't. This patch adds a
subtarget feature to indicate that a subtarget has no branch predictor, and
changes the heuristic in isProfitableToiIfCvt when it's present. This gives a
slight overall improvement in a set of embedded benchmarks on Cortex-M4 and
Cortex-M33.

Differential Revision: https://reviews.llvm.org/D34398

llvm-svn: 306547
2017-06-28 14:11:15 +00:00
Simon Pilgrim 48b30c3d55 [X86] Added BSWAP tests for illegal i64/i128/i256 'wide' scalar integers
llvm-svn: 306546
2017-06-28 14:07:50 +00:00
Simon Pilgrim 4f5fcb03ad [X86][SSE] Dropped -mcpu from vector bswap tests
Use triple and attribute only for consistency 

llvm-svn: 306545
2017-06-28 13:59:15 +00:00
Daniel Sanders 3229198ecd [globalisel][tablegen] Multiple 80-col corrections.
llvm-svn: 306544
2017-06-28 13:50:04 +00:00
Michael Zuckerman d0e663a697 [X86][LLVM][test]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess test.
Exapnding the test to include AVX target. 
Adding base tast (to trunk) for Store strid=4 vf=32. 

llvm-svn: 306543
2017-06-28 13:42:45 +00:00
Easwaran Raman 8249fac52d Create inliner params based on size and opt levels.
Differential revision: https://reviews.llvm.org/D34309

llvm-svn: 306542
2017-06-28 13:33:49 +00:00
Teresa Johnson 538b8d25f0 Add zero-length check to memcpy/memset load store loop expansion
Summary:
I was testing using this expansion logic in other cases besides
NVPTX, and found some runtime failures due to the lack of a check
for a zero length memcpy/memset before the loop. There is already
such a check in the memmove expansion code though.

Reviewers: hfinkel

Subscribers: jholewinski, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D34707

llvm-svn: 306541
2017-06-28 13:07:37 +00:00
Igor Breger 86cf07a32e [GlobalISel][X86] Test G_CONSTANT i32 0 TableGen'erated selection.NFC.
llvm-svn: 306537
2017-06-28 12:43:21 +00:00
Nikolai Bozhenov 6710ba07c7 Revert r306528
llvm-svn: 306536
2017-06-28 12:15:13 +00:00
Igor Breger d5b59cf914 [GlobalISel][X86] Support bitwise operations : G_AND, G_OR, G_XOR
Summary: Support G_AND, G_OR, G_XOR for i8/i16/i32/i64. Selection done via TableGen'erated code.

Reviewers: zvi, guyblank, aymanmus, m_zuckerman

Reviewed By: aymanmus

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34605

llvm-svn: 306533
2017-06-28 11:39:04 +00:00
Michael Zuckerman f66840020c Reverting commit 306414 on behalf of @gadi.haber
llvm-svn: 306532
2017-06-28 11:23:31 +00:00
Simon Pilgrim b9fa16bc53 [X86][AVX2] Dropped -mcpu from avx2 arithmetic/intrinsics tests
Use triple and attribute only for consistency 

llvm-svn: 306531
2017-06-28 10:54:54 +00:00
Petar Jovanovic 7b3a38ec30 [X86] Correct dwarf unwind information in function epilogue
CFI instructions that set appropriate cfa offset and cfa register are now
inserted in emitEpilogue() in X86FrameLowering.

Majority of the changes in this patch:

1. Ensure that CFI instructions do not affect code generation.
2. Enable maintaining correct information about cfa offset and cfa register
in a function when basic blocks are reordered, merged, split, duplicated.

These changes are target independent and described below.

Changed CFI instructions so that they:

1. are duplicable
2. are not counted as instructions when tail duplicating or tail merging
3. can be compared as equal

Add information to each MachineBasicBlock about cfa offset and cfa register
that are valid at its entry and exit (incoming and outgoing CFI info). Add
support for updating this information when basic blocks are merged, split,
duplicated, created. Add a verification pass (CFIInfoVerifier) that checks
that outgoing cfa offset and register of predecessor blocks match incoming
values of their successors.

Incoming and outgoing CFI information is used by a late pass
(CFIInstrInserter) that corrects CFA calculation rule for a basic block if
needed. That means that additional CFI instructions get inserted at basic
block beginning to correct the rule for calculating CFA. Having CFI
instructions in function epilogue can cause incorrect CFA calculation rule
for some basic blocks. This can happen if, due to basic block reordering,
or the existence of multiple epilogue blocks, some of the blocks have wrong
cfa offset and register values set by the epilogue block above them.

Patch by Violeta Vukobrat.

Differential Revision: https://reviews.llvm.org/D18046

llvm-svn: 306529
2017-06-28 10:21:17 +00:00
Nikolai Bozhenov 77b5536e4e [ValueTracking] Enabling existing ValueTracking patch by default.
The original patch was an improvement to IR ValueTracking on non-negative
integers. It has been checked in to trunk (D18777, r284022). But was disabled by
default due to performance regressions.
Perf impact has improved. The patch would be enabled by default.

Reviewers: reames

Differential Revision: https://reviews.llvm.org/D34101

Patch by: Olga Chupina <olga.chupina@intel.com>

llvm-svn: 306528
2017-06-28 10:08:08 +00:00
Nikolai Bozhenov b01e6b5a52 [InstCombine] Canonicalize clamp of float types to minmax in fast mode.
Summary:
This commit allows matchSelectPattern to recognize clamp of float
arguments in the presence of FMF the same way as already done for
integers.

This case is a little different though. With integers, given the
min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX
"automatically". That is not the case for float, because for them only
full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care
about NaNs. On the other hand, some backends (e.g. X86) have only
FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM
nodes are illegal thus selection is not happening. So I decided to do
such kind of transformation in IR (InstCombiner) instead of
complicating the logic in the backend.

Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper

Reviewed By: efriedma

Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits

Patch by Andrei Elovikov <andrei.elovikov@intel.com>

Differential Revision: https://reviews.llvm.org/D33186

llvm-svn: 306525
2017-06-28 09:26:20 +00:00
Nikolai Bozhenov 4ec1bb6f39 Add tests to document current InstCombine behavior for clamp pattern.
Summary:
This commit adds the tests for clamp pattern as a prerequisite of
D33186 to make the impact of that fix more clear and also to document
current behavior.

Reviewers: spatel, jmolloy

Reviewed By: spatel

Subscribers: n.bozhenov, llvm-commits

Patch by Andrei Elovikov <andrei.elovikov@intel.com>

Differential Revision: https://reviews.llvm.org/D34350

llvm-svn: 306524
2017-06-28 09:22:58 +00:00
George Rimar 002655df17 [DebugInfo] - Removed trailing whitespaces. NFC.
llvm-svn: 306518
2017-06-28 08:26:57 +00:00
George Rimar 1af3cb2912 Recommit "[ELF] - Add ability for DWARFContextInMemory to exit early when any error happen."
With fix in include folder character case:
#include "llvm/Codegen/AsmPrinter.h" -> #include "llvm/CodeGen/AsmPrinter.h"

Original commit message:

Change introduces error reporting policy for DWARFContextInMemory.
New callback provided by client is able to handle error on it's
side and return Halt or Continue.

That allows to either keep current behavior when parser prints all errors
but continues parsing object or implement something very different, like
stop parsing on a first error and report an error in a client style.

Differential revision: https://reviews.llvm.org/D34328

llvm-svn: 306517
2017-06-28 08:21:19 +00:00
Kristof Beyls eecb353d0e [ARM] Make -mcpu=generic schedule for an in-order core (Cortex-A8).
The benchmarking summarized in
http://lists.llvm.org/pipermail/llvm-dev/2017-May/113525.html showed
this is beneficial for a wide range of cores.

As is to be expected, quite a few small adaptations are needed to the
regressions tests, as the difference in scheduling results in:
- Quite a few small instruction schedule differences.
- A few changes in register allocation decisions caused by different
 instruction schedules.
- A few changes in IfConversion decisions, due to a difference in
 instruction schedule and/or the estimated cost of a branch mispredict.

llvm-svn: 306514
2017-06-28 07:07:03 +00:00
George Rimar 7a82cffd68 Revert r306512 "[ELF] - Add ability for DWARFContextInMemory to exit early when any error happen."
It broke BB:

[13/106] 13 0.022 Generating VCSRevision.h
[25/106] 24 1.209 Building CXX object unittests/DebugInfo/DWARF/CMakeFiles/DebugInfoDWARFTests.dir/DWARFDebugInfoTest.cpp.o
FAILED: unittests/DebugInfo/DWARF/CMakeFiles/DebugInfoDWARFTests.dir/DWARFDebugInfoTest.cpp.o 
/home/bb/bin/g++  -DGTEST_HAS_RTTI=0 -DLLVM_BUILD_GLOBAL_ISEL -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Iunittests/DebugInfo/DWARF -I../llvm-project/llvm/unittests/DebugInfo/DWARF -Iinclude -I../llvm-project/llvm/include -I../llvm-project/llvm/utils/unittest/googletest/include -I../llvm-project/llvm/utils/unittest/googlemock/include -fPIC -fvisibility-inlines-hidden -m32 -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -ffunction-sections -fdata-sections -O3    -UNDEBUG  -Wno-variadic-macros -fno-exceptions -fno-rtti -MD -MT unittests/DebugInfo/DWARF/CMakeFiles/DebugInfoDWARFTests.dir/DWARFDebugInfoTest.cpp.o -MF unittests/DebugInfo/DWARF/CMakeFiles/DebugInfoDWARFTests.dir/DWARFDebugInfoTest.cpp.o.d -o unittests/DebugInfo/DWARF/CMakeFiles/DebugInfoDWARFTests.dir/DWARFDebugInfoTest.cpp.o -c ../llvm-project/llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp
../llvm-project/llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp:18:37: fatal error: llvm/Codegen/AsmPrinter.h: No such file or directory
 #include "llvm/Codegen/AsmPrinter.h"
                                     ^
compilation terminated.

llvm-svn: 306513
2017-06-28 07:06:17 +00:00
George Rimar 397a70425b [ELF] - Add ability for DWARFContextInMemory to exit early when any error happen.
Change introduces error reporting policy for DWARFContextInMemory.
New callback provided by client is able to handle error on it's
side and return Halt or Continue.

That allows to either keep current behavior when parser prints all errors
but continues parsing object or implement something very different, like
stop parsing on a first error and report an error in a client style.

Differential revision: https://reviews.llvm.org/D34328

llvm-svn: 306512
2017-06-28 06:57:20 +00:00
Craig Topper 8fe3603ff1 [InstCombine] Add test case demonstrating that we don't handle icmp eq (trunc (lshr(X, cst1)), cst->icmp (and X, mask), cst when the shift type is larger than 64-bits. NFC
llvm-svn: 306510
2017-06-28 06:45:36 +00:00
Craig Topper 7f124694c5 Revert r306508 "[InstCombine] Add test case demonstrating that we don't handle icmp eq (trunc (lshr(X, cst1)), cst->icmp (and X, mask), cst when the shift type is larger than 64-bits. NFC"
I accidentally had a extra change in there.

llvm-svn: 306509
2017-06-28 06:43:58 +00:00
Craig Topper 1d5b4b634b [InstCombine] Add test case demonstrating that we don't handle icmp eq (trunc (lshr(X, cst1)), cst->icmp (and X, mask), cst when the shift type is larger than 64-bits. NFC
llvm-svn: 306508
2017-06-28 06:42:48 +00:00
Hiroshi Inoue 4bcef51b0f Add missing library dependency to fix build break in llvm-lto2
error message
CMakeFiles/llvm-lto2.dir/llvm-lto2.cpp.o: In function `dumpSymtab(int, char**)':
llvm-lto2.cpp:(.text._ZL10dumpSymtabiPPc+0x238): undefined reference to `llvm::getBitcodeFileContents(llvm::MemoryBufferRef)'
collect2: error: ld returned 1 exit status

llvm-svn: 306507
2017-06-28 06:14:30 +00:00
Max Kazantsev 6c466a376e [IRCE][NFC] Better get SCEV for 1 in calculateSubRanges
A slightly more efficient way to get constant, we avoid resolving in getSCEV and excessive
invocations, and we don't create a ConstantInt if 'true' branch is taken.

Differential Revision: https://reviews.llvm.org/D34672

llvm-svn: 306503
2017-06-28 04:57:45 +00:00
Nirav Dave c4ce2293b0 Revert "[DAG] Fold FrameIndex offset into BaseIndexOffset analysis. NFCI."
This reverts commit r306498 which appears to cause a compilrt-rt test failures

llvm-svn: 306501
2017-06-28 03:20:04 +00:00
Stanislav Mekhanoshin d445455643 [AMDGPU] Add pattern for v_alignbit_b32 with immediate
If immediate in shift is less than 32 we can use alignbit too.

Differential Revision: https://reviews.llvm.org/D34729

llvm-svn: 306500
2017-06-28 02:52:39 +00:00
Stanislav Mekhanoshin eb40733bf0 Allow to truncate left shift with non-constant shift amount
That is pretty common for clang to produce code like
(shl %x, (and %amt, 31)). In this situation we can still perform
trunc (shl) into shl (trunc) conversion given the known value
range of shift amount.

Differential Revision: https://reviews.llvm.org/D34723

llvm-svn: 306499
2017-06-28 02:37:11 +00:00
Nirav Dave 8ef03802f1 [DAG] Fold FrameIndex offset into BaseIndexOffset analysis. NFCI.
Pull FrameIndex comparision reasoning from DAGCombiner::isAlias to
general BaseIndexOffset.

llvm-svn: 306498
2017-06-28 02:09:50 +00:00
Kyle Butt f73c8a06a9 Inlining: Don't re-map simplified cloned instructions.
When simplifying an instruction that has been re-mapped, it should never
simplify to an instruction in the original function. In the edge case
where we are inlining a function into itself, the existing code led to
incorrect behavior. Replace the incorrect code with an assert verifying
that we never expect simplification to produce an instruction in the old
function, unless the functions are the same.

Differential Revision: https://reviews.llvm.org/D33850

llvm-svn: 306495
2017-06-28 01:41:25 +00:00
Joel Jones 8037233b54 [TableGen] Improve Debug Output for --debug-only=subtarget-emitter NFCI
Add headers for each section of output, with white space and "+++" to
improve readability.

Differential Revision: https://reviews.llvm.org/D34713

llvm-svn: 306492
2017-06-28 00:06:40 +00:00
Peter Collingbourne 2d67e3789a Add missing library dependency.
llvm-svn: 306491
2017-06-28 00:05:27 +00:00
Mandeep Singh Grang 0c72172e32 [COFF, ARM64] Add support for Windows ARM64 COFF format
Summary:
This is the llvm part of the initial implementation to support Windows ARM64 COFF format.
I will gradually add more functionality in subsequent patches.

Reviewers: ruiu, rnk, t.p.northover, compnerd

Reviewed By: ruiu, compnerd

Subscribers: aemerson, mgorny, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D34705

llvm-svn: 306490
2017-06-27 23:58:19 +00:00
Peter Collingbourne 99b98c21f2 Object: Teach irsymtab::read() to try to use the irsymtab that we wrote to disk.
Fixes PR27551.

Differential Revision: https://reviews.llvm.org/D33974

llvm-svn: 306488
2017-06-27 23:50:24 +00:00
Peter Collingbourne 92648c25a4 Bitcode: Write the irsymtab to disk.
Differential Revision: https://reviews.llvm.org/D33973

llvm-svn: 306487
2017-06-27 23:50:11 +00:00
Peter Collingbourne 53ed867da8 Object: Add version and producer fields to the irsymtab header. NFCI.
These will be necessary in order to handle upgrades from old bitcode
files.

Differential Revision: https://reviews.llvm.org/D33972

llvm-svn: 306486
2017-06-27 23:49:58 +00:00
Sanjay Patel 4b23fa0abf [CGP] add specialization for memcmp expansion with only one basic block
llvm-svn: 306485
2017-06-27 23:15:01 +00:00
Easwaran Raman c5fa6358ba [NewPM/Inliner] Reduce threshold for cold callsites in the non-PGO case
Differential Revision: https://reviews.llvm.org/D34312

llvm-svn: 306484
2017-06-27 23:11:18 +00:00
Tim Northover c990236ff9 GlobalISel: add some more sanity-checking to MachineInstrBuilder. NFC.
llvm-svn: 306481
2017-06-27 22:45:35 +00:00
Florian Hahn 2665febb54 [AArch64] Inline callee if its target-features are a subset of the caller
Summary:
Similar to X86, it should be safe to inline callees if their target-features
are a subset of the caller. This change matches GCC's inlining behavior
with respect to attributes [1].

[1] https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes

Reviewers: kristof.beyls, javed.absar, rengolin, t.p.northover

Reviewed By: t.p.northover

Subscribers: aemerson, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D34698

llvm-svn: 306478
2017-06-27 22:27:32 +00:00
Geoff Berry 2573a19fe6 [EarlyCSE][MemorySSA] Enable MemorySSA in function-simplification pass of EarlyCSE.
llvm-svn: 306477
2017-06-27 22:25:02 +00:00
Eugene Zelenko 9dece19159 [Analysis] Revert r306472 changes in LoopInfo headers to fix broken builds.
llvm-svn: 306476
2017-06-27 22:20:38 +00:00
Aditya Nandakumar cca75d2406 [GISel]: Add G_FEXP, G_FEXP2 opcodes
Also add IRTranslator support.
https://reviews.llvm.org/D34710

llvm-svn: 306475
2017-06-27 22:19:32 +00:00
Rafael Espindola 650c96e0a7 clang-format a file.
It had a few inconsistent indentations that made a followup patch
hard to read.

llvm-svn: 306474
2017-06-27 22:14:20 +00:00
Dehao Chen 920d022519 re-commit r306336: Enable vectorizer-maximize-bandwidth by default.
Differential Revision: https://reviews.llvm.org/D33341

llvm-svn: 306473
2017-06-27 22:05:58 +00:00
Eugene Zelenko 4f820d0e01 [Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 306472
2017-06-27 21:52:05 +00:00
Sanjay Patel 70b36f193d [CGP] eliminate a sub instruction in memcmp expansion
As noted in D34071, there are some IR optimization opportunities that could be 
handled by normal IR passes if this expansion wasn't happening so late in CGP.

Regardless of that, it seems wasteful to knowingly produce suboptimal IR here, 
so I'm proposing this change:
  %s = sub i32 %x, %y
  %r = icmp ne %s, 0
    =>
  %r = icmp ne %x, %y

Changing the predicate to 'eq' mimics what InstCombine would do, so that's just
an efficiency improvement if we decide this expansion should happen sooner.

The fact that the PowerPC backend doesn't eliminate the 'subf.' might be 
something for PPC folks to investigate separately.

Differential Revision: https://reviews.llvm.org/D34416

llvm-svn: 306471
2017-06-27 21:46:34 +00:00
Tim Northover 849fcca090 GlobalISel: verify that a COPY is trivial when created.
Without this check, COPY instructions can actually be one of the generic casts
in disguise. That's confusing and bad.

At some point during ISel this restriction has to be relaxed since the fully
selected instructions will usually use COPY for those purposes. Right now I
think it's possible that relaxation occurs during RegBankSelect (hence the
change there). I'm not convinced that's where it belongs long-term though.

llvm-svn: 306470
2017-06-27 21:41:40 +00:00
Xinliang David Li 241f020657 Clean up a test case
llvm-svn: 306468
2017-06-27 21:35:49 +00:00
Krzysztof Parzyszek 0b7688e6c0 Create a PHI value when merging with a known undef live-in
Differential Revision: https://reviews.llvm.org/D34640

llvm-svn: 306466
2017-06-27 21:30:46 +00:00
Sam Clegg 3431c3e9be [WebAssembly] Only run WebAssembly objdump tests if it is enabled as a target
Differential Revision: https://reviews.llvm.org/D34712

llvm-svn: 306464
2017-06-27 21:19:27 +00:00
Joel Jones aea1c356e6 [AArch64] Performance enhancements for Cavium ThunderX2 T99
This patch enables significant performance enhancements to the
Cavium ThunderX2T99 LLVM backend, as observed by running SPEC2K6,
by adding more detailed scheduling information.

Related Bugzilla bug: http://bugs.llvm.org/show_bug.cgi?id=32562

Patch by: steleman

Differential Revision: https://reviews.llvm.org/D31801

llvm-svn: 306462
2017-06-27 20:44:55 +00:00
Sam Clegg 4df5d76477 [WebAssembly] Add support for printing relocations with llvm-objdump
Differential Revision: https://reviews.llvm.org/D34658

llvm-svn: 306461
2017-06-27 20:40:53 +00:00
Sam Clegg 9e1ade93a8 [WebAssembly] Add data size and alignement to linking section
The overal size of the data section (including BSS)
is otherwise not included in the wasm binary.

Differential Revision: https://reviews.llvm.org/D34657

llvm-svn: 306459
2017-06-27 20:27:59 +00:00
Krzysztof Parzyszek 25173e4cba [Hexagon] Use proper predicate register state when expanding PS_vselect
llvm-svn: 306458
2017-06-27 19:59:46 +00:00
Craig Topper 5fe0197622 [InstCombine] Propagate nsw flag when turning mul by pow2 into shift when the constant is a vector splat or the scalar bit width is larger than 64-bits
The check to see if we can propagate the nsw flag used m_ConstantInt(uint64_t*&) which doesn't work with splat vectors and has a restriction that the bitwidth of the ConstantInt must be 64-bits are less.

This patch changes it to use m_APInt to remove both these issues

Differential Revision: https://reviews.llvm.org/D34699

llvm-svn: 306457
2017-06-27 19:57:53 +00:00
Craig Topper ccbb810776 [Constants] Fix copy-pasto in llvm_unreachable message. NFC
llvm-svn: 306456
2017-06-27 19:57:51 +00:00
Sanjay Patel 352e60556e [CGP] simplify code to get bswap in memcmp expansion; NFCI
llvm-svn: 306452
2017-06-27 19:31:35 +00:00
Stanislav Mekhanoshin e8bf6c9629 [AMDGPU] Add 2 new alignbit patterns
Differential Revision: https://reviews.llvm.org/D34655

llvm-svn: 306449
2017-06-27 19:10:47 +00:00
Serge Guelton 7bc405aa4c [CodeExtractor] Prevent extraction of block involving blockaddress
BlockAddress are only valid within their function context, which does not
interact well with CodeExtractor. Detect this case and prevent it.

Differential Revision: https://reviews.llvm.org/D33839

llvm-svn: 306448
2017-06-27 18:57:53 +00:00
Stanislav Mekhanoshin c9bd53ab59 [AMDGPU] Simplify setcc (sext from i1 b), -1|0, cc
Depending on the compare code that can be either an argument of
sext or negate of it. This helps to avoid v_cndmask_b64 instruction
for sext. A reversed value can be further simplified and folded into
its parent comparison if possible.

Differential Revision: https://reviews.llvm.org/D34545

llvm-svn: 306446
2017-06-27 18:53:03 +00:00
Krzysztof Parzyszek 5ddd2e5899 [Hexagon] Update kills in hexagon-nvj even more properly than before
Account for the fact that both, the feeder and the compare can be moved
over instructions that kill registers.

llvm-svn: 306443
2017-06-27 18:37:16 +00:00
Matt Arsenault 836d786e86 RenameIndependentSubregs: Fix infinite loop
Apparently this replacement can really be substituting the
same as the original register. Avoid restarting the loop
when there's been no change in the register uses.

llvm-svn: 306441
2017-06-27 18:28:10 +00:00
Yaxun Liu 7c44f340de [SROA] Fix APInt size when alloca address space is not 0
SROA assumes alloca address space is 0, which causes assertion. This patch fixes that.

Differential Revision: https://reviews.llvm.org/D34104

llvm-svn: 306440
2017-06-27 18:26:06 +00:00
Stanislav Mekhanoshin 6851ddf942 [AMDGPU] Combine and x, (sext cc from i1) => select cc, x, 0
Also factored out function to check if a boolean is an already
deserialized value which does not require v_cndmask_b32 to be
loaded. Added binary logical operators to its check.

Differential Revision: https://reviews.llvm.org/D34500

llvm-svn: 306439
2017-06-27 18:25:26 +00:00
Sanjay Patel 9a4ce0cc1c [CGP] add an IR builder to memcmp expansion class instead of recreating it; NFCI
This was a clean-up suggestion from:
https://reviews.llvm.org/D34005

llvm-svn: 306438
2017-06-27 18:18:42 +00:00
Jakub Kuderski 59ee5735ba [Dominators] Use Semi-NCA instead of SLT to calculate dominators
Summary:
This patch makes GenericDomTreeConstruction use the Semi-NCA algorithm instead of Simple Lengauer-Tarjan.

As described in `RFC: Dynamic dominators`, Semi-NCA offers slightly better performance than SLT. What's more important, it can be extended to perform incremental updates on already constructed dominator trees.

The patch passes check-all, llvm test suite and is able to boostrap clang. I also wasn't able to observe any compilation time regressions.

Reviewers: sanjoy, dberlin, chandlerc, grosser

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34258

llvm-svn: 306437
2017-06-27 18:08:53 +00:00
Matthias Braun a6e77405d0 LiveRangeCalc: Slightly improve map usage; NFC
- DenseMap should be faster than std::map
- Use the `InsertRes = insert() if (!InsertRes.inserted)` pattern rather
  than the `if (!X.contains(...)) { X.insert(...); }` to save one map
  lookup.

llvm-svn: 306436
2017-06-27 18:05:26 +00:00
Sanjay Patel 7227276d41 [InstCombine] canonicalize icmp predicate feeding select
This canonicalization was suggested in D33172 as a way to make InstCombine behavior more uniform. 
We have this transform for icmp+br, so unless there's some reason that icmp+select should be 
treated differently, we should do the same thing here.

The benefit comes from increasing the chances of creating identical instructions. This is shown in
the tests in logical-select.ll (PR32791). InstCombine doesn't fold those directly, but EarlyCSE 
can simplify the identical cmps, and then InstCombine can fold the selects together.

The possible regression for the tests in select.ll raises questions about poison/undef:
http://lists.llvm.org/pipermail/llvm-dev/2017-May/113261.html

...but that transform is just as likely to be triggered by this canonicalization as it is to be 
missed, so we're just pointing out a commutation deficiency in the pattern matching:
https://reviews.llvm.org/rL228409

Differential Revision: https://reviews.llvm.org/D34242

llvm-svn: 306435
2017-06-27 17:53:22 +00:00
Dehao Chen 66131665c4 Enable ICP for AutoFDO.
Summary: AutoFDO should have ICP enabled.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: sanjoy, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D34662

llvm-svn: 306429
2017-06-27 17:23:33 +00:00
Xinliang David Li d6f10a62c6 [ProfData] Make the method threadsafe
llvm-svn: 306428
2017-06-27 17:21:51 +00:00
Craig Topper 9512332bcb [InstCombine] Add test case demonstrating that we don't propagate nsw flag when converting mul by pow2 to shl when the type is larger than 64-bits. NFC
llvm-svn: 306427
2017-06-27 17:16:03 +00:00
Craig Topper d068fb8104 [InstCombine] Add test cases to show that we don't propagate 'nsw' flags when converting mul by pow2 constant to shl for splat vectors. NFC
llvm-svn: 306426
2017-06-27 17:16:01 +00:00
Coby Tayree 41a5b55f50 [X86][AsmParser][MS-compatability] Binary/Unary operators enhancements
Introducing MOD binary operator
https://msdn.microsoft.com/en-us/library/hha180wt.aspx

Enhancing unary operators NEG and NOT, to support more complex patterns

Differential Revision: https://reviews.llvm.org/D33876

llvm-svn: 306425
2017-06-27 16:58:27 +00:00
Javed Absar d9cc8fbcd3 Fix incorrect comment in machine-scheduler
The example code incorrectly invokes ScheduleDAGMI wherein from context
it is clear it intends to invoke ScheduleDAGMILive actually.

Reviewed by: Andrew Trick
Differential Revision: https://reviews.llvm.org/D34675

llvm-svn: 306424
2017-06-27 16:49:45 +00:00
Brian Gesiak 9473db43b6 [opt-viewer] Python 3 support in opt-diff.py
Summary:
The `file()` builtin is not available in Python 3; use `open()` instead.
https://docs.python.org/3.0/whatsnew/3.0.html#builtins

Reviewers: anemet, davidxl, davide

Reviewed By: davide

Subscribers: davide, fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D34670

llvm-svn: 306423
2017-06-27 16:46:50 +00:00
David Green 22322fb638 Change sort function used in tblgen to be strict weak ordering
The windows debug is failing as the sort function is not strict
weak ordering, so switch a >= to a >.

llvm-svn: 306422
2017-06-27 16:28:44 +00:00
Chih-Hung Hsieh ff680f0386 Another test commit
llvm-svn: 306420
2017-06-27 16:18:41 +00:00
Paul Robinson d66ee0f9a7 [DWARF] NFC: Make string-offset handling more like address-table handling;
do the indirection and relocation all in the same method.

llvm-svn: 306418
2017-06-27 15:40:18 +00:00
Craig Topper 81cbb0c237 [PatternMatch] Remove 64-bit or less restriction from m_SpecificInt
Not sure why this restriction existed, but it seems like we should support any size Constant here.

The particular pattern in the tests is not the only use of this matcher in the tree. There's one in CodeGenPrepare and one in InstSimplify as well.

Differential Revision: https://reviews.llvm.org/D34666

llvm-svn: 306417
2017-06-27 15:39:40 +00:00
Craig Topper 0a60d85811 [JumpThreading] Add test case that was supposed to go with r306085.
Looks like I forgot to 'git add' when I submitted the commit. Thanks to Chandler for noticing.

llvm-svn: 306416
2017-06-27 15:26:47 +00:00
Gadi Haber 13759a7ed6 Updated and extended the information about each instruction in HSW and SNB to include the following data:
•static latency
•number of uOps from which the instructions consists
•all ports used by the instruction

Reviewers: 
 RKSimon 
 zvi  
aymanmus  
m_zuckerman 

Differential Revision: https://reviews.llvm.org/D33897
 

llvm-svn: 306414
2017-06-27 15:05:13 +00:00
Sam Kolton a179d25b99 [AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions
Summary:
1. Instruction V_CVT_U32_F32 allow omod operand (see SIInstrInfo.td:1435). In fact this operand shouldn't be allowed here. This fix checks if SDWA pseudo instruction has OMod operand and then copy it.
2. There were several problems with support of VOPC instructions in SDWA peephole pass.

Reviewers: tstellar, arsenm, vpykhtin, airlied, kzhuravl

Subscribers: wdng, nhaehnle, yaxunl, dstuttard, tpr, sarnex, t-tye

Differential Revision: https://reviews.llvm.org/D34626

llvm-svn: 306413
2017-06-27 15:02:23 +00:00
Matthew Simpson 0bd79f416a [AArch64] Update successor probabilities after ccmp-conversion
This patch modifies the conditional compares pass so that it keeps successor
probabilities up-to-date after the conversion. Previously, successor
probabilities were being normalized to a uniform distribution, even though they
may have been heavily biased prior to the conversion (e.g., if one of the edges
was the back edge of a loop). This loss of information affected passes later in
the pipeline.

Differential Revision: https://reviews.llvm.org/D34109

llvm-svn: 306412
2017-06-27 15:00:22 +00:00
Anna Thomas dc935a6eb6 [LoopUnrollRuntime] Use SCEV exit count for calculating trip count. NFCI
Instead of getBackEdgeTakenCount, use getExitCount on the latch exiting block
(which is proven to be the only exiting block in the loop to be unrolled).

llvm-svn: 306410
2017-06-27 14:14:35 +00:00
Simon Dardis 4155c8f1f3 [mips] Add instruction aliases for ds(r|l)l.
Add the instruction aliases for ds(r|l)l for the two operand alias
of ds(r|l)lv and the aliases ds(r|l)l with the three register operands.

llvm-svn: 306405
2017-06-27 13:35:17 +00:00
Hiroshi Inoue 84aafee4fb [SelectionDAG] set dereferenceable flag in MergeConsecutiveStores to fix assetion failure
When SelectionDAG merges consecutive stores and loads in MergeConsecutiveStores, it does not set dereferenceable flag for a created load instruction. This results in an assertion failure if SelectionDAG commonizes this load instruction with other load instructions, as well as it may miss optimization opportunities.

This patch sat dereferenceable flag for the newly created load instruction if all the load instructions to be merged are dereferenceable.

Differential Revision: https://reviews.llvm.org/D34679

llvm-svn: 306404
2017-06-27 12:43:08 +00:00
Ayman Musa 721d97f7b8 Recommitting rL305465 after fixing bug in TableGen in rL306251 & rL306371
[X86][AVX512] Improve lowering of AVX512 compare intrinsics (remove redundant shift left+right instructions).

AVX512 compare instructions return v*i1 types.
In cases where the number of elements in the returned value are less than 8, clang adds zeroes to get a mask of v8i1 type.
Later on it's replaced with CONCAT_VECTORS, which then is lowered to many DAG nodes including insert/extract element and shift right/left nodes.
The fact that AVX512 compare instructions put the result in a k register and zeroes all its upper bits allows us to remove the extra nodes simply by copying the result to the required register class.

When lowering, identify these cases and transform them into an INSERT_SUBVECTOR node (marked legal), then catch this pattern in instructions selection phase and transform it into one avx512 cmp instruction.

Differential Revision: https://reviews.llvm.org/D33188

llvm-svn: 306402
2017-06-27 12:08:37 +00:00
Vassil Vassilev 8da30a79cd Add missing include. Should fix modules libstdc++ builds.
llvm-svn: 306399
2017-06-27 11:45:26 +00:00
Hiroshi Inoue 6a391bbf40 fix trivial typos, NFC
succesor -> successor

llvm-svn: 306393
2017-06-27 10:35:37 +00:00
Diana Picus 0e74a134f8 [ARM] GlobalISel: Support G_SELECT for pointers
All we need to do is mark it as legal, otherwise it's just like s32.

llvm-svn: 306390
2017-06-27 10:29:50 +00:00
Simon Pilgrim 71d8b67bea [X86][AVX512] Regenerate avx512 arithmetic tests
llvm-svn: 306389
2017-06-27 10:13:56 +00:00
Daniel Sanders cc36dbf55d [globalisel][tablegen] Add support for EXTRACT_SUBREG.
Summary:
After this patch, we finally have test cases that require multiple
instruction emission.

Depends on D33590

Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls

Subscribers: javed.absar, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D33596

llvm-svn: 306388
2017-06-27 10:11:39 +00:00
Simon Dardis 3e0d39e403 [mips] Refine the condition for when to use CALL16 vs a GOT displacement.
Borrow from the logic for 'jal' in MipsAsmParser::processInstruction
and add the extra condition of bypassing CALL16 if the destination symbol
is an ELF symbol with STB_LOCAL binding.

Patch by: John Baldwin

Reviewers: sdardis

Differential Revision: https://reviews.llvm.org/D33999

llvm-svn: 306387
2017-06-27 10:11:11 +00:00
Diana Picus 7145d22f81 [ARM] GlobalISel: Support G_SELECT for i32
* Mark as legal for (s32, i1, s32, s32)
* Map everything into GPRs
* Select to two instructions: a CMP of the condition against 0, to set
  the flags, and a MOVCCr to select between the two inputs based on the
  flags that we've just set

llvm-svn: 306382
2017-06-27 09:19:51 +00:00
Ayal Zaks fc1e210d44 Recommitting 306331.
Undoing revert 306338 after fixed bug: add metadata to the load instead of the
reverse shuffle added to it, retaining the original ValueMap implementation.

llvm-svn: 306381
2017-06-27 08:41:19 +00:00
Hiroshi Inoue 912eec378c [PowerPC] fix incorrect processor name for -mcpu in a test case
to surpress warnings. ppc970 should be 970 (or g5)

llvm-svn: 306380
2017-06-27 08:35:35 +00:00
Chandler Carruth 3f81d8024c [SROA] Fix PR32902 by more carefully propagating !nonnull metadata.
This is based heavily on the work done ni D34285. I mostly wanted to do
test cleanup for the author to save them some time, but I had a really
hard time understanding why it was so hard to write better test cases
for these issues.

The problem is that because SROA does a second rewrite of the loads and
because we *don't* propagate !nonnull for non-pointer loads, we first
introduced invalid !nonnull metadata and then stripped it back off just
in time to avoid most ways of this PR manifesting. Moving to the more
careful utility only fixes this by changing the predicate to look at the
new load's type rather than the target type. However, that *does* fix
the bug, and the utility is much nicer including adding range metadata
to model the nonnull property after a conversion to an integer.

However, we have bigger problems because we don't actually propagate
*range* metadata, and the utility to do this extracted from instcombine
isn't really in good shape to do this currently. It *only* handles the
case of copying range metadata from an integer load to a pointer load.
It doesn't even handle the trivial cases of propagating from one integer
load to another when they are the same width! This utility will need to
be beefed up prior to using in this location to get the metadata to
fully survive.

And even then, we need to go and teach things to turn the range metadata
into an assume the way we do with nonnull so that when we *promote* an
integer we don't lose the information.

All of this will require a new test case that looks kind-of like
`preserve-nonnull.ll` does here but focuses on range metadata. It will
also likely require more testing because it needs to correctly handle
changes to the integer width, especially as SROA actively tries to
change the integer width!

Last but not least, I'm a little worried about hooking the range
metadata up here because the instcombine logic for converting from
a range metadata *to* a nonnull metadata node seems broken in the face
of non-zero address spaces where null is not mapped to the integer `0`.
So that probably needs to get fixed with test cases both in SROA and in
instcombine to cover it.

But this *does* extract the core PR fix from D34285 of preventing the
!nonnull metadata from being propagated in a broken state just long
enough to feed into promotion and crash value tracking.

On D34285 there is some discussion of zero-extend handling because it
isn't necessary. First, the new load size covers all of the non-undef
(ie, possibly initialized) bits. This may even extend past the original
alloca if loading those bits could produce valid data. The only way its
valid for us to zero-extend an integer load in SROA is if the original
code had a zero extend or those bits were undef. And we get to assume
things like undef *never* satifies nonnull, so non undef bits can
participate here. No need to special case the zero-extend handling, it
just falls out correctly.

The original credit goes to Ariel Ben-Yehuda! I'm mostly landing this to
save a few rounds of trivial edits fixing style issues and test case
formulation.

Differental Revision: D34285

llvm-svn: 306379
2017-06-27 08:32:03 +00:00
Vassil Vassilev 2315d83333 Add missing forward declaraion.
llvm-svn: 306376
2017-06-27 08:10:28 +00:00
Nicolai Haehnle 43cc6c4e0f AMDGPU: M0 operands to spill/restore opcodes are dead
Summary:
With scalar stores, M0 is clobbered and therefore marked as implicitly
defined. However, it is also dead.

This fixes an assertion when the Greedy Register Allocator decides to
optimize a spill/restore pair away again (via tryHintsRecoloring).

Reviewers: arsenm

Subscribers: qcolombet, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D33319

llvm-svn: 306375
2017-06-27 08:04:13 +00:00
Ayman Musa 40e3f194ae [TableGen] Fix bug in TableGen CodeGenPatterns when adding variants of the patterns.
All patterns reside in a std::vector container, where new variants are added to it using the standard library's emplace_back function.
When calling this with a new element while there is no enough allocated space, a bigger space is allocated and all the old info in the small vector is copied to the newly allocated vector, then the old vector is freed.
The problem is that before doing this "copying", we take a reference of one of the elements in the old vector, and after the "copying" we add it to the new vector.
As the old vector is freed after the copying, the reference now does not point to a valid element.

Added new function to the API of CodeGenDAGPatterns class to return the same information as a copy in order to avoid this issue.

This was revealed in rL305465 that added many patterns and forced the reallocation of the vector which caused crashes in windows bots.

Differential Revision: https://reviews.llvm.org/D34341

llvm-svn: 306371
2017-06-27 07:10:20 +00:00
Igor Breger 925f088bae [GlobalISel][X86] Add fp32/62 legalizer, regbank-select, selection tests for G_FADD, G_FSUB, G_FMUL, G_FDIV. NFC.
llvm-svn: 306370
2017-06-27 07:01:54 +00:00
Galina Kistanova 06a0e0e6a9 Fixed the warning introduced by r306289 to make ubuntu-gcc7.1-werror bot green.
llvm-svn: 306369
2017-06-27 06:58:57 +00:00
Mikael Holmen 37b5120a9a [Reassociate] Make sure EraseInst sets MadeChange
Summary:
EraseInst didn't report that it made IR changes through MadeChange.

It is essential that changes to the IR are reported correctly,
since for example ReassociatePass::run() will indicate that all
analyses are preserved otherwise.
And the CGPassManager determines if the CallGraph is up-to-date
based on status from InstructionCombiningPass::runOnFunction().

Reviewers: craig.topper, rnk, davide

Reviewed By: rnk, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34616

llvm-svn: 306368
2017-06-27 05:32:13 +00:00
Hiroshi Inoue 5102028f63 [PowerPC] set optimization level in SelectionDAGISel
PowerPC backend does not pass the current optimization level to SelectionDAGISel and so SelectionDAGISel works with the default optimization level regardless of the current optimization level.
This patch makes the PowerPC backend set the optimization level correctly.

Differential Revision: https://reviews.llvm.org/D34615

llvm-svn: 306367
2017-06-27 04:52:17 +00:00
Mandeep Singh Grang 53b82b3df9 [COFF, ARM64] Fix typo in COFF ARM64 Relocation Type
Summary:
Fixed IMAGE_REL_ARM64_PAGEBASE_REL2 ==> IMAGE_REL_ARM64_PAGEBASE_REL21
Refer: http://www.microsoft.com/whdc/system/platform/firmware/PECOFF.mspx

Reviewers: zturner, rnk, ruiu

Reviewed By: ruiu

Subscribers: aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34659

llvm-svn: 306366
2017-06-27 04:51:44 +00:00
Craig Topper f9319a78a5 [InstCombine] Add test cases demonstrating that we don't optmize select+cmp+cttz/ctlz when the bitwidth is larger than 64 bits.
llvm-svn: 306365
2017-06-27 04:50:47 +00:00
Leslie Zhai c9d9d7976a [AVR] Migrate to new MCAsmBackend applyFixup and processFixupValue
Reviewers: rafael, dylanmckay, jroelofs, meadori

Reviewed By: rafael, meadori

Subscribers: meadori, llvm-commits

Differential Revision: https://reviews.llvm.org/D34551

llvm-svn: 306359
2017-06-27 03:29:27 +00:00
Chandler Carruth c691623585 [SROA] Further test cleanup and add a test for the actual propagation of
the nonnull attribute distinct from rewriting it into an assume.

llvm-svn: 306358
2017-06-27 03:08:45 +00:00
Davide Italiano e096da7f2c [CFLAA] Move FunctionHandle to llvm::cflaa.
Also, while here, remove an unneeded `using namespace llvm`.
Thanks to George for the suggestion.

llvm-svn: 306355
2017-06-27 02:43:00 +00:00
Davide Italiano 31d4c1bbbc [CFLAA] Move a common function to the header to reduce duplication.
Differential Revision:  https://reviews.llvm.org/D34660

llvm-svn: 306354
2017-06-27 02:25:06 +00:00
Chandler Carruth 3b978394ba [SROA] Clean up a test case a bit prior to adding more testing for
nonnull as part of fixing PR32902.

llvm-svn: 306353
2017-06-27 02:23:15 +00:00
Matthias Braun e2ae001982 ScheduleDAGInstrs: Fix fixupKills() adding too many kill flags.
Remove invalid shortcut in fixupKills(): A register needs to be marked
live even when we are not adding a kill flag. This is because a
partially live register must not get a kill flags, but it still needs to
be fully marked live when walking backwards.

llvm-svn: 306352
2017-06-27 00:58:48 +00:00
Davide Italiano 604c003f5f [CFLAA] Use raw pointers instead of Optional<Pointer>. NFC.
Using Optional<> here doesn't seem to be terribly valuable, but
this is not the main point of this change. The change enables
us to merge the (now) two identical copies of parentFunctionOfValue()
that Steensgaard's and Andersens' provide.

llvm-svn: 306351
2017-06-27 00:33:37 +00:00
Davide Italiano e34a806431 [CFLAA] Change FunctionHandle to be common to Steensgaard's and Andersens'
Differential Revision:  https://reviews.llvm.org/D34638

llvm-svn: 306348
2017-06-26 23:59:14 +00:00
Wolfgang Pieb 9f65858235 DAGCombine: Make sure we only eliminate trunc/extend when the scales of truncation and extension match.
This fixes PR33368.

Reviewer: rksimon

Differential Revision:  https://reviews.llvm.org/D34069

llvm-svn: 306345
2017-06-26 23:05:51 +00:00
Dehao Chen 8b7effb344 revert r306336 for breaking ppc test.
llvm-svn: 306344
2017-06-26 23:05:35 +00:00
Eugene Zelenko 76bf48d932 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 306341
2017-06-26 22:44:03 +00:00
Vedant Kumar 71b3d721fd [Coverage] Improve readability by using a struct. NFC.
llvm-svn: 306340
2017-06-26 22:33:06 +00:00
Ayal Zaks 3923c0c46b reverting 306331.
Causes TBAA metadata to be generates on reverse shuffles, investigating.

llvm-svn: 306338
2017-06-26 22:26:54 +00:00
Sanjay Patel b859910eb2 [x86] add tests for missing sbb transforms; NFC
llvm-svn: 306337
2017-06-26 22:20:07 +00:00
Dehao Chen 79655792cc Enable vectorizer-maximize-bandwidth by default.
Summary:
vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact:

spec/2006/fp/C++/444.namd                 26.84  -0.31%
spec/2006/fp/C++/447.dealII               46.19  +0.89%
spec/2006/fp/C++/450.soplex               42.92  -0.44%
spec/2006/fp/C++/453.povray               38.57  -2.25%
spec/2006/fp/C/433.milc                   24.54  -0.76%
spec/2006/fp/C/470.lbm                    41.08  +0.26%
spec/2006/fp/C/482.sphinx3                47.58  -0.99%
spec/2006/int/C++/471.omnetpp             22.06  +1.87%
spec/2006/int/C++/473.astar               22.65  -0.12%
spec/2006/int/C++/483.xalancbmk           33.69  +4.97%
spec/2006/int/C/400.perlbench             33.43  +1.70%
spec/2006/int/C/401.bzip2                 23.02  -0.19%
spec/2006/int/C/403.gcc                   32.57  -0.43%
spec/2006/int/C/429.mcf                   40.35  +0.27%
spec/2006/int/C/445.gobmk                 26.96  +0.06%
spec/2006/int/C/456.hmmer                  24.4  +0.19%
spec/2006/int/C/458.sjeng                 27.91  -0.08%
spec/2006/int/C/462.libquantum            57.47  -0.20%
spec/2006/int/C/464.h264ref               46.52  +1.35%

geometric mean                                   +0.29%

The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag.

I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent.

Reviewers: hfinkel, mkuper, davidxl, chandlerc

Reviewed By: chandlerc

Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33341

llvm-svn: 306336
2017-06-26 21:41:09 +00:00
Dehao Chen 38f1bc7834 Fix the bug when handling shufflevector for aarch64.
Summary: This Fixes https://bugs.llvm.org/show_bug.cgi?id=33600

Reviewers: mssimpso, davidxl, Carrot

Reviewed By: mssimpso

Subscribers: aemerson, rengolin, sanjoy, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D34641

llvm-svn: 306334
2017-06-26 21:33:51 +00:00
Matt Arsenault 53fae0772a RenameIndependentSubregs: Fix iterator problem
Fixes bug 33597.

Use of substituteRegister in the tied operand case messes
up the register use iterator, causing some uses to be left
unprocessed.

llvm-svn: 306333
2017-06-26 21:33:36 +00:00
Vassil Vassilev 304392621b Add missing forward declaration.
This should fix our modules builds.

llvm-svn: 306332
2017-06-26 21:11:29 +00:00
Ayal Zaks e7e15d186b [LV] Changing the interface of ValueMap, NFC.
Instead of providing access to the internal MapStorage holding all Values
associated with a given Key, used for setting or resetting them all together,
ValueMap keeps its MapStorage internal; its new interface allows getting,
setting or resetting a single Value, per part or per part-and-lane.
Follows the discussion in https://reviews.llvm.org/D32871.

Differential Revision: https://reviews.llvm.org/D34473

llvm-svn: 306331
2017-06-26 21:03:51 +00:00
Sam Clegg 933df2658d [WebAssembly] Add more support for weak symbols
Add weak symbol tests to MC
Add symbol flags to output of `llvm-readobj -t`.

Differential Revision: https://reviews.llvm.org/D34635

llvm-svn: 306330
2017-06-26 21:01:39 +00:00
Tim Northover c2d5e6d637 AArch64: legalize G_EXTRACT operations.
This is the dual problem to legalizing G_INSERTs so most of the code and
testing was cribbed from there.

llvm-svn: 306328
2017-06-26 20:34:13 +00:00
Paul Robinson 36e85a867b [DWARF] NFC: Give DwarfFormat a 1-byte base type.
In particular this reduces DWARFFormParams from 64 to 32 bits; pass it
around by value.

llvm-svn: 306324
2017-06-26 19:52:32 +00:00
Tim Northover 9ac3e42211 AArch64: remove all kill flags when extending register liveness.
When we forward a stored value to a load and eliminate it entirely we need to
make sure the liveness of the register is maintained all the way to its use.
Previously we only cleared liveness on the store doing the forwarding, but
there could be other killing uses in between.

We already do the right thing when the load has to be converted into something
else, it was just this one path that skipped it.

llvm-svn: 306318
2017-06-26 18:49:25 +00:00
Paul Robinson 75c068c50b [DWARF] NFC: Collect info used by DWARFFormValue into a helper.
Some forms have sizes that depend on the DWARF version, DWARF format
(32/64-bit), or the size of an address.  Collect these into a struct
to simplify passing them around.  Require callers to provide one when
they query a form's size.

Differential Revision: http://reviews.llvm.org/D34570

llvm-svn: 306315
2017-06-26 18:43:01 +00:00
Simon Pilgrim d58f051792 [X86][SSE] Check SSE2/SSE3 codegen tests on i686 and x86_64
llvm-svn: 306314
2017-06-26 18:20:46 +00:00
Wei Mi 71f06420e4 [GVN] Recommit the patch "Add phi-translate support in scalarpre".
The recommit fixes three bugs: The first one is to use CurrentBlock instead of
PREInstr's Parent as param of performScalarPREInsertion because the Parent
of a clone instruction may be uninitialized. The second one is stop PRE when
CurrentBlock to its predecessor is a backedge and an operand of CurInst is
defined inside of CurrentBlock. The same value defined inside of loop in last
iteration can not be regarded as available. The third one is an out-of-bound
array access in a flipped if guard.

Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.

long a[100], b[100], g1, g2, g3;
__attribute__((pure)) long goo();

void foo(long a, long b, long c, long d) {

  g1 = a * b;
  if (__builtin_expect(g2 > 3, 0)) {
    a = c;
    b = d;
    g2 = a * b;
  }
  g3 = a * b;      // fully redundant.

}

The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.

llvm-svn: 306313
2017-06-26 18:16:10 +00:00
Matt Arsenault f28683cf51 AMDGPU: Setup SP/FP in callee function prolog/epilog
llvm-svn: 306312
2017-06-26 17:53:59 +00:00
Eric Beckmann 2a81089116 Replace trivial use of external rc.exe by writing our own .res file.
This patch removes the dependency on the external rc.exe tool by writing
a simple .res file using our own library. In this patch I also added an
explicit definition for the .res file magic.  Furthermore, I added a
unittest for embeded manifests and fixed a bug exposed by the test.

llvm-svn: 306311
2017-06-26 17:43:30 +00:00
Zachary Turner e79b07e41e [llvm-pdbutil] Add a mode to `bytes` for dumping split debug chunks.
llvm-svn: 306309
2017-06-26 17:22:36 +00:00
Brian Gesiak 9b4e8975a9 [opt-viewer] Python 3 support in opt-stats.py
Summary: Minor changes that allow opt-stats.py to support both Python 2 and 3.

Reviewers: anemet, davidxl

Reviewed By: anemet

Subscribers: llvm-commits, fhahn

Differential Revision: https://reviews.llvm.org/D34564

llvm-svn: 306306
2017-06-26 16:51:24 +00:00
Ulrich Weigand af98b748f6 [SystemZ] Fix missing emergency spill slot corner case
We sometimes need emergency spill slots for the register scavenger.
This may be the case when code needs to access a stack slot that
has an offset of 4096 or more relative to the stack pointer.

To make that determination, processFunctionBeforeFrameFinalized
currently simply checks the total stack frame size of the current
function.  But this is not enough, since code may need to access
stack slots in the caller's stack frame as well, in particular
incoming arguments stored on the stack.

This commit fixes the problem by taking argument slots into account.

llvm-svn: 306305
2017-06-26 16:50:32 +00:00
Simon Pilgrim f07663876a [X86][SSE] Add combine tests for PMULDQ/PMULUDQ
Found several missed optimizations while investigating replacing _mm_mul_epi32/_mm_mul_epu32 with generic implementations

llvm-svn: 306302
2017-06-26 16:22:52 +00:00
Marina Yatsina f58dcb85d2 [inline asm] dot operator while using imm generates wrong ir + asm - llvm part
Inline asm dot operator while using imm generates wrong ir and asm

This also fixes bugzilla 32987:
https://bugs.llvm.org//show_bug.cgi?id=32987

The clang part of the review that contains the test can be found here:
https://reviews.llvm.org/D33040

commit on behald of zizhar

Differential Revision:
https://reviews.llvm.org/D33039

llvm-svn: 306300
2017-06-26 16:03:42 +00:00
Ahmed Bougacha 58a197414e [X86][AVX-512] Don't raise inexact in ceil, floor, round, trunc.
The non-AVX-512 behavior was changed in r248266 to match N1778
(C bindings for IEEE-754 (2008)), which defined the four functions
to not raise the inexact exception ("rint" is still defined as raising
it).

Update the AVX-512 lowering of these functions to match that: it should
not be different.

llvm-svn: 306299
2017-06-26 16:00:24 +00:00
Tom Stellard eb8f1e27d9 AMDGPU/GlobalISel: Mark 32-bit G_SHL as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D34589

llvm-svn: 306298
2017-06-26 15:56:52 +00:00
Simon Pilgrim 0ad0e5802b [X86] Add test case for PR15981
llvm-svn: 306296
2017-06-26 15:53:11 +00:00
Simon Pilgrim e77df9bc6c [llvm-stress] Add getRandom() helper that was going to be part of D34157. NFCI.
llvm-svn: 306294
2017-06-26 15:41:36 +00:00
Sanjay Patel 15748d239e [x86] transform vector inc/dec to use -1 constant (PR33483)
Convert vector increment or decrement to sub/add with an all-ones constant:

add X, <1, 1...> --> sub X, <-1, -1...>
sub X, <1, 1...> --> add X, <-1, -1...>

The all-ones vector constant can be materialized using a pcmpeq instruction that is 
commonly recognized as an idiom (has no register dependency), so that's better than 
loading a splat 1 constant.

AVX512 uses 'vpternlogd' for 512-bit vectors because there is apparently no better
way to produce 512 one-bits.

The general advantages of this lowering are:
1. pcmpeq has lower latency than a memop on every uarch I looked at in Agner's tables, 
   so in theory, this could be better for perf, but...

2. That seems unlikely to affect any OOO implementation, and I can't measure any real 
   perf difference from this transform on Haswell or Jaguar, but...

3. It doesn't look like it from the diffs, but this is an overall size win because we 
   eliminate 16 - 64 constant bytes in the case of a vector load. If we're broadcasting 
   a scalar load (which might itself be a bug), then we're replacing a scalar constant 
   load + broadcast with a single cheap op, so that should always be smaller/better too.

4. This makes the DAG/isel output more consistent - we use pcmpeq already for padd x, -1 
   and psub x, -1, so we should use that form for +1 too because we can. If there's some
   reason to favor a constant load on some CPU, let's make the reverse transform for all
   of these cases (either here in the DAG or in a later machine pass).

This should fix:
https://bugs.llvm.org/show_bug.cgi?id=33483

Differential Revision: https://reviews.llvm.org/D34336

llvm-svn: 306289
2017-06-26 14:19:26 +00:00
Krzysztof Parzyszek 918e6d70bd [Hexagon] Handle cases when the aligned stack pointer is missing
llvm-svn: 306288
2017-06-26 14:17:58 +00:00
Jonas Paulsson 8c33647ba1 [SystemZ] Add a check against zero before calling getTestUnderMaskCond()
Csmith discovered that this function can be called with a zero argument,
in which case an assert for this triggered.

This patch also adds a guard before the other call to this function since
it was missing, although the test only covers the case where it was
discovered.

Reduced test case attached as CodeGen/SystemZ/int-cmp-54.ll.

Review: Ulrich Weigand
llvm-svn: 306287
2017-06-26 13:38:27 +00:00
Michael Zuckerman ce7e187f84 [X86][LLVM][test]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess test.
Adding base tast (to trunk) for Store strid=4 vf=32. 

llvm-svn: 306286
2017-06-26 13:27:32 +00:00
Simon Pilgrim 3d116189c7 [llvm-stress] Remove Rand32 helper function
To try and help avoid repeats of PR32585, remove Rand32 which is only called by Rand64

llvm-svn: 306285
2017-06-26 13:17:36 +00:00
Simon Pilgrim 1158fe9715 [llvm-stress] Ensure that the C++11 random device respects its min/max values (PR32585)
As noted on PR32585, the change in D29780/rL295325 resulted in calls to Rand32() (values 0 -> 0xFFFFFFFF) but the min()/max() operators indicated it would be (0 -> 0x7FFFF).

This patch changes the random operator to call Rand() instead which does respect the 0 -> 0x7FFFF range and asserts that the value is in range as well.

Differential Revision: https://reviews.llvm.org/D34089

llvm-svn: 306281
2017-06-26 10:16:34 +00:00
Mikael Holmen 45bd32f9ad [IfConversion] Hoist removeBranch calls out of if/else clauses [NFC]
Summary:
Also added a comment.

Pulled out of https://reviews.llvm.org/D34099.

Reviewers: iteratee

Reviewed By: iteratee

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34388

llvm-svn: 306279
2017-06-26 09:33:04 +00:00
Craig Topper 700892fd89 [IR] Rename BinaryOperator::init to AssertOK and remove argument. Replace default case in switch with llvm_unreachable since all valid opcodes are covered.
This method doesn't do any initializing. It just contains asserts. So renaming to AssertOK makes it consistent with similar instructions in other Instruction classes.

llvm-svn: 306277
2017-06-26 07:15:59 +00:00
Serguei Katkov 0e70206c8f This reverts commit r306272.
Revert "[MBP] do not rotate loop if it creates extra branch"

It breaks the sanitizer build bots. Need to fix this.

llvm-svn: 306276
2017-06-26 06:51:45 +00:00
Tobias Grosser 258245a6ef [bugpoint] Do not initialize disassembler passes
We added the initilization of disassembler passes in r306208 with the goal to
bring bugpoint in line with 'opt'. However, 'opt' does itself not initialize
dissassembler passes. As our goal was consistency, we drop the initialization
of dissassembler passes again from bugpoint.

Thanks to Chandler for pointing this out!

llvm-svn: 306275
2017-06-26 06:50:50 +00:00
Hiroshi Inoue 4484ff03df fix trivial typo in comment, NFC
llvm-svn: 306274
2017-06-26 06:32:04 +00:00
Serguei Katkov b01fff06ed [MBP] do not rotate loop if it creates extra branch
This is a last fix for the corner case of PR32214. Actually this is not really corner case in general.

We should not do a loop rotation if we create an additional branch due to it.
Consider the case where we have a loop chain H, M, B, C , where
H is header with viable fallthrough from pre-header and exit from the loop
M - some middle block
B - backedge to Header but with exit from the loop also.
C - some cold block of the loop.

Let's H is determined as a best exit. If we do a loop rotation M, B, C, H we can introduce the extra branch.
Let's compute the change in number of branches:
+1 branch from pre-header to header
-1 branch from header to exit
+1 branch from header to middle block if there is such
-1 branch from cold bock to header if there is one

So if C is not a predecessor of H then we introduce extra branch.

This change actually prohibits rotation of the loop if both true
1) Best Exit has next element in chain as successor.
2) Last element in chain is not a predecessor of first element of chain.

Reviewers: iteratee, xur
Reviewed By: iteratee
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34271

llvm-svn: 306272
2017-06-26 05:27:27 +00:00
Davide Italiano 9a02494230 [CFL-AA] Remove unneeded function declaration. NFCI.
llvm-svn: 306268
2017-06-26 03:55:41 +00:00
Chandler Carruth 2abb65ae11 [InstCombine] Factor the logic for propagating !nonnull and !range
metadata out of InstCombine and into helpers.

NFC, this just exposes the logic used by InstCombine when propagating
metadata from one load instruction to another. The plan is to use this
in SROA to address PR32902.

If anyone has better ideas about how to factor this or name variables,
I'm all ears, but this seemed like a pretty good start and lets us make
progress on the PR.

This is based on a patch by Ariel Ben-Yehuda (D34285).

llvm-svn: 306267
2017-06-26 03:31:31 +00:00
Matt Arsenault 8bcf2f20a7 AMDGPU: Whitespace fixes
llvm-svn: 306265
2017-06-26 03:01:36 +00:00
Matt Arsenault 10fc062b2b AMDGPU: Partially fix implicit.buffer.ptr intrinsic handling
This should not be treated as a different version of
private_segment_buffer. These are distinct things with
different uses and register classes, and requires the
function argument info to have more context about the
function's type and environment.

Also add missing test coverage for the intrinsic, and
emit an error for HSA. This also encovers that the intrinsic
is broken unless there happen to be stack objects.

llvm-svn: 306264
2017-06-26 03:01:31 +00:00
Sylvestre Ledru e3fdbaea0d fix various typos
llvm-svn: 306262
2017-06-26 02:45:39 +00:00
Chandler Carruth 4a000883c7 [LoopSimplify] Re-instate r306081 with a bug fix w.r.t. indirectbr.
This was reverted in r306252, but I already had the bug fixed and was
just trying to form a test case.

The original commit factored the logic for forming dedicated exits
inside of LoopSimplify into a helper that could be used elsewhere and
with an approach that required fewer intermediate data structures. See
that commit for full details including the change to the statistic, etc.

The code looked fine to me and my reviewers, but in fact didn't handle
indirectbr correctly -- it left the 'InLoopPredecessors' vector dirty.

If you have code that looks *just* right, you can end up leaking these
predecessors into a subsequent rewrite, and crash deep down when trying
to update PHI nodes for predecessors that don't exist.

I've added an assert that makes the bug much more obvious, and then
changed the code to reliably clear the vector so we don't get this bug
again in some other form as the code changes.

I've also added a test case that *does* manage to catch this while also
giving some nice positive coverage in the face of indirectbr.

The real code that found this came out of what I think is CPython's
interpreter loop, but any code with really "creative" interpreter loops
mixing indirectbr and other exit paths could manage to tickle the bug.
I was hard to reduce the original test case because in addition to
having a particular pattern of IR, the whole thing depends on the order
of the predecessors which is in turn depends on use list order. The test
case added here was designed so that in multiple different predecessor
orderings it should always end up going down the same path and tripping
the same bug. I hope. At least, it tripped it for me without
manipulating the use list order which is better than anything bugpoint
could do...

llvm-svn: 306257
2017-06-25 22:45:31 +00:00
Chandler Carruth 73367b6a09 [LoopSimplify] Improve a test for loop simplify minorly. NFC.
I did some basic testing while looking for a bug in my recent change to
loop simplify and even though it didn't find the bug it seems like
a useful improvement anyways.

llvm-svn: 306256
2017-06-25 22:24:02 +00:00
Davide Italiano f15fb368a3 [MemDep] Cleanup return after else & use `auto`. NFC.
llvm-svn: 306255
2017-06-25 22:12:59 +00:00
Anna Thomas e7cb633d29 [LoopDeletion] NFC: Move phi node value setting into prepass
Recommit NFC patch (rL306157) where I missed incrementing the basic block iterator,
which caused loop deletion tests to hang due to infinite loop.
Had reverted it in rL306162.

rL306157 commit message:
Currently, the implementation of delete dead loops has a special case
when the loop being deleted is never executed. This special case
(updating of exit block's incoming values for phis) can be
run as a prepass for non-executable loops before performing
the actual deletion.

llvm-svn: 306254
2017-06-25 21:13:58 +00:00
Daniel Jasper 4c6cd4ccb7 Revert "[LoopSimplify] Factor the logic to form dedicated exits into a utility."
This leads to a segfault. Chandler already has a test case and should be
able to recommit with a fix soon.

llvm-svn: 306252
2017-06-25 17:58:25 +00:00
Craig Topper 18e6b57cdf [TableGen] Remove some copies around PatternToMatch.
Summary:
This patch does a few things that should remove some copies around PatternsToMatch. These were noticed while reviewing code for D34341.

Change constructor to take Dstregs by value and move it into the class. Change one of the callers to add std::move to the argument so that it gets moved.

Make AddPatternToMatch take PatternToMatch by rvalue reference so we can move it into the PatternsToMatch vector. I believe we should have a implicit default move constructor available on PatternToMatch. I chose rvalue reference because both callers call it with temporaries already.

Reviewers: RKSimon, aymanmus, spatel

Reviewed By: aymanmus

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34411

llvm-svn: 306251
2017-06-25 17:33:49 +00:00
Craig Topper d1fbb38475 [IR] Use isIntOrIntVectorTy instead of writing it out the long way. NFC
llvm-svn: 306250
2017-06-25 17:33:48 +00:00
Craig Topper e3bd1a8598 [IR] Move repeated asserts in FCmpInst constructor to a helper method like we do for ICmpInst and other classes. NFC
llvm-svn: 306249
2017-06-25 17:33:46 +00:00
Simon Pilgrim c338ba48fc [X86][SSE] Remove unused memopfsf32_128/memopfsf64_128 scalar memops
The 'scalar' simd bitops were dropped a while ago

llvm-svn: 306248
2017-06-25 17:04:58 +00:00
Simon Pilgrim bed1fa1ac1 Strip trailing whitespace. NFCI.
llvm-svn: 306247
2017-06-25 16:57:46 +00:00
Simon Pilgrim 9956364a1f [X86] Add test case for PR15705
llvm-svn: 306246
2017-06-25 16:12:45 +00:00
Sanjay Patel 2f3ead7adc [InstCombine] add (sext i1 X), 1 --> zext (not X)
http://rise4fun.com/Alive/i8Q

A narrow bitwise logic op is obviously better than math for value tracking, 
and zext is better than sext. Typically, the 'not' will be folded into an 
icmp predicate.

The IR difference would even survive through codegen for x86, so we would see 
worse code:

https://godbolt.org/g/C14HMF

one_or_zero(int, int):                      # @one_or_zero(int, int)
        xorl    %eax, %eax
        cmpl    %esi, %edi
        setle   %al
        retq

one_or_zero_alt(int, int):                  # @one_or_zero_alt(int, int)
        xorl    %ecx, %ecx
        cmpl    %esi, %edi
        setg    %cl
        movl    $1, %eax
        subl    %ecx, %eax
        retq

llvm-svn: 306243
2017-06-25 14:15:28 +00:00
Elena Demikhovsky 72f991cded AVX-512: Fixed a crash during legalization of <3 x i8> type
The compiler fails with assertion during legalization of SETCC for <3 x i8> operands.
The result is extended to <4 x i8> and then truncated <4 x i1>. It does not happen on AVX2, because the final result of SETCC is <4 x i32>.

Differential Revision: https://reviews.llvm.org/D34503

llvm-svn: 306242
2017-06-25 13:36:20 +00:00
Xin Tong 70f7512add [AST] Fix a bug in aliasesUnknownInst. Make sure we are comparing the unknown instructions in the alias set and the instruction interested in.
Summary:
Make sure we are comparing the unknown instructions in the alias set and the instruction interested in.
I believe this is clearly a bug (missed opportunity). I can also add some test cases if desired.

Reviewers: hfinkel, davide, dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34597

llvm-svn: 306241
2017-06-25 12:55:11 +00:00
Igor Breger f5035d6ee5 [GlobalISel][X86] Support vector type G_EXTRACT selection.
Summary:
Support vector type G_EXTRACT selection. For now G_EXTRACT marked as legal for any type, so nothing to do in legalizer.
Split from https://reviews.llvm.org/D33665

Reviewers: qcolombet, t.p.northover, zvi, guyblank

Reviewed By: guyblank

Subscribers: guyblank, rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D33957

llvm-svn: 306240
2017-06-25 11:42:17 +00:00
Dorit Nuzman e0e0f1ddb0 [AVX2] [TTI CostModel] Add cost of interleaved loads/stores for AVX2
The cost of an interleaved access was only implemented for AVX512. For other
X86 targets an overly conservative Base cost was returned, resulting in
avoiding vectorization where it is actually profitable to vectorize.
This patch starts to add costs for AVX2 for most prominent cases of
interleaved accesses (stride 3,4 chars, for now).

Note1: Improvements of up to ~4x were observed in some of EEMBC's rgb
workloads; There is also a known issue of 15-30% degradations on some of these
workloads, associated with an interleaved access followed by type
promotion/widening; the resulting shuffle sequence is currently inefficient and
will be improved by a series of patches that extend the X86InterleavedAccess pass
(such as D34601 and more to follow).

Note 2: The costs in this patch do not reflect port pressure penalties which can
be very dominant in the case of interleaved accesses since most of the shuffle
operations are restricted to a single port. Further tuning, that may incorporate
these considerations, will be done on top of the upcoming improved shuffle
sequences (that is, along with the abovementioned work to extend
X86InterleavedAccess pass).


Differential Revision: https://reviews.llvm.org/D34023

llvm-svn: 306238
2017-06-25 08:26:25 +00:00
Ed Schouten 3370e19725 Add support for Ananas platform
Ananas is a home-brew operating system, mainly for amd64 machines. After
using GCC for quite some time, it has switched to clang and never looked
back - yet, having to manually patch things is annoying, so it'd be much
nicer if this was in the official tree.

More information:

https://github.com/zhmu/ananas/
https://rink.nu/projects/ananas.html

Submitted by:	Rink Springer
Differential Revision:	https://reviews.llvm.org/D32937

llvm-svn: 306237
2017-06-25 08:19:37 +00:00
Craig Topper f2b890e806 [PatternMatch] Just check if value is a Constant before calling isAllOnesValue for not_match. We don't really need to check for a specific subclass of Constant. NFC
llvm-svn: 306236
2017-06-25 06:56:34 +00:00
Zachary Turner 1affd805fc [pdb] Fix reading of llvm-generated PDBs by cvdump.
If you dump a pdb to yaml, and then round-trip it back to a pdb,
and run cvdump -l <file> on the new pdb, cvdump will generate
output such as this.

*** LINES

** Module: "d:\src\llvm\test\DebugInfo\PDB\Inputs\empty.obj"

Error: Line number corrupted: invalid file id 0
  <Unknown> (MD5), 0001:00000010-0000001A, line/addr pairs = 3

        5 00000010      6 00000013      7 00000018

Note the error message about the corrupted line number.

It turns out that the problem is that cvdump cannot find the
/names stream (e.g. the global string table), and the reason it
can't find the /names stream is because it doesn't understand
the NameMap that we serialize which tells pdb consumers which
stream has the string table.

Some experimentation shows that if we add items to the hash
table in a specific order before serializing it, cvdump can read
it. This suggests that either we're using the wrong hash function,
or we're serializing something incorrectly, but it will take some
deeper investigation to figure out how / why.  For now, this at
least allows cvdump to read our line information (and incidentally,
produces an identical byte sequence to what Microsoft tools
produce when writing the named stream map).

Differential Revision: https://reviews.llvm.org/D34491

llvm-svn: 306233
2017-06-25 03:51:42 +00:00
Xinliang David Li b67530e9b9 [PGO] Implementate profile counter regiser promotion
Differential Revision: http://reviews.llvm.org/D34085

llvm-svn: 306231
2017-06-25 00:26:43 +00:00
Zachary Turner 91c8daf300 [Support] Don't use std::iterator, it's deprecated in C++17.
In converting this over to iterator_facade_base, some member
operators and methods are no longer needed since iterator_facade
implements them in the base class using CRTP.

Differential Revision: https://reviews.llvm.org/D34223

llvm-svn: 306230
2017-06-25 00:00:08 +00:00
Craig Topper 010203964d [SCEV] Avoid copying ConstantRange just to get the min/max value
Summary:
This patch changes getRange to getRangeRef and returns a reference to the ConstantRange object stored inside the DenseMap caches. We then take advantage of that to add new helper methods that can return min/max value of a signed or unsigned ConstantRange using that reference without first copying the ConstantRange.

getRangeRef calls itself recursively and I believe the reference return is fine for those calls.

I've left getSignedRange and getUnsignedRange returning a ConstantRange object so they will make a copy now. This is to ensure safety since the reference will be invalidated if the DenseMap changes.

I'm sure there are still more places that can take advantage of the reference and I'll submit future patches as I find them.

Reviewers: sanjoy, davide

Reviewed By: sanjoy

Subscribers: zzheng, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D32978

llvm-svn: 306229
2017-06-24 23:34:50 +00:00
Craig Topper 914ad85c6a [PatternMatch] Use ConstantFP::isNan instead of getting the APFloat and calling isNaN on that. NFC
llvm-svn: 306227
2017-06-24 22:59:11 +00:00
Craig Topper bf7f7c26b5 [IR] Implement commutable matchers without using combineOr
Summary:
Turns out creating matchers with combineOr isn't very efficient as we have to build matcher objects for both sides of the OR. Those objects aren't free, the trees usually contain several objects that contain a reference to a Value *, ConstantInt *, APInt * or some such thing. The compiler isn't always willing to inline all the matcher code to get rid of these member variables. Thus we end up loads and stores of these variables.

Using combineOR ends up creating two complete copies of the tree and the associated stores. I believe we're also paying for the opcode check twice.

This patch adds a commutable mode to several of the matcher objects as a bool template parameter that can be used to enable  commutable support directly in the match functions of the corresponding objects. This avoids the duplicate object creation and the opcode checks.

This shows about an ~7-8k reduction in the opt binary size on my local build.

Reviewers: spatel, majnemer, davide

Reviewed By: majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34592

llvm-svn: 306226
2017-06-24 22:59:10 +00:00
Anton Korobeynikov dfa26f0ac1 Another test commit
llvm-svn: 306224
2017-06-24 21:04:32 +00:00
Tanya Lattner d3ae628100 Remove test commit change.
llvm-svn: 306223
2017-06-24 20:13:32 +00:00
Tanya Lattner 8c88112a79 test commit
llvm-svn: 306222
2017-06-24 20:08:28 +00:00
Anton Korobeynikov c0d3f00903 Still debugging
llvm-svn: 306216
2017-06-24 18:07:05 +00:00
Anton Korobeynikov 0b42fccb99 Still test commit
llvm-svn: 306215
2017-06-24 18:05:08 +00:00
Anton Korobeynikov 1a95495022 Another test commit
llvm-svn: 306214
2017-06-24 18:01:33 +00:00
Anton Korobeynikov 25b627c032 Another test commit
llvm-svn: 306213
2017-06-24 17:47:19 +00:00
Anton Korobeynikov 4bb4fd0c10 Test commit
llvm-svn: 306212
2017-06-24 17:35:28 +00:00
Hiroshi Inoue a85d24b73d fix trivial typos in comment, NFC
llvm-svn: 306211
2017-06-24 16:00:26 +00:00
Hiroshi Inoue b300824ee7 fix trivial typos in comment, NFC
dereferencable -> dereferenceable

llvm-svn: 306210
2017-06-24 15:43:33 +00:00
Hiroshi Inoue 95f24dca98 [SelectionDAG] set dereferenceable flag when expanding memcpy/memmove
When SelectionDAG expands memcpy (or memmove) call into a sequence of load and store instructions, it disregards dereferenceable flag even the source pointer is known to be dereferenceable.
This results in an assertion failure if SelectionDAG commonizes a load instruction generated for memcpy with another load instruction for the source pointer.
This patch makes SelectionDAG to set the dereferenceable flag for the load instructions properly to avoid the assertion failure.

Differential Revision: https://reviews.llvm.org/D34467

llvm-svn: 306209
2017-06-24 15:17:38 +00:00
Tobias Grosser 4bba4ba5ba Ensure backends available in 'opt' are also available in 'bugpoint'
This patch links LLVM back-ends into bugpoint the same way they are already
available in 'opt' and 'clang'. This resolves an inconsistency that allowed the
use of LLVM backends in loadable modules that run in 'opt', but that would
prevent the debugging of these modules with bugpoint due to unavailable /
unresolved symbols.

For e.g. In D31859, Polly requires the NVPTX back-end.

Reviewers: hfinkel, bogner, chandlerc, grosser, Meinersbur

Subscribers: bollu, mgorny, grosser, Meinersbur

Tags: #polly

Contributed by: Singapuram Sanjay

Differential Revision: https://reviews.llvm.org/D32003

llvm-svn: 306208
2017-06-24 08:09:33 +00:00
Craig Topper 05333b4d40 [IR] Remove BinOp2_match and replace its usage with the more capable BinOpPred_match.
llvm-svn: 306207
2017-06-24 07:02:52 +00:00
Craig Topper 8bec6a4e1c [IR][AssumptionCache] Add m_Shift and m_BitwiseLogic matchers to replace a couple m_CombineOr
Summary:
m_CombineOr isn't very efficient. The code using it is also quite verbose.

This patch adds m_Shift and m_BitwiseLogic matchers to make the using code more concise and improve the match efficiency.

Reviewers: spatel, davide

Reviewed By: davide

Subscribers: davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D34593

llvm-svn: 306206
2017-06-24 06:27:14 +00:00
Craig Topper 7b66ffe875 [ValueTracking][InstCombine] Use m_Shr instead m_CombineOr(m_LShr, m_AShr). NFC
llvm-svn: 306205
2017-06-24 06:24:04 +00:00
Craig Topper 72ee6945af [Analysis][Transforms] Use commutable matchers instead of m_CombineOr in a few places. NFC
llvm-svn: 306204
2017-06-24 06:24:01 +00:00
Rafael Espindola 6418856127 Simplify the processFixupValue interface. NFC.
llvm-svn: 306202
2017-06-24 05:22:28 +00:00
Xin Tong 25d51d66df Add comments for OrderedInstruction. NFC
llvm-svn: 306201
2017-06-24 05:16:12 +00:00
Rafael Espindola daaee7151b Remove a processFixupValue hack.
The intention of processFixupValue is not to redefine the semantics of
MCExpr. It is odd enough that a expression lowers to a PCRel MCExpr or
not depending on what it looks like. At least it is a local hack now.

I left a fix for anyone trying to figure out what producers should be
producing a different expression.

llvm-svn: 306200
2017-06-24 05:12:29 +00:00