Commit Graph

98364 Commits

Author SHA1 Message Date
Matt Arsenault e482403e1c DAGCombiner: Add hasOneUse checks to fadd/fma combine
Even with aggressive fusion enabled, this requires duplicating
the fmul, or increases an fadd to another fma which is not an
improvement.

llvm-svn: 291642
2017-01-11 02:02:12 +00:00
Eugene Zelenko c4ad1ce068 [Target] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 291641
2017-01-11 01:45:03 +00:00
Hans Wennborg 6573976f57 Re-commit r289955: [X86] Fold (setcc (cmp (atomic_load_add x, -C) C), COND) to (setcc (LADD x, -C), COND) (PR31367)
This was reverted because it would miscompile code where the cmp had
multiple uses. That was due to a deficiency in the existing code, which
was fixed in r291630 (see the PR for details).

This re-commit includes an extra test for the kind of code that got
miscompiled: @test_sub_1_setcc_jcc.

llvm-svn: 291640
2017-01-11 01:36:57 +00:00
Matt Arsenault 8260666d11 InstSimplify: Refactor function to use more switches
llvm-svn: 291634
2017-01-11 00:57:54 +00:00
Hans Wennborg 12de693747 [X86] Dont run combineSetCCAtomicArith() when the cmp has multiple uses
We would miscompile the following:

  void g(int);
  int f(volatile long long *p) {
    bool b = __atomic_fetch_add(p, 1, __ATOMIC_SEQ_CST) < 0;
    g(b ? 12 : 34);
    return b ? 56 : 78;
  }

into

  pushq   %rax
  lock            incq    (%rdi)
  movl    $12, %eax
  movl    $34, %edi
  cmovlel %eax, %edi
  callq   g(int)
  testq   %rax, %rax   <---- Bad.
  movl    $56, %ecx
  movl    $78, %eax
  cmovsl  %ecx, %eax
  popq    %rcx
  retq

because the code failed to take into account that the cmp has multiple
uses, replaced one of them, and left the other one comparing garbage.

llvm-svn: 291630
2017-01-11 00:49:54 +00:00
Quentin Colombet 0b63b31b2e [RegBankSelect] Improve the output of the debug messages.
Add more information about mapping cost and chosen solution.

llvm-svn: 291629
2017-01-11 00:48:41 +00:00
Zachary Turner a9054ddd9c [CodeView/PDB] Rename a bunch of files.
We were starting to get some name clashes between llvm-pdbdump
and the common CodeView framework, so I took this opportunity
to rename a bunch of files to more accurately describe their
usage.  This also helps in llvm-pdbdump to distinguish
between different files and whether they are used for pretty
dump mode or raw dump mode.

llvm-svn: 291627
2017-01-11 00:35:43 +00:00
Zachary Turner c640b76db5 [CodeView] Add TypeDatabase class.
This creates a centralized class in which to store type records.
It stores types as an array of entries, which matches the
notion of a type stream being a topologically sorted DAG.
Logic to build up such a database was already being used in
CVTypeDumper, so CVTypeDumper is now updated to to read from
a TypeDatabase which is filled out by an earlier visitor in
the pipeline.

Differential Revision: https://reviews.llvm.org/D28486

llvm-svn: 291626
2017-01-11 00:35:08 +00:00
Matt Arsenault 1e0edbf03c InstSimplify: Eliminate fabs on known positive
llvm-svn: 291624
2017-01-11 00:33:24 +00:00
Jan Vesely 0d6cb1caaf AMDGPU/EG,CM: Add fp16 conversion instructions
Differential Revision: https://reviews.llvm.org/D28164

llvm-svn: 291622
2017-01-11 00:12:39 +00:00
Rong Xu acd6360251 Revert "[PGO] Turn off comdat renaming in IR PGO by default"
This patch reverts r291588: [PGO] Turn off comdat renaming in IR PGO by default,
as we are seeing some hash mismatches in our internal tests.

llvm-svn: 291621
2017-01-10 23:54:31 +00:00
Sanjay Patel db0938fd9a [InstCombine] add a wrapper for a common pair of transforms; NFCI
Some of the callers are artificially limiting this transform to integer types;
this should make it easier to incrementally remove that restriction.

llvm-svn: 291620
2017-01-10 23:49:07 +00:00
Florian Hahn 4f9d6d56c0 [loop-unroll] Properly populate LoopInfo for loops cloned in LoopUnrollRuntime.
Summary:
This fixes Transforms/LoopUnroll/runtime-loop3.ll which failed with
EXTENSIVE_DEBUG, because the cloned basic blocks were not added to the
correct sub-loops in LoopUnrollRuntime.cpp.


Reviewers: dexonsmith, mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28482

llvm-svn: 291619
2017-01-10 23:43:35 +00:00
Justin Lebar 7d81813d76 [TM] Restore default TargetOptions in TargetMachine::resetTargetOptions.
Summary:
Previously if you had

 * a function with the fast-math-enabled attr, followed by
 * a function without the fast-math attr,

the second function would inherit the first function's fast-math-ness.

This means that mixing fast-math and non-fast-math functions in a module
was completely broken unless you explicitly annotated every
non-fast-math function with "unsafe-fp-math"="false".  This appears to
have been broken since r176986 (March 2013), when the resetTargetOptions
function was introduced.

This patch tests the correct behavior as best we can.  I don't think I
can test FPDenormalMode and NoTrappingFPMath, because they aren't used
in any backends during function lowering.  Surprisingly, I also can't
find any uses at all of LessPreciseFPMAD affecting generated code.

The NVPTX/fast-math.ll test changes are an expected result of fixing
this bug.  When FMA is disabled, we emit add as "add.rn.f32", which
prevents fma combining.  Before this patch, fast-math was enabled in all
functions following the one which explicitly enabled it on itself, so we
were emitting plain "add.f32" where we should have generated
"add.rn.f32".

Reviewers: mkuper

Subscribers: hfinkel, majnemer, jholewinski, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D28507

llvm-svn: 291618
2017-01-10 23:43:04 +00:00
Evandro Menezes 330e1b8945 [AArch64] Consider all vector types for FeatureSlowMisaligned128Store
The original code considered only v2i64 as slow for this feature. This patch
consider all 128-bit long vector types as slow candidates.

In internal tests, extending this feature to all 128-bit vector types
resulted in an overall improvement of 1% on Exynos M1.

Differential revision: https://reviews.llvm.org/D27998

llvm-svn: 291616
2017-01-10 23:42:21 +00:00
Matt Arsenault 51818c14b3 AMDGPU: Constant fold when immediate is materialized
In future commits these patterns will appear after moveToVALU changes.

llvm-svn: 291615
2017-01-10 23:32:04 +00:00
Florian Hahn fdea2e420c [loop-unroll] Factor out code to update LoopInfo (NFC).
Move the code to update LoopInfo for cloned basic blocks to
addClonedBlockToLoopInfo, as suggested in 
https://reviews.llvm.org/D28482.

llvm-svn: 291614
2017-01-10 23:24:54 +00:00
Reid Kleckner 443423e38a Move the section name from GlobalObject to the LLVMContext
Summary:
Convention wisdom says that bytes in Function are precious, and the
vast, vast majority of globals do not live in special sections. Even
when they do, they tend to live in the same section. Store the section
name on the LLVMContext in a StringSet, and maintain a map from
GlobalObject* to section name like we do for metadata, prefix data, etc.

The fact that we've survived this long wasting at least three pointers
of space in Function suggests that Function bytes are perhaps not as
precious as we once thought. Given that most functions have metadata
attachments when debug info is enabled, we might consider adding a
pointer here to make that access more efficient.

Reviewers: jlebar, dexonsmith, mehdi_amini

Subscribers: mehdi_amini, aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D28150

llvm-svn: 291613
2017-01-10 23:23:58 +00:00
Matt Arsenault 3f509042b0 InstCombine: Set operands instead of creating new call
llvm-svn: 291612
2017-01-10 23:17:52 +00:00
Matt Arsenault fdb78f8bae InstCombine: fdiv -x, -y -> fdiv x, y
llvm-svn: 291611
2017-01-10 23:08:54 +00:00
Kyle Butt df27aa8c89 CodeGen: Allow small copyable blocks to "break" the CFG.
When choosing the best successor for a block, ordinarily we would have preferred
a block that preserves the CFG unless there is a strong probability the other
direction. For small blocks that can be duplicated we now skip that requirement
as well.

Differential revision: https://reviews.llvm.org/D27742

llvm-svn: 291609
2017-01-10 23:04:30 +00:00
Matt Arsenault def496c04b Remove unused CONVERT_RNDSAT intrinsics
llvm-svn: 291607
2017-01-10 22:38:02 +00:00
Matt Arsenault 0b382a7cb8 DAG: Avoid OOB when legalizing vector indexing
If a vector index is out of bounds, the result is supposed to be
undefined but is not undefined behavior. Change the legalization
for indexing the vector on the stack so that an out of bounds
index does not create an out of bounds memory access.

llvm-svn: 291604
2017-01-10 22:02:30 +00:00
Derek Schuff 7acb42a41a [WebAssembly] Only RAUW a constant once in FixFunctionBitcasts
When we collect 2 uses of a function in FindUses and then RAUW when we
visit the first, we end up visiting the wrapper (because the second was
RAUW'd).  We still want to use RAUW instead of just Use->set() because
it has special handling for Constants, so this patch just ensures that
only one use of each constant is added to the work list.

Differential Revision: https://reviews.llvm.org/D28504

llvm-svn: 291603
2017-01-10 21:59:53 +00:00
Victor Leschuk cbddae74f5 DebugInfo: support for DW_FORM_implicit_const
Support for DW_FORM_implicit_const DWARFv5 feature.
When this form is used attribute value goes to .debug_abbrev section (as SLEB).
As this form would break any debug tool which doesn't support DWARFv5
it is guarded by dwarf version check. Attempt to use this form with
dwarf version <= 4 is considered a fatal error.

Differential Revision: https://reviews.llvm.org/D28456

llvm-svn: 291599
2017-01-10 21:18:26 +00:00
Michael Kuperstein ee31cbe35f [LV] Don't panic when encountering the IV of an outer loop.
Bail out instead of asserting when we encounter this situation,
which can actually happen.

The reason the test uses the new PM is that the "bad" phi, incidentally, gets
cleaned up by LoopSimplify. But LICM can create this kind of phi and preserve
loop simplify form, so the cleanup has no chance to run.

This fixes PR31190.
We may want to solve this in a less conservative manner, since this phi is
actually uniform within the inner loop (or we may want LICM to output a cleaner
promotion to begin with).

Differential Revision: https://reviews.llvm.org/D28490

llvm-svn: 291589
2017-01-10 19:32:30 +00:00
Rong Xu ef1adad938 [PGO] Turn off comdat renaming in IR PGO by default
Summary:
In IR PGO we append the function hash to comdat functions to avoid the
potential hash mismatch. This turns out not legal in some cases: if the comdat
function is address-taken and used in comparison. Renaming changes the semantic.

This patch turns off comdat renaming by default.

To alleviate the hash mismatch issue, we now rename the profile variable
for comdat functions. Profile allows co-existing multiple versions of profiles
with different hash value. The inlined copy will always has the correct profile
counter. The out-of-line copy might not have the correct count. But we will
not have the bogus mismatch warning.

Reviewers: davidxl

Subscribers: llvm-commits, xur

Differential Revision: https://reviews.llvm.org/D28416

llvm-svn: 291588
2017-01-10 19:30:20 +00:00
Chad Rosier d0114fc1dd [ARM] Remove rbit intrinsics and autoupgrade to generic bitreverse.
Testing already covered by CodeGen/ARM/rbit.ll

llvm-svn: 291587
2017-01-10 19:23:51 +00:00
Matt Arsenault 8871683d60 AMDGPU: Add tests for HasMultipleConditionRegisters
This was enabled without many specific tests or the comment.

llvm-svn: 291586
2017-01-10 19:08:15 +00:00
Michael Zuckerman bcd03e7f3b [X86][AVX512]Improving shuffle lowering by using AVX-512 EXPAND* instructions
This patch fix PR31351: https://llvm.org/bugs/show_bug.cgi?id=31351

1.  This patch adds new type of shuffle lowering
2.  We can use the expand instruction, When the shuffle pattern is as following:
    { 0*a[0]0*a[1]...0*a[n] , n >=0 where a[] elements in a ascending order}.

Reviewers: 1. igorb  
           2. guyblank  
           3. craig.topper  
           4. RKSimon 

Differential Revision: https://reviews.llvm.org/D28352

llvm-svn: 291584
2017-01-10 18:57:17 +00:00
Davide Italiano f8711f093e [SimplifyLibCalls] Propagate fast math flags while optimizing pow().
llvm-svn: 291577
2017-01-10 18:02:05 +00:00
Chad Rosier 3daffbf6a8 [AArch64] Add support for lowering bitreverse to the rbit instruction.
Differential Revision: https://reviews.llvm.org/D28379

llvm-svn: 291575
2017-01-10 17:20:33 +00:00
Simon Dardis 548a53f5ee [mips] Fix Mips MSA instrinsics
The usage of some MIPS MSA instrinsics that took immediates could crash LLVM
during lowering. This patch addresses that behaviour. Crucially this patch
also makes the use of intrinsics with out of range immediates as producing an
internal error.

The ld,st instrinsics would trigger an assertion failure for MIPS64 as their
lowering would attempt to add an i32 offset to a i64 pointer.

Reviewers: vkalintiris, slthakur

Differential Revision: https://reviews.llvm.org/D25438

llvm-svn: 291571
2017-01-10 16:40:57 +00:00
Simon Dardis 0e9e237310 [mips] Honour -mno-odd-spreg for vector splat (again)
Previous the lowering of FILL_FW would use the MSA128W register class when
performing a vector splat. Instead it should be honouring -mno-odd-spreg and
only use the even registers when performing a splat from word to vector
register.

Logical follow-on from r230235.

This fixes PR/31369.

A previous commit was missing the test case and had another differential
in it.

Reviewers: slthakur

Differential Revision: https://reviews.llvm.org/D28373

llvm-svn: 291566
2017-01-10 15:53:10 +00:00
Simon Dardis f790ff3da0 Revert "[mips] Honour -mno-odd-spreg for vector splat"
This reverts commit r291556. It was a mixture of two differentials and
was missing a test.

llvm-svn: 291562
2017-01-10 13:57:44 +00:00
Eugene Leviant 8e32aebe80 RuntimeDyldELF: implement R_AARCH64_PREL64 reloc
Differential revision: https://reviews.llvm.org/D28122

llvm-svn: 291558
2017-01-10 11:05:30 +00:00
Simon Dardis f4041a2714 [mips] Honour -mno-odd-spreg for vector splat
Previous the lowering of FILL_FW would use the MSA128W register class when
performing a vector splat. Instead it should be honouring -mno-odd-spreg and
only use the even registers when performing a splat from word to vector
register.

Logical follow-on from r230235.

This fixes PR/31369.

Reviewers: slthakur

Differential Revision: https://reviews.llvm.org/D28373

llvm-svn: 291556
2017-01-10 10:28:37 +00:00
Craig Topper 588c734b0f [DAGCombiner] Merge together duplicate checks for folding fold (select C, 1, X) -> (or C, X) and folding (select C, X, 0) -> (and C, X). Also be consistent about checking that both the condition and the result type are i1. NFC
I guess previously we just assumed if the result type was i1, then the condition type must also be i1?

llvm-svn: 291548
2017-01-10 07:42:57 +00:00
Chris Bieneman 1b7200d2cf [ObjectYAML] Support for DWARF line tables
One more try... relanding r291541 with a fix to properly gate MaxOpsPerInst on DWARF version.

Description from r291541:

This patch re-lands r291470, which failed on Linux bots. The issue (I believe) was undefined behavior because the size of llvm::dwarf::LineNumberOps was not explcitly specified or consistently respected. The updated patch adds an explcit underlying type to the enum and preserves the size more correctly.

Original description:

This patch adds support for the DWARF debug_lines section. The line table state machine opcodes are preserved, so this can be used to test the state machine evaluation directly.

llvm-svn: 291546
2017-01-10 06:22:49 +00:00
Craig Topper d55b83128b AMD family 17h (znver1) enablement
Summary:
This patch enables the following
1. AMD family 17h architecture using "znver1" tune flag (-march, -mcpu).
2. ISAs that are enabled for "znver1" architecture.
3. Checks ADX isa from cpuid to identify "znver1" flag when -march=native is used.
4. ISAs FMA4, XOP are disabled as they are dropped from amdfam17.
5. For the time being, it uses the btver2 scheduler model.
6. Test file is updated to check this flag.

This item is linked to clang review item https://reviews.llvm.org/D28018

Patch by Ganesh Gopalasubramanian

Reviewers: RKSimon, craig.topper

Subscribers: vprasad, RKSimon, ashutosh.nema, llvm-commits

Differential Revision: https://reviews.llvm.org/D28017

llvm-svn: 291543
2017-01-10 06:01:16 +00:00
Chris Bieneman e6663d376e Revert "[ObjectYAML] Support for DWARF line tables"
This reverts commit r291541.

Still failing on a bot:

http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/47224/steps/test_llvm/logs/stdio

llvm-svn: 291542
2017-01-10 05:31:23 +00:00
Chris Bieneman 07ab0aa5d6 [ObjectYAML] Support for DWARF line tables
This patch re-lands r291470, which failed on Linux bots. The issue (I believe) was undefined behavior because the size of llvm::dwarf::LineNumberOps was not explcitly specified or consistently respected. The updated patch adds an explcit underlying type to the enum and preserves the size more correctly.

Original description:

This patch adds support for the DWARF debug_lines section. The line table state machine opcodes are preserved, so this can be used to test the state machine evaluation directly.

llvm-svn: 291541
2017-01-10 05:25:24 +00:00
Craig Topper 2ed461e5c4 [X86] When lowering uniform shifts, use X86ISD::VZEXT instead of using a ZERO_EXTEND_VECTOR_INREG. If we emit the ZERO_EXTEND_VECTOR_INREG too late it doesn't get lowered properly and makes it through to isel and fails.
Fixes PR31593.

llvm-svn: 291535
2017-01-10 04:12:24 +00:00
Craig Topper d915d6ba57 [DAGCombiner] Remove code for optimizing select (xor Cond, 0), X, Y -> select Cond, X, Y. Just let combine on the xor itself take care of it.
llvm-svn: 291534
2017-01-10 04:12:19 +00:00
Xin Tong 02b1397ac3 Fix a typo and also test a new machine for commit. NFC.
llvm-svn: 291532
2017-01-10 03:13:52 +00:00
Serge Pavlov 0668cd2c95 [StructurizeCfg] Update dominator info.
In some cases StructurizeCfg updates root node, but dominator info
remains unchanges, it causes crash when expensive checks are enabled.
To cope with this problem a new method was added to DominatorTreeBase
that allows adding new root nodes, it is called in StructurizeCfg to
put dominator tree in sync.

This change fixes PR27488.

Differential Revision: https://reviews.llvm.org/D28114

llvm-svn: 291530
2017-01-10 02:50:47 +00:00
Evandro Menezes 063259d4d1 [CodeGen] Implement the SUnit::print() method
This method seems to have had a troubled life.  This patch proposes that it
replaces the recently added helper function dumpSUIdentifier.  This way, the
method can be used in other files using the SUnit class.

Differential revision: https://reviews.llvm.org/D28488

llvm-svn: 291520
2017-01-10 01:08:01 +00:00
Mehdi Amini c92b612636 [ThinLTO] Hash more part of the config to build cache entries
This has been fixed in the "new" LTO API used by Gold/LLD, this is
fixing the same issue in the libLTO API used by ld64 (amongst other)

llvm-svn: 291518
2017-01-10 00:55:47 +00:00
Xin Tong 12c8cb3745 Add an assert for hasLoopInvariantOperands
Summary: Add an assert for hasLoopInvariantOperands

Reviewers: danielcdh, sanjoy

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D28501

llvm-svn: 291516
2017-01-10 00:39:49 +00:00
Dan Gohman 6055fbae62 [WebAssembly] Add return type annotations in fast isel.
llvm-svn: 291498
2017-01-09 23:09:38 +00:00
Rui Ueyama e9d17545bc TarWriter: Fix a bug in Ustar header.
If we split a filename into `Name` and `Prefix`, `Prefix` is at most
145 bytes. We had a bug that didn't split a path correctly. This bug
was pointed out by Rafael in the post commit review.

This patch adds a unit test for TarWriter to verify the fix.

llvm-svn: 291494
2017-01-09 22:55:00 +00:00
Eugene Zelenko c9f1f6b8ec [NVPTX] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 291490
2017-01-09 22:16:51 +00:00
Easwaran Raman e08b139d7d Refactor inline threshold update code.
Functional change: Previously, if a callee is cold, we used ColdThreshold if it minimizes the existing threshold. This was irrespective of whether we were optimizing for minsize (-Oz) or not. But -Oz uses very low threshold to begin with and the inlining with -Oz is expected to be tuned for lowering code size, so there is no good reason to set an even lower threshold for cold callees. We now lower the threshold for cold callees only when -Oz is not used. For default values of -inlinethreshold and -inlinecold-threshold, this change has no effect and this simplifies the code.

NFC changes: Group all threshold updates that are guarded by !Caller->optForMinSize() and within that group threshold updates that require profile summary info.

Differential revision: https://reviews.llvm.org/D28369

llvm-svn: 291487
2017-01-09 21:56:26 +00:00
Davide Italiano 472684eaf5 [SimplifyLibCalls] pow(x, -0.5) -> 1.0 / sqrt(x).
Differential Revision:  https://reviews.llvm.org/D28479

llvm-svn: 291486
2017-01-09 21:55:23 +00:00
Rafael Espindola d4b24eda73 Support outputting to /dev/null.
When writing to a non regular file we cannot rename to it. Since we
have to write, we may as well create a temporary file to avoid trying
to create an unique file in /dev when trying to write to /dev/null.

llvm-svn: 291485
2017-01-09 21:52:35 +00:00
Matthias Braun ba7d95d425 PeepholeOptimizer: Do not replace SubregToReg(bitcast like)
While we can usually replace bitcast like instructions
(MachineInstr::isBitcast()) with a COPY this is not legal if any of the
users uses SUBREG_TO_REG to assert the upper bits of the result are
zero.

Differential Revision: https://reviews.llvm.org/D28474

llvm-svn: 291483
2017-01-09 21:38:17 +00:00
Matthias Braun a37430844c MachineInstr: Print name for subreg index in SUBREG_TO_REG
SUBREG_TO_REG takes a subregister index as 3rd operand, print the name
instead of a number. We already do the same for INSERT_SUBREG and
REG_SEQUENCE.

llvm-svn: 291481
2017-01-09 21:38:10 +00:00
Rui Ueyama a84ab073d9 TarWriter: Set "00" to Ustar version field.
Most (maybe all?) tar commands can handle tar archives with blank
version fields, but POSIX requires "00" to be set to the field, so
doing it is good for compliance.

llvm-svn: 291479
2017-01-09 21:20:42 +00:00
Michael Kuperstein 1559e8863e Revert r291092 because it introduces a crash.
See PR31589 for details.

llvm-svn: 291478
2017-01-09 21:04:46 +00:00
Vyacheslav Klochkov d497d36083 X86-specific path: Implemented the fusing of MUL+ADDSUB to FMADDSUB.
Differential Revision: https://reviews.llvm.org/D28087

llvm-svn: 291473
2017-01-09 20:26:17 +00:00
Chris Bieneman e62e684fdd Revert "[ObjectYAML] Support for DWARF line tables"
This reverts commit r291470 due to failing bots:

http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/47209/steps/test_llvm/logs/stdio

llvm-svn: 291471
2017-01-09 20:04:55 +00:00
Chris Bieneman 0396f99184 [ObjectYAML] Support for DWARF line tables
This patch adds support for the DWARF debug_lines section. The line table state machine opcodes are preserved, so this can be used to test the state machine evaluation directly.

llvm-svn: 291470
2017-01-09 20:01:37 +00:00
Matthew Simpson cf796478e9 [LV] Fix-up external IV users after updating dominator tree
This patch delays the fix-up step for external induction variable users until
after the dominator tree has been properly updated. This should fix PR30742.
The SCEVExpander in InductionDescriptor::transform can generate code in the
wrong location if the dominator tree is not up-to-date. We should work towards
keeping the dominator tree up-to-date throughout the transformation.

Reference: https://llvm.org/bugs/show_bug.cgi?id=30742
Differential Revision: https://reviews.llvm.org/D28168

llvm-svn: 291462
2017-01-09 19:05:29 +00:00
Matt Arsenault 6dca542b4a AMDGPU: Add Assert[SZ]Ext during argument load creation
For i16 zeroext arguments when i16 was a legal type, the
known bits information from the truncate was lost. Insert
a zeroext so the known bits optimizations work with the 32-bit
loads.

Fixes code quality regressions vs. SI in min.ll test.

llvm-svn: 291461
2017-01-09 18:52:39 +00:00
Matt Arsenault 5f45e7890a Reapply r291025 ("AMDGPU: Remove unneccessary intermediate vector")
llvm-svn: 291460
2017-01-09 18:44:11 +00:00
Xin Tong c13a8e84d1 Intrinsic::Bitreverse is safe to speculate
Summary: Intrinsic::Bitreverse is safe to speculate

Reviewers: hfinkel, mkuper, arsenm, jmolloy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28471

llvm-svn: 291456
2017-01-09 17:57:08 +00:00
Sumanth Gundapaneni 0ac1ce70ba In the below scenario, we must be able to skip the a DBG_VALUE instruction and
remove the dead store.

%vreg0<def> = L2_loadri_io <fi#15>, 0; mem:LD4[%dataF](align=4)
DBG_VALUE %vreg0, %noreg, !"dataF", <!184>; IntRegs:%vreg0 
S2_storeri_io <fi#15>, 0, %vreg0; mem:ST4[%dataF]

In reality, this kind of stores are eliminated before Stack Slot Coloring pass,
possibly in instruction lowering

Differential Revision: https://reviews.llvm.org/D26616

llvm-svn: 291455
2017-01-09 17:45:02 +00:00
Simon Pilgrim 0f23b2ba1a [X86][AVX512] Enable v16i8/v32i8 vector shifts to use an extend+shift+truncate pattern.
Use the existing AVX2 v8i16 vector shift lowering for v16i8 (extending to v16i32) on AVX512 targets and v32i8 (extending to v32i16) on AVX512BW targets.

Cost model updates to follow.

llvm-svn: 291451
2017-01-09 17:20:03 +00:00
Sanjay Patel 940c06188e fix comment typos; NFC
llvm-svn: 291447
2017-01-09 16:27:56 +00:00
Simon Pilgrim d990cd371b [X86][AVX512DQ] Enable v16i16 vector shifts to use an extend+shift+truncate pattern.
Use the existing AVX2 v8i16 vector shift lowering for v16i16 on AVX512 targets (AVX512BW will have already have lowered with vpsravw).

Cost model updates to follow.

llvm-svn: 291445
2017-01-09 15:15:45 +00:00
Amaury Sechet 7d6285fb1c Some formatting in TargetMachineC. NFC
llvm-svn: 291442
2017-01-09 13:54:51 +00:00
Bjorn Pettersson b14afd452d [SelectionDAG] Fix in legalization of UMAX/SMAX/UMIN/SMIN. Solves PR31486.
Summary:
Originally

 i64 = umax t8, Constant:i64<4>

was expanded into

 i32,i32 = umax Constant:i32<0>, Constant:i32<0>
 i32,i32 = umax t7, Constant:i32<4>

Now instead the two produced umax:es return i32 instead of i32, i32.

Thanks to Jan Vesely for help with the test case.

Patch by mikael.holmen at ericsson.com

Reviewers: bogner, jvesely, tstellarAMD, arsenm

Subscribers: test, wdng, RKSimon, arsenm, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D28135

llvm-svn: 291441
2017-01-09 12:03:50 +00:00
Pavel Labath d338381077 Fix MSVC build failure introduced in r291431
MSVC does not like to reinterpret_cast to a uint64_t. Use a different cast
instead.

llvm-svn: 291435
2017-01-09 11:20:35 +00:00
Eugene Leviant be2d68f774 RuntimeDyldELF: don't create thunk if not needed
This patch doesn't create thunk for branch operation when following conditions are met:
- Architecture is AArch64
- Relocation target is in the same object file
- Relocation target is close enough to be encoded in immediate offset

In such case we branch directly to the target instead of branching to thunk

Differential revision: https://reviews.llvm.org/D28108

llvm-svn: 291431
2017-01-09 09:56:31 +00:00
Chandler Carruth 082c183f06 [PM] Teach SCEV to invalidate itself when its dependencies become
invalid.

This fixes use-after-free bugs that will arise with any interesting use
of SCEV.

I've added a dedicated test that works diligently to trigger these kinds
of bugs in the new pass manager and also checks for them explicitly as
well as triggering ASan failures when things go squirly.

llvm-svn: 291426
2017-01-09 07:44:34 +00:00
Dan Gohman 2aae5dc713 [WebAssembly] Fix the opcode values for i64.eq and i64.ne.
llvm-svn: 291424
2017-01-09 06:21:28 +00:00
Jonas Paulsson cf7543c44b Remove unused method in LoopVectorize.cpp.
computeInterleaveCount() is not defined/used and is therefore removed.

Review: Davide Italiano
llvm-svn: 291423
2017-01-09 06:13:21 +00:00
Daniel Berlin b755aea8eb NewGVN: Fix PR 31573, a failure to verify memory congruency due to
not excluding ourselves when checking if any equivalent stores
exist.

llvm-svn: 291421
2017-01-09 05:34:29 +00:00
Daniel Berlin 2f1fbcc718 NewGVN: Change a std::vector to SmallVector and cleanup naming.
llvm-svn: 291420
2017-01-09 05:34:19 +00:00
Craig Topper 96ab6fd2eb [AVX-512] Change another pattern that was using BLENDM to use masked moves. A future patch will conver it back to BLENDM if its beneficial to register allocation.
llvm-svn: 291419
2017-01-09 04:19:34 +00:00
Craig Topper 6393afce97 [AVX-512] Add patterns to use a zero masked VPTERNLOG instruction for vselects of all ones and all zeros.
Previously we emitted a VPTERNLOG and a separate masked move.

llvm-svn: 291415
2017-01-09 02:44:34 +00:00
Rui Ueyama 3e6490399e Define sys::path::convert_to_slash
This patch moves convertToUnixPathSeparator from LLD to LLVM.

Differential Revision: https://reviews.llvm.org/D28444

llvm-svn: 291414
2017-01-09 01:47:15 +00:00
Mehdi Amini e4c7f12274 CommandLine option: Relax the assertion introduced in r290467 to allows for empty string
This is used in LDC for custom boolean commandline options, setArgStr
is called with an empty string before using AddLiteralOption.

llvm-svn: 291406
2017-01-08 22:30:43 +00:00
Piotr Padlewski 09ad678bc4 [MemDep] NFC walk invariant.group graph only down
Summary:
By using stripPointerCasts we can get to the root
value and then walk down the bitcast graph

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28181

llvm-svn: 291405
2017-01-08 22:26:06 +00:00
Davide Italiano 362cc7b0fd [LCSSA] Fix some typos. NFCI.
llvm-svn: 291404
2017-01-08 22:22:09 +00:00
Craig Topper f51ba1e3da [AVX-512] If avx512dq is available use vpmovm2d/vpmovm2q instead of vselect of zeroes/ones when handling sign extends of i1 without VLX.
llvm-svn: 291402
2017-01-08 21:32:30 +00:00
Davide Italiano 1a12522e87 [SCCP] Unknown instructions are sent to overdefined anyway. NFCI.
llvm-svn: 291400
2017-01-08 21:19:05 +00:00
Saleem Abdulrasool 1d84d9ac48 llvm-objdump: speed up -objc-meta-data
Running a Debug build of objdump -objc-meta-data with a large Mach-O file is
currently unnecessarily slow.

With some local test input, this change reduces the run time from 75-85s down
to 15-20s.

The two changes are:
  Assert on pointer equality not array equality
  Replace vector<pair<address, symbol>> with DenseMap<address, symbol>

Additionally, use a std::unique_ptr rather than handling the memory manually.

Patch by Dave Lee!

llvm-svn: 291398
2017-01-08 19:14:15 +00:00
Simon Pilgrim e6d948b857 Strip trailing whitespace.
llvm-svn: 291395
2017-01-08 18:37:42 +00:00
Simon Pilgrim b2a80950fe Fix line endings and strip trailing whitespace.
llvm-svn: 291393
2017-01-08 16:45:39 +00:00
Sanjay Patel bf51c8a975 [x86] fix usage of stale operands when lowering select
I noticed this problem as part of the ongoing attempt to canonicalize min/max ops in IR.

The debug output shows nodes like this:

t4: i32 = xor t2, Constant:i32<-1>
    t21: i8 = setcc t4, Constant:i32<0>, setlt:ch
  t14: i32 = select t21, t4, Constant:i32<-1>

And because the select is holding onto the t4 (xor) node while EmitTest creates a new 
x86-specific xor node, the lowering results in:

  t4: i32 = xor t2, Constant:i32<-1>
  t25: i32,i32 = X86ISD::XOR t2, Constant:i32<-1>
t28: i32,glue = X86ISD::CMOV Constant:i32<-1>, t4, Constant:i8<15>, t25:1

Differential Revision: https://reviews.llvm.org/D28374

llvm-svn: 291392
2017-01-08 15:53:40 +00:00
Simon Pilgrim 9c58950eeb [CostModel][X86] Fixed vXi8 uniform shift costs.
The 'fast' costs should only work for shifts by uniform constants (uniform non-constant are lowered using the slow default implementation).

Logical shifts were not taking into account that we must mask the psrlw result, so the costs needed to be doubled.

Added missing AVX2/AVX512BW costs as well.

llvm-svn: 291391
2017-01-08 14:14:36 +00:00
Simon Pilgrim 1fa5487c05 [CostModel][X86] Moved legal uniform shift costs earlier.
XOP was prematurely matching, doubling the cost of ashr/lshr uniform shifts.

llvm-svn: 291390
2017-01-08 13:12:03 +00:00
Craig Topper 5c46c7526e [AVX-512] Remove redundant patterns that select unaligned moves with zero masking for patterns that already use the aligned form. NFC
llvm-svn: 291383
2017-01-08 05:46:21 +00:00
Mehdi Amini 7b0d145768 [ThinLTO] Fix lazy-loading of Metadata attachment, which left some Fwd ref behind
The change in r291362 was too agressive. We still need to flush at the
end of the block because function local metadata can introduce fwd
ref as well.
(Bootstrap with ThinLTO was broken)

llvm-svn: 291379
2017-01-08 00:44:45 +00:00
Mehdi Amini 83a807ebae [ThinLTO] Expected<> return values need to be handled to avoid an assertion
llvm-svn: 291377
2017-01-08 00:30:27 +00:00
Dylan McKay 8fa6d8db9c [AVR] Implement TargetLoweing::getRegisterByName
This allows the use of the 'read_register' intrinsics used by clang's
named register globals features.

llvm-svn: 291375
2017-01-07 23:39:47 +00:00
Simon Pilgrim 9681c407b4 [CostModel][X86] Update SSE41/AVX1 vXi32 SHL costs
SSE41 provides pmulld which allows the simpler pslld/paddd/cvttps2dq/pmulld pattern than SSE2's use of pmuludq.

llvm-svn: 291372
2017-01-07 22:27:43 +00:00
Craig Topper a74e3088df [AVX-512] Remove patterns from the other VBLENDM instructions. They are all redundant with masked move instructions.
We should probably teach the two address instruction pass to turn masked moves into BLENDM when its beneficial to the register allocator.

llvm-svn: 291371
2017-01-07 22:20:34 +00:00
Craig Topper 81f20aa336 [AVX-512] Remove patterns from masked broadcast versions of BLENDM instructions.
All but (v2f64 broadcast f64) are handled with VBROADCAST instructions. The v2f64 version can be handled with VMOVDDUP.

We may want to consider converting to BLENDM instructions in the two address instruction pass if its beneficial to register allocation.

llvm-svn: 291369
2017-01-07 22:20:26 +00:00