Commit Graph

61179 Commits

Author SHA1 Message Date
Simon Pilgrim 9f04d97cd7 [X86][SSE] Fold scalar horizontal add/sub for non-0/1 element extractions
We already perform horizontal add/sub if we extract from elements 0 and 1, this patch extends it to non-0/1 element extraction indices (as long as they are from the lowest 128-bit vector).

Differential Revision: https://reviews.llvm.org/D61263

llvm-svn: 359707
2019-05-01 17:13:35 +00:00
Stanislav Mekhanoshin 3b7925f035 [AMDGPU] gfx1010 GCNRegBankReassign pass
Reassign registers to reduce register bank conflicts.

Differential Revision: https://reviews.llvm.org/D61344

llvm-svn: 359704
2019-05-01 16:49:31 +00:00
Stanislav Mekhanoshin c29d491596 [AMDGPU] gfx1010 GCNNSAReassign pass
Convert NSA into non-NSA images.

Differential Revision: https://reviews.llvm.org/D61341

llvm-svn: 359700
2019-05-01 16:40:49 +00:00
Stanislav Mekhanoshin 692560dc98 [AMDGPU] gfx1010 MIMG implementation
Differential Revision: https://reviews.llvm.org/D61339

llvm-svn: 359698
2019-05-01 16:32:58 +00:00
Teresa Johnson b3203ec078 [ThinLTO] Fix unreachable code when parsing summary entries.
Summary:
Early returns were causing some code to be skipped. This was missed
since the summary entries are typically at the end of the llvm assembly
file.

Fixes PR41663.

Reviewers: RKSimon, wristow

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61355

llvm-svn: 359697
2019-05-01 16:26:59 +00:00
Stanislav Mekhanoshin a224f68a10 [AMDGPU] gfx1010 DS implementation
Differential Revision: https://reviews.llvm.org/D61332

llvm-svn: 359696
2019-05-01 16:11:11 +00:00
Sanjay Patel 64d5751254 Revert "[DAGCombiner] try repeated fdiv divisor transform before building estimate"
This reverts commit fb9a5307a9 (rL359398)
because it can cause an infinite loop due to opposing combines.

llvm-svn: 359695
2019-05-01 16:06:21 +00:00
Hubert Tong 02d055a269 [tests] Add host-byteorder-*-endian; update XFAILs of big-endian triples
Summary:
Triple components in `XFAIL` lines are tested against the target triple.
Various tests that are expected to fail on big-endian hosts are marked
as being `XFAIL` for big-endian targets. This patch corrects these tests
by having them test against a new `host-byteorder-big-endian` feature.

Reviewers: xingxue, sfertile, jasonliu

Reviewed By: xingxue

Subscribers: jvesely, nhaehnle, fedor.sergeev, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60551

llvm-svn: 359689
2019-05-01 15:36:18 +00:00
Fangrui Song 9caa6b5b64 [llvm-ar][llvm-nm][llvm-size] Change -long-option to --long-option in tests. NFC
llvm-svn: 359688
2019-05-01 15:31:15 +00:00
Simon Pilgrim 6711b9699a [X86][SSE] Add demanded elts support X86ISD::PMULDQ\PMULUDQ
Add to SimplifyDemandedVectorEltsForTargetNode and SimplifyDemandedBitsForTargetNode

llvm-svn: 359686
2019-05-01 14:50:50 +00:00
Simon Pilgrim 3d6899e369 [X86][SSE] Add SSE vector shift support to SimplifyDemandedVectorEltsForTargetNode vector splitting
llvm-svn: 359680
2019-05-01 13:51:09 +00:00
Simon Pilgrim ba372c6e62 [X86][SSE] Split 512-bit -> 128-bit vector directly in SimplifyDemandedVectorEltsForTargetNode
llvm-svn: 359678
2019-05-01 12:48:42 +00:00
Simon Pilgrim 951a6b4579 [X86][SSE] Add 512-bit vector support to SimplifyDemandedVectorEltsForTargetNode vector splitting
llvm-svn: 359677
2019-05-01 12:37:41 +00:00
Simon Pilgrim 37c2419cc7 [X86][SSE] Add X86ISD::PACKSS\PACKUS to SimplifyDemandedVectorEltsForTargetNode vector splitting
llvm-svn: 359673
2019-05-01 11:29:36 +00:00
Simon Pilgrim 7244437050 [X86][SSE] Add scalar horizontal add/sub tests for element extractions from upper lanes
As suggested on D61263

llvm-svn: 359671
2019-05-01 11:17:11 +00:00
Simon Pilgrim 3353cee06c [X86][SSE] Add X86ISD::UNPCKL\UNPCK to SimplifyDemandedVectorEltsForTargetNode vector splitting
llvm-svn: 359670
2019-05-01 11:08:03 +00:00
Simon Pilgrim f7b978a71b [X86][SSE] Move extract_subvector(pshufb) fold to SimplifyDemandedVectorEltsForTargetNode
This lets us hit more cases than combineExtractSubvector and allows us reuse more code.

llvm-svn: 359669
2019-05-01 10:58:38 +00:00
Fangrui Song 5387c2cd17 [llvm-objdump] Print newlines before and after "Disassembly of section ...:"
This improves readability and the behavior is consistent with GNU objdump.

The new test test/tools/llvm-objdump/X86/disassemble-section-name.s
checks we print newlines before and after "Disassembly of section ...:"

Differential Revision: https://reviews.llvm.org/D61127

llvm-svn: 359668
2019-05-01 10:40:48 +00:00
Simon Pilgrim 99eefe94b5 [X86][SSE] Extract i1 elements from vXi1 bool vectors
This is an alternative to D59669 which more aggressively extracts i1 elements from vXi1 bool vectors using a MOVMSK.

Differential Revision: https://reviews.llvm.org/D61189

llvm-svn: 359666
2019-05-01 10:02:22 +00:00
George Rimar f5345a3f4c [yaml2obj] - Report when unknown section is referenced from program header declaration block.
Previously we did not report this.
Also this removes multiple lookups in the map
what cleanups the code.

Differential revision: https://reviews.llvm.org/D61322

llvm-svn: 359663
2019-05-01 09:45:55 +00:00
Fangrui Song 6afcdcf9ab [llvm-readobj] Change -t to --symbols in tests. NFC
-t is --symbols in llvm-readobj but --section-details (unimplemented) in readelf.
The confusing option should not be used since we aim for improving
compatibility.

Keep just one llvm-readobj -t use case in test/tools/llvm-readobj/symbols.test

llvm-svn: 359661
2019-05-01 09:28:24 +00:00
Fangrui Song 085bbe204c [gold] Fix two readelf tests after rL359649
llvm-svn: 359660
2019-05-01 09:01:10 +00:00
Fangrui Song 26676c82e8 Fix test/tools/llvm-readobj/mips-plt.test
llvm-svn: 359657
2019-05-01 06:46:34 +00:00
Fangrui Song 97c17e83f8 [llvm-readobj] llvm-readobj --elf-output-style=GNU => llvm-readelf. NFC
The latter is much more common.

A dedicated --elf-output-style=GNU test demonstrating it is the same as
llvm-readelf is sufficient.

llvm-svn: 359652
2019-05-01 05:55:22 +00:00
Fangrui Song e29e30b139 [llvm-readobj] Change -long-option to --long-option in tests. NFC
We use both -long-option and --long-option in tests. Switch to --long-option for consistency.

In the "llvm-readelf" mode, -long-option is discouraged as it conflicts with grouped short options and it is not accepted by GNU readelf.

While updating the tests, change llvm-readobj -s to llvm-readobj -S to reduce confusion ("s" is --section-headers in llvm-readobj but --symbols in llvm-readelf).

llvm-svn: 359649
2019-05-01 05:27:20 +00:00
David L. Jones fccb505f0f Revert "[llvm] r359313 - [PowerPC] Update P9 vector costs for insert/extract element"
This causes segfaults during optimized builds. More details, including a reproducer, are on the llvm-commits thread for r359313.

llvm-svn: 359648
2019-05-01 05:01:03 +00:00
Fangrui Song aa1f2c50a8 [llvm-objcopy] Simplify SHT_NOBITS -> SHT_PROGBITS promotion
GNU objcopy uses bfd_elf_get_default_section_type to decide the candidate section type,
which roughly translates to our [a] (I assume SEC_COMMON implies SHF_ALLOC):

  (!(Sec.Flags & ELF::SHF_ALLOC) || Flags & (SectionFlag::SecContents | SectionFlag::SecLoad)))

Then, it updates the section type in bfd/elf.c:elf_fake_sections if:

  if (this_hdr->sh_type == SHT_NULL)
    this_hdr->sh_type = sh_type; // common case
  else if (this_hdr->sh_type == SHT_NOBITS
           && sh_type == SHT_PROGBITS
           && (asect->flags & SEC_ALLOC) != 0)  // uncommon case
    ...
    this_hdr->sh_type = sh_type;

If the following condition is met the uncommon branch is executed:

  if (elf_section_type (osec) == SHT_NULL
      && (osec->flags == isec->flags
	  || (final_link
	      && ((osec->flags ^ isec->flags)
		  & ~(SEC_LINK_ONCE | SEC_LINK_DUPLICATES | SEC_RELOC)) == 0)))

I suggest we just ignore this clause and follow the common case
behavior, which is done in this patch. Rationales to do so:

If --set-section-flags is a no-op (osec->flags == isec->flags)
(corresponds to the "readonly" test in set-section-flags.test), GNU
objcopy will require (Sec.Flags & ELF::SHF_ALLOC). [a] is essentially:

  Flags & (SectionFlag::SecContents | SectionFlag::SecLoad)

This special case is not really useful. Non-SHF_ALLOC SHT_NOBITS
sections do not make much sense and it doesn't matter if they are
SHT_NOBITS or SHT_PROGBITS.

For all other RUN lines in set-section-flags.test, the new behavior
matches GNU objcopy, i.e. this patch improves compatibility.

Differential Revision: https://reviews.llvm.org/D60189

llvm-svn: 359639
2019-05-01 00:39:31 +00:00
Philip Reames 84e54eb471 [InstCombine] Limit a vector demanded elts rule which was producing invalid IR.
The demanded elts rules introduced for GEPs in https://reviews.llvm.org/rL356293 replaced vector constants with undefs (by design).  It turns out that the LangRef disallows such cases when indexing structs.  The right fix is probably to relax the langref requirement, and update other passes to expect the result, but for the moment, limit the transform to avoid compiler crashes.

This should fix https://bugs.llvm.org/show_bug.cgi?id=41624.

llvm-svn: 359633
2019-04-30 23:09:26 +00:00
Alina Sbirlea b468320313 [MemorySSA] Invalidate MemorySSA if AA or DT are invalidated.
Summary:
MemorySSA keeps internal pointers of AA and DT.
If these get invalidated, so should MemorySSA.

Reviewers: george.burgess.iv, chandlerc

Subscribers: jlebar, Prazek, llvm-commits

Tags: LLVM

Differential Revision: https://reviews.llvm.org/D61043

llvm-svn: 359627
2019-04-30 22:43:55 +00:00
Alina Sbirlea ba48a2c5e8 [AliasAnalysis/NewPassManager] Invalidate AAManager less often.
Summary:
This is a redo of D60914.

The objective is to not invalidate AAManager, which is stateless, unless
there is an explicit invalidate in one of the AAResults.

To achieve this, this patch adds an API to PAC, to check precisely this:
is this analysis not invalidated explicitly == is this analysis not abandoned == is this analysis stateless, so preserved without explicitly being marked as preserved by everyone

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61284

llvm-svn: 359622
2019-04-30 22:15:47 +00:00
Stanislav Mekhanoshin a6322941ff [AMDGPU] gfx1010 VMEM and SMEM implementation
Differential Revision: https://reviews.llvm.org/D61330

llvm-svn: 359621
2019-04-30 22:08:23 +00:00
Alina Sbirlea 4e1ac95cf5 [PassManagerBuilder] Add option for interleaved loops, for loop vectorize.
Summary:
Match NewPassManager behavior: add option for interleaved loops in the
old pass manager, and use that instead of the flag used to disable loop unroll.
No changes in the defaults.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, dmgreen, hsaito, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61030

llvm-svn: 359615
2019-04-30 21:29:20 +00:00
Rong Xu 998b97f6f1 [llvm-profdata] Add overlap command to compute similarity b/w two profile files
Add overlap functionality to llvm-profdata tool to compute the similarity
between two profile files.

Differential Revision: https://reviews.llvm.org/D60977

llvm-svn: 359612
2019-04-30 21:19:12 +00:00
Simon Pilgrim 07ab4e7db8 [X86][SSE] Fold extract_subvector(extend(x)) -> extend_vector_inreg(x)
This adds any extend support - folding to zero_extend_vector_inreg (PMOVZX) for legality

Minor improvement for PR39709

llvm-svn: 359608
2019-04-30 20:31:07 +00:00
Dan Gohman 397ca2f22e [WebAssembly] Fix test after r359602
Update the expected output for this test now that the EXPLICIT_NAME
flag is being printed.

llvm-svn: 359605
2019-04-30 19:58:56 +00:00
Dan Gohman 3b5b9d0e72 [WebAssembly] Support EXPLICIT_NAME symbols in llvm-readobj
Teach llvm-readobj about WASM_SYMBOL_EXPLICIT_NAME.

Differential Revision: https://reviews.llvm.org/D61323

Reviewer: sbc100
llvm-svn: 359602
2019-04-30 19:30:24 +00:00
Dan Gohman 3a7532e645 [WebAssembly] Support f16 libcalls
Add support for f16 libcalls in WebAssembly. This entails adding signatures
for the remaining F16 libcalls, and renaming gnu_f2h_ieee/gnu_h2f_ieee to
truncsfhf2/extendhfsf2 for consistency between f32 and f64/f128 (compiler-rt
already supports this).

Differential Revision: https://reviews.llvm.org/D61287

Reviewer: dschuff
llvm-svn: 359600
2019-04-30 19:17:59 +00:00
Sanjay Patel 3ec1c51716 [AArch64] add more tests for constant folding failures; NFC
llvm-svn: 359592
2019-04-30 18:15:18 +00:00
Craig Topper 3958719dda [X86] If PreprocessISelDAG reorders a load before a call, make sure we remove dead nodes from the graph
The reordering can leave at least a dead TokenFactor in the graph. This cause the linearize scheduler to fail with something like the assert seen in PR22614. This is only one of many ways we can break the linearize scheduler today so I can't say for sure that any of the other failures in that bug were caused by this issue.

This takes the heavy hammer approach of just running RemoveDeadNodes unconditionally at the end of the PreprocessISelDAG. If this turns out to be a compile time hit, we can try to refine it.

Differential Revision: https://reviews.llvm.org/D61164

llvm-svn: 359582
2019-04-30 17:56:47 +00:00
Craig Topper 965d1306ae [X86] Initial cleanups on the FixupLEAs pass. Separate Atom LEA creation from other LEA optimizations.
This removes some of the class variables. Merge basic block processing into
runOnMachineFunction to keep the flags local.

Pass MachineBasicBlock around instead of an iterator. We can get the iterator in
the few places that need it. Allows a range-based outer for loop.

Separate the Atom optimization from the rest of the optimizations. This allows
fixupIncDec to create INC/DEC and still allow Atom to turn it back into LEA
when profitable by its heuristics.

I'd like to improve fixupIncDec to turn LEAs into ADD any time the base or index
register is equal to the destination register. This is profitable regardless of
the various slow flags. But again we would want Atom to be able to undo that.

Differential Revision: https://reviews.llvm.org/D60993

llvm-svn: 359581
2019-04-30 17:56:28 +00:00
Jordan Rupprecht 96bbb1dc2b [llvm-objcopy] Add RISC-V support for -B/-O
Reviewers: jorgbrown, espindola, alexshap, jhenderson

Subscribers: emaste, arichardson, fedor.sergeev, jakehehrlich, kito-cheng, shiva0217, MaskRay, rogfer01, rkruppe, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61272

llvm-svn: 359568
2019-04-30 15:21:36 +00:00
Sanjay Patel 0387bf5269 [SelectionDAG] remove div-by-zero constant folding restriction
We don't have this restriction in IR, so it should not be here
either simply out of consistency. Code that wants to handle FP
exceptions is expected to use the 'strict' variants of these
nodes.

We don't get the frem case because frem by 0.0 produces NaN (invalid),
and that's the remaining check here (so the removed check for frem
was dead code AFAIK).

This is the only place in SDAG that uses "HasFPExceptions", so I
think we should remove that entirely as a follow-up patch.

llvm-svn: 359566
2019-04-30 14:37:15 +00:00
Eugene Leviant fd0831d0f5 [llvm-nm] Add --special-syms no-op flag
Differential revision: https://reviews.llvm.org/D60502

llvm-svn: 359563
2019-04-30 13:51:48 +00:00
Sanjay Patel c16fd75e44 [AArch64] add tests for fdiv/frem constant folding (PR41668); NFC
llvm-svn: 359561
2019-04-30 13:43:17 +00:00
Simon Pilgrim f5e8f222d6 Revert rL359519 : [MemorySSA] Invalidate MemorySSA if AA or DT are invalidated.
Summary:
MemorySSA keeps internal pointers of AA and DT.
If these get invalidated, so should MemorySSA.

Reviewers: george.burgess.iv, chandlerc

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61043
........
This was causing windows build bot failures

llvm-svn: 359555
2019-04-30 12:34:21 +00:00
Simon Pilgrim 83098d28a1 [SLP] Lit test that cannot get vectorized due to lack of look-ahead operand reordering heuristic.
The code in this test is not vectorized by SLP because its operand reordering cannot look beyond the immediate predecessors.
This will get fixed in a follow-up patch that introduces the look-ahead operand reordering heuristic.

Committed on behalf of @vporpo (Vasileios Porpodas)

Differential Revision: https://reviews.llvm.org/D61283

llvm-svn: 359553
2019-04-30 11:03:09 +00:00
George Rimar 67f590e286 [llvm-objcopy] - Check dynamic relocation sections for broken references.
This is a fix for https://bugs.llvm.org/show_bug.cgi?id=41371.

Currently, it is possible to break the sh_link field of the dynamic relocation section
by removing the section it refers to. The patch fixes an issue and adds 2 test cases.

One of them shows that it does not seem possible to break the sh_info field.
I added an assert to verify this.

Differential revision: https://reviews.llvm.org/D60825

llvm-svn: 359552
2019-04-30 11:02:09 +00:00
Jeremy Morse 562f5f04f5 Update checks in an instcombine test, NFC
This reduces the delta in some incoming work that changes this test.

llvm-svn: 359549
2019-04-30 10:56:33 +00:00
Sjoerd Meijer ea31ddb36f [ARM] Implement TTI::getMemcpyCost
This implements TargetTransformInfo method getMemcpyCost, which estimates the
number of instructions to which a memcpy instruction expands to.

Differential Revision: https://reviews.llvm.org/D59787

llvm-svn: 359547
2019-04-30 10:28:50 +00:00
Simon Pilgrim 22641cc194 Fix for bug 41512: lower INSERT_VECTOR_ELT(ZeroVec, 0, Elt) to SCALAR_TO_VECTOR(Elt) for all SSE flavors
Current LLVM uses pxor+pinsrb on SSE4+ for INSERT_VECTOR_ELT(ZeroVec, 0, Elt) insead of much simpler movd.
INSERT_VECTOR_ELT(ZeroVec, 0, Elt) is idiomatic construct which is used e.g. for _mm_cvtsi32_si128(Elt) and for lowest element initialization in _mm_set_epi32.
So such inefficient lowering leads to significant performance digradations in ceratin cases switching from SSSE3 to SSE4.
https://bugs.llvm.org/show_bug.cgi?id=41512

Here INSERT_VECTOR_ELT(ZeroVec, 0, Elt) is simply converted to SCALAR_TO_VECTOR(Elt) when applicable since latter is closer match to desired behavior and always efficiently lowered to movd and alike.

Committed on behalf of @Serge_Preis (Serge Preis)

Differential Revision: https://reviews.llvm.org/D60852

llvm-svn: 359545
2019-04-30 10:18:25 +00:00
Diana Picus 59a4c0481a [ARM GlobalISel] Widen small shift operands
The legalizer was already widening the shift amount. Add tests for that
behaviour, and also support widening the shifted value.

llvm-svn: 359542
2019-04-30 09:24:43 +00:00
Diana Picus 1e88ac213b [ARM GlobalISel] Be more careful about bailing out
Bail out on function arguments/returns with types aggregating an
unsupported type. This fixes cases where we would happily and
incorrectly lower functions taking e.g. [1 x i64] parameters, when we
don't even support plain i64 yet.

llvm-svn: 359540
2019-04-30 09:05:25 +00:00
Alexander Potapenko 06d00afa61 MSan: handle llvm.lifetime.start intrinsic
Summary:
When a variable goes into scope several times within a single function
or when two variables from different scopes share a stack slot it may
be incorrect to poison such scoped locals at the beginning of the
function.
In the former case it may lead to false negatives (see
https://github.com/google/sanitizers/issues/590), in the latter - to
incorrect reports (because only one origin remains on the stack).

If Clang emits lifetime intrinsics for such scoped variables we insert
code poisoning them after each call to llvm.lifetime.start().
If for a certain intrinsic we fail to find a corresponding alloca, we
fall back to poisoning allocas for the whole function, as it's now
impossible to tell which alloca was missed.

The new instrumentation may slow down hot loops containing local
variables with lifetime intrinsics, so we allow disabling it with
-mllvm -msan-handle-lifetime-intrinsics=false.

Reviewers: eugenis, pcc

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60617

llvm-svn: 359536
2019-04-30 08:35:14 +00:00
Markus Lavin a475da36eb [DebugInfo] DW_OP_deref_size in PrologEpilogInserter.
The PrologEpilogInserter need to insert a DW_OP_deref_size before
prepending a memory location expression to an already implicit
expression to avoid having the existing expression act on the memory
address instead of the value behind it.

The reason for using DW_OP_deref_size and not plain DW_OP_deref is that
big-endian targets need to read the right size as simply truncating a
larger read would yield the wrong result (LSB bytes are not at the lower
address).

This re-commit fixes issues reported in the first one. Namely deref was
inserted under wrong conditions and additionally the deref_size argument
was incorrectly encoded.

Differential Revision: https://reviews.llvm.org/D59687

llvm-svn: 359535
2019-04-30 07:58:57 +00:00
Kang Zhang d43b66b318 [NFC][PowerPC] Use -check-prefixes to simplify the check in code-align.ll
Summary:
When checking the same output, we can use the `-check-prefixes` to simplify the check.
For example, if we want to check below output.
```
; GENERIC-LABEL: .globl  foo
; BASIC-LABEL: .globl  foo
; PWR-LABEL: .globl  foo
; GENERIC: .p2align  2
; BASIC: .p2align  4
; PWR: .p2align  4
; GENERIC: @foo
; BASIC: @foo
; PWR: @foo

```
If we use `-check-prefixes`
```
... -check-prefixes=CHECK,GENERAL
... -check-prefixes=CHECK,BASIC
... -check-prefixes=CHECK,PWR
```
Above check can be simplify to:
```
; CHECK-LABEL: .globl  foo
; GENERIC: .p2align  2
; BASIC: .p2align  4
; PWR: .p2align  4
; CHECK: @foo
```

Reviewed By: hfinkel
Differential Revision: https://reviews.llvm.org/D61227

llvm-svn: 359533
2019-04-30 03:39:05 +00:00
Zi Xuan Wu 49d60fdc2e [DAGCombiner] Do not generate ISD::ADDE node if adde is not legal for the target when combine ISD::TRUNC node
Do not combine (trunc adde(X, Y, Carry)) into (adde trunc(X), trunc(Y), Carry), 
if adde is not legal for the target. Even it's at type-legalize phase. 
Because adde is special and will not be legalized at operation-legalize phase later.

This fixes: PR40922
https://bugs.llvm.org/show_bug.cgi?id=40922

Differential Revision: https://reviews.llvm.org//D60854

llvm-svn: 359532
2019-04-30 03:01:14 +00:00
Alina Sbirlea 9a1edd14a2 [MemorySSA] Invalidate MemorySSA if AA or DT are invalidated.
Summary:
MemorySSA keeps internal pointers of AA and DT.
If these get invalidated, so should MemorySSA.

Reviewers: george.burgess.iv, chandlerc

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61043

llvm-svn: 359519
2019-04-29 23:53:04 +00:00
Ahsan Saghir 3962d6da17 Add __builtin_dcbf support for PPC
Summary:
This patch adds support for __builtin_dcbf for PPC.

__builtin_dcbf copies the contents of a modified block from the data cache
to main memory and flushes the copy from the data cache.

Differential revision: https://reviews.llvm.org/D59843

llvm-svn: 359517
2019-04-29 23:25:33 +00:00
Steven Wu 6c9f6fd11b [ThinLTO] Adding architecture name into saved object filename
Summary:
For ThinLTOCodegenerator, it has an option to save the object file
outputs into a directory which is essential for debug info. Tools like lldb
and dsymutil will look for these object files for debug info.

On Darwin platform, you can link fat binaries with one single clang
driver invocation like:
 $ clang -arch x86_64 -arch i386 -Wl,-object_path_lto,$TMPDIR ...
Unfornately, the output object files for one architecture is going to
overwrite the previous ones and one architecture slice will end up with
no debug info. One example for this is to turn on ThinLTO for sanitizer
dylibs in compiler-rt project.

To fix the issue, add the name for the architecture into the name of the
output object file.

rdar://problem/35482935

Reviewers: tejohnson, bd1976llvm, dexonsmith, JDevlieghere

Reviewed By: dexonsmith

Subscribers: mehdi_amini, aprantl, inglorion, eraman, hiraditya, jkorous, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60924

llvm-svn: 359508
2019-04-29 21:39:54 +00:00
Dan Gohman 8306cb5702 [WebAssembly] Define the signature for __stack_chk_fail
The WebAssembly backend needs to know the signatures of all runtime
libcall functions. This adds the signature for __stack_chk_fail which was
previously missing.

Also, make the error message for a missing libcall include the name of
the function.

Differential Revision: https://reviews.llvm.org/D59521

Reviewed By: sbc100

llvm-svn: 359505
2019-04-29 21:09:44 +00:00
Roland Froese 728e139700 [PowerPC] Try harder to avoid load/move-to VSR for partial vector loads
Change the PPCISelLowering.cpp function that decides to avoid update form in
favor of partial vector loads to know about newer load types and to not be
confused by the chain operand.

Differential Revision: https://reviews.llvm.org/D60102

llvm-svn: 359504
2019-04-29 21:08:35 +00:00
Jessica Paquette 7f6fe7c02c [GlobalISel][AArch64] Select llvm.aarch64.crypto.sha1h
This was falling back and gives us a reason to create a selectIntrinsic function
which we would need eventually anyway. Update arm64-crypto.ll to show that we
correctly select it.

Also factor out the code for finding an intrinsic ID.

llvm-svn: 359501
2019-04-29 20:58:17 +00:00
Martin Storsjo c0d138d147 [X86] Run CFIInstrInserter on Windows if Dwarf is used
This is necessary since SVN r330706, as tail merging can include
CFI instructions since then.

This fixes PR40322 and PR40012.

Differential Revision: https://reviews.llvm.org/D61252

llvm-svn: 359496
2019-04-29 20:25:51 +00:00
Don Hinton 1817377f10 Fix one more case of passing options with too many dashes.
llvm-svn: 359495
2019-04-29 20:13:35 +00:00
Simon Pilgrim 028485d7b9 [X86][SSE] isHorizontalBinOp - add support for target shuffles
Add target shuffle decoding to isHorizontalBinOp as well as ISD::VECTOR_SHUFFLE support.

This does mean we can go through bitcasts so we need to bitcast the extracted args to ensure they are the correct type

Fixes PR39936 and should help with PR39920/PR39921

Differential Revision: https://reviews.llvm.org/D61245

llvm-svn: 359491
2019-04-29 19:52:59 +00:00
Don Hinton 54dbcfe5f0 Fix additional cases of more that two dashes for options in tests.
llvm-svn: 359484
2019-04-29 18:58:52 +00:00
Don Hinton 89e583b843 [CommandLine] Don't allow unlimitted dashes for options. Part 1 or 5
Summary:
Prior to this patch, the CommandLine parser would strip an
unlimitted number of dashes from options.  This patch limits it to
two.

Reviewers: rnk

Reviewed By: rnk

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61229

llvm-svn: 359480
2019-04-29 18:34:18 +00:00
Simon Pilgrim 9d99372f73 [llvm-mca][x86] Fix MMX PMOVMSKB test
This is defined as part of SSE1, XMM PMOVMSKB doesn't appear until SSE2

llvm-svn: 359477
2019-04-29 18:24:30 +00:00
Bjorn Pettersson 820994572c [DAG] Refactor DAGCombiner::ReassociateOps
Summary:
Extract the logic for doing reassociations
from DAGCombiner::reassociateOps into a helper
function DAGCombiner::reassociateOpsCommutative,
and use that helper to trigger reassociation
on the original operand order, or the commuted
operand order.

Codegen is not identical since the operand order will
be different when doing the reassociations for the
commuted case. That causes some unfortunate churn in
some test cases. Apart from that this should be NFC.

Reviewers: spatel, craig.topper, tstellar

Reviewed By: spatel

Subscribers: dmgreen, dschuff, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61199

llvm-svn: 359476
2019-04-29 17:50:10 +00:00
Thomas Preud'homme 15cb1f1501 FileCheck [3/12]: Stricter parsing of @LINE expressions
Summary:
This patch is part of a patch series to add support for FileCheck
numeric expressions. This specific patch gives earlier and better
diagnostics for the @LINE expressions.

Rather than detect parsing errors at matching time, this commit adds
enhance parsing to detect issues with @LINE expressions at parse time
and diagnose them more accurately.

Copyright:
    - Linaro (changes up to diff 183612 of revision D55940)
    - GraphCore (changes in later versions of revision D55940 and
                 in new revision created off D55940)

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60383

llvm-svn: 359475
2019-04-29 17:46:26 +00:00
Quentin Colombet 2d977935a2 [llvm-extract] Expose the group extraction feature of the BlockExtractor
This patch extends the `-bb` option to be able to use the group
extraction feature from the BlockExtractor.
In particular, `-bb=func:bb` is modified to support a list of basic
blocks per function: `-bb=func:bb1[;bb2...]` that will be extracted
together if at all possible (region must be single entry.)

Differential Revision: https://reviews.llvm.org/D60973

llvm-svn: 359464
2019-04-29 16:14:03 +00:00
Quentin Colombet ae2cbb3400 [BlockExtractor] Change the basic block separator from ',' to ';'
This change aims at making the file format be compatible with the
way LLVM handles command line options.

Differential Revision: https://reviews.llvm.org/D60970

llvm-svn: 359462
2019-04-29 16:14:00 +00:00
Kevin P. Neal a25c928302 Add AVX support to this test.
Requested by Craig Topper and Andrew Kaylor as part of D55897.

llvm-svn: 359461
2019-04-29 16:06:04 +00:00
Cullen Rhodes 2c0d5043a7 [AArch64][SVE] Asm: add aliases for unpredicated bitwise logical instructions
This patch adds aliases for element sizes .B/.H/.S to the
AND/ORR/EOR/BIC bitwise logical instructions. The assembler now accepts
these instructions with all element sizes up to 64-bit (.D). The
preferred disassembly is .D.

llvm-svn: 359457
2019-04-29 15:27:27 +00:00
Simon Pilgrim 9d4ed24f25 [X86][SSE] Add scalar horizontal add/sub tests for non-0/1 element extractions
llvm-svn: 359454
2019-04-29 14:26:27 +00:00
Thomas Preud'homme 5a33047022 FileCheck [2/12]: Stricter parsing of -D option
Summary:
This patch is part of a patch series to add support for FileCheck
numeric expressions. This specific patch gives earlier and better
diagnostics for the -D option.

Prior to this change, parsing of -D option was very loose: it assumed
that there is an equal sign (which to be fair is now checked by the
FileCheck executable) and that the part on the left of the equal sign
was a valid variable name. This commit adds logic to ensure that this
is the case and gives diagnostic when it is not, making it clear that
the issue came from a command-line option error. This is achieved by
sharing the variable parsing code into a new function ParseVariable.

Copyright:
    - Linaro (changes up to diff 183612 of revision D55940)
    - GraphCore (changes in later versions of revision D55940 and
                 in new revision created off D55940)

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60382

llvm-svn: 359447
2019-04-29 13:32:36 +00:00
Simon Pilgrim c570b2a2e5 [X86][SSE] Moved haddps test from phaddsub.ll to haddsub.ll (D61245)
Also merged duplicate PR39921 + PR39936 tests

llvm-svn: 359437
2019-04-29 11:30:47 +00:00
Simon Pilgrim 46128cdf08 [InstCombine][X86] Add PACKSS tests for truncation of sign-extended comparisons
llvm-svn: 359435
2019-04-29 10:36:20 +00:00
Diogo N. Sampaio d95abb170b [ARM] Add bitcast/extract_subvec. of fp16 vectors
Summary:
This patch adds some basic operations for fp16
vectors, such as bitcast from fp16 to i16,
required to perform extract_subvector (also added
here) and extract_element.

Reviewers: SjoerdMeijer, DavidSpickett, t.p.northover, ostannard

Reviewed By: ostannard

Subscribers: javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60618

llvm-svn: 359433
2019-04-29 10:28:07 +00:00
Diogo N. Sampaio 2078eb745d [ARM] Add v4f16 and v8f16 types to the CallingConv
Summary:
The Procedure Call Standard for the Arm Architecture
states that float16x4_t and float16x8_t behave just
as uint16x4_t and uint16x8_t for argument passing.
This patch adds the fp16 vectors to the
ARMCallingConv.td file.

Reviewers: miyuki, ostannard

Reviewed By: ostannard

Subscribers: ostannard, javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60720

llvm-svn: 359431
2019-04-29 10:10:37 +00:00
Jeremy Morse 055aee1d8a [DebugInfo] Terminate more location-list ranges at the end of blocks
This patch fixes PR40795, where constant-valued variable locations can
"leak" into blocks placed at higher addresses. The root of this is that
DbgEntityHistoryCalculator terminates all register variable locations at
the end of each block, but not constant-value variable locations.

Fixing this requires constant-valued DBG_VALUE instructions to be
broadcast into all blocks where the variable location remains valid, as
documented in the LiveDebugValues section of SourceLevelDebugging.rst,
and correct termination in DbgEntityHistoryCalculator.

Differential Revision: https://reviews.llvm.org/D59431

llvm-svn: 359426
2019-04-29 09:13:16 +00:00
Fangrui Song 97b8cd54ad [DWARF] Fix dump of local/foreign TU lists in .debug_names
Differential Revision: https://reviews.llvm.org/D61241

llvm-svn: 359425
2019-04-29 08:55:10 +00:00
Craig Topper 9202d5f8f1 [X86] Remove some intel syntax aliases on (v)cvtpd2(u)dq, (v)cvtpd2ps, (v)cvt(u)qq2ps. Add 'x' and'y' suffix aliases to masked version of the same in att syntax.
The 128/256 bit version of these instructions require an 'x' or 'y' suffix to
disambiguate the memory form in att syntax.

We were allowing the same suffix in intel syntax, but it appears gas does not
do that.

gas does allow the 'x' and 'y' suffix on register and broadcast forms even
though its not needed. We were allowing it on unmasked register form, but not on
masked versions or on masked or unmasked broadcast form.

While there fix some test coverage holes so they can be extended with the 'x'
and 'y' suffix tests.

llvm-svn: 359418
2019-04-29 06:13:41 +00:00
Fangrui Song 2d5e7de526 [llvm-nm] -print-size => --print-size
llvm-svn: 359417
2019-04-29 06:03:59 +00:00
Simon Pilgrim 65f12f66f6 [X86] Add PR39921 HADD pairwise reduction test and AVX2 test coverage
llvm-svn: 359409
2019-04-28 21:04:47 +00:00
Simon Pilgrim 85bacd0f95 [X86][AVX] Add fast-hops target for add/fadd reduction tests
llvm-svn: 359408
2019-04-28 20:04:08 +00:00
Simon Pilgrim e375257e95 [X86] Add PR39936 HADD Tests
llvm-svn: 359407
2019-04-28 20:03:11 +00:00
Simon Pilgrim d394195221 [X86][AVX] Enabled AVX512F tests and add PR40815 test case
llvm-svn: 359401
2019-04-28 15:04:30 +00:00
Simon Pilgrim 22d1476bfa [X86][AVX] Combine non-lane crossing binary shuffles using X86ISD::VPERMV3
Some of the combines might be further improved if we lower more shuffles with X86ISD::VPERMV3 directly, instead of waiting to combine the results.

llvm-svn: 359400
2019-04-28 14:31:01 +00:00
Sanjay Patel ce8cfe96f7 [SelectionDAG] include FP min/max variants as binary operators
The x86 test diffs don't look great because of extra move ops,
but FP min/max should clearly be included in the list.

llvm-svn: 359399
2019-04-28 13:19:29 +00:00
Sanjay Patel fb9a5307a9 [DAGCombiner] try repeated fdiv divisor transform before building estimate
This was originally part of D61028, but it's an independent diff.

If we try the repeated divisor reciprocal transform before producing an estimate sequence,
then we have an opportunity to use scalar fdiv. On x86, the trade-off is 1 divss vs. 5
vector FP ops in the default estimate sequence. On recent chips (Skylake, Ryzen), the
full-precision division is only 3 cycle throughput, so that's probably the better perf
default option and avoids problems from x86's inaccurate estimates.

The last 2 tests show that users still have the option to override the defaults by using
the function attributes for reciprocal estimates, but those patterns are potentially made
faster by converting the vector ops (including ymm ops) to scalar math.

Differential Revision: https://reviews.llvm.org/D61149

llvm-svn: 359398
2019-04-28 12:23:43 +00:00
Andrea Di Biagio 43003f0fec [MCA] Fix typo in AVX2 gather tests. NFC
llvm-svn: 359397
2019-04-28 10:54:45 +00:00
Simon Pilgrim 93ad48210c [X86][SSE] Optimize llvm.experimental.vector.reduce.xor.vXi1 parity reduction (PR38840)
An xor reduction of a bool vector can be optimized to a parity check of the MOVMSK/BITCAST'd integer - if the population count is odd return 1, else return 0.

Differential Revision: https://reviews.llvm.org/D61230

llvm-svn: 359396
2019-04-28 10:46:17 +00:00
Simon Pilgrim fed302ae37 [X86][AVX] Add AVX512DQ coverage for masked memory ops tests (PR34584)
llvm-svn: 359395
2019-04-28 10:02:34 +00:00
Craig Topper bd35a30940 [X86] Remove (V)MOV64toSDrr/m and (V)MOVDI2SSrr/m. Use 128-bit result MOVD/MOVQ and COPY_TO_REGCLASS instead
Summary:
The register form of these instructions are CodeGenOnly instructions that cover
GR32->FR32 and GR64->FR64 bitcasts. There is a similar set of instructions for
the opposite bitcast. Due to the patterns using bitcasts these instructions get
marked as "bitcast" machine instructions as well. The peephole pass is able to
look through these as well as other copies to try to avoid register bank copies.

Because FR32/FR64/VR128 are all coalescable to each other we can end up in a
situation where a GR32->FR32->VR128->FR64->GR64 sequence can be reduced to
GR32->GR64 which the copyPhysReg code can't handle.

To prevent this, this patch removes one set of the 'bitcast' instructions. So
now we can only go GR32->VR128->FR32 or GR64->VR128->FR64. The instruction that
converts from GR32/GR64->VR128 has no special significance to the peephole pass
and won't be looked through.

I guess the other option would be to add support to copyPhysReg to just promote
the GR32->GR64 to a GR64->GR64 copy. The upper bits were basically undefined
anyway. But removing the CodeGenOnly instruction in favor of one that won't be
optimized seemed safer.

I deleted the peephole test because it couldn't be made to work with the bitcast
instructions removed.

The load version of the instructions were unnecessary as the pattern that selects
them contains a bitcasted load which should never happen.

Fixes PR41619.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61223

llvm-svn: 359392
2019-04-28 06:25:33 +00:00
Simon Pilgrim 03c4e2663c Revert rL359389: [X86][SSE] Add support for <64 x i1> bool reduction
Minor generalization of the existing <32 x i1> pre-AVX2 split code.
........
Causing irregular buildbot failures.

llvm-svn: 359391
2019-04-27 20:44:08 +00:00
Simon Pilgrim 1a4a43250e [X86][AVX] Add additional SSE/AVX expandload and compressstore targets
llvm-svn: 359390
2019-04-27 20:20:02 +00:00
Simon Pilgrim 4118be3af6 [X86][SSE] Add support for <64 x i1> bool reduction
Minor generalization of the existing <32 x i1> pre-AVX2 split code.

llvm-svn: 359389
2019-04-27 20:04:44 +00:00
Simon Pilgrim 399746eaf6 [X86][AVX] Cleanup and add additional expandload and compressstore tests
sort order by types and add vXi32/vXi16/vXi8 test coverage

llvm-svn: 359388
2019-04-27 19:57:34 +00:00
Simon Pilgrim 2a2d422400 [X86][AVX512] Improve vector bool reductions
As predicate masks are legal on AVX512 targets, we avoid MOVMSK in these cases, but we can just bitcast the bool vector to the integer equivalent directly - avoiding expansion of the reduction to a shuffle pattern.

llvm-svn: 359386
2019-04-27 17:32:46 +00:00