Commit Graph

56580 Commits

Author SHA1 Message Date
Simon Pilgrim aabd99c27a [X86] PUSH/POP 'mem-mem' instructions are not RMW - these are 2 different addresses
This patch adds a 'WriteCopy' [WriteLoad, WriteStore] schedule sequence instead to better model the behaviour

Found by @andreadb during llvm-mca testing on btver2 which was crashing on "zero uop" WriteRMW only instructions

llvm-svn: 343708
2018-10-03 19:02:38 +00:00
Matthew Voss f8ab35a4f4 Emit template type and value parameter DIEs for template variables.
Summary:
Ensure the TemplateParam attribute of the DIGlobalVariable node is translated into the proper DIEs.

Resolves https://bugs.llvm.org/show_bug.cgi?id=22119

Reviewers: dblaikie, probinson, aprantl, JDevlieghere, clayborg, whitequark, deadalnix

Reviewed By: dblaikie

Subscribers: llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D52057

llvm-svn: 343706
2018-10-03 18:44:53 +00:00
Simon Pilgrim 0b451a2983 [X86][Btver2] Fix MMX PSHUFB schedule
Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343701
2018-10-03 18:18:50 +00:00
Daniel Sanders fb9b99b26e [globalisel][combines] Don't sink G_TRUNC down to use if that use is a G_PHI
This fixes a problem where the register allocator fails to eliminate a PHI
because there's a non-PHI in the middle of the PHI instructions at the start
of a BB.

This G_TRUNC can be better placed but this at least fixes the correctness issue
quickly. I'll follow up with a patch to the verifier to catch this kind of bug
in future.

llvm-svn: 343693
2018-10-03 15:43:39 +00:00
Andrea Di Biagio 207e0217f9 [llvm-mca] Add support for move elimination in class RegisterFile.
This patch teaches class RegisterFile how to analyze register writes from
instructions that are move elimination candidates.
In particular, it teaches it how to check if a move can be effectively eliminated
by the underlying PRF, and (if necessary) how to perform move elimination.

The long term goal is to allow processor models to describe instructions that
are valid move elimination candidates.
The idea is to let register file definitions in tablegen declare if/when moves
can be eliminated.

This patch is a non functional change.
The logic that performs move elimination is currently disabled.  A future patch
will add support for move elimination in the processor models, and enable this
new code path.

llvm-svn: 343691
2018-10-03 15:02:44 +00:00
Nirav Dave 925b64be64 [X86] Correctly use SSE registers if no-x87 is selected.
Fix use of SSE1 registers for f32 ops in no-x87 mode.

Notably, allow use of SSE instructions for f32 operations in 64-bit
mode (but not 32-bit which is disallowed by callign convention).

Also avoid translating memset/memcopy/memmove into SSE registers
without X87 for 32-bit mode.

This fixes PR38738.

Reviewers: nickdesaulniers, craig.topper

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D52555

llvm-svn: 343689
2018-10-03 14:13:30 +00:00
James Henderson 99031b79a6 [ThinLTO]Expose cache entry expiration time option in llvm-lto and fix a test
Two cases in a ThinLTO test were passing for the wrong reasons, since
rL340374. The tests were supposed to be testing that files were being
pruned due to the cache size, but they were in fact being pruned because
they were older than the default expiration period of 1 week.

This change fixes the tests by explicitly setting the expiration time to
the maximum value. This required the option to be exposed in llvm-lto.

By assigning all files in the cache a similar time, it is possible to see
that the newest files are still being kept, and that we aren't passing
for the wrong reason again. In the event that the entry expiration were
to expire for them, then the test would start failing, because these
files would be removed too.

Reviewed by: rnk, inglorion

Differential Revision: https://reviews.llvm.org/D51992

llvm-svn: 343687
2018-10-03 13:00:20 +00:00
Clement Courbet d5a39553ff [llvm-exegesis] Resolve variant classes in analysis.
Summary: See PR38884.

Reviewers: gchatelet

Subscribers: tschuett, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D52825

llvm-svn: 343680
2018-10-03 11:50:25 +00:00
Alex Bradbury efceb59801 [RISCV] Remove RV64 test lines from umulo-128-legalisation-lowering.ll
The generated code is incorrect anyway, and this test adds noise to the 
upcoming set of patches that flesh out RV64 support.

llvm-svn: 343675
2018-10-03 10:59:42 +00:00
Tim Renouf a37679d67b [AMDGPU] Fix for negative offsets in buffer/tbuffer intrinsics
Summary:
The new buffer/tbuffer intrinsics handle an out-of-range immediate
offset by moving/adding offset&-4096 to a vgpr, leaving an in-range
immediate offset, with a chance of the move/add being CSEd for similar
loads/stores.

However it turns out that a negative offset in a vgpr is illegal, even
if adding the immediate offset makes it legal again.

Therefore, this commit disables the offset&-4096 thing if the offset is
negative.

Differential Revision: https://reviews.llvm.org/D52683

Change-Id: Ie02f0a74f240a138dc2a29d17cfbd9e350e4ed13
llvm-svn: 343672
2018-10-03 10:29:43 +00:00
Simon Pilgrim c68cc4efbe [X86][Btver2] Most RMW instructions don't require an additional uop
Remove uop on WriteRMW and move it into the few instructions that need it.

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343671
2018-10-03 10:28:43 +00:00
Simon Pilgrim d11015861c [X86] ALU/ADC RMW instructions should use the WriteRMW sequence class
I was expecting this to be a nfc but Silvermont seems to be setup a little differently:

// A folded store needs a cycle on MEC_RSV for the store data, but it does not need an extra port cycle to recompute the address.
def : WriteRes<WriteRMW, [SLM_MEC_RSV]>;

So moving from WriteStore to WriteRMW reduces predicted port pressure, confirmed by @craig.topper that this is correct.

Differential Revision: https://reviews.llvm.org/D52740

llvm-svn: 343670
2018-10-03 10:01:13 +00:00
Aditya Kumar 9e20ade72a Add support for new pass manager
Modified the testcases to use both pass managers
Use single commandline flag for both pass managers.

Differential Revision: https://reviews.llvm.org/D52708
Reviewers: sebpop, tejohnson, brzycki, SirishP
Reviewed By: tejohnson, brzycki

llvm-svn: 343662
2018-10-03 05:55:20 +00:00
Fangrui Song 3d76d36059 [AMDGPU] Rename pass "isel" to "amdgpu-isel"
Summary: The AMDGPU target specific pass "isel" is a misleading name.

Reviewers: tstellar, echristo, javed.absar, arsenm

Reviewed By: arsenm

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D52759

llvm-svn: 343659
2018-10-03 03:38:22 +00:00
Daniel Sanders bad3936109 [globalisel] Fix one more missing Verifier pass from gisel-commandline-option.ll
llvm-svn: 343658
2018-10-03 02:52:54 +00:00
Matt Arsenault 635d479322 AMDGPU: Always run AMDGPUAlwaysInline
Even if calls are enabled, it still needs to be run
for forcing inline of functions that use LDS.

llvm-svn: 343657
2018-10-03 02:47:25 +00:00
Matt Arsenault 0f83d66ae7 Add atomicrmw operation to error messages
llvm-svn: 343656
2018-10-03 02:37:15 +00:00
Daniel Sanders 34eac35a60 Add the missing new files from r343654
llvm-svn: 343655
2018-10-03 02:21:30 +00:00
Daniel Sanders c973ad1878 Re-commit: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
Summary: Depends on D45541

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson

Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45543

The previous commit failed portions of the test-suite on GreenDragon due to
duplicate COPY instructions and iterator invalidation. Both issues have now
been fixed. To assist with this, a helper (cloneVirtualRegister) has been added
to MachineRegisterInfo that can be used to get another register that has the same
type and class/bank as an existing one.

llvm-svn: 343654
2018-10-03 02:12:17 +00:00
Thomas Lively 9075cd607d [WebAssembly] any_true and all_true intrinsics and instructions
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52755

llvm-svn: 343649
2018-10-03 00:19:39 +00:00
Sanjay Patel abcacf9753 [InstCombine] add icmp+logic tests with commuted ops; NFC
The transform in question is located in foldICmpAndConstConst(),
but as shown here, it doesn't work if operands are commuted.

llvm-svn: 343646
2018-10-02 22:53:37 +00:00
Reid Kleckner 9c0baa524c Relax dbg-declare-inalloca.ll test more
We don't need to match the precise type index number here. It's not
important. The type name is what matters to make this test useful.

llvm-svn: 343642
2018-10-02 22:28:10 +00:00
Sam Clegg b2486f118d [WebAssembly] Stop generating helper functions in WebAssemblyLowerEmscriptenEHSjLj
Previously we were creating weakly defined helper function in
each translation unit:

-  setThrew
-  setTempRet0

Instead we now assume these will be provided at link time.  In
emscripten they are provided in compiler-rt:
 https://github.com/kripken/emscripten/pull/7203

Additionally we previously created three global variable which are
also now required to exist at link time instead.

- __THREW__
- _threwValue
- __tempRet0

Differential Revision: https://reviews.llvm.org/D49208

llvm-svn: 343640
2018-10-02 22:12:15 +00:00
Fangrui Song e5652fc682 [CodeView] Try fixing DebugInfo/X86/dbg-declare-inalloca.ll
llvm-svn: 343639
2018-10-02 22:03:31 +00:00
Daniel Sanders f430d941e9 [globalisel] Attempt to fix llvm-clang-x86_64-expensive-checks-win
The behaviour of this bot indicates that -verify-machineinstrs has been forced
on and is therefore inserting the verifier on builds that don't expect it.
Explicitly specify whether it's enabled or disabled for each test.

llvm-svn: 343633
2018-10-02 20:51:27 +00:00
Aaron Smith da0602c154 [CodeView] Only add the Scoped flag for an enum type when it has an immediate function scope to match MSVC
Reviewers: rnk, zturner, llvm-commits

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D52706

llvm-svn: 343627
2018-10-02 20:28:15 +00:00
Aaron Smith 802b033d78 [CodeView] Emit function options for subprogram and member functions
Summary:
Use the newly added DebugInfo (DI) Trivial flag, which indicates if a C++ record is trivial or not, to determine Codeview::FunctionOptions.

Clang and MSVC generate slightly different Codeview for C++ records. For example, here is the C++ code for a class with a defaulted ctor,

       class C {
       public:
         C() = default;
       };

Clang will produce a LF for the defaulted ctor while MSVC does not. For more details, refer to FIXMEs in the test cases in "function-options.ll" included with this set of changes.


Reviewers: zturner, rnk, llvm-commits, aleksandr.urakov

Reviewed By: rnk

Subscribers: Hui, JDevlieghere

Differential Revision: https://reviews.llvm.org/D45123

llvm-svn: 343626
2018-10-02 20:21:05 +00:00
Matt Morehouse 4b1ec17fb0 Revert "X86, AArch64, ARM: Do not attach debug location to spill/reload instructions"
This reverts r343520 due to breakage of HWASan tests on Android.

llvm-svn: 343616
2018-10-02 18:35:44 +00:00
Craig Topper 49225d0915 [X86][Disassembler] Add bizarro versions of the MOVSXD instruction that sign extend from a GR32 to GR32 or GR16.
The 0x63 opcodes in 64-bit mode have a fixed source size of 32-bits, but the destination size is controlled by REX.W and the 0x66 opsize prefix. This instruction is normally used with a REX.W prefix which provides desired behavior. The other encodings are interpretted as valid by the processor, but aren't useful.

This patch makes us recognize them for the disassembler to match objdump.

llvm-svn: 343614
2018-10-02 18:16:19 +00:00
Reid Kleckner d5e4ec74e3 [codeview] Fix 32-bit x86 variable locations in realigned stack frames
Add the .cv_fpo_stackalign directive so that we can define $T0, or the
VFRAME virtual register, with it. This was overlooked in the initial
implementation because unlike MSVC, we push CSRs before allocating stack
space, so this value is only needed to describe local variable
locations. Variables that the compiler now addresses via ESP are instead
described as being stored at offsets from VFRAME, which for us is ESP
after alignment in the prologue.

This adds tests that show that we use the VFRAME register properly in
our S_DEFRANGE records, and that we emit the correct FPO data to define
it.

Fixes PR38857

llvm-svn: 343603
2018-10-02 16:43:52 +00:00
Simon Pilgrim 860cb5c071 [X86][Btver2] Fix BLENDV and AESDEC schedules
Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343597
2018-10-02 15:13:18 +00:00
Krzysztof Parzyszek 528aff3372 [Hexagon] Fix extracting subvectors of non-HVX vNi1
Patch by Brendon Cahoon.

llvm-svn: 343596
2018-10-02 15:05:43 +00:00
Sanjay Patel e2cd6384b7 [InstCombine] add tests with undef elements; NFC
See discussion in D52747.

llvm-svn: 343595
2018-10-02 15:00:56 +00:00
Diogo N. Sampaio eb9ca5ab18 [ARM] Emmit data symbol for constant pool data
The ARM elf emitter would omit printing data
symbol when constant data. This patch
overrides the emitFill method as to enforce that
the symbol is correctly printed.

Differential revision: https://reviews.llvm.org/D52737

llvm-svn: 343594
2018-10-02 14:55:48 +00:00
Roman Lebedev ea2046bea9 [NFC][CodeGen][X86] fma.ll, lwp-intrinsics.ll: actually spell --check-prefixes correctly :/
llvm-svn: 343588
2018-10-02 13:34:50 +00:00
Sanjay Patel 6dbecb4162 [InstCombine] add more insert/extract vector tests with FP types; NFC
These are candidates for the same fold that was implemented in
D52439, but FP types require bitcasting (and that changes the
extra uses profitability calculation).

llvm-svn: 343587
2018-10-02 13:34:05 +00:00
Roman Lebedev 5412be4b7a [NFC][CodeGen][X86] lwp-intrinsics.ll: fix check prefixes
llvm-svn: 343585
2018-10-02 13:11:08 +00:00
Roman Lebedev 8b253f0b54 [NFC][CodeGen][X86] fma.ll: fix check prefixes for -mcpu=bdver2
llvm-svn: 343584
2018-10-02 13:10:55 +00:00
Oliver Stannard c41902807e [AArch64][v8.5A] Add Memory Tagging instructions
This adds new instructions to manipluate tagged pointers, and to load
and store the tags associated with memory.

Patch by Pablo Barrio, David Spickett and Oliver Stannard!

Differential revision: https://reviews.llvm.org/D52490

llvm-svn: 343572
2018-10-02 10:04:39 +00:00
Oliver Stannard 2a5fcba94b [AArch64][v8.5A] Add Memory Tagging system registers
This adds new system registers introduced by the Memory Tagging
extension.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52488

llvm-svn: 343571
2018-10-02 09:54:35 +00:00
Oliver Stannard 4493f421ac [AArch64][v8.5A] Add MTE system instructions
The Memory Tagging Extension adds system instructions for data cache
maintenance, implemented as new operands to the DC instruction.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52487

llvm-svn: 343570
2018-10-02 09:48:43 +00:00
David Green 1e44c3b62c [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A
This is an attempt to get out of a local-minimum that instcombine currently
gets stuck in. We essentially combine two optimisations at once, ~a - ~b = b-a
and min(~a, ~b) = ~max(a, b), only doing the transform if the result is at
least neutral. This involves using IsFreeToInvert, which has been expanded a
little to include selects that can be easily inverted.

This is trying to fix PR35875, using the ideas from Sanjay. It is a large
improvement to one of our rgb to cmy kernels.

Differential Revision: https://reviews.llvm.org/D52177

llvm-svn: 343569
2018-10-02 09:48:34 +00:00
Simon Pilgrim ad23f270db [X86] Standardize floating point assembly comments
Consistently try to use APFloat::toString for floating point constant comments to get rid of differences between Constant / ConstantDataSequential values - it should help stop some of the linux-windows buildbot failures matching NaN/INF etc. as well.

Differential Revision: https://reviews.llvm.org/D52702

llvm-svn: 343562
2018-10-02 09:08:51 +00:00
David Green c066a92657 [InstCombine] Tests for ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A. NFC
llvm-svn: 343561
2018-10-02 09:06:49 +00:00
Matt Arsenault ab41193312 AMDGPU: Expand atomicrmw nand in IR
llvm-svn: 343559
2018-10-02 03:50:56 +00:00
Thomas Lively 6f77811a21 [WebAssembly] Restore slashes in SIMD conversion names
Summary: Depends on D52372 and D52442.

Reviewers: aheejin, dschuff, aardappel

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52512

llvm-svn: 343558
2018-10-02 01:52:21 +00:00
Fangrui Song 99d4f74d01 [AArch64][DAGCombiner]: change -stop-after=isel to instruction-select
"isel" is registered by AMDGPU. The test will break if the AMDGPU target
is not built.

llvm-svn: 343553
2018-10-02 00:22:51 +00:00
Daniel Sanders 33f42f97af Revert: r343521 and r343541: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
There's a strange assertion on two of the Green Dragon bots that goes away when
this is reverted. The assertion is in RegBankAlloc and if it is this commit then
-verify-machine-instrs should have caught it earlier in the pipeline.

llvm-svn: 343546
2018-10-01 22:32:08 +00:00
Reid Kleckner 9ea2c01264 [codeview] Emit S_FRAMEPROC and use S_DEFRANGE_FRAMEPOINTER_REL
Summary:
Before this change, LLVM would always describe locals on the stack as
being relative to some specific register, RSP, ESP, EBP, ESI, etc.
Variables in stack memory are pretty common, so there is a special
S_DEFRANGE_FRAMEPOINTER_REL symbol for them. This change uses it to
reduce the size of our debug info.

On top of the size savings, there are cases on 32-bit x86 where local
variables are addressed from ESP, but ESP changes across the function.
Unlike in DWARF, there is no FPO data to describe the stack adjustments
made to push arguments onto the stack and pop them off after the call,
which makes it hard for the debugger to find the local variables in
frames further up the stack.

To handle this, CodeView has a special VFRAME register, which
corresponds to the $T0 variable set by our FPO data in 32-bit.  Offsets
to local variables are instead relative to this value.

This is part of PR38857.

Reviewers: hans, zturner, javed.absar

Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D52217

llvm-svn: 343543
2018-10-01 21:59:45 +00:00
Craig Topper 42cd8cd862 Recommit r343499 "[X86] Enable load folding in the test shrinking code"
Original message:
This patch adds load folding support to the test shrinking code. This was noticed missing in the review for D52669

llvm-svn: 343540
2018-10-01 21:35:28 +00:00
Craig Topper f06a57fc89 Recommit r343498 "[X86] Improve test instruction shrinking when the sign flag is used and the output of the and is truncated."
This includes a fix to prevent i16 compares with i32/i64 ands from being shrunk if bit 15 of the and is set and the sign bit is used.

Original commit message:
Currently we skip looking through truncates if the sign flag is used. But that's overly restrictive.

It's safe to look through the truncate as long as we ensure one of the 3 things when we shrink. Either the MSB of the mask at the shrunken size isn't set. If the mask bit is set then either the shrunk size needs to be equal to the compare size or the sign

There are still missed opportunities to shrink a load and fold it in here. This will be fixed in a future patch.

llvm-svn: 343539
2018-10-01 21:35:26 +00:00
Sanjay Patel de5e8b93f4 [InstCombine] add inverse test for vector trunc canonical form; NFC
llvm-svn: 343529
2018-10-01 20:25:49 +00:00
Sanjay Patel 746eb09127 [InstCombine] regenerate test checks; NFC
These files used an old version of the script.
We regex more now.

llvm-svn: 343527
2018-10-01 20:22:28 +00:00
Stefan Pintilie 5d32a86f44 [PowerPC] Folding XForm to DForm loads requires alignment for some DForm loads.
Going from XForm Load to DSForm Load requires that the immediate be 4 byte
aligned.
If we are not aligned we must leave the load as LDX (XForm).
This bug is causing a compile-time failure in the benchmark h264ref.

Differential Revision: https://reviews.llvm.org/D51988

llvm-svn: 343525
2018-10-01 20:16:27 +00:00
Eric Christopher dcf1d97c5c Temporarily revert "[GVNHoist] Re-enable GVNHoist by default"
This reverts commit r342387 as it's showing significant performance
regressions in a number of benchmarks. Followed up with the
committer and original thread with an example and will get performance
numbers before recommitting.

llvm-svn: 343522
2018-10-01 18:57:08 +00:00
Daniel Sanders 9659bfda5a [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
Summary: Depends on D45541

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson

Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45543

llvm-svn: 343521
2018-10-01 18:56:47 +00:00
Matthias Braun 3e081703c3 X86, AArch64, ARM: Do not attach debug location to spill/reload instructions
Spill/reload instructions are artificially generated by the compiler and
have no relation to the original source code. So the best thing to do is
not attach any debug location to them (instead of just taking the next
debug location we find on following instructions).

Differential Revision: https://reviews.llvm.org/D52125

llvm-svn: 343520
2018-10-01 18:56:39 +00:00
Craig Topper 1346b5b7cf [X86] Add more test shrinking with truncate and sign bit usage tests. NFC
llvm-svn: 343519
2018-10-01 18:52:19 +00:00
Craig Topper e072934d28 Revert r343499 and r343498. X86 test improvements
There's a subtle bug in the handling of truncate from i32/i64 to i32 without minsize.

I'll be adding more test cases and trying to find a fix.

llvm-svn: 343516
2018-10-01 18:40:44 +00:00
Krzysztof Parzyszek 6d569a2cc4 [Hexagon] Remove incorrect pattern for swiz
The pattern had a couple of problems:
- It was checking for loads of bytes in the reverse order to what it
  should have been looking for.
- It would replace loads of bytes with a load of a word without making
  sure that the alignment was correct.

Thanks to Eli Friedman for pointing it out.

llvm-svn: 343514
2018-10-01 18:24:40 +00:00
Zachary Turner a5e3e02602 [PDB] Add support for dumping Typedef records.
These work a little differently because they are actually in
the globals stream and are treated as symbol records, even though
DIA presents them as types.  So this also adds the necessary
infrastructure to cache records that live somewhere other than
the TPI stream as well.

llvm-svn: 343507
2018-10-01 17:55:38 +00:00
Matthias Braun 7159daa68e MIRParser: Check that instructions only reference DILocation metadata
llvm-svn: 343505
2018-10-01 17:50:52 +00:00
Wouter van Oortmerssen 0c83c3ff38 [WebAssembly] Fixed AsmParser not allowing instructions with /
Summary:
The AsmParser Lexer regards these as a seperate token.
Here we expand the instruction name with them if they are
adjacent (no whitespace).

Tested: the basic-assembly.s test case has one case with a / in it.
The currently are also instructions with : in them, which we intend
to rename rather than fix them here.

Reviewers: tlively, dschuff

Subscribers: sbc100, jgravelle-google, aheejin, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52442

llvm-svn: 343501
2018-10-01 17:20:31 +00:00
Craig Topper aa84e1bba2 [X86] Enable load folding in the test shrinking code
This patch adds load folding support to the test shrinking code. This was noticed missing in the review for D52669

Differential Revision: https://reviews.llvm.org/D52699

llvm-svn: 343499
2018-10-01 17:10:50 +00:00
Craig Topper 2b587ad071 [X86] Improve test instruction shrinking when the sign flag is used and the output of the and is truncated
Currently we skip looking through truncates if the sign flag is used. But that's overly restrictive.

It's safe to look through the truncate as long as we ensure one of the 3 things when we shrink. Either the MSB of the mask at the shrunken size isn't set. If the mask bit is set then either the shrunk size needs to be equal to the compare size or the sign flag needs to be unused.

There are still missed opportunities to shrink a load and fold it in here. This will be fixed in a future patch.

Differential Revision: https://reviews.llvm.org/D52669

llvm-svn: 343498
2018-10-01 17:10:45 +00:00
Simon Pilgrim e0d2019052 [X86][Btver2] Fix BT(C|R|S)mr & BT(C|R|S)mi schedule latency + uop counts
Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343494
2018-10-01 16:31:30 +00:00
Matthias Braun 004fe6bf83 DAGCombiner: StoreMerging: Fix bad index calculating when adjusting mismatching vector types
This fixes a case of bad index calculation when merging mismatching
vector types. This changes the existing code to just use the existing
extract_{subvector|element} and a bitcast (instead of bitcast first and
then newly created extract_xxx) so we don't need to adjust any indices
in the first place.

rdar://44584718

Differential Revision: https://reviews.llvm.org/D52681

llvm-svn: 343493
2018-10-01 16:25:50 +00:00
Sanjay Patel 5187efcfab [x86] add tests for 256- and 512-bit vector types for scalar-to-vector transform; NFC
llvm-svn: 343491
2018-10-01 16:17:18 +00:00
Jesper Antonsson c954b86391 [InstCombine] Handle vector compares in foldGEPIcmp(), take 2
Summary:
This is a continuation of the fix for PR34627 "InstCombine assertion at vector gep/icmp folding". (I just realized bugpoint had fuzzed the original test for me, so I had fixed another trigger of the same assert in adjacent code in InstCombine.)

This patch avoids optimizing an icmp (to look only at the base pointers) when the resulting icmp would have a different type.

The patch adds a testcase and also cleans up and shrinks the pre-existing test for the adjacent assert trigger.

Reviewers: lebedev.ri, majnemer, spatel

Reviewed By: lebedev.ri

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52494

llvm-svn: 343486
2018-10-01 14:59:25 +00:00
Simon Atanasyan 1ea206be73 [mips] Generate tests expectations using update_llc_test_checks. NFC
Generate tests expectations using update_llc_test_checks and reduce
number of "check prefixes" used in the tests.

llvm-svn: 343485
2018-10-01 14:43:07 +00:00
Simon Pilgrim 6ddc4e821c [X86][Btver2] Fix BTmr schedule uop counts
Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343484
2018-10-01 14:42:16 +00:00
Sanjay Patel 31b07198f1 [InstCombine] try to convert vector insert+extract to trunc; 2nd try
This was originally committed at rL343407, but reverted at 
rL343458 because it crashed trying to handle a case where
the destination type is FP. This version of the patch adds
a check for that possibility. Tests added at rL343480.

Original commit message:

This transform is requested for the backend in:
https://bugs.llvm.org/show_bug.cgi?id=39016
...but I figured it was worth doing in IR too, and it's probably
easier to implement here, so that's this patch.

In the simplest case, we are just truncating a scalar value. If the
extract index doesn't correspond to the LSBs of the scalar, then we
have to shift-right before the truncate. Endian-ness makes this tricky,
but hopefully the ASCII-art helps visualize the transform.

Differential Revision: https://reviews.llvm.org/D52439

llvm-svn: 343482
2018-10-01 14:40:00 +00:00
Sanjay Patel 22ae8dabb5 [InstCombine] add more insert-extract tests for D52439; NFC
The first attempt at this transform:
rL343407
...was reverted:
rL343458
...because it did not handle the case where we bitcast to FP. 
The patch was already limited to avoid the case where we
bitcast from FP, but we might want to transform that too.

llvm-svn: 343480
2018-10-01 14:29:09 +00:00
Simon Pilgrim a982236e59 [X86][Btver2] Fix masked load schedule
JFPU01 resource usage should match JFPX

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343468
2018-10-01 13:12:05 +00:00
Hans Wennborg a60aa91374 Revert r343407 "[InstCombine] try to convert vector insert+extract to trunc"
This caused Chromium builds to fail with "Illegal Trunc" assertion.
See https://crbug.com/890723 for repro.

> This transform is requested for the backend in:
> https://bugs.llvm.org/show_bug.cgi?id=39016
> ...but I figured it was worth doing in IR too, and it's probably
> easier to implement here, so that's this patch.
>
> In the simplest case, we are just truncating a scalar value. If the
> extract index doesn't correspond to the LSBs of the scalar, then we
> have to shift-right before the truncate. Endian-ness makes this tricky,
> but hopefully the ASCII-art helps visualize the transform.
>
> Differential Revision: https://reviews.llvm.org/D52439

llvm-svn: 343458
2018-10-01 12:07:45 +00:00
Puyan Lotfi 06e65cae4a [NFC] Adding "REQUIRES: zlib" to a llvm-objcopy test for bots without zlib.
M    test/tools/llvm-objcopy/compress-and-decompress-debug-sections-error.test

llvm-svn: 343454
2018-10-01 10:50:23 +00:00
Andrea Di Biagio 24ea163007 [X86][BtVer2] Teach how to identify zero-idiom VPERM2F128rr instructions.
This patch adds another variant class to identify zero-idiom VPERM2F128rr
instructions.

On Jaguar, a VPERM wih bit 3 and 7 of the mask set, is a zero-idiom.

Differential Revision: https://reviews.llvm.org/D52663

llvm-svn: 343452
2018-10-01 10:35:13 +00:00
Puyan Lotfi af048648d3 [llvm-objcopy] Adding support for decompressing zlib compressed dwarf sections.
Summary: I had added support for compressing dwarf sections in a prior commit,
         this one adds support for decompressing. Usage is:

         llvm-objcopy --decompress-debug-sections input.o output.o

Reviewers: jakehehrlich, jhenderson, alexshap	

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D51841

llvm-svn: 343451
2018-10-01 10:29:41 +00:00
Clement Courbet a933fb237e [X86][Sched] Update scheduling information for VZEROALL on HWS, BDW, SKX, SNB.
Summary:
    While looking at PR35606, I found out that the scheduling info is incorrect.

    One can check that it's really a P5+P6 and not a 2*P56 with:
    echo -e 'vzeroall\nvandps %xmm1, %xmm2, %xmm3' | ./bin/llvm-exegesis -mode=uops -snippets-file=-
    (vandps executes on P5 only)

    Reviewers: craig.topper, RKSimon

    Subscribers: llvm-commits

    Differential Revision: https://reviews.llvm.org/D52541

llvm-svn: 343447
2018-10-01 08:37:48 +00:00
Carlos Alberto Enciso 81d8ef2196 [DebugInfo][Dexter] Incorrect DBG_VALUE after MCP dead copy instruction removal.
When MachineCopyPropagation eliminates a dead 'copy', its associated debug information becomes invalid. as the recorded register has been removed.  It causes the debugger to display wrong variable value.

Differential Revision: https://reviews.llvm.org/D52614

llvm-svn: 343445
2018-10-01 08:14:44 +00:00
Clement Courbet ce4caff0de [CodeGen][NFC] Add tests for heterogeneous types in MergeConsecutiveStores
Reviewers: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52643

llvm-svn: 343444
2018-10-01 07:16:22 +00:00
Craig Topper 67d9dbdbdd [X86] Stop X86DomainReassignment from creating copies between GR8/GR16 physical registers and k-registers.
We can only copy between a k-register and a GR32/GR64 register.

This patch detects that the copy will be illegal and prevents the domain reassignment from happening for that closure.

This probably isn't the best fix, and we should probably figure out how to handle this correctly.

Fixes PR38803.

llvm-svn: 343443
2018-10-01 07:08:41 +00:00
Simon Pilgrim f21083870d [X86] Fix scheduler class for BTmi instructions
This wasn't treated as a folded load instruction

llvm-svn: 343424
2018-09-30 20:19:16 +00:00
Simon Pilgrim b1108399bd [LLVM-MCA][X86] Add missing VCMPESTR/VCMPESTR tests
llvm-svn: 343421
2018-09-30 18:19:00 +00:00
Bjorn Pettersson c2fc53ac90 [PHIElimination] Lower a PHI node with only undef uses as IMPLICIT_DEF
Summary:
The lowering of PHI nodes used to detect if all inputs originated
from IMPLICIT_DEF's. If so the PHI node was replaced by an
IMPLICIT_DEF. Now we also consider undef uses when checking the
inputs. So if all inputs are implicitly defined or undef we
lower the PHI to an IMPLICIT_DEF. This makes
PHIElimination::LowerPHINode more consistent as it checks
both implicit and undef properties at later stages.

Reviewers: MatzeB, tstellar

Reviewed By: MatzeB

Subscribers: jvesely, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D52558

llvm-svn: 343417
2018-09-30 17:26:58 +00:00
Bjorn Pettersson 4af7f57bdf [PHIElimination] Update the regression test for PR16508
Summary:
When PR16508 was solved (in rL185363) a regression test was
added as test/CodeGen/PowerPC/2013-07-01-PHIElimBug.ll.
I discovered that the test case no longer reproduced the
scenario from PR16508. This problem could have been amended
by adding an extra RUN line with "-O1" (or possibly "-O0"),
but instead I added a mir-reproducer
  test/CodeGen/PowerPC/2013-07-01-PHIElimBug.mir
to get a reproducer that is less sensitive to changes in
earlier passes (including O-level).

While being at it I also corrected a code comment in
PHIElimination::EliminatePHINodes that has been incorrect
since the related bugfix from rL185363.

Reviewers: MatzeB, hfinkel

Reviewed By: MatzeB

Subscribers: nemanjai, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D52553

llvm-svn: 343416
2018-09-30 17:23:21 +00:00
Simon Pilgrim 20623f2343 [LLVM-MCA][X86] Add some AVX512 tests
These are going to be necessary to check I don't mess up when I start cleaning up all the remaining vector integer overrides

llvm-svn: 343414
2018-09-30 17:01:59 +00:00
Simon Pilgrim 4f5693ac8d [X86][Btver2] Fix PCmpIStrI/PCmpIStrM schedules
Missing JFPU0 pipe and double JFPU1 pipe (to match JVALU1) resources

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343413
2018-09-30 16:38:38 +00:00
Zachary Turner 518cb2d560 [PDB] Add native support for dumping array types.
llvm-svn: 343412
2018-09-30 16:19:18 +00:00
Sanjay Patel 1e0f1f645a [InstCombine] try to convert vector insert+extract to trunc
This transform is requested for the backend in:
https://bugs.llvm.org/show_bug.cgi?id=39016
...but I figured it was worth doing in IR too, and it's probably 
easier to implement here, so that's this patch.

In the simplest case, we are just truncating a scalar value. If the 
extract index doesn't correspond to the LSBs of the scalar, then we 
have to shift-right before the truncate. Endian-ness makes this tricky, 
but hopefully the ASCII-art helps visualize the transform.

Differential Revision: https://reviews.llvm.org/D52439

llvm-svn: 343407
2018-09-30 14:34:01 +00:00
Sanjay Patel 26c119a9c2 [InstCombine] allow lengthening of insertelement to eliminate shuffles
As noted in post-commit comments for D52548, the limitation on 
increasing vector length can be applied by opcode.
As a first step, this patch only allows insertelement to be
widened because that has no logical downsides for IR and has 
little risk of pessimizing codegen.

This may cause PR39132 to go into hiding during a full compile,
but that bug is not fixed.

llvm-svn: 343406
2018-09-30 13:50:42 +00:00
Roman Lebedev 0496477c5d [NFC][CodeGen][X86][AArch64] Add 64-bit constant bit field extract pattern tests
llvm-svn: 343404
2018-09-30 12:42:08 +00:00
Simon Pilgrim 84e280ae42 [X86] Regenerate MMX coalescing test
Exposes another extractelement(bitcast(scalartovector())) pattern

llvm-svn: 343403
2018-09-30 09:42:04 +00:00
Zachary Turner 9be3b6a18b [PDB] Fix this test for real.
I was able to test this fix on an actual Windows machine
so this should get the bot green again.

llvm-svn: 343400
2018-09-30 03:57:49 +00:00
Craig Topper 1709829fed [X86] Disable BMI BEXTR in X86DAGToDAGISel::matchBEXTRFromAnd unless we're on compiling for a CPU with single uop BEXTR
Summary:
This function turns (X >> C1) & C2 into a BMI BEXTR or TBM BEXTRI instruction. For BMI BEXTR we have to materialize an immediate into a register to feed to the BEXTR instruction.

The BMI BEXTR instruction is 2 uops on Intel CPUs. It looks like on SKL its one port 0/6 uop and one port 1/5 uop. Despite what Agner's tables say. I know one of the uops is a regular shift uop so it would have to go through the port 0/6 shifter unit. So that's the same or worse execution wise than the shift+and which is one 0/6 uop and one 0/1/5/6 uop. The move immediate into register is an additional 0/1/5/6 uop.

For now I've limited this transform to AMD CPUs which have a single uop BEXTR. If may also might make sense if we can fold a load or if the and immediate is larger than 32-bits and can't be encoded as a sign extended 32-bit value or if LICM or CSE can hoist the move immediate and share it. But we'd need to look more carefully at that. In the regression I looked at it doesn't look load folding or large immediates were occurring so the regression isn't caused by the loss of those. So we could try to be smarter here if we find a compelling case.

Reviewers: RKSimon, spatel, lebedev.ri, andreadb

Reviewed By: RKSimon

Subscribers: llvm-commits, andreadb, RKSimon

Differential Revision: https://reviews.llvm.org/D52570

llvm-svn: 343399
2018-09-30 03:01:46 +00:00
Zachary Turner 6e6d545d24 Only dump the types we need in the test.
We added support for dumping pointers but pointers to arrays
won't correctly dump until we add support for dumping arrays.
Instead of trying to dump everything, which this test isn't
even interested in, just dump enums and typedefs.

llvm-svn: 343398
2018-09-30 00:51:54 +00:00
Zachary Turner a1e79e326a Fix some tests on Windows.
I don't actually have a Windows machine at the present moment,
so hopefully this fixes it.

llvm-svn: 343397
2018-09-30 00:22:21 +00:00
Lang Hames 98440293fb [ORC] Add partitioning support to CompileOnDemandLayer2.
CompileOnDemandLayer2 now supports user-supplied partition functions (the
original CompileOnDemandLayer already supported these).

Partition functions are called with the list of requested global values
(i.e. global values that currently have queries waiting on them) and have an
opportunity to select extra global values to materialize at the same time.

Also adds testing infrastructure for the new feature to lli.

llvm-svn: 343396
2018-09-29 23:49:57 +00:00
Zachary Turner 6ca6a03c51 [PDB] Better native API support for pointers.
We didn't properly detect when a pointer was a member
pointer, and when that was the case we were not
properly returning class parent info.  This caused
member pointers to render incorrectly in pretty mode.
However, we didn't even have pretty tests for pointers
in native mode, so those are also added now to ensure
this.

llvm-svn: 343393
2018-09-29 23:28:19 +00:00
David Bolvansky 09fd8172df [DAGCombiner][NFC] Tests for X div/rem Y single bit fold
llvm-svn: 343392
2018-09-29 21:00:37 +00:00
Simon Pilgrim c4e7c347cd [X86][AVX2] Cleanup shuffle combining tests - add common prefixes
llvm-svn: 343391
2018-09-29 20:34:16 +00:00
Simon Pilgrim a2efe82b81 [X86] SimplifyDemandedVectorEltsForTargetNode - remove identity target shuffles before simplifying inputs
By removing demanded target shuffles that simplify to zero/undef/identity before simplifying its inputs we improve chances of further simplification, as only the immediate parent user of the combined is added back to the work list - this still doesn't help us if its passed through other ops though (bitcasts....).

llvm-svn: 343390
2018-09-29 18:15:26 +00:00
Craig Topper 845789e823 [X86] Add fast-isel test cases for unaligned load/store intrinsics recently added to clang
This adds tests for:
_mm_loadu_si16
_mm_loadu_si32
_mm_loadu_si16
_mm_storeu_si64
_mm_storeu_si32
_mm_storeu_si16

llvm-svn: 343389
2018-09-29 18:03:52 +00:00
Simon Pilgrim d633e290c8 [X86] getTargetConstantBitsFromNode - add support for rearranging constant bits via shuffles
Exposed an issue that recursive calls to getTargetConstantBitsFromNode don't handle changes to EltSizeInBits yet.

llvm-svn: 343384
2018-09-29 17:01:55 +00:00
Sanjay Patel 20c64510cb [InstCombine] add test for vector widening of insertelements; NFC
The test shows a potential overreach with the fix from D52548.

llvm-svn: 343378
2018-09-29 15:01:45 +00:00
Simon Pilgrim 43e4e648ef [X86] Regenerate fma comments.
llvm-svn: 343376
2018-09-29 14:31:00 +00:00
Simon Pilgrim 22d51014af [X86] getTargetConstantBitsFromNode - add support for peeking through ISD::EXTRACT_SUBVECTOR
llvm-svn: 343375
2018-09-29 14:17:32 +00:00
Simon Pilgrim aa77033a6b [X86][SSE] Fixed issue with v2i64 variable shifts on 32-bit targets
The shift amount might have peeked through a extract_subvector, altering the number of vector elements in the 'Amt' variable - so we were incorrectly calculating the ratio when peeking through bitcasts, resulting in incorrectly detecting splats.

llvm-svn: 343373
2018-09-29 13:25:22 +00:00
Eli Friedman 5ab09a684f [ARM] Fix correctness checks in promoteToConstantPool.
Correctly check for relocations in the constant to promote. And don't
allow promoting a constant multiple times.

This partially fixes https://bugs.llvm.org//show_bug.cgi?id=32780 ;
it's not a complete fix because we also need to prevent
ARMConstantIslands from cloning the constant.

(-arm-promote-constant is currently off by default, and it stays off
with this patch. I'll look into turning it on again when all the known
issues are fixed.)

Differential Revision: https://reviews.llvm.org/D51472

llvm-svn: 343361
2018-09-28 20:27:31 +00:00
Eli Friedman bb993be56b [ARM] Use preferred alignment for constants in promoteToConstantPool.
This mostly affects IR generated by non-clang frontends because clang
generally sets the alignment of globals explicitly.

Fixes https://bugs.llvm.org//show_bug.cgi?id=32394 .

(-arm-promote-constant is currently off by default, and it stays off
with this patch. I'll look into turning it on again when all the known
issues are fixed.)

Differential Revision: https://reviews.llvm.org/D51469

llvm-svn: 343359
2018-09-28 20:21:51 +00:00
Craig Topper 98aa643420 [X86] Add test cases for failures to use narrow test with immediate instructions when a truncate is beteen the CMP and the AND and the sign flag is used.
The code in X86ISelDAGToDAG only looks through truncates if the sign flag isn't used, but that is overly restrictive. A future patch will improve this.

llvm-svn: 343355
2018-09-28 19:06:28 +00:00
Evandro Menezes fc1852ff1c [AArch64] Split zero cycle feature more granularly
Split the `zcz` feature into specific ones got GP and FP registers, `zcz-gp`
and `zcz-fp`, respectively, while retaining the original feature option to
mean both.

Differential revision: https://reviews.llvm.org/D52621

llvm-svn: 343354
2018-09-28 19:05:09 +00:00
Andrea Di Biagio 6e218d0a57 [llvm-mca] Add a test for zero-idiom VPERM2F128rr. NFC
We don't correctly model the latency and resource usage information for
zero-idiom VPERM2F128rr on Jaguar.

This is demonstrated by the incorrect numbers in the resource pressure view, and
the timeline view.
A follow up patch will fix this problem.

llvm-svn: 343346
2018-09-28 17:47:09 +00:00
Luke Cheeseman 10981cc884 Revert r343317
- asan buildbots are breaking and I need to investigate the issue

llvm-svn: 343341
2018-09-28 17:01:50 +00:00
Greg Bedwell becbbe0383 [utils] Stricter checking from update_mca_test_checks.py
If any prefixes have been specified on the RUN lines that do not end up
ever actually getting printed, raise an Error. This is either an
indication that the run lines just need cleaning up, or that something
is more fundamentally wrong with the test.

Also raise an Error if there are any blocks which cannot be checked
because they are not uniquely covered by a prefix.

Fixed up a couple of tests where the extra checking flagged up issues.

Differential Revision: https://reviews.llvm.org/D48276

llvm-svn: 343332
2018-09-28 15:39:09 +00:00
Greg Bedwell 2f528f8c1e [utils] Allow better identification of matching blocks in update_mca_test_checks.py
Insert empty blocks to cause the positions of matching blocks to match
across lists where possible so that later stages of the algorithm can
actually identify them as being identical.

Regenerated all tests with this change.

Differential Revision: https://reviews.llvm.org/D52560

llvm-svn: 343331
2018-09-28 15:38:56 +00:00
Robert Widmann 9cba4eced8 [LLVM-C] Add more debug information accessors to GlobalObject and Instruction
Summary: Adds missing debug information accessors to GlobalObject.  This puts the finishing touches on cloning debug info in the echo tests.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: aprantl, JDevlieghere, llvm-commits, harlanhaskins

Differential Revision: https://reviews.llvm.org/D51522

llvm-svn: 343330
2018-09-28 15:35:18 +00:00
Sanjay Patel 242f90fe82 [InstCombine] don't propagate wider shufflevector arguments to predecessors
InstCombine would propagate shufflevector insts that had wider output vectors onto 
predecessors, which would sometimes push undef's onto the divisor of a div/rem and 
result in bad codegen.

I've fixed this by just banning propagating shufflevector back if the result of 
the shufflevector is wider than the input vectors.

Patch by: @sheredom (Neil Henning)

Differential Revision: https://reviews.llvm.org/D52548

llvm-svn: 343329
2018-09-28 15:24:41 +00:00
Sanjay Patel 699ee504f6 [InstCombine] adjust shuffle undef propagation tests; NFC
These are the updated baseline tests for D52548 -
I'm putting the tests next to the tests where the transform 
functions as expected, so we can see the intended/unintended
consequences.

Patch by: @sheredom (Neil Henning)

llvm-svn: 343328
2018-09-28 15:20:06 +00:00
Aditya Nandakumar 1cbb057142 [GISel]: Remove an incorrect assert in CallLowering
https://reviews.llvm.org/D51147

Asserting if any extend of vectors should be up to the target's
legalizer/target specific code not in CallLowering.

reviewed by : dsanders.

llvm-svn: 343325
2018-09-28 15:08:49 +00:00
Simon Pilgrim 428c1196d8 [X86][Btver2] PSUBS/PSUBUS instructions are zero-idioms
Noticed during llvm-exegesis tests, the PSUBS/PSUBUS instructions have the same zero-idiom behaviour to PSUB

llvm-svn: 343321
2018-09-28 14:20:42 +00:00
Simon Pilgrim 3216fd3602 [X86][Btver2] Add zero-idiom tests for PSUBS/PSUBUS instructions
Noticed during llvm-exegesis tests, the PSUBS/PSUBUS instructions have the same zero-idiom behaviour to PSUB

llvm-svn: 343319
2018-09-28 13:53:11 +00:00
Luke Cheeseman 21f2955bb2 Reapply changes reverted by r343235
- Add fix so that all code paths that create DWARFContext
  with an ObjectFile initialise the target architecture in the context
- Add an assert that the Arch is known in the Dwarf CallFrameString method

llvm-svn: 343317
2018-09-28 13:37:27 +00:00
Petar Jovanovic ff1bc621a0 [MIPS GlobalISel] Lower i64 arguments
Lower integer arguments larger then 32 bits for MIPS32.
setMostSignificantFirst is used in order for G_UNMERGE_VALUES and
G_MERGE_VALUES to always hold registers in same order, regardless of
endianness.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D52409

llvm-svn: 343315
2018-09-28 13:28:47 +00:00
Simon Pilgrim 66da1ed29d [X86][Btver2] CVTSS2I/CVTSD2I - add missing JFPU0 pipe
We issue JFPU1->JSTC then JFPU0->JFPA then -> JALU0 (integer pipe)

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343314
2018-09-28 13:19:22 +00:00
Jonas Devlieghere f1c414cd0d Split invocations in CodeGen/X86/cpus.ll among multiple tests. (NFC)
On GreenDragon `CodeGen/X86/cpus.ll` is timing out on the bot with Asan
and UBSan enabled. With the same configuration on my machine, the test
passes but takes more than 3 minutes to do so. I could increase the
timeout, but I believe it makes more sense to split up the test because
it allows for more parallelism.

Differential revision: https://reviews.llvm.org/D52603

llvm-svn: 343313
2018-09-28 12:08:51 +00:00
Simon Pilgrim 17e5981ebf [X86][Btver2] Fix BSF/BSR schedule
Double throughput to account for 2 pipes + fix BSF's latency/uop counts

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343311
2018-09-28 10:26:48 +00:00
David Spickett ea605913be [ARM] Allow execute only code on Cortex-m23
The NoMovt feature prevents the use of MOVW/MOVT
instructions on Cortex-M23 for performance reasons.
These instructions are required for execute only code
so NoMovt should be disabled when that option is enabled.

Differential Revision: https://reviews.llvm.org/D52551

llvm-svn: 343302
2018-09-28 08:55:19 +00:00
Oliver Stannard 5f34e9e265 [ARM][v8.5A] Add speculation barriers SSBB and PSSBB
This adds two new barrier instructions which can be used to restrict
speculative execution of load instructions.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52484

llvm-svn: 343300
2018-09-28 08:27:56 +00:00
Simon Pilgrim 280af1c7f0 [X86][BtVer2] Fix PHMINPOS schedule resources typo
PHMINPOS can run on either JFPU pipe

llvm-svn: 343299
2018-09-28 08:21:39 +00:00
Hiroshi Inoue 69bfa40200 [CodeGen] fix broken successor probability in MBB dump
When printing successor probabilities for a MBB, a human readable value is sometimes shown as 200.0%.
The human readable output is based on getProbabilityIterator, which returns 0xFFFFFFFF for getNumerator() and 0x80000000 for getDenominator() for unknown BranchProbability.
By using getSuccProbability as we do for the non-human readable part, we can avoid this problem.

Differential Revision: https://reviews.llvm.org/D52605

llvm-svn: 343297
2018-09-28 05:27:32 +00:00
Craig Topper bb50c38635 [ScalarizeMaskedMemIntrin] Use MinAlign to calculate alignment for the scalar load/stores to handle element types that are byte-sized but not powers of 2.
This pass doesn't handle non-byte sized types correctly at all, but at least we can make byte sized types work.

llvm-svn: 343294
2018-09-28 03:35:37 +00:00
Craig Topper fdf4c76ca0 [ScalarizeMaskedMemIntrin] Fix the alignment calculation for the scalar stores of a masked store expansion.
It should be the minimum of the original alignment and the scalar size.

llvm-svn: 343284
2018-09-28 01:06:13 +00:00
Craig Topper 92b992164d [ScalarizeMaskedMemIntrin] Add test cases for masked store expansion. Increase alignment of one of the masked load test cases.
The masked store alignment is being miscalculated, but masked load is correct.

llvm-svn: 343283
2018-09-28 01:06:09 +00:00
Craig Topper 1b29615330 [X86] Add the test case from PR38986.
The assembly for this test should be optimal now after changes to the ScalarizeMaskedMemIntrin patch.

llvm-svn: 343281
2018-09-27 23:25:10 +00:00
Craig Topper 6911bfe263 [ScalarizeMaskedMemIntrin] When expanding masked gathers, start with the passthru vector and insert the new load results into it.
Previously we started with undef and did a final merge with the passthru at the end.

llvm-svn: 343273
2018-09-27 21:28:59 +00:00
Craig Topper 45ad631b4c [ScalarizeMaskedMemIntrin] Add some IR only test cases for masked gather expansion.
llvm-svn: 343272
2018-09-27 21:28:55 +00:00
Craig Topper 7d234d6628 [ScalarizeMaskedMemIntrin] When expanding masked loads, start with the passthru value and insert each conditional load result over their element.
Previously we started with undef and did one final merge at the end with a select.

llvm-svn: 343271
2018-09-27 21:28:52 +00:00
Craig Topper dfc0f289fa [ScalarizeMaskedMemIntrin] Handle the case where the mask is an all zero vector.
This shouldn't really happen in practice I hope, but we tried to handle other constant cases. We missed this one because we checked for ConstantVector without realizing that zero becomes ConstantAggregateZero instead.

So instead just check for Constant and use getAggregateElement which will do the dirty work for us.

llvm-svn: 343270
2018-09-27 21:28:46 +00:00
Craig Topper a6478ac5d4 [ScalarizeMaskedMemIntrin] Add dedicated IR only tests for masked load expansion so I can begin making modifications.
llvm-svn: 343269
2018-09-27 21:28:43 +00:00
Stanislav Mekhanoshin b080adfc0c [AMDGPU] Fold copy (copy vgpr)
This allows to reduce a number of used VGPRs in some cases.

Differential Revision: https://reviews.llvm.org/D52577

llvm-svn: 343249
2018-09-27 18:55:20 +00:00
Craig Topper 0423681d4a [ScalarizeMaskedMemIntrin] Don't emit 'icmp eq i1 %x, 1' to check mask values. That's just %x so use that directly.
Had we emitted this IR earlier, InstCombine would have removed icmp so I'm going to assume using the i1 directly would be considered canonical.

llvm-svn: 343244
2018-09-27 18:01:48 +00:00
Simon Pilgrim 86c7b07ecd [X86][Btver2] (V)MPSADBW instructions take 3uops not 1
llvm-svn: 343238
2018-09-27 17:13:57 +00:00
Luke Cheeseman 8e5676b1aa Revert r343192 as an ubsan build is currently failing
llvm-svn: 343235
2018-09-27 16:47:30 +00:00
Simon Pilgrim dd744f158a [X86][Btver2] BTC/BTR/BTS instructions take 2uops not 1
llvm-svn: 343234
2018-09-27 16:39:52 +00:00
Oliver Stannard a4f68bf4ad [AArch64][v8.5A] Add speculation barriers SSBB and PSSBB
This adds two new barrier instructions which can be used to restrict
speculative execution of load instructions.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52483

llvm-svn: 343229
2018-09-27 16:09:05 +00:00
Sanjay Patel c3f50ff92e [InstCombine] Without infinites, fold (C / X) < 0.0 --> (X < 0)
When C is not zero and infinites are not allowed (C / X) > 0 is a sign
test. Depending on the sign of C, the predicate must be swapped.

E.g.:
  foo(double X) {
    if ((-2.0 / X) <= 0) ...
  }
 =>
  foo(double X) {
    if (X >= 0) ...
  }

Patch by: @marels (Martin Elshuber)

Differential Revision: https://reviews.llvm.org/D51942

llvm-svn: 343228
2018-09-27 15:59:24 +00:00
Simon Pilgrim c2a88ea64e [X86][Btver2] BLSI/BLSMSK/BLSR instructions take 2uops not 1 (same as TZCNT)
llvm-svn: 343227
2018-09-27 14:57:57 +00:00
Teresa Johnson f24136f17a [WPD] Fix incorrect devirtualization after indirect call promotion
Summary:
Add a dominance check to ensure that the possible devirtualizable
call is actually dominated by the type test/checked load intrinsic being
analyzed. With PGO, after indirect call promotion is performed during
the compile step, followed by inlining, we may have a type test in the
promoted and inlined sequence that allows an indirect call in that
sequence to be devirtualized. That indirect call (inserted by inlining
after promotion) will share the same vtable pointer as the fallback
indirect call that cannot be devirtualized.

Before this patch the code was incorrectly devirtualizing the fallback
indirect call.

See the new test and the example described there for more details.

Reviewers: pcc, vitalybuka

Subscribers: mehdi_amini, Prazek, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D52514

llvm-svn: 343226
2018-09-27 14:55:32 +00:00
Oliver Stannard a9a5eee169 [AArch64][v8.5A] Add Branch Target Identification instructions
This adds new instructions used by the Branch Target Identification
feature. When this is enabled, these are the only instructions which can
be targeted by indirect branch instructions.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52485

llvm-svn: 343225
2018-09-27 14:54:33 +00:00
Sanjay Patel 95a816b34a [InstCombine] add tests for FP sign-bit cmp optimization with fdiv; NFC
These are baseline tests for D51942.
Patch by: @marels (Martin Elshuber)

llvm-svn: 343222
2018-09-27 14:24:29 +00:00
Oliver Stannard 8459d34e82 [AArch64][v8.5A] Add speculation restriction system registers
This adds some new system registers which can be used to restrict
certain types of speculative execution.

Patch by Pablo Barrio and David Spickett!

Differential revision: https://reviews.llvm.org/D52482

llvm-svn: 343218
2018-09-27 14:05:46 +00:00
Oliver Stannard dc837e3f1f [AArch64][v8.5A] Add Armv8.5-A random number instructions
This adds two new system registers, used to generate random numbers.

This is an optional extension to v8.5-A, and will be controlled by the
"+rng" modifier of the -march= and -mcpu= options.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52481

llvm-svn: 343217
2018-09-27 14:01:40 +00:00
Oliver Stannard 6930b12d53 [AArch64][v8.5A] Add Armv8.5-A "DC CVADP" instruction
This adds a new variant of the DC system instruction for persistent
memory.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52480

llvm-svn: 343216
2018-09-27 13:53:35 +00:00
Oliver Stannard 224428c06a [AArch64][v8.5A] Add prediction invalidation instructions to AArch64
This adds new system instructions which act as barriers to speculative
execution based on earlier execution within a particular execution
context.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52479

llvm-svn: 343214
2018-09-27 13:47:40 +00:00
Oliver Stannard 382c935c42 [ARM][v8.5A] Add speculation barrier to ARM & Thumb instruction sets
This is a new barrier which limits speculative execution of the
instructions following it.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52477

llvm-svn: 343213
2018-09-27 13:41:14 +00:00
Oliver Stannard e481f1d95a [AArch64][v8.5A] Add speculation barrier to AArch64 instruction set
This is a new barrier which limits speculative execution of the
instructions following it.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52476

llvm-svn: 343211
2018-09-27 13:39:06 +00:00
Daniel Cederman 0c05bdea2b [Sparc] Remove the support for builtin setjmp/longjmp
Summary: It is currently broken and for Sparc there is not much benefit
in using a builtin version compared to a library version. Both versions
needs to store the same four values in setjmp and flush the register
windows in longjmp. If the need for a builtin setjmp/longjmp arises there
is an improved implementation available at https://reviews.llvm.org/D50969.

Reviewers: jyknight, joerg, venkatra

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D51487

llvm-svn: 343210
2018-09-27 13:32:54 +00:00
Oliver Stannard ddb7d46aa5 [AArch64][v8.5A] Add FRINT[32,64][Z,X] instructions
These are some new variants of the "Floating-point Round to Integral"
family of instructions, which round to the nearest floating-point value
which fits in a 32- or 64-bit integer.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52475

llvm-svn: 343209
2018-09-27 13:32:06 +00:00
Clement Courbet 30183093ab [llvm-exegesis] Fix PR39096.
Summary: The key is now the resource name, not the resource id.

Reviewers: gchatelet

Subscribers: tschuett, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D52607

llvm-svn: 343208
2018-09-27 13:26:37 +00:00
Daniel Cederman b35d3a2733 [Sparc] Add unimp alias
Summary: Use 0 as the default immediate for the UNIMP instruction.
This matches the behavior in gas.

Reviewers: jyknight, venkatra

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D51526

llvm-svn: 343203
2018-09-27 12:34:53 +00:00
Daniel Cederman c1968ba5d3 [Sparc] Add support for the partial write PSR instruction
Summary:
Partial write %PSR (WRPSR) is a SPARC V8e option that allows WRPSR
instructions to only affect the %PSR.ET field. It is supported by
the GR740 and GR716.

Reviewers: jyknight, venkatra

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D48644

llvm-svn: 343202
2018-09-27 12:34:48 +00:00
Simon Pilgrim 98f503a326 [X86][Btver2] TZCNT instructions take 2uops not 1
llvm-svn: 343200
2018-09-27 12:28:47 +00:00
Luke Cheeseman f6844b307a Reapply changes reverted in r343114, lldb patch to follow shortly
llvm-svn: 343192
2018-09-27 10:39:20 +00:00
Nicola Zaghen 436c012702 [InstCombine] Add new tests in preparation for a combine of icmp (mul nsw/nuw X, C2), C
Proof for the future optimisations are here:
- eq/neq: https://rise4fun.com/Alive/9PBA
- sgt/ugt: https://rise4fun.com/Alive/58yr
- slt/ult: https://rise4fun.com/Alive/VCQ

Differential Revision: https://reviews.llvm.org/D51625

llvm-svn: 343190
2018-09-27 10:08:38 +00:00
Oliver Stannard 31af178f4a [AArch64][v8.5A] Add PSTATE manipulation instructions XAFlag and AXFlag
These new instructions manipluate the NZCV bits, to convert between the
regular Arm floating-point comare format and an alternative format.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52473

llvm-svn: 343187
2018-09-27 09:11:27 +00:00
Sanjay Patel 150afce75a [InstCombine] add tests that show undef propagation failures from D52548; NFC
Differential Revision: https://reviews.llvm.org/D52556

llvm-svn: 343140
2018-09-26 20:30:47 +00:00
Florian Hahn 6feb637124 [LoopInterchange] Preserve LCSSA.
This patch extends LoopInterchange to move LCSSA to the right place
after interchanging. This is required for LoopInterchange to become a
function pass.

An alternative to the manual moving of the PHIs, we could also re-form
the LCSSA phis for a set of interchanged loops, but that's more
expensive.

Reviewers: efriedma, mcrosier, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D52154

llvm-svn: 343132
2018-09-26 19:34:25 +00:00
Sanjay Patel d938d0d4f5 [InstCombine] add tests for vector insert/extract; NFC
Preliminary step for D52439.

llvm-svn: 343128
2018-09-26 17:57:38 +00:00
Craig Topper e4c96f4a48 [X86] Update tzcnt fast-isel tests to match clang r343126.
We now generate cttz with the zero_undef flag set to false. This allows -O0 to avoid the zero check.

llvm-svn: 343127
2018-09-26 17:19:28 +00:00
Lang Hames f0a3fd885d Reapply r343058 with a fix for -DLLVM_ENABLE_THREADS=OFF.
Modifies lit to add a 'thread_support' feature that can be used in lit test
REQUIRES clauses. The thread_support flag is set if -DLLVM_ENABLE_THREADS=ON
and unset if -DLLVM_ENABLE_THREADS=OFF. The lit flag is used to disable the
multiple-compile-threads-basic.ll testcase when threading is disabled.

llvm-svn: 343122
2018-09-26 16:26:59 +00:00
Luke Cheeseman 77aaa22081 Revert r343112 as CallFrameString API change has broken lldb builds
llvm-svn: 343114
2018-09-26 14:48:03 +00:00
Luke Cheeseman 03ad8812f5 [AArch64] - Return address signing dwarf support
- Reapply r343089 with a fix for DebugInfo/Sparc/gnu-window-save.ll

llvm-svn: 343112
2018-09-26 14:30:29 +00:00
Clement Courbet a5720c4e62 [llvm-exgesis][NFC] Do not pollute buildbots with messages when
the exegesis lit tests cannot run.

llvm-svn: 343110
2018-09-26 13:58:26 +00:00
Clement Courbet 28d4f85824 [llvm-exegesis] Get rid of debug_string.
Summary:
THis is a backwards-compatible change (existing files will work as
expected).

See PR39082.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52546

llvm-svn: 343108
2018-09-26 13:35:10 +00:00
Francis Visoiu Mistrih 6acaa18afc [CodeGen] Always print register ties in MI::dump()
It was the case when calling MO::dump(), but MI::dump() was still
depending on hasComplexRegisterTies().

The MIR output is not affected.

llvm-svn: 343107
2018-09-26 13:33:09 +00:00
Hans Wennborg 00b88bbcaf Revert r343089 "[AArch64] - Return address signing dwarf support"
This caused the DebugInfo/Sparc/gnu-window-save.ll test to fail.

> Functions that have signed return addresses need additional dwarf support:
> - After signing the LR, and before authenticating it, the LR register is in a
>   state the is unusable by a debugger or unwinder
> - To account for this a new directive, .cfi_negate_ra_state, is added
> - This directive says the signed state of the LR register has now changed,
>   i.e. unsigned -> signed or signed -> unsigned
> - This directive has the same CFA code as the SPARC directive GNU_window_save
>   (0x2d), adding a macro to account for multiply defined codes
> - This patch matches the gcc implementation of this support:
>   https://patchwork.ozlabs.org/patch/800271/
>
> Differential Revision: https://reviews.llvm.org/D50136

llvm-svn: 343103
2018-09-26 12:57:45 +00:00
Hiroshi Inoue 20982f0995 [PowerPC] optimize conditional branch on CRSET/CRUNSET
This patch adds a check to optimize conditional branch (BC and BCn) based on a constant set by CRSET or CRUNSET.
Other optimizers, such as block placement, may generate such code and hence
I do this at the very end of the optimization in pre-emit peephole pass.

A conditional branch based on a constant is eliminated or converted into unconditional branch. 
Also CRSET/CRUNSET is eliminated if the condition code register is not used
by instruction other than the branch to be optimized.

Differential Revision: https://reviews.llvm.org/D52345

llvm-svn: 343100
2018-09-26 12:32:45 +00:00
Hans Wennborg 20b5abe23b Revert r343058 "[ORC] Add support for multithreaded compiles to LLJIT and LLLazyJIT."
This doesn't work well in builds configured with LLVM_ENABLE_THREADS=OFF,
causing the following assert when running
ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll:

  lib/ExecutionEngine/Orc/Core.cpp:1748: Expected<llvm::JITEvaluatedSymbol>
  llvm::orc::lookup(const llvm::orc::JITDylibList &, llvm::orc::SymbolStringPtr):
  Assertion `ResultMap->size() == 1 && "Unexpected number of results"' failed.

> LLJIT and LLLazyJIT can now be constructed with an optional NumCompileThreads
> arguments. If this is non-zero then a thread-pool will be created with the
> given number of threads, and compile tasks will be dispatched to the thread
> pool.
>
> To enable testing of this feature, two new flags are added to lli:
>
> (1) -compile-threads=N (N = 0 by default) controls the number of compile threads
> to use.
>
> (2) -thread-entry can be used to execute code on additional threads. For each
> -thread-entry argument supplied (multiple are allowed) a new thread will be
> created and the given symbol called. These additional thread entry points are
> called after static constructors are run, but before main.

llvm-svn: 343099
2018-09-26 12:15:23 +00:00
Simon Pilgrim 26223bccde [X86][SSE] Refresh PR34947 test code to handle D52504
The previously reduced version used urem <9 x i32> zeroinitializer, %tmp which D52504 will simplify.

llvm-svn: 343097
2018-09-26 11:53:51 +00:00
Simon Pilgrim 5beaac433d [X86][SSE] Use ISD::MULHS for constant vXi16 ISD::SRA lowering (PR38151)
Similar to the existing ISD::SRL constant vector shifts from D49562, this patch adds ISD::SRA support with ISD::MULHS.

As we're dealing with signed values, we have to handle shift by zero and shift by one special cases, so XOP+AVX2/AVX512 splitting/extension is still a better solution - really we should still use ISD::MULHS if one of the special cases are used but for now I've just left a TODO and filtered by isKnownNeverZero.

Differential Revision: https://reviews.llvm.org/D52171

llvm-svn: 343093
2018-09-26 10:57:05 +00:00
Sam Parker 75aca94093 [ARM] Fix for PR39060
When calculating whether a value can safely overflow for use by an
icmp, we weren't checking that the value couldn't wrap around. To do
this we need the icmp to be using a constant, as well as the incoming
add or sub.

bugzilla report: https://bugs.llvm.org/show_bug.cgi?id=39060

Differential Revision: https://reviews.llvm.org/D52463

llvm-svn: 343092
2018-09-26 10:56:00 +00:00
David Green 353cb3d4e5 [CodeGen] Enable tail calls for functions with NonNull attributes.
Adding NonNull as attributes to returned pointers has the unfortunate side
effect of disabling tail calls. This patch ignores the NonNull attribute when
we decide whether to tail merge, in the same way that we ignore the NoAlias
attribute, as it has no affect on the call sequence.

Differential Revision: https://reviews.llvm.org/D52238

llvm-svn: 343091
2018-09-26 10:46:18 +00:00
Yury Gribov 67572004df Fixes removal of dead elements from PressureDiff (PR37252).
Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D51495

llvm-svn: 343090
2018-09-26 10:42:41 +00:00
Luke Cheeseman f755e687fc [AArch64] - Return address signing dwarf support
Functions that have signed return addresses need additional dwarf support:
- After signing the LR, and before authenticating it, the LR register is in a
  state the is unusable by a debugger or unwinder
- To account for this a new directive, .cfi_negate_ra_state, is added
- This directive says the signed state of the LR register has now changed,
  i.e. unsigned -> signed or signed -> unsigned
- This directive has the same CFA code as the SPARC directive GNU_window_save
  (0x2d), adding a macro to account for multiply defined codes
- This patch matches the gcc implementation of this support:
  https://patchwork.ozlabs.org/patch/800271/

Differential Revision: https://reviews.llvm.org/D50136

llvm-svn: 343089
2018-09-26 10:14:15 +00:00
Hans Wennborg 4b2e7daa7e Revert r342870 "[ARM] bottom-top mul support ARMParallelDSP"
This broke Chromium's Android build (https://crbug.com/889390) and the
polly-aosp buildbot
(http://lab.llvm.org:8011/builders/aosp-O3-polly-before-vectorizer-unprofitable).

> Originally committed in rL342210 but was reverted in rL342260 because
> it was causing issues in vectorized code, because I had forgotten to
> ensure that we're operating on scalar values.
>
> Original commit message:
>
> On failing to find sequences that can be converted into dual macs,
> try to find sequential 16-bit loads that are used by muls which we
> can then use smultb, smulbt, smultt with a wide load.
>
> Differential Revision: https://reviews.llvm.org/D51983

llvm-svn: 343082
2018-09-26 08:41:50 +00:00
Lang Hames d8048675f4 [ORC] Update CompileOnDemandLayer2 to use the new lazyReexports mechanism
for lazy compilation, rather than a callback manager.

The new mechanism does not block compile threads, and does not require
function bodies to be renamed.

Future modifications should allow laziness on a per-module basis to work
without any modification of the input module.

llvm-svn: 343065
2018-09-26 05:08:29 +00:00
Hsiangkai Wang 55321d82bd [DebugInfo] Do not generate address info for removed debug labels.
In some senario, LLVM will remove llvm.dbg.labels in IR. For example,
when the labels are in unreachable blocks, these labels will not
be generated in LLVM IR. In the case, these debug labels will have
address zero as their address. It is not legal address for debugger to
set breakpoints or query sources. So, the patch inhibits the address info
(DW_AT_low_pc) of removed labels.

Fix build failed in BuildBot, clang-stage1-cmake-RA-incremental, on macOS.

Differential Revision: https://reviews.llvm.org/D51908

llvm-svn: 343062
2018-09-26 04:19:23 +00:00
Lang Hames 225a32af72 [ORC] Add support for multithreaded compiles to LLJIT and LLLazyJIT.
LLJIT and LLLazyJIT can now be constructed with an optional NumCompileThreads
arguments. If this is non-zero then a thread-pool will be created with the
given number of threads, and compile tasks will be dispatched to the thread
pool.

To enable testing of this feature, two new flags are added to lli:

(1) -compile-threads=N (N = 0 by default) controls the number of compile threads
to use.

(2) -thread-entry can be used to execute code on additional threads. For each
-thread-entry argument supplied (multiple are allowed) a new thread will be
created and the given symbol called. These additional thread entry points are
called after static constructors are run, but before main.

llvm-svn: 343058
2018-09-26 02:39:42 +00:00
Vyacheslav Zakharin e06831a3b2 Remove LoopID metadata from the branch instruction
that follows the peeled iterations.

Differential Revision: https://reviews.llvm.org/D52176

llvm-svn: 343054
2018-09-26 01:03:21 +00:00
Zhaoshi Zheng 95710337b4 Revert "Revert "[ConstHoist] Do not rebase single (or few) dependent constant""
This reverts commit bd7b44f35ee9fbe365eb25ce55437ea793b39346.

Reland r342994: disabled the optimization and explicitly enable it in test.

-mllvm -consthoist-min-num-to-rebase<unsigned>=0

[ConstHoist] Do not rebase single (or few) dependent constant

If an instance (InsertionPoint or IP) of Base constant A has only one or few
rebased constants depending on it, do NOT rebase. One extra ADD instruction is
required to materialize each rebased constant, assuming A and the rebased have
the same materialization cost.

Differential Revision: https://reviews.llvm.org/D52243

llvm-svn: 343053
2018-09-26 00:59:09 +00:00
Thomas Lively c949857a7f [WebAssembly] SIMD conversions
Summary:
Lowers (s|u)itofp and fpto(s|u)i instructions for vectors. The fp to
int conversions produce poison values if their arguments are out of
the convertible range, so a future CL will have to add an LLVM
intrinsic to make the saturating behavior of this conversion usable.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52372

llvm-svn: 343052
2018-09-26 00:34:36 +00:00
Stanislav Mekhanoshin 8dfcd83371 [AMDGPU] Fix ds combine with subregs
Differential Revision: https://reviews.llvm.org/D52522

llvm-svn: 343047
2018-09-25 23:33:18 +00:00
Craig Topper 12c18840fa [X86] Allow movmskpd/ps ISD nodes to be created and selected with integer input types.
This removes an int->fp bitcast between the surrounding code and the movmsk. I had already added a hack to combineMOVMSK to try to look through this bitcast to improve the SimplifyDemandedBits there.

But I found an additional issue where the bitcast was preventing combineMOVMSK from being called again after earlier nodes in the DAG are optimized. The bitcast gets revisted, but not the user of the bitcast. By using integer types throughout, the bitcast doesn't get in the way.

llvm-svn: 343046
2018-09-25 23:28:27 +00:00
Craig Topper d8c68840c8 [X86] Add some more movmsk test cases. NFC
These IR patterns represent the exact behavior of a movmsk instruction using (zext (bitcast (icmp slt X, 0))).

For the v4i32/v8i32/v2i64/v4i64 we currently emit a PCMPGT for the icmp slt which is unnecessary since we only care about the sign bit of the result. This is because of the int->fp bitcast we put on the input to the movmsk nodes for these cases. I'll be fixing this in a future patch.

llvm-svn: 343045
2018-09-25 23:28:24 +00:00
Sanjay Patel f23727d972 [InstCombine] add fneg variation of shuffle-binop fold; NFC
If the fsub in this pattern was replaced by an actual fneg
instruction, we would need to add a fold to recognize that
because fneg would not be a binop.

llvm-svn: 343041
2018-09-25 22:48:58 +00:00
Changpeng Fang 6f4922ccc9 AMDGPU: Add Selection patterns to support add of one bit.
Summary:
  We generate s_xor to lower add of i1s in general cases, and s_not to
lower add with a one-bit imm of -1 (true).

Reviewers:
  rampitec

Differential Revision:
  https://reviews.llvm.org/D52518

llvm-svn: 343030
2018-09-25 21:21:18 +00:00
Anna Thomas b1e3d45318 [LV][LAA] Vectorize loop invariant values stored into loop invariant address
Summary:
We are overly conservative in loop vectorizer with respect to stores to loop
invariant addresses.
More details in https://bugs.llvm.org/show_bug.cgi?id=38546
This is the first part of the fix where we start with vectorizing loop invariant
values to loop invariant addresses.

This also includes changes to ORE for stores to invariant address.

Reviewers: anemet, Ayal, mkuper, mssimpso

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50665

llvm-svn: 343028
2018-09-25 20:57:20 +00:00
Teresa Johnson 7fb39dfa7c [ThinLTO] Efficiency fix for writing type id records in per-module indexes
Summary:
In D49565/r337503, the type id record writing was fixed so that only
referenced type ids were emitted into each per-module index for ThinLTO
distributed builds. However, this still left an efficiency issue: each
per-module index checked all type ids for membership in the referenced
set, yielding O(M*N) performance (M indexes and N type ids).

Change the TypeIdMap in the summary to be indexed by GUID, to facilitate
correlating with type identifier GUIDs referenced in the function
summary TypeIdInfo structures. This allowed simplifying other
places where a map from type id GUID to type id map entry was previously
being used to aid this correlation.

Also fix AsmWriter code to handle the rare case of type id GUID
collision.

For a large internal application, this reduced the thin link time by
almost 15%.

Reviewers: pcc, vitalybuka

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51330

llvm-svn: 343021
2018-09-25 20:14:40 +00:00
Sanjay Patel 10c11b867a [x86] avoid 256-bit andnp that requires insert/extract with AVX1 (PR37449)
This is the final (I hope!) problem pattern mentioned in PR37749:
https://bugs.llvm.org/show_bug.cgi?id=37749

We are trying to avoid an AVX1 sinkhole caused by having 256-bit bitwise logic ops but no other 256-bit integer ops. 
We've already solved the simple logic ops, but 'andn' is an x86 special. I looked at alternative solutions like 
extending the generic DAG combine or trying to wait until the ANDNP node is created, but those are bigger patches 
that can over-reach. Ie, splitting to 128-bit does not look like a win in most cases with >1 256-bit op.

The pattern matching is cluttered with bitcasts because of our i64 element canonicalization. For the affected test, 
we have this vector-type-legalized sequence:

        t29: v8i32 = concat_vectors t27, t28
      t30: v4i64 = bitcast t29
        t18: v8i32 = BUILD_VECTOR Constant:i32<-1>, Constant:i32<-1>, ...
      t31: v4i64 = bitcast t18
    t32: v4i64 = xor t30, t31
      t9: v8i32 = BUILD_VECTOR Constant:i32<255>, Constant:i32<255>, ...
    t34: v4i64 = bitcast t9
  t35: v4i64 = and t32, t34
t36: v8i32 = bitcast t35
      t37: v4i32 = extract_subvector t36, Constant:i64<0>
      t38: v4i32 = extract_subvector t36, Constant:i64<4>

Differential Revision: https://reviews.llvm.org/D52318

llvm-svn: 343008
2018-09-25 19:09:34 +00:00
Yury Delendik 7c18d6083a [WebAssembly] Move/clone DBG_VALUE during WebAssemblyRegStackify pass
Summary:
The MoveForSingleUse or MoveAndTeeForMultiUse functions move wasm instructions,
however DBG_VALUE stay unchanged -- moving or cloning these.

Reviewers: dschuff

Reviewed By: dschuff

Subscribers: mattd, MatzeB, dschuff, sbc100, jgravelle-google, aheejin, sunfish, llvm-commits, aardappel

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D49034

llvm-svn: 343007
2018-09-25 18:59:34 +00:00
Jessica Paquette e02de05b32 Revert "[ConstHoist] Do not rebase single (or few) dependent constant"
This caused a couple test failures on a bot:

CodeGen/X86/constant-hoisting-bfi.ll
Transforms/ConstantHoisting/X86/ehpad.ll

Example:

http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/53575/

llvm-svn: 343005
2018-09-25 18:41:40 +00:00
Daniil Fukalov 349b5943b4 [RegAllocGreedy] avoid using physreg candidates that cannot be correctly spilled
For the AMDGPU target if a MBB contains exec mask restore preamble, SplitEditor may get state when it cannot insert a spill instruction.

E.g. for a MIR

bb.100:
    %1 = S_OR_SAVEEXEC_B64 %2, implicit-def $exec, implicit-def $scc, implicit $exec
and if the regalloc will try to allocate a virtreg to the physreg already assigned to virtreg %1, it should insert spill instruction before the S_OR_SAVEEXEC_B64 instruction.
But it is not possible since can generate incorrect code in terms of exec mask.

The change makes regalloc to ignore such physreg candidates.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D52052

llvm-svn: 343004
2018-09-25 18:37:38 +00:00
Daniel Sanders 06f4ff1952 [globalisel][tblgen] Table optimization should consider the C++ code in C++ predicates
This fixes PR39045

llvm-svn: 342997
2018-09-25 17:59:02 +00:00
Zhaoshi Zheng 2c1a09188f [ConstHoist] Do not rebase single (or few) dependent constant
If an instance (InsertionPoint or IP) of Base constant A has only one or few
rebased constants depending on it, do NOT rebase. One extra ADD instruction is
required to materialize each rebased constant, assuming A and the rebased have
the same materialization cost.

Differential Revision: https://reviews.llvm.org/D52243

llvm-svn: 342994
2018-09-25 17:45:37 +00:00
Justin Bogner ef2ae740c6 Revert "[DebugInfo] Do not generate address info for removed debug labels."
The added test is failing on macOS:

  http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/53550/

This reverts r342943.

llvm-svn: 342993
2018-09-25 17:29:30 +00:00
Craig Topper 6fb1358a98 [X86] Add AVX512 support to combineVectorSizedSetCCEquality.
Reviewers: spatel, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52424

llvm-svn: 342989
2018-09-25 16:27:12 +00:00
Sanjay Patel 69ed4710b8 [InstCombine] narrow binops on concatenated vectors (PR33026)
The motivating case from:
https://bugs.llvm.org/show_bug.cgi?id=33026
...has no shuffles now. This kind of pattern may occur during
vectorization when targets have lumpy ISAs like SSE/AVX.

llvm-svn: 342988
2018-09-25 15:57:37 +00:00
Guillaume Chatelet 345fae5d56 [llvm-exegesis] Serializes registers initial values.
Summary: Adds the registers initial values to the YAML output of llvm-exegesis.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52460

llvm-svn: 342982
2018-09-25 15:15:54 +00:00
Guillaume Chatelet 6078f82241 [llvm-exegesis] Fix missing document separator in YAML output.
Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52496

llvm-svn: 342981
2018-09-25 14:48:24 +00:00
Clement Courbet 86baebc5fd [llvm-exegesis] Add lit tests (v2).
Summary: This revisits rL342953 by adding detection of host support.

Reviewers: gchatelet, lebedev.ri, alexshap

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52464

llvm-svn: 342975
2018-09-25 13:59:35 +00:00
Simon Pilgrim b56be79e0c Revert rL342916: [X86] Remove shift/rotate by CL memory (RMW) overrides
As suggested by Craig Topper - I'm going to look at cleaning up the RMW sequences instead.

The uops are slightly different to the register variant, so requires a +1uop tweak

llvm-svn: 342969
2018-09-25 13:01:26 +00:00
David Green 9108c2b921 [LoopUnroll] Add check to Latch's terminator in UnrollRuntimeLoopRemainder
In this patch, I'm adding an extra check to the Latch's terminator in llvm::UnrollRuntimeLoopRemainder,
similar to how it is already done in the llvm::UnrollLoop.

The compiler would crash if this function is called with a malformed loop.

Patch by Rodrigo Caetano Rocha!

Differential Revision: https://reviews.llvm.org/D51486

llvm-svn: 342958
2018-09-25 10:08:47 +00:00
Sameer Sahasrabuddhe b4f2d1cb68 [AMDGPU] restore r342722 which was reverted with r342743
[AMDGPU] lower-switch in preISel as a workaround for legacy DA

Summary:
The default target of the switch instruction may sometimes be an
"unreachable" block, when it is guaranteed that one of the cases is
always taken. The dominator tree concludes that such a switch
instruction does not have an immediate post dominator. This confuses
divergence analysis, which is unable to propagate sync dependence to
the targets of the switch instruction.

As a workaround, the AMDGPU target now invokes lower-switch as a
preISel pass. LowerSwitch is designed to handle the unreachable
default target correctly, allowing the divergence analysis to locate
the correct immediate dominator of the now-lowered switch.

llvm-svn: 342956
2018-09-25 09:39:21 +00:00
Clement Courbet 6d92c198ac Revert rL342953 "[llvm-exegesis] Add lit tests."
We also need to make sure that we're on the right subtarget.

llvm-svn: 342955
2018-09-25 09:36:44 +00:00
Clement Courbet 7f1322dc4d [llvm-exegesis] Add lit tests.
Summary:
Right now we only have unit tests. This will allow testing the whole
tool. Even though We can't really check actual values, this will avoid
regressions such as PR39055.

Reviewers: gchatelet, alexshap

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52407

llvm-svn: 342953
2018-09-25 09:27:43 +00:00
Hsiangkai Wang 9c2463622d [DebugInfo] Do not generate address info for removed debug labels.
In some senario, LLVM will remove llvm.dbg.labels in IR. For example,
when the labels are in unreachable blocks, these labels will not
be generated in LLVM IR. In the case, these debug labels will have
address zero as their address. It is not legal address for debugger to
set breakpoints or query sources. So, the patch inhibits the address info
(DW_AT_low_pc) of removed labels.

Differential Revision: https://reviews.llvm.org/D51908

llvm-svn: 342943
2018-09-25 06:09:50 +00:00
Thomas Lively 12da0f9c3d [WebAssembly] SIMD sqrt
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52387

llvm-svn: 342937
2018-09-25 03:39:28 +00:00
Stanislav Mekhanoshin 14fefe7f8e [AMDGPU] Remove useless check from test. NFC.
The check for assignment of zero is practically useless
while the assignment moves around with different scheduling.

llvm-svn: 342935
2018-09-25 01:24:54 +00:00
Craig Topper 9ce5da7b62 [X86] Don't create FILD ISD nodes when X87 is disabled.
The included test case previously asserted because the type legalizer tried to soften the FILD ISD node.

Fixes PR38819.

llvm-svn: 342934
2018-09-25 00:16:57 +00:00
Thomas Lively 586153652c [WebAssembly][NFC] Fix hardcoded stack indices in tests
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52388

llvm-svn: 342928
2018-09-24 23:42:07 +00:00
Evgeniy Stepanov 090f0f9504 [hwasan] Record and display stack history in stack-based reports.
Summary:
Display a list of recent stack frames (not a stack trace!) when
tag-mismatch is detected on a stack address.

The implementation uses alignment tricks to get both the address of
the history buffer, and the base address of the shadow with a single
8-byte load. See the comment in hwasan_thread_list.h for more
details.

Developed in collaboration with Kostya Serebryany.

Reviewers: kcc

Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52249

llvm-svn: 342923
2018-09-24 23:03:34 +00:00
Evgeniy Stepanov 20c4999e8b Revert "[hwasan] Record and display stack history in stack-based reports."
This reverts commit r342921: test failures on clang-cmake-arm* bots.

llvm-svn: 342922
2018-09-24 22:50:32 +00:00
Evgeniy Stepanov 9043e17edd [hwasan] Record and display stack history in stack-based reports.
Summary:
Display a list of recent stack frames (not a stack trace!) when
tag-mismatch is detected on a stack address.

The implementation uses alignment tricks to get both the address of
the history buffer, and the base address of the shadow with a single
8-byte load. See the comment in hwasan_thread_list.h for more
details.

Developed in collaboration with Kostya Serebryany.

Reviewers: kcc

Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52249

llvm-svn: 342921
2018-09-24 21:38:42 +00:00
Christy Lee e94374809e Re-submitting changes in D51550 because it failed to patch.
Reviewers: javed.absar, trentxintong, courbet

Reviewed By: trentxintong

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52433

llvm-svn: 342919
2018-09-24 20:47:12 +00:00
Simon Pilgrim 0b4ad7596f [X86] Remove shift/rotate by CL memory (RMW) overrides
The uops are slightly different to the register variant, so requires a +1uop tweak

llvm-svn: 342916
2018-09-24 20:11:50 +00:00
Stefan Pintilie b5305771fb [Power9] [LLVM] Add __float128 exponent GET and SET builtins
Added

__builtin_vsx_scalar_extract_expq
__builtin_vsx_scalar_insert_exp_qp

Builtins should behave the same way as in GCC.

Differential Revision: https://reviews.llvm.org/D48185

llvm-svn: 342910
2018-09-24 18:14:13 +00:00
Simon Pilgrim 51cbd838d0 [X86][AVX] Add truncation as shuffle test for PR31451
llvm-svn: 342908
2018-09-24 17:26:31 +00:00
Christy Lee bf112ea25b Reland r342494 after fixing LIT checks.
llvm-svn: 342907
2018-09-24 17:26:30 +00:00
Sanjay Patel 7b86bc22de [InstCombine] add/move tests for extractelement; NFC
llvm-svn: 342905
2018-09-24 17:17:16 +00:00
Zhaoshi Zheng 05b46dc300 [Thumb1] Any imm8 should have cost of 1
A simple MOVS rd, imm8 can materialize [-128, 127] in signed i8 type or
[0, 255] in unsigned i8 type on Thumb1.

Differential Revision: https://reviews.llvm.org/D52257

llvm-svn: 342898
2018-09-24 16:15:23 +00:00
Fedor Sergeev 662e5686fe [New PM][PassInstrumentation] IR printing support for New Pass Manager
Implementing -print-before-all/-print-after-all/-filter-print-func support
through PassInstrumentation callbacks.

- PrintIR routines implement printing callbacks.

- StandardInstrumentations class provides a central place to manage all
  the "standard" in-tree pass instrumentations. Currently it registers
  PrintIR callbacks.

Reviewers: chandlerc, paquette, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D50923

llvm-svn: 342896
2018-09-24 16:08:15 +00:00
Simon Pilgrim 00865a48d1 [X86] Split WriteIMul into 8/16/32/64 implementations (PR36931)
Split WriteIMul by size and also by IMUL multiply-by-imm and multiply-by-reg cases.

This removes all the scheduler overrides for gpr multiplies and stops WriteMULH being ignored for BMI2 MULX instructions.

llvm-svn: 342892
2018-09-24 15:21:57 +00:00
Luke Cheeseman ab7f9b170d [Arm][AsmParser] Restrict register list size for VSTM/VLDM
- The assembler accepts VSTM/VLDM with register lists (specifically double registers lists) with more than 16 registers specified
- The Arm architecture reference manual says this instruction must not contain more than 16 registers when the registers are doubleword registers
- This addresses one of the concerns in https://bugs.llvm.org/show_bug.cgi?id=38389

Differential Revision: https://reviews.llvm.org/D52082

llvm-svn: 342891
2018-09-24 15:13:48 +00:00
Sanjay Patel 2c901742ca [DAGCombiner] use UADDO to optimize saturated unsigned add
This is a preliminary step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

If we have an 'add' instruction that sets flags, we can use that to eliminate an
explicit compare instruction or some other instruction (cmn) that sets flags for 
use in the later select.

As shown in the unchanged tests that use 'icmp ugt %x, %a', we're effectively 
reversing an IR icmp canonicalization that replaces a variable operand with a
constant:
https://rise4fun.com/Alive/V1Q

But we're not using 'uaddo' in those cases via DAG transforms. This happens in 
CGP after D8889 without checking target lowering to see if the op is supported. 
So AArch already shows 'uaddo' codegen for the i8/i16/i32/i64 test variants with 
"using_cmp_sum" in the title. That's the pattern that CGP matches as an unsigned 
saturated add and converts to uaddo without checking target capabilities.

This patch is gated by isOperationLegalOrCustom(ISD::UADDO, VT), so we see only 
see AArch diffs for i32/i64 in the tests with "using_cmp_notval" in the title 
(unlike x86 which sees improvements for all sizes because all sizes are 'custom'). 
But the AArch code (like x86) looks better when translated to 'uaddo' in all cases. 
So someone that is involved with AArch may want to set i8/i16 to 'custom' for UADDO, 
so this patch will fire on those tests.

Another possibility given the existing behavior: we could remove the legal-or-custom 
check altogether because we're assuming that a UADDO sequence is canonical/optimal 
before we ever reach here. But that seems like a bug to me. If the target doesn't 
have an add-with-flags op, then it's not likely that we'll get optimal DAG combining 
using a UADDO node. This is similar justification for why we don't canonicalize IR to 
the overflow math intrinsic sibling (llvm.uadd.with.overflow) for UADDO in the first 
place.

Differential Revision: https://reviews.llvm.org/D51929

llvm-svn: 342886
2018-09-24 14:47:15 +00:00
Petar Jovanovic f9808c5f09 [Mips][FastISel] Fix selectBranch on icmp i1
The r337288 tried to fix result of icmp i1 when its input is not sanitized
by falling back to DagISel. While it now produces the correct result for
bit 0, the other bits can still hold arbitrary value which is not supported
by MipsFastISel branch lowering. This patch fixes the issue by falling back
to DagISel in this case.

Patch by Dragan Mladjenovic.

Differential Revision: https://reviews.llvm.org/D52045

llvm-svn: 342884
2018-09-24 14:14:19 +00:00
Zaara Syeda edefda48d2 [PowerPC] Support operand modifier 'x' in inline asm
gcc uses operand modifier 'x' in inline asm for VSX registers.
Without this modifier, instructions which use VSX numbering for their
operands are printed as VMX registers. This patch adds support for the
operand modifier 'x'.

Differential Revision: https://reviews.llvm.org/D52244

llvm-svn: 342882
2018-09-24 14:01:16 +00:00
Jonas Devlieghere 8a7cfc6c86 [dsymutil] Set LSan blacklist whenever sanitizers are enabled.
LSan can be enabled by itself or as part of the address sanitizer.
Rather than checking the enabled sanitizers for both, just set the LSan
env options whenever a sanitizer is enabled.

llvm-svn: 342881
2018-09-24 13:56:36 +00:00
Roman Lebedev fb697d0f1b [NFC][CodeGen][X86][AArch64] More tests for 'bit field extract' w/ constants
It would be best to introduce ISD::BitFieldExtract,
because clearly more than one backend faces the same problem.
But for now let's solve this in the x86-specific DAG combine.

https://bugs.llvm.org/show_bug.cgi?id=38938

llvm-svn: 342880
2018-09-24 13:24:20 +00:00
Matt Arsenault f432011d33 AMDGPU: Fix private handling for allowsMisalignedMemoryAccesses
If the alignment is at least 4, this should report true.

Something still seems off with how < 4-byte types are
handled here though.

Fixing this seems to change how some combines get
to where they get, but somehow isn't changing the net
result.

llvm-svn: 342879
2018-09-24 13:18:15 +00:00
Matt Arsenault b53feca372 Fix some missing opcodes in bcanalyzer
llvm-svn: 342878
2018-09-24 12:47:17 +00:00
Sjoerd Meijer d986ede313 [ARM] Do not fuse VADD and VMUL on the Cortex-M4 and Cortex-M33
A sequence of VMUL and VADD instructions always give the same or better
performance than a fused VMLA instruction on the Cortex-M4 and Cortex-M33.
Executing the VMUL and VADD back-to-back requires the same cycles, but
having separate instructions allows scheduling to avoid the hazard between
these 2 instructions.

Differential Revision: https://reviews.llvm.org/D52289

llvm-svn: 342874
2018-09-24 12:02:50 +00:00
Luke Cheeseman bda54bca39 [ARM][ARMLoadStoreOptimizer]
- The load store optimizer is currently merging multiple loads/stores into VLDM/VSTM with more than 16 doubleword registers
- This is an UNPREDICTABLE instruction and shouldn't be done
- It looks like the Limit for how many registers included in a merge got dropped at some point so I am reintroducing it in this patch
- This fixes https://bugs.llvm.org/show_bug.cgi?id=38389

Differential Revision: https://reviews.llvm.org/D52085

llvm-svn: 342872
2018-09-24 10:42:22 +00:00
Petar Jovanovic c451c9ef50 [deadargelim] Update dbg.value of 'unused' parameters
DeadArgElim pass marks unused function arguments as ‘undef’ without updating
existing dbg.values referring to it. As a consequence the debug info
metadata in the final executable was wrong.

Patch by Djordje Todorovic.

Differential Revision: https://reviews.llvm.org/D51968

llvm-svn: 342871
2018-09-24 10:01:24 +00:00
Sam Parker a7b2405b06 [ARM] bottom-top mul support ARMParallelDSP
Originally committed in rL342210 but was reverted in rL342260 because
it was causing issues in vectorized code, because I had forgotten to
ensure that we're operating on scalar values.

Original commit message:

On failing to find sequences that can be converted into dual macs,
try to find sequential 16-bit loads that are used by muls which we
can then use smultb, smulbt, smultt with a wide load.

Differential Revision: https://reviews.llvm.org/D51983

llvm-svn: 342870
2018-09-24 09:34:06 +00:00
Craig Topper 2b8107614c [X86] Add 512-bit test cases to setcc-wide-types.ll. NFC
llvm-svn: 342860
2018-09-24 05:46:01 +00:00
Matt Arsenault 9a71e80645 Fix asserts when linking wrong address space declarations
llvm-svn: 342858
2018-09-24 04:42:14 +00:00
Matt Arsenault ce5f203415 llvm-diff: Fix crash on anonymous functions
Not sure what the correct behavior is for this.
Skip them and report how many there were.

llvm-svn: 342857
2018-09-24 04:42:13 +00:00
Simon Pilgrim 9202c9fb47 [X86] ROR*mCL instruction models should match ROL*mCL etc.
Confirmed with Craig Topper - fix a typo that was missing a Port4 uop for ROR*mCL instructions on some Intel models.

Yet another step on the scheduler model cleanup marathon......

llvm-svn: 342846
2018-09-23 19:16:01 +00:00
Sanjay Patel 0027946915 [DAGCombiner][x86] extend decompose of integer multiply into shift/add with negation
This is an alternative to https://reviews.llvm.org/D37896. We can't decompose 
multiplies generically without a target hook to tell us when it's profitable.

ARM and AArch64 may be able to remove some existing code that overlaps with
this transform.

This extends D52195 and may resolve PR34474: 
https://bugs.llvm.org/show_bug.cgi?id=34474
(still an open question about transforming legal vector multiplies, but we
could open another bug report for those)

llvm-svn: 342844
2018-09-23 18:41:38 +00:00