Commit Graph

9813 Commits

Author SHA1 Message Date
Wei Mi 20526b2725 [ConstHoisting] choose to hoist when frequency is the same.
The patch is to adjust the strategy of frequency based consthoisting:
Previously when the candidate block has the same frequency with the existing
blocks containing a const, it will not hoist the const to the candidate block.
For that case, now we change the strategy to hoist the const if only existing
blocks have more than one block member. This is helpful for reducing code size.

Differential Revision: https://reviews.llvm.org/D35084

llvm-svn: 307328
2017-07-06 22:32:27 +00:00
Simon Pilgrim a80cb1d7a7 [X86][SSE] Tests for bitcasting iX integers to vXi1 boolean vectors
Including sign/zero extension to legal types

llvm-svn: 307301
2017-07-06 19:33:10 +00:00
Simon Pilgrim 0fee3372c9 [X86][SSE] Dropped -mcpu from bitcast+setcc tests
Use triple and attribute only for consistency

Added SSE2/AVX tests on 256-bit vectors to test PACKSS behaviour

llvm-svn: 307289
2017-07-06 18:27:34 +00:00
Wei Mi 90707394e3 [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale.
When the formulae search space is huge, LSR uses a series of heuristic to keep
pruning the search space until the number of possible solutions are within
certain limit.

The big hammer of the series of heuristics is NarrowSearchSpaceByPickingWinnerRegs,
which picks the register which is used by the most LSRUses and deletes the other
formulae which don't use the register. This is a effective way to prune the search
space, but quite often not a good way to keep the best solution. We saw cases before
that the heuristic pruned the best formula candidate out of search space.

To relieve the problem, we introduce a new heuristic called
NarrowSearchSpaceByFilterFormulaWithSameScaledReg. The basic idea is in order to
reduce the search space while keeping the best formula, we want to keep as many
formulae with different Scale and ScaledReg as possible. That is because the central
idea of LSR is to choose a group of loop induction variables and use those induction
variables to represent LSRUses. An induction variable candidate is often represented
by the Scale and ScaledReg in a formula. If we have more formulae with different
ScaledReg and Scale to choose, we have better opportunity to find the best solution.
That is why we believe pruning search space by only keeping the best formula with the
same Scale and ScaledReg should be more effective than PickingWinnerReg. And we use
two criteria to choose the best formula with the same Scale and ScaledReg. The first
criteria is to select the formula using less non shared registers, and the second
criteria is to select the formula with less cost got from RateFormula. The patch
implements the heuristic before NarrowSearchSpaceByPickingWinnerRegs, which is the
last resort.

Testing shows we get 1.8% and 2% on two internal benchmarks on x86. llvm nightly
testsuite performance is neutral. We also tried lsr-exp-narrow and it didn't help
on the two improved internal cases we saw.

Differential Revision: https://reviews.llvm.org/D34583

llvm-svn: 307269
2017-07-06 15:52:14 +00:00
Simon Pilgrim 713600747e [X86][SSE4A] Add support for shuffle combining to INSERTQI.
llvm-svn: 307268
2017-07-06 15:34:17 +00:00
Simon Pilgrim 03641df383 [X86][SSE4A] Add test showing missed opportunities to combine INSERTQI shuffle
llvm-svn: 307265
2017-07-06 14:52:24 +00:00
Sanjay Patel 2a341620e7 [x86] fix over-specified triple and auto-generate checks; NFC
llvm-svn: 307262
2017-07-06 14:15:15 +00:00
Simon Pilgrim cc0f785dca [X86][SSE4A] Add support for shuffle combining to EXTRQ.
llvm-svn: 307254
2017-07-06 12:22:58 +00:00
Simon Pilgrim 40c0ae200f [X86][SSE4A] Add scheduling tests for SSE4A instructions
llvm-svn: 307251
2017-07-06 11:26:43 +00:00
Simon Pilgrim ac78daf517 {DAGCombiner] Fold (rot x, 0) -> x
llvm-svn: 307184
2017-07-05 18:27:11 +00:00
Simon Pilgrim 49123d4bb0 [X86] Test bitfield loadstore tests on i686 as well
llvm-svn: 307182
2017-07-05 18:09:30 +00:00
Andrew Zhogin 45d192823e [DAGCombiner] visitRotate patch to optimize pair of ROTR/ROTL instructions into one with combined shift operand.
For two ROTR operations with shifts C1, C2; combined shift operand will be (C1 + C2) % bitsize.

Differential revision: https://reviews.llvm.org/D12833

llvm-svn: 307179
2017-07-05 17:55:42 +00:00
Simon Pilgrim 55006b407b [X86][SSE] Dropped -mcpu from bitcast+setcc mask tests
Use triple and attribute only for consistency

llvm-svn: 307176
2017-07-05 17:30:30 +00:00
Igor Breger 55e2f5963a [GlobalIsel] allow x86_fp80 values to be dumped.
Summary:
Otherwise the fallback path fails with an assertion on x86_64 targets,
when "x86_fp80" is encountered.

Reviewers: t.p.northover, zvi, guyblank

Reviewed By: zvi

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34975

llvm-svn: 307140
2017-07-05 11:11:10 +00:00
Nirav Dave b320ef9fab Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset
Relanding after rewriting undef.ll test to avoid host-dependant
endianness.

As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using
generic checks. Also, propagate missing local handling from there to
BaseIndexOffset checks.

Tests of note:

  * test/CodeGen/X86/build-vector* - Improved.
  * test/CodeGen/BPF/undef.ll - Improved store alignment allows an
    additional store merge

  * test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a
    case we already do not handle well. Here, the DAG is improved, but
    scheduling causes a code size degradation.

Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D34472

llvm-svn: 307114
2017-07-05 01:21:23 +00:00
Gadi Haber 689426e3cb NFC.
Made some updates to the half.ll test under CodeGen to make it friendly to the update_llc_test_checks .py tool as follows:
1.Removing the llc flag -asm-verbose=false
2.Grouping the multiple check-prefix directives
3.Apply update_llc_test_checks.py tool on the test

This change is needed to easily update scheduling changes in an upcoming patch.

Reviewers: zvi, RKSimon, craig.topper 

Differential Revision: https://reviews.llvm.org/D34934

llvm-svn: 307108
2017-07-04 21:51:05 +00:00
Simon Pilgrim ac3e7f3f57 [X86][SSE4A] Add support for combining from non-v16i8 EXTRQI/INSERTQI shuffles
With the improved shuffle decoding we can now combine EXTRQI/INSERTQI shuffles from non-v16i8 vector types

llvm-svn: 307099
2017-07-04 18:11:02 +00:00
Anna Thomas 505941e7d6 [FastISel] Move gc intrinsic test to X86 directory
Move from generic to X86 directory since gc intrinsics only supposed in
X86 64 bit.
Add target triple as well.
Fixes build failure in i686-linux-RA  caused by rL307084.

llvm-svn: 307086
2017-07-04 15:24:08 +00:00
Simon Pilgrim d128222f0c [X86] Add combine tests for vector rotates
Reference tests for D12833

llvm-svn: 307073
2017-07-04 12:33:53 +00:00
Gadi Haber 4980790e81 NFC commit.
Converting the Codegen test "extractelement-legalization-store-ordering.ll" to be "update_llc_test_checks" friendly.

The changes to the test are needed for an upcoming scheduling patch.

Reviewers: zvi, RKSimon

Differential Revision: https://reviews.llvm.org/D34935

llvm-svn: 307066
2017-07-04 07:18:03 +00:00
Craig Topper ad140cfb68 [X86] Add comment string for broadcast loads from the constant pool.
Summary:
When broadcasting from the constant pool its useful to print out the final vector similar to what we do for normal moves from the constant pool.

I changed only a couple tests that were broadcast focused. One of them had been previously hand tweaked after running the script so that it could check the constant pool declaration. But I think this patch makes that unnecessary now since we can check the comment instead.

Reviewers: spatel, RKSimon, zvi

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34923

llvm-svn: 307062
2017-07-04 05:46:11 +00:00
Anton Yartsev 66d32c5e06 [legalize-types] Clean up softening machinery.
The patch makes SoftenFloatResult/Operand logic just the same as all other legalization routines have: SoftenFloatResult() now fills the SoftenFloats map and SoftenFloatOperand() perform all needed replacements. This prevents softening mashinery from leaving stale entries in SoftenFloats map (that resulted in errors during the legalize type checking) and clarifies softening. The patch replaces https://reviews.llvm.org/D29265.

Differential Revision: https://reviews.llvm.org/D31946

llvm-svn: 307053
2017-07-04 01:08:55 +00:00
Simon Pilgrim fa6e675267 [X86][SSE4A] Add support for combining from EXTRQI/INSERTQI shuffles
llvm-svn: 307048
2017-07-03 20:58:16 +00:00
Simon Pilgrim bdfb3b1d5f [X86][SSE4A] Add SSE4A shuffle tests on pre-SSSE3 hardware
llvm-svn: 307042
2017-07-03 16:53:11 +00:00
Simon Pilgrim b5c68a6717 [X86][SSE4A] Test SSE4A shuffle combining on SSE42 capable target as well
llvm-svn: 307038
2017-07-03 15:55:54 +00:00
Zvi Rackover d7a1c334ce DAGCombine: Combine BUILD_VECTOR to TRUNCATE
Summary:
Add a combine for creating a truncate to replace a build_vector composed of extracts with
indices that form a stride-2^N series.

Example:
v8i32 V = ...

v4i32 build_vector((extract_elt V, 0), (extract_elt V, 2), (extract_elt V, 4), (extract_elt V, 6))
-->
v4i32 truncate (bitcast V to v4i64)

Related discussion in llvm-dev about canonicalizing shuffles to
truncates in LLVM IR:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/108936.html.

Reviewers: spatel, RKSimon, efriedma, igorb, craig.topper, wolfgangp, delena

Reviewed By: delena

Subscribers: guyblank, delena, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D34077

llvm-svn: 307036
2017-07-03 15:47:40 +00:00
Sanjay Patel e9b1d16a8c [x86] auto-generate complete checks for tests; NFC
These all used 'CHECK-NOT' which isn't necessary if we have complete checks.
There were also over-specifications in the RUN params such as CPU model.

llvm-svn: 307033
2017-07-03 15:27:19 +00:00
Sanjay Patel d3173740fd [x86] auto-generate complete checks for tests; NFC
These all used 'CHECK-NOT' which isn't necessary if we have complete checks.
There were also several over-specifications in the RUN params such as CPU model or OS requirement

llvm-svn: 307028
2017-07-03 15:04:05 +00:00
Simon Pilgrim decfaca033 [X86][SSE4A] Add tests showing missed opportunities to combine EXTRQI/INSERTQI shuffles
llvm-svn: 307027
2017-07-03 15:01:07 +00:00
Sanjay Patel dab798a25f [x86] auto-generate complete checks for tests; NFC
These all used 'CHECK-NOT' which isn't necessary if we have complete checks.

llvm-svn: 307024
2017-07-03 14:29:45 +00:00
Igor Breger 5c787ab346 [GlobalISel][X86] fix %ptr(p0) = G_CONSTANT selection.
llvm-svn: 307019
2017-07-03 11:06:54 +00:00
Simon Pilgrim f05c5ef441 [X86][AVX512] Test AVX512VPOPCNTDQ CTPOP with/without AVX512BW
llvm-svn: 306991
2017-07-02 19:52:20 +00:00
Simon Pilgrim a9655ffb42 [X86][AVX512VPOPCNTDQ] Improve support for v16i8/v8i16/v16i16/ CTPOP
Zero extend to v16i32/v8i64, use VPOPCNTDQ instructions and truncate back.

llvm-svn: 306990
2017-07-02 19:32:37 +00:00
Simon Pilgrim 3f5ed96f92 [X86][AVX512] Cleanup tzcnt tests triples and attributes
Avoid use of specific -mcpu

llvm-svn: 306989
2017-07-02 18:51:48 +00:00
Simon Pilgrim df55dd09d6 [X86][AVX512] Cleanup popcnt tests triples and attributes
Avoid use of specific -mcpu

llvm-svn: 306988
2017-07-02 18:35:22 +00:00
Sanjay Patel 7d263c1a27 [x86] auto-generate complete checks for tests; NFC
These all used 'CHECK-NOT' which isn't necessary if we have complete checks.

llvm-svn: 306984
2017-07-02 15:24:08 +00:00
Sanjay Patel dd076f0178 [x86] remove unnecessary RUN for test after auto-generating checks; NFC
llvm-svn: 306983
2017-07-02 15:16:17 +00:00
Sanjay Patel c22223e6cd [x86] update test to use FileCheck and auto-generate checks; NFC
llvm-svn: 306982
2017-07-02 15:15:18 +00:00
Sanjay Patel 27cccc96c2 [x86] auto-generate complete checks for tests; NFC
These all used 'CHECK-NOT' which isn't necessary if we have complete checks.

llvm-svn: 306981
2017-07-02 14:50:35 +00:00
Simon Pilgrim 8971b2904e [X86][SSE] Attempt to combine 64-bit and 32-bit shuffles to unary shuffles before bit shifts
We are combining shuffles to bit shifts before unary permutes, which means we can't fold loads plus the destination register is destructive

llvm-svn: 306978
2017-07-02 14:16:25 +00:00
Simon Pilgrim 4cb5613c38 [X86][SSE] Attempt to combine 64-bit and 16-bit shuffles to unary shuffles before bit shifts
We are combining shuffles to bit shifts before unary permutes, which means we can't fold loads plus the destination register is destructive

The 32-bit shuffles are a bit tricky and will be dealt with in a later patch

llvm-svn: 306977
2017-07-02 13:19:10 +00:00
Simon Pilgrim 638af5f1c4 [X86][SSE] Add test showing missed opportunity to combine to pshuflw
We are combining shuffles to bit shifts before unary permutes, which means we can't fold loads plus the destination register is destructive

llvm-svn: 306976
2017-07-02 12:56:10 +00:00
Gadi Haber dc25c2b08b [X86] Rerun "update_llc_test_checks" tool on CodeGen tests. NFC.
This is NFC after rerunning the "update_llc_test_checks.py" tool on the CodeGen X86 tests in order to submit a patch.
Minor differences due to added "End of Function" lines.

Reviewers: zvi
Differential Revision: https://reviews.llvm.org/D34933

llvm-svn: 306973
2017-07-02 12:01:33 +00:00
Igor Breger 717bd36c83 [GlobalISel][X86] Support G_GLOBAL_VALUE operation.
Summary: Support G_GLOBAL_VALUE operation. For now most of the PIC configurations not implemented yet.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34738

Conflicts:
	test/CodeGen/X86/GlobalISel/regbankselect-X86_64.mir

llvm-svn: 306972
2017-07-02 08:58:29 +00:00
Igor Breger b186a69aa5 [GlobalISel][X86] Support vector type G_UNMERGE_VALUES selection.
Summary:
Support vector type G_UNMERGE_VALUES selection.
For now G_UNMERGE_VALUES marked as legal for any type, so nothing to do in legalizer.

Reviewers: t.p.northover, qcolombet, zvi, guyblank

Reviewed By: guyblank

Subscribers: rovka, kristof.beyls, guyblank, llvm-commits

Differential Revision: https://reviews.llvm.org/D33665

llvm-svn: 306971
2017-07-02 08:15:49 +00:00
Hiroshi Inoue bb703e8960 fix trivial typos; NFC
suport -> support

llvm-svn: 306968
2017-07-02 03:24:54 +00:00
Simon Pilgrim 3bad6f3167 [X86][RDSEED] Split off i64 intrinsic tests and test i16/i32 on 32-bit target as well.
llvm-svn: 306961
2017-07-01 16:42:16 +00:00
Simon Pilgrim 2d320161e5 [X86][RDRAND] Split off i64 intrinsic tests and test i16/i32 on 32-bit target as well.
llvm-svn: 306960
2017-07-01 16:41:12 +00:00
Simon Pilgrim 2b679e1812 [X86] Removed reference to update_test_checks.py
llvm-svn: 306959
2017-07-01 16:34:29 +00:00
Simon Pilgrim ad7f0844ea [X86][AVX] Remove duplicate autogeneration note
llvm-svn: 306958
2017-07-01 16:32:02 +00:00