Commit Graph

6348 Commits

Author SHA1 Message Date
Reid Kleckner 2f07c2e9d9 Standardize on MSVC behavior for triples with no environment
Summary:
This makes it so that IR files using triples without an environment work
out of the box, without normalizing them.

Typically, the MSVC behavior is more desirable. For example, it tends to
enable things like constant merging, use of associative comdats, etc.

Addresses PR42491

Reviewers: compnerd

Subscribers: hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64109

llvm-svn: 365387
2019-07-08 21:05:20 +00:00
Simon Pilgrim e1a9b49d6b [X86] ISD::INSERT_SUBVECTOR - use uint64_t index. NFCI.
Keep the uint64_t type from getConstantOperandVal to stop truncation/extension overflow warnings in MSVC in subvector index math.

llvm-svn: 365328
2019-07-08 14:52:56 +00:00
Simon Pilgrim a7145c45a7 [X86] SimplifyDemandedVectorEltsForTargetNode - fix shadow variable warning. NFCI.
Fixes cppcheck warning.

llvm-svn: 365271
2019-07-06 18:46:09 +00:00
Simon Pilgrim 01f1bad618 [X86] LowerBuildVectorv16i8 - pull out repeated getOperand() call. NFCI.
llvm-svn: 365270
2019-07-06 18:33:29 +00:00
Simon Pilgrim 8b25d9bf01 [X86][SSE] LowerINSERT_VECTOR_ELT - early out for out of range indices
Fixes OSS-Fuzz #15662

llvm-svn: 365180
2019-07-05 10:34:53 +00:00
Simon Pilgrim fde766de4b [X86][AVX1] Combine concat_vectors(pshufd(x,c),pshufd(y,c)) -> vpermilps(concat_vectors(x,y),c)
Bitcast v4i32 to v8f32 and back again - it might be worth adding isel patterns for X86PShufd v8i32 on AVX1 targets like we did for X86Blendi to avoid the bitcasts?

llvm-svn: 365125
2019-07-04 10:17:10 +00:00
Craig Topper 163b8bb3f5 [X86] Use pointer sized indices instead of i32 for EXTRACT_VECTOR_ELT and INSERT_VECTOR_ELT in a couple places.
Most places already did this.

llvm-svn: 365109
2019-07-04 06:21:54 +00:00
Simon Pilgrim 26812c7675 [X86] ComputeNumSignBitsForTargetNode - add target shuffle support.
llvm-svn: 365057
2019-07-03 17:06:59 +00:00
Simon Pilgrim 783dbe402f [X86][AVX] combineX86ShufflesRecursively - peek through extract_subvector
If we have more then 2 shuffle ops to combine, try to use combineX86ShuffleChainWithExtract to see if some are from the same super vector.

llvm-svn: 365050
2019-07-03 15:46:08 +00:00
Simon Pilgrim 868d0b7fd9 [X86][AVX] Combine vpermi(bitcast(x)) -> bitcast(vpermi(x))
iff the number of elements doesn't change.

This gets around an issue with combineX86ShuffleChain not being able to hint which domain is preferred for shuffles that can be done with either.

Fixes regression introduced in rL365041

llvm-svn: 365044
2019-07-03 14:34:16 +00:00
Simon Pilgrim 0c230209fe [X86][AVX] combineX86ShuffleChainWithExtract - add number of non-zero extract_subvectors to the combine depth
This better accounts for the cost/benefit of removing extract_subvectors from the shuffle and will be more useful in future patches.

The vpermq predicate regression will be fixed shortly.

llvm-svn: 365041
2019-07-03 14:17:21 +00:00
Simon Pilgrim 8c099cbe7c [X86][SSE] lowerUINT_TO_FP_v2i32 - explicitly cast half word to double
Fixes MSVC analyzer extension->double warning.

llvm-svn: 365027
2019-07-03 11:23:27 +00:00
Simon Pilgrim 8df90b843d [X86][SSE] LowerINSERT_VECTOR_ELT - ensure insertion index correctness. NFCI.
Assert that the insertion index is in range and use uint64_t for the index to fix MSVC/cppcheck truncation warning.

llvm-svn: 365025
2019-07-03 10:59:52 +00:00
Simon Pilgrim 8853bd9592 [X86][SSE] LowerScalarImmediateShift - ensure shift amount correctness. NFCI.
Assert that the shift amount is in range and create vXi8 shift masks in a way that doesn't cause MSVC/cppcheck shift result is truncated then extended warnings.

llvm-svn: 365024
2019-07-03 10:47:33 +00:00
Simon Pilgrim 64e3a51534 Fix uninitialized variable warnings. NFCI.
Both MSVC and cppcheck don't like the fact that the variables are initialized via references.

llvm-svn: 365018
2019-07-03 10:22:08 +00:00
Simon Pilgrim 7b7b9b78a2 [X86] LowerFunnelShift - use modulo constant shift amount.
This avoids the use of getZExtValue and uses the modulo shift amount which is whats expected for funnel shifts anyhow. 

llvm-svn: 365016
2019-07-03 10:04:16 +00:00
Craig Topper b770d2c9d4 [X86] Add a DAG combine for turning *_extend_vector_inreg+load into an appropriate extload if the load isn't volatile.
Remove the corresponding isel patterns that did the same thing without checking for volatile.

This fixes another variation of PR42079

llvm-svn: 364977
2019-07-02 23:20:03 +00:00
Simon Pilgrim 5613874947 [X86] getTargetConstantBitsFromNode - remove unnecessary getZExtValue() (PR42486)
Don't use APInt::getZExtValue() if you can avoid it - eventually someone will call it with i128 or something that doesn't fit into 64-bits.

In this case it was completely superfluous as we'd moved the rest of the code to always use APInt.

Fixes the <1 x i128> addition bug in PR42486

llvm-svn: 364953
2019-07-02 18:20:38 +00:00
Simon Pilgrim 9304168103 [X86][AVX] combineX86ShuffleChain - pull out CombineShuffleWithExtract lambda. NFCI.
Pull out CombineShuffleWithExtract lambda to new combineX86ShuffleChainWithExtract wrapper and refactored it to handle more than 2 shuffle inputs - this will allow combineX86ShufflesRecursively to call this in a future patch.

llvm-svn: 364924
2019-07-02 13:30:04 +00:00
Simon Pilgrim d609ebb779 [X86] resolveTargetShuffleInputsAndMask - add repeated input handling.
We were relying on combineX86ShufflesRecursively to handle this - this patch gets it done earlier which should make it easier for other code to use resolveTargetShuffleInputsAndMask.

llvm-svn: 364906
2019-07-02 10:53:17 +00:00
Craig Topper 5e7815b695 [X86] Correct v4f32->v2i64 cvt(t)ps2(u)qq memory isel patterns
These instructions only read 64-bits of memory so we shouldn't
allow a full vector width load to be pattern matched in case it
is marked volatile.

Instead allow vzload or scalar_to_vector+load.

Also add a DAG combine to turn full vector loads into vzload when
used by one of these instructions if the load isn't volatile.

This fixes another case for PR42079

llvm-svn: 364838
2019-07-01 19:01:37 +00:00
Simon Pilgrim e3e38cce4a [X86] Add widenSubVector to size in bits helper. NFCI.
We can already widenSubVector to a specific type (of the same scalar type) - this variant just specifies the target vector size.

This will be useful when CombineShuffleWithExtract relaxes the need to have the same scalar type for all shuffle operand subvector sources.

llvm-svn: 364803
2019-07-01 16:20:47 +00:00
Simon Pilgrim 172fe5dd19 [X86] CombineShuffleWithExtract - updated description comments. NFCI.
CombineShuffleWithExtract no longer requires that both shuffle ops are extract_subvectors, from the same type or from the same size.

llvm-svn: 364745
2019-07-01 11:33:45 +00:00
Craig Topper 4ca81a9b99 [X86] Add a DAG combine to replace vector loads feeding a v4i32->v2f64 CVTSI2FP/CVTUI2FP node with a vzload.
But only when the load isn't volatile.

This improves load folding during isel where we only have vzload
and scalar_to_vector+load patterns. We can't have full vector load
isel patterns for the same volatile load issue.

Also add some missing masked cvtsi2fp/cvtui2fp with vzload patterns.

llvm-svn: 364728
2019-07-01 07:09:31 +00:00
Craig Topper 725a8a5dc4 [X86] Custom lower AVX masked loads to masked load and vselect instead of selecting a maskmov+vblend during isel.
AVX masked loads only support 0 as the value for masked off elements.
So we need an extra blend to support other values. Previously we
expanded the masked load to two instructions with isel patterns.
With this patch we now insert the vselect during lowering and it
will be separately selected as a blend.

llvm-svn: 364718
2019-06-30 06:46:37 +00:00
Simon Pilgrim 978a08c885 [X86] CombineShuffleWithExtract - recurse through EXTRACT_SUBVECTOR chain
llvm-svn: 364667
2019-06-28 17:57:32 +00:00
Simon Pilgrim a54e1a0f01 [X86] CombineShuffleWithExtract - only require 1 source to be EXTRACT_SUBVECTOR
We were requiring that both shuffle operands were EXTRACT_SUBVECTORs, but we can relax this to only require one of them to be.

Also, we shouldn't bother attempting this if both operands are from the lowest subvector (or not EXTRACT_SUBVECTOR at all).

llvm-svn: 364644
2019-06-28 12:24:49 +00:00
Craig Topper cbb88a5169 [X86] Connect the output chain properly when combining vzext_movl+load into vzext_load.
llvm-svn: 364625
2019-06-28 06:58:50 +00:00
Sanjay Patel a95ca2b5ff [x86] prevent crashing from select narrowing with AVX512
llvm-svn: 364585
2019-06-27 20:16:58 +00:00
Simon Pilgrim 1fd1c60979 [X86] combineX86ShufflesRecursively - merge shuffles with more than 2 inputs
We already had the infrastructure for this, but were waiting for the fix for a number of regressions which were handled by the recent shuffle(extract_subvector(),extract_subvector()) -> extract_subvector(shuffle()) shuffle combines

llvm-svn: 364569
2019-06-27 17:30:51 +00:00
Simon Pilgrim e9a2f4fe2c Use getConstantOperandAPInt instead of getConstantOperandVal for comparisons.
getConstantOperandAPInt avoids any large integer issues - these are unlikely but the fuzzers do like to mess around.....

llvm-svn: 364564
2019-06-27 16:46:00 +00:00
Simon Pilgrim 74343eba37 [X86] getTargetVShiftByConstNode - reduce variable scope. NFCI.
Fixes cppcheck warning.

llvm-svn: 364561
2019-06-27 16:33:44 +00:00
Simon Pilgrim c5cff5d3d1 [X86] getFauxShuffle - add DemandedElts as a filter
This is currently benign but will be used in the future based on the elements referenced by the parent shuffle(s).

llvm-svn: 364530
2019-06-27 12:35:52 +00:00
Simon Pilgrim 90e121fbe6 [X86][AVX] SimplifyDemandedVectorElts - combine PERMPD(x) -> EXTRACTF128(X)
If we only use the bottom lane, see if we can simplify this to extract_subvector - which is always at least as quick as PERMPD/PERMQ.

llvm-svn: 364518
2019-06-27 11:16:03 +00:00
Djordje Todorovic 7eeeb5947e [ISEL][X86] Tracking of registers that forward call arguments
While lowering calls, collect info about registers that forward arguments
into following function frame. We store such info into the MachineFunction
of the call. This is used very late when dumping DWARF info about
call site parameters.

([9/13] Introduce the debug entry values.)

Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>

Differential Revision: https://reviews.llvm.org/D60715

llvm-svn: 364516
2019-06-27 10:51:15 +00:00
Mikael Holmen 7b81b61368 Silence gcc warning after r364458
Without the fix gcc 7.4.0 complains with

../lib/Target/X86/X86ISelLowering.cpp: In function 'bool getFauxShuffleMask(llvm::SDValue, llvm::SmallVectorImpl<int>&, llvm::SmallVectorImpl<llvm::SDValue>&, llvm::SelectionDAG&)':
../lib/Target/X86/X86ISelLowering.cpp:6690:36: error: enumeral and non-enumeral type in conditional expression [-Werror=extra]
             int Idx = (ZeroMask[j] ? SM_SentinelZero : (i + j + Ofs));
                        ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors

llvm-svn: 364507
2019-06-27 08:16:18 +00:00
Craig Topper 3d12971e1c [X86] Rework the logic in LowerBuildVectorv16i8 to make better use of any_extend and break false dependencies. Other improvements
This patch rewrites the loop iteration to only visit every other element starting with element 0. And we work on the "even" element and "next" element at the same time. The "First" logic has been moved to the bottom of the loop and doesn't run on every element. I believe it could create dangling nodes previously since we didn't check if we were going to use SCALAR_TO_VECTOR for the first insertion. I got rid of the "First" variable and just do a null check on V which should be equivalent. We also no longer use undef as the starting V for vectors with no zeroes to avoid false dependencies. This matches v8i16.

I've changed all the extends and OR operations to use MVT::i32 since that's what they'll be promoted to anyway. I've tried to use zero_extend only when necessary and use any_extend otherwise. This resulted in some improvements in tests where we are now able to promote aligned (i32 (extload i8)) to a 32-bit load.

Differential Revision: https://reviews.llvm.org/D63702

llvm-svn: 364469
2019-06-26 20:16:19 +00:00
Craig Topper afa58b6ba1 [X86] Remove isTypePromotionOfi1ZeroUpBits and its helpers.
This was trying to optimize concat_vectors with zero of setcc or
kand instructions. But I think it produced the same code we
produce for a concat_vectors with 0 even it it doesn't come from
one of those operations.

llvm-svn: 364463
2019-06-26 19:45:48 +00:00
Simon Pilgrim dfe079ffbf [X86][SSE] getFauxShuffleMask - handle OR(x,y) where x and y have no overlapping bits
Create a per-byte shuffle mask based on the computeKnownBits from each operand - if for each byte we have a known zero (or both) then it can be safely blended.

Fixes PR41545

llvm-svn: 364458
2019-06-26 18:21:26 +00:00
Simon Pilgrim 435ee9fb1f [X86][SSE] X86TargetLowering::isCommutativeBinOp - add PMULDQ
Allows narrowInsertExtractVectorBinOp to reduce vector size instead of the more restricted SimplifyDemandedVectorEltsForTargetNode

llvm-svn: 364434
2019-06-26 14:58:11 +00:00
Simon Pilgrim 6b687bf681 [X86][SSE] X86TargetLowering::isCommutativeBinOp - add PCMPEQ
Allows narrowInsertExtractVectorBinOp to reduce vector size

llvm-svn: 364432
2019-06-26 14:40:49 +00:00
Simon Pilgrim b13c6f1a9d [X86][SSE] X86TargetLowering::isBinOp - add PCMPGT
Allows narrowInsertExtractVectorBinOp to reduce vector size

llvm-svn: 364431
2019-06-26 14:34:41 +00:00
Simon Pilgrim 24f96a0eee [X86] shouldScalarizeBinop - never scalarize target opcodes.
We have (almost) no target opcodes that have scalar/vector equivalents - for now assume we can't scalarize them (we can add exceptions if we need to).

llvm-svn: 364429
2019-06-26 14:21:29 +00:00
Hans Wennborg 6876de90e8 Fix the build after r364401
It was failing with:

/b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Target/X86/X86ISelLowering.cpp:18772:66:
error: call of overloaded 'makeArrayRef(<brace-enclosed initializer list>)' is ambiguous
     scaleShuffleMask<int>(Scale, makeArrayRef<int>({ 0, 2, 1, 3 }), Mask);
                                                                  ^
/b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Target/X86/X86ISelLowering.cpp:18772:66: note: candidates are:
In file included from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/CodeGen/MachineFunction.h:20:0,
                 from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/CodeGen/CallingConvLower.h:19,
                 from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Target/X86/X86ISelLowering.h:17,
                 from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Target/X86/X86ISelLowering.cpp:14:
/b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/ADT/ArrayRef.h:480:15:
note: llvm::ArrayRef<T> llvm::makeArrayRef(const std::vector<_RealType>&) [with T = int]
   ArrayRef<T> makeArrayRef(const std::vector<T> &Vec) {
               ^
/b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/ADT/ArrayRef.h:485:37:
note: llvm::ArrayRef<T> llvm::makeArrayRef(const llvm::ArrayRef<T>&) [with T = int]
   template <typename T> ArrayRef<T> makeArrayRef(const ArrayRef<T> &Vec) {
                                     ^

llvm-svn: 364414
2019-06-26 11:56:38 +00:00
Simon Pilgrim c0711af7f9 [X86][AVX] combineExtractSubvector - 'little to big' extract_subvector(bitcast()) support
Ideally this needs to be a generic combine in DAGCombiner::visitEXTRACT_SUBVECTOR but there's some nasty regressions in aarch64 due to neon shuffles not handling bitcasts at all.....

llvm-svn: 364407
2019-06-26 11:21:09 +00:00
Simon Pilgrim 3845a4f849 [X86][AVX] truncateVectorWithPACK - avoid bitcasted shuffles
truncateVectorWithPACK is often used in conjunction with ComputeNumSignBits which struggles when peeking through bitcasts.

This fix tries to avoid bitcast(shuffle(bitcast())) patterns in the 256-bit 64-bit sublane shuffles so we can still see through at least until lowering when the shuffles will need to be bitcasted to widen the shuffle type.

llvm-svn: 364401
2019-06-26 09:50:11 +00:00
Craig Topper 14ea14ae85 [X86] Add a DAG combine to turn vzmovl+load into vzload if the load isn't volatile. Remove isel patterns for vzmovl+load
We currently have some isel patterns for treating vzmovl+load the same as vzload, but that shrinks the load which we shouldn't do if the load is volatile.

Rather than adding isel checks for volatile. This patch removes the patterns and teachs DAG combine to merge them into vzload when its legal to do so.

Differential Revision: https://reviews.llvm.org/D63665

llvm-svn: 364333
2019-06-25 17:08:26 +00:00
Simon Pilgrim aae4b68703 [X86] lowerShuffleAsSpecificZeroOrAnyExtend - add ANY_EXTEND TODO.
lowerShuffleAsSpecificZeroOrAnyExtend should be able to lower to ANY_EXTEND_VECTOR_INREG as well as ZER_EXTEND_VECTOR_INREG.

llvm-svn: 364313
2019-06-25 13:36:53 +00:00
Craig Topper 7fccb2ac5e [X86] Don't a vzext_movl in LowerBuildVectorv16i8/LowerBuildVectorv8i16 if there are no zeroes in the vector we're building.
In LowerBuildVectorv16i8 we took care to use an any_extend if the first pair is in the lower 16-bits of the vector and no elements are 0. So bits [31:16] will be undefined. But we still emitted a vzext_movl to ensure that bits [127:32] are 0. If we don't need any zeroes we should be consistent and make all of 127:16 undefined.

In LowerBuildVectorv8i16 we can just delete the vzext_movl code because we only use the scalar_to_vector when there are no zeroes. So the vzext_movl is always unnecessary.

Found while investigating whether (vzext_movl (scalar_to_vector (loadi32)) patterns are necessary. At least one of the cases where they were necessary was where the loadi32 matched 32-bit aligned 16-bit extload. Seemed weird that we required vzext_movl for that case.

Differential Revision: https://reviews.llvm.org/D63700

llvm-svn: 364207
2019-06-24 17:28:41 +00:00
Craig Topper 033774e144 [X86] Cleanups and safety checks around the isFNEG
This patch does a few things to start cleaning up the isFNEG function.

-Remove the Op0/Op1 peekThroughBitcast calls that seem unnecessary. getTargetConstantBitsFromNode has its own peekThroughBitcast inside. And we have a separate peekThroughBitcast on the return value.
-Add a check of the scalar size after the first peekThroughBitcast to ensure we haven't changed the element size and just did something like f32->i32 or f64->i64.
-Remove an unnecessary check that Op1's type is floating point after the peekThroughBitcast. We're just going to look for a bit pattern from a constant. We don't care about its type.
-Add VT checks on several places that consume the return value of isFNEG. Due to the peekThroughBitcasts inside, the type of the return value isn't guaranteed. So its not safe to use it to build other nodes without ensuring the type matches the type being used to build the node. We might be able to replace these checks with bitcasts instead, but I don't have a test case so a bail out check seemed better for now.

Differential Revision: https://reviews.llvm.org/D63683

llvm-svn: 364206
2019-06-24 17:28:26 +00:00