Commit Graph

8887 Commits

Author SHA1 Message Date
Simon Pilgrim 5d005a359e Fix signed/unsigned comparison warning. NFCI.
llvm-svn: 325359
2018-02-16 16:52:50 +00:00
Simon Pilgrim ff53a4a234 [SelectionDAG] Enable SimplifyDemandedVectorElts support for simplifying shuffle masks
Based off the DemandedElts mask the and UNDEF elements returned from the SimplifyDemandedVectorElts calls to the shuffle operands, we can attempt to simplify the shuffle mask.

I had to be very conservative here as accepting post-legalized shuffle masks could cause problems for targets that legalize UNDEF mask elements back to inrange values (PowerPC), similarly combining to identity shuffle masks could cause too much UNDEF information to disappear for later combines.

llvm-svn: 325354
2018-02-16 16:22:14 +00:00
Simon Pilgrim 0ffde50f9c [SelectionDAG] Add initial SimplifyDemandedVectorElts support for simplifying VSELECT operands
This just adds a basic pass through - we can add constant selection mask handling in a future patch to fully match InstCombine.

llvm-svn: 325338
2018-02-16 12:21:08 +00:00
Mikhail Maltsev 0a7e107e77 [LegalizeDAG] Fix legalization of SETCC
Summary:
Currently when expanding a SETCC node into a SELECT_CC, LLVM uses
an incorrect type for determining BooleanContent of the result. This
patch fixes the issue.

Fixes PR36079.

Reviewers: rogfer01, javed.absar, efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43282

llvm-svn: 325325
2018-02-16 09:35:16 +00:00
Craig Topper dac3c1f5c8 [DAGCombiner] Call ExtendUsesToFormExtLoad in (zext (and (load)))->(and (zextload)) even when the and does not have multiple uses
Same for the sign extend case.

Currently we check for multiple uses on the binop. Then we call ExtendUsesToFormExtLoad to capture SetCCs that use the load. So we only end up finding any setccs when the and has additional uses and the load is used by a setcc. I don't think the and having multiple uses is relevant here. I think we should only be checking for the load having multiple uses.

This changes an NVPTX test because we now find that the load has a second use by a truncate, but ExtendUsesToFormExtLoad only looks at setccs it can extend. All other operations just check isTruncateFree. Maybe we should allow widening of an existing truncate even if its not free?

Differential Revision: https://reviews.llvm.org/D43063

llvm-svn: 325289
2018-02-15 20:20:32 +00:00
Simon Pilgrim 1eb5c455c9 [SelectionDAG] Pull out repeated Op.getOpcode(). NFCI.
llvm-svn: 325253
2018-02-15 15:31:00 +00:00
Simon Pilgrim 80663ee986 [SelectionDAG] Add initial implementation of TargetLowering::SimplifyDemandedVectorElts
This is mainly a move of simplifyShuffleOperands from DAGCombiner::visitVECTOR_SHUFFLE to create a more general purpose TargetLowering::SimplifyDemandedVectorElts implementation.

Further features can be moved/added in future patches.

Differential Revision: https://reviews.llvm.org/D42896

llvm-svn: 325232
2018-02-15 12:14:15 +00:00
Alexander Ivchenko 7e5d525bd5 [SelectionDAG][X86] Fix incorrect offset generated for VMASKMOV
When creating high MachineMemOperand for MSTORE/MLOAD we supply
it with the original PointerInfo, while the pointer itself had been incremented.
The patch adds the proper offset to the PointerInfo.

llvm-svn: 325135
2018-02-14 15:55:24 +00:00
Elena Demikhovsky 945b7e5aa6 Adding a width of the GEP index to the Data Layout.
Making a width of GEP Index, which is used for address calculation, to be one of the pointer properties in the Data Layout.
p[address space]:size:memory_size:alignment:pref_alignment:index_size_in_bits.
The index size parameter is optional, if not specified, it is equal to the pointer size.

Till now, the InstCombiner normalized GEPs and extended the Index operand to the pointer width.
It works fine if you can convert pointer to integer for address calculation and all registered targets do this.
But some ISAs have very restricted instruction set for the pointer calculation. During discussions were desided to retrieve information for GEP index from the Data Layout.
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120416.html

I added an interface to the Data Layout and I changed the InstCombiner and some other passes to take the Index width into account.
This change does not affect any in-tree target. I added tests to cover data layouts with explicitly specified index size.

Differential Revision: https://reviews.llvm.org/D42123

llvm-svn: 325102
2018-02-14 06:58:08 +00:00
Craig Topper 5ecea9fff5 [SelectionDAG] Remove duplicate code from TargetLowering::SimplifySetCC.
This exact code already exists a little further up.

llvm-svn: 325101
2018-02-14 06:51:57 +00:00
Craig Topper f73ff612ca [DAGCombiner] Add one use check to fold (not (and x, y)) -> (or (not x), (not y))
Summary:
If the and has an additional use we shouldn't invert it. That creates an additional instruction.

While there add a one use check to the transform above that looked similar.

Reviewers: spatel, RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43225

llvm-svn: 325019
2018-02-13 16:25:27 +00:00
Sanjay Patel 907b58530f [DAG] fix type of undef returned by getNode()
The bug has been lying dormant, but apparently was never exposed, until
after rL324941 because we didn't return the correct result 
for shifts with undef operands.

llvm-svn: 325010
2018-02-13 14:55:07 +00:00
Sanjay Patel 014c000f6a [DAG] make binops with undef operands consistent with IR
This started by noticing that scalar and vector types were producing different results with div ops in PR36305:
https://bugs.llvm.org/show_bug.cgi?id=36305

...but the problem is bigger. I couldn't keep it straight without a table, so I'm attaching that as a PDF to 
the review. The x86 tests in undef-ops.ll correspond to that table.

Green means that instsimplify and the DAG agree on the result for all types.
Red means the DAG was returning undef when IR was not.
Yellow means the DAG was returning a non-undef result when IR returned undef.

This patch assumes that we're currently doing the right thing in IR.

Note: I couldn't find any problems with lowering vector constants as the code comments were warning, 
but those comments were written long ago in rL36413 .

Differential Revision: https://reviews.llvm.org/D43141

llvm-svn: 324941
2018-02-12 21:37:27 +00:00
Oliver Stannard 02f08c9d1f [AArch64] Improve v8.1-A code-gen for atomic load-and
Armv8.1-A added an atomic load-clear instruction (which performs bitwise
and with the complement of it's operand), but not a load-and
instruction. Our current code-generation for atomic load-and always
inserts an MVN instruction to invert its argument, even if it could be
folded into a constant or another instruction.

This adds lowering early in selection DAG to convert a load-and
operation into an xor with -1 and a load-clear, allowing the normal DAG
optimisations to work on it.

To do this, I've had to add a new ISD opcode, ATOMIC_LOAD_CLR. I don't
see any easy way to do this with an AArch64-specific ISD node, because
the code-generation for atomic operations assumes the SDNodes are of
type AtomicSDNode.

I've left the old tablegen patterns in because they are still needed for
global isel.

Differential revision: https://reviews.llvm.org/D42478

llvm-svn: 324908
2018-02-12 17:03:11 +00:00
Sanjay Patel eb8c408e50 [TargetLowering] try to create -1 constant operand for math ops via demanded bits
This reverses instcombine's demanded bits' transform which always tries to clear bits in constants.

As noted in PR35792 and shown in the test diffs:
https://bugs.llvm.org/show_bug.cgi?id=35792
...we can do better in codegen by trying to form -1. The x86 sub test shows a missed opportunity. 

I did investigate changing instcombine's behavior, but it would be more work to change 
canonicalization in IR. Clearing bits / shrinking constants can allow killing instructions, 
so we'd have to figure out how to not regress those cases.

Differential Revision: https://reviews.llvm.org/D42986

llvm-svn: 324839
2018-02-11 14:38:23 +00:00
Simon Pilgrim 0be5567a89 [X86][SSE] Enable SMIN/SMAX/UMIN/UMAX custom lowering for all legal types
This allows us to recognise more saturation patterns and also simplify some MINMAX codegen that was failing to combine CMPGE comparisons to a legal CMPGT.

Differential Revision: https://reviews.llvm.org/D43014

llvm-svn: 324837
2018-02-11 10:52:37 +00:00
Craig Topper 36f913ee80 [SelectionDAG] Remove TargetLowering::getConstTrueVal. Use SelectionDAG::getBoolConstant in the one place it was used.
SelectionDAG::getBoolConstant was recently introduced. At the time I didn't know getConstTrueVal existed, but I think getBoolConstant is better as it will use the source VT to make sure it can properly detect floating point if it is configured differently.

llvm-svn: 324832
2018-02-11 04:58:58 +00:00
Nirav Dave c8c9d4fe35 [DAG] Make early exit hasPredecessorHelper return true. NFCI.
All uses conservatively assume in early exit case that it will be a
predecessor. Changing default removes checking code in all uses.

llvm-svn: 324797
2018-02-10 02:41:22 +00:00
Stefan Maksimovic dc66ae78c6 [SelectionDAG] Provide adequate register class for RegisterSDNode
When adding operands to machine instructions in case of
RegisterSDNodes, generate a COPY node in case the register class
does not match the one in the instruction definition.

Differental Revision: https://reviews.llvm.org/D35561

llvm-svn: 324733
2018-02-09 13:55:25 +00:00
Vedant Kumar 7fd9a58d8c Revert "WIP: [DAGCombiner] Assert that debug info is preserved"
This reverts commit r324648. It was committed accidentally.

llvm-svn: 324650
2018-02-08 20:27:35 +00:00
Vedant Kumar 28323ff5a3 WIP: [DAGCombiner] Assert that debug info is preserved
llvm-svn: 324648
2018-02-08 20:27:09 +00:00
Craig Topper 9b611e436f [SelectionDAG] Add a helper function for creating a boolean constant based on the target's boolean content
Many in SimplifySetCC and FoldSetCC try to create true or false constants. Some of them query getBooleanContents to figure out whether to use all ones or just 1 for true. But many places do not check and just use 1 without ensuring the VT has an i1 scalar type. Note sure if those places only trigger before type legalization so they only see an i1
type?

To cleanup the inconsistency and reduce some duplicated code, this patch adds a getBoolConstant method to SelectionDAG that takes are of querying getBooleanContents and doing the right thing.

Differential Revision: https://reviews.llvm.org/D43037

llvm-svn: 324634
2018-02-08 18:55:14 +00:00
Craig Topper c19aed963e [DAGCombiner] Fix a couple mistakes from r324311 by really passing the original load to ExtendSetCCUses.
We're passing the binary op that uses the load instead of the load.

Noticed by inspection. Not sure how to test this because this just prevents the introduction of an extend that will later be truncated and will probably be combined out.

llvm-svn: 324568
2018-02-08 06:27:18 +00:00
Craig Topper 9b9d527427 [DAGCombiner] Don't create truncate nodes in (aext (zextload x)) -> (zextload x) and similar folds. NFCI
The truncate is being used to replace other users of of the load, but we checked that the load only has one use so there are no other uses to replace.

llvm-svn: 324567
2018-02-08 06:04:18 +00:00
Craig Topper cbfe41ac2f [DAGCombiner] Avoid creating truncate nodes in (zext (and (load)))->(and (zextload)) fold until we know for sure we're going to need it. NFCI
The truncate is only needed if the load has additional users. It used to get passed to extendSetCCUses so was created early, but that's no longer the case.

llvm-svn: 324562
2018-02-08 04:38:04 +00:00
Craig Topper bf4ed42606 [DAGCombiner] Rename variable to be slightly better. NFC
We were calling a load LN0 but it came from N0.getOperand(0) so its really more like LN00 if we follow the name used in other places.

llvm-svn: 324561
2018-02-08 04:38:02 +00:00
Nirav Dave efed656873 [SelectionDAG] More Aggressibly prune nodes in AddChains. NFCI.
Travel all chains paths to first non-tokenfactor node can be
exponential work. Add simple redundency check to avoid this.
Fixes PR36264.

llvm-svn: 324491
2018-02-07 17:12:34 +00:00
Eugene Leviant 25347ea895 [LegalizeDAG] Truncate condition operand of ISD::SELECT
Differential revision: https://reviews.llvm.org/D42737

llvm-svn: 324447
2018-02-07 05:38:29 +00:00
Craig Topper 58ecffd857 [DAGCombiner][AMDGPU][X86] Turn cttz/ctlz into cttz_zero_undef/ctlz_zero_undef if we can prove the input is never zero
X86 currently has a late DAG combine after cttz/ctlz are turned into BSR+BSF+CMOV to detect this and remove the CMOV. But we should be able to do this much earlier and avoid creating the cmov all together.

For the changed AMDGPU test case it appears that previously the i8 cttz was type legalized to i16 which introduced an OR with 256 in order to limit the result to 8 on the widened type. At this point the result is known to never be zero, but nothing checked that. Then operation legalization is told to promote all i16 cttz to i32. This introduces an extend and a truncate and another OR with 65536 to limit the result to 16. With the DAG combiner change we are able to prevent the creation of the second OR since the opcode will have been changed to cttz_zero_undef after the first OR. I the lack of the OR caused the instruction to change to v_ffbl_b32_sdwa

Differential Revision: https://reviews.llvm.org/D42985

llvm-svn: 324427
2018-02-06 23:54:37 +00:00
Andrew Kaylor c41499865b Add SelectionDAGDumper support for strict FP nodes
Patch by Kevin P. Neal

llvm-svn: 324416
2018-02-06 22:28:15 +00:00
Sanjay Patel 87ce2fd82d [TargetLowering] use local variable to reduce duplication; NFCI
llvm-svn: 324401
2018-02-06 21:09:42 +00:00
Sanjay Patel e96a9014ab [TargetLowering] use local variables to reduce duplication; NFCI
llvm-svn: 324397
2018-02-06 20:49:28 +00:00
Nirav Dave 27721e8617 [DAG, X86] Improve Dependency analysis when doing multi-node
Instruction Selection

Cleanup cycle/validity checks in ISel (IsLegalToFold,
HandleMergeInputChains) and X86 (isFusableLoadOpStore). Now do a full
search for cycles / dependencies pruning the search when topological
property of NodeId allows.

As part of this propogate the NodeId-based cutoffs to narrow
hasPreprocessorHelper searches.

Reviewers: craig.topper, bogner

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D41293

llvm-svn: 324359
2018-02-06 16:14:29 +00:00
Craig Topper ee1f34eb9a [DAGCombiner] Pass the original load to ExtendSetCCUses not the turncate.
Summary:
This method is trying to use the truncate node to find which SETCC operand should be replaced directly with the extended load.

This used to work correctly because all uses of the original load were replaced by the truncate before this function was called. So this was used to effectively bypass the truncate and find the load under it.

All but one of the callers now call this before the truncate has replaced the laod so the setcc doesn't yet use the truncate. To account for this we should pass the original load instead.

I changed the order of that one caller to make this work there too.

I don't have a test case because this is probably hidden by later DAG combines causing the extend and truncate to cancel out. I assume this way is a little more efficient and matches what was originally intended.

Reviewers: RKSimon, spatel, niravd

Reviewed By: niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42878

llvm-svn: 324311
2018-02-06 03:23:27 +00:00
Krzysztof Parzyszek fee3f419ae [SDAG] Legalize all CondCodes by inverting them and/or swapping operands
Differential Revision: https://reviews.llvm.org/D42788

llvm-svn: 324274
2018-02-05 21:27:16 +00:00
Craig Topper fc5bd023dd [DAGCombiner] When folding fold (sext/zext (and/or/xor (sextload/zextload x), cst)) -> (and/or/xor (sextload/zextload x), (sext/zext cst)) make sure we check the legality of the full extended load.
Summary:
If the load is already an extended load we should be using the memory VT for the legality check, not just the VT of the current extension.

I don't have a test case, just noticed it while investigating some load extension improvements.

Reviewers: RKSimon, spatel, niravd

Reviewed By: niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42783

llvm-svn: 324181
2018-02-03 23:00:31 +00:00
Simon Pilgrim 4df6499f10 [SelectionDAG] Don't use simple VT in generic shuffle code
Better to assume that any value type may be commuted, not just MVTs.

No test case right now, but discovered while investigating possible shuffle combines.

llvm-svn: 324179
2018-02-03 21:34:42 +00:00
Jonas Paulsson 422dfbf7cc [SelectionDAG] Consider endianness in scalarizeVectorStore().
When handling vectors with non byte-sized elements, reverse the order of the
elements in the built integer if the target is Big-Endian.

SystemZ tests updated.

Review: Eli Friedman, Ulrich Weigand.
https://reviews.llvm.org/D42786

llvm-svn: 324063
2018-02-02 08:48:02 +00:00
Jonas Paulsson ad089fe46e [SelectionDAG] Add an assert in getNode() for EXTRACT_VECTOR_ELT.
When getNode() is called to create an EXTRACT_VECTOR_ELT, assert that
the result VT is at least as wide as the vector element type.

Review: Eli Friedman
llvm-svn: 324061
2018-02-02 08:21:53 +00:00
Craig Topper a5944aade1 [DAGCombiner] When folding (insert_subvector undef, (bitcast (extract_subvector N1, Idx)), Idx) -> (bitcast N1) make sure that N1 has the same total size as the original output
We were only checking the element count, but not the total width. This could cause illegal bitcasts to be created if for example the output was 512-bits, but N1 is 256 bits, and the extraction size was 128-bits.

Fixes PR36199

Differential Revision: https://reviews.llvm.org/D42809

llvm-svn: 324002
2018-02-01 20:48:50 +00:00
Sanjay Patel 657e5d8d41 [DAGCombiner] filter out denorm inputs when calculating sqrt estimate (PR34994)
As shown in the example in PR34994:
https://bugs.llvm.org/show_bug.cgi?id=34994
...we can return a very wrong answer (inf instead of 0.0) for square root when 
using a reciprocal square root estimate instruction.

Here, I've conditionalized the filtering out of denorms based on the function 
having "denormal-fp-math"="ieee" in its attributes. The other options for this 
attribute are 'preserve-sign' and 'positive-zero'.

So we don't generate this extra code by default with just '-ffast-math' (because 
then there's no denormal attribute string at all), but it works if you specify 
'-ffast-math -fdenormal-fp-math=ieee' from clang. 

As noted in the review, there may be other problems in clang that affect the 
results depending on platform (Linux x86 at least), but this should allow 
creating the desired codegen.

Differential Revision: https://reviews.llvm.org/D42323

llvm-svn: 323981
2018-02-01 16:57:18 +00:00
Nirav Dave 18f7f60e17 [SelectionDAG] Fix UpdateChains handling of TokenFactors
Summary:
In Instruction Selection UpdateChains replaces all matched Nodes'
chain references including interior token factors and deletes them.
This may allow nodes which depend on these interior nodes but are not
part of the set of matched nodes to be left with a dangling dependence.
Avoid this by doing the replacement for matched non-TokenFactor nodes.

Fixes PR36164.

Reviewers: jonpa, RKSimon, bogner

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D42754

llvm-svn: 323977
2018-02-01 16:11:59 +00:00
Dean Michael Berris cdca0730be [XRay][compiler-rt+llvm] Update XRay register stashing semantics
Summary:
This change expands the amount of registers stashed by the entry and
`__xray_CustomEvent` trampolines.

We've found that since the `__xray_CustomEvent` trampoline calls can show up in
situations where the scratch registers are being used, and since we don't
typically want to affect the code-gen around the disabled
`__xray_customevent(...)` intrinsic calls, that we need to save and restore the
state of even the scratch registers in the handling of these custom events.

Reviewers: pcc, pelikan, dblaikie, eizan, kpw, echristo, chandlerc

Reviewed By: echristo

Subscribers: chandlerc, echristo, hiraditya, davide, dblaikie, llvm-commits

Differential Revision: https://reviews.llvm.org/D40894

llvm-svn: 323940
2018-02-01 02:21:54 +00:00
Matt Arsenault df0f25070c DAG: Fix not truncating when promoting bswap/bitreverse
These need to convert back to the original type, like any
other promotion.

llvm-svn: 323932
2018-01-31 23:54:16 +00:00
Nirav Dave c3a1e16db1 [DAG] Prevent NodeId pruning of TokenFactors in Instruction Selection.
Summary:
Instruction Selection preserves relative orders of all nodes save
TokenFactors which we treat specially. As a result Node Ids for
TokenFactors may violate the topological ordering and should not be
considered as valid pruning candidates in predecessor search.

Fixes PR35316.

Reviewers: RKSimon, hfinkel

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D42701

llvm-svn: 323880
2018-01-31 15:23:17 +00:00
Roger Ferrer Ibanez aea4208720 [ARM] Allow the scheduler to clone a node with glue to avoid a copy CPSR ↔ GPR.
In Thumb 1, with the new ADDCARRY / SUBCARRY the scheduler may need to do
copies CPSR ↔ GPR but not all Thumb1 targets implement them.

The schedule can attempt, before attempting a copy, to clone the instructions
but it does not currently do that for nodes with input glue. In this patch we
introduce a target-hook to let the hook decide if a glued machinenode is still
eligible for copying. In this case these are ARM::tADCS and ARM::tSBCS .

As a follow-up of this change we should actually implement the copies for the
Thumb1 targets that do implement them and restrict the hook to the targets that
can't really do such copy as these clones are not ideal.

This change fixes PR35836.

Differential Revision: https://reviews.llvm.org/D42051

llvm-svn: 323857
2018-01-31 09:23:43 +00:00
Simon Dardis daaeaba665 [mips] Fix incorrect sign extension for fpowi libcall
PR36061 showed that during the expansion of ISD::FPOWI, that there
was an incorrect zero extension of the integer argument which for
MIPS64 would then give incorrect results. Address this with the
existing mechanism for correcting sign extensions.

This resolves PR36061.

Thanks to James Cowgill for reporting the issue!

Reviewers: atanasyan, hfinkel

Differential Revision: https://reviews.llvm.org/D42537

llvm-svn: 323781
2018-01-30 16:24:10 +00:00
Dan Gohman 832092ca12 [SelectionDAG]: Ignore "returned" in the presence of an implicit sret.
When a function return value can't be directly lowered, such as
returning an i128 on WebAssembly, as indicated by the CanLowerReturn
target hook, SelectionDAGBuilder can translate it to return the
value through a hidden sret-like argument.

If such a function has an argument with the "returned" attribute,
the attribute can't be automatically lowered, because the function
no longer has a normal return value. For now, just discard the
"returned" attribute.

This fixes PR36128.

llvm-svn: 323715
2018-01-30 00:14:40 +00:00
Craig Topper 62b62356fa [X86] Make foldLogicOfSetCCs work better for vectors pre legal types/operations
Summary:
There's a check in the code to only check getSetCCResultType after LegalOperations or if the type is MVT::i1. But the i1 check is only allowing scalar types through. I think it should check that the scalar type is MVT::i1 so that it will work for vectors.

The changed test already does this combine with AVX512VL where getSetCCResultType returns vXi1. But with avx512f and no VLX getSetCCResultType returns a type matching the width of the input type.

Reviewers: spatel, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42619

llvm-svn: 323631
2018-01-29 07:52:55 +00:00
Hiroshi Inoue c8e9245816 [NFC] fix trivial typos in comments and documents
"to to" -> "to"

llvm-svn: 323628
2018-01-29 05:17:03 +00:00
Craig Topper 2c570eaa00 [TargetLowering] Teach TargetLowering::SimplifySetCC to simplify setcc of vXi1 vectors into logic ops.
This transform was already being done for setcc of scalar i1. This extends it to vectors.

llvm-svn: 323585
2018-01-27 09:10:58 +00:00
Craig Topper c80f0ced84 [SelectionDAG] Make DAGTypeLegalizer::PromoteSetCCOperands handle SETEQ/SETNE correctly for vector types.
The code was using getValueSizeInBits and combining with the result of a call to DAG.ComputeNumSignBits. But for vector types getValueSizeInBits returns the width of the full vector while ComputeNumSignBits is going to give a number no larger than the width of a single element. So we should be using getScalarValueSizeInBits to get the element width.

llvm-svn: 323583
2018-01-27 08:41:03 +00:00
Craig Topper 8f324bb1a4 [SelectionDAGISel] Add a debug print before call to Select. Adjust where blank lines are printed during isel process to make things more sensibly grouped.
Previously some targets printed their own message at the start of Select to indicate what they were selecting. For the targets that didn't, it means there was no print of the root node before any custom handling in the target executed. So if the target did something custom and never called SelectNodeCommon, no print would be made. For the targets that did print a message in Select, if they didn't custom handle a node SelectNodeCommon would reprint the root node before walking the isel table.

It seems better to just print the message before the call to Select so all targets behave the same. And then remove the root node printing from SelectNodeCommon and just leave a message that says we're starting the table search.

There were also some oddities in blank line behavior. Usually due to a \n after a call to SelectionDAGNode::dump which already inserted a new line.

llvm-svn: 323551
2018-01-26 19:34:20 +00:00
Nirav Dave 9896238dc9 [DAG] Teach findBaseOffset to interpret indexes of indexed memory operations
Indexed outputs are addition / subtractions and can be interpreted as such.

llvm-svn: 323539
2018-01-26 16:51:27 +00:00
Simon Pilgrim f531cf8964 [DAGCombine] reduceBuildVecToShuffle - ensure EXTRACT_VECTOR_ELT index is in range
From OSS Fuzz Test Case #5688

llvm-svn: 323535
2018-01-26 15:50:20 +00:00
Hiroshi Inoue 0909ca132f [NFC] fix trivial typos in comments and documents
"in in" -> "in", "on on" -> "on" etc.

llvm-svn: 323508
2018-01-26 08:15:29 +00:00
Craig Topper b5c45e0509 [SelectionDAG] Replace a std::vector<SDValue> with a SmallVector.
It likely the number of elements in the type we're legalizing here is reasonably small.

llvm-svn: 323505
2018-01-26 07:15:22 +00:00
Amara Emerson f386e2b081 [GlobalISel] Don't fall back to FastISel.
Apparently checking the pass structure isn't enough to ensure that we don't fall
back to FastISel, as it's set up as part of the SelectionDAGISel.

llvm-svn: 323369
2018-01-24 19:59:29 +00:00
Sven van Haastregt e8404780c3 [DAGCombiner] Bail out if vector size is not a multiple
For the included test case, the DAG transformation
  concat_vectors(scalar, undef) -> scalar_to_vector(sclr)
would attempt to create a v2i32 vector for a v9i8
concat_vector.  Bail out to avoid creating a bitcast with
mismatching sizes later on.

Differential Revision: https://reviews.llvm.org/D42379

llvm-svn: 323312
2018-01-24 09:53:47 +00:00
Jonas Paulsson 7ad28863fb [SelectionDAG] Fix codegen of vector stores with non byte-sized elements.
This was completely broken, but hopefully fixed by this patch.

In cases where it is needed, a vector with non byte-sized elements is stored
by extracting, zero-extending, shift:ing and or:ing the elements into an
integer of the same width as the vector, which is then stored.

Review: Eli Friedman, Ulrich Weigand
https://reviews.llvm.org/D42100#inline-369520
https://bugs.llvm.org/show_bug.cgi?id=35520

llvm-svn: 323042
2018-01-20 16:05:10 +00:00
Ulrich Weigand 5605be9e50 [SelectionDAG] Teach computeKnownBits about ATOMIC_CMP_SWAP_WITH_SUCCESS boolean return value
The second return value of ATOMIC_CMP_SWAP_WITH_SUCCESS is known to be a
boolean, and should therefore be treated by computeKnownBits just like
the second return values of SMULO / UMULO.

Differential Revision: https://reviews.llvm.org/D42067

llvm-svn: 322985
2018-01-19 20:47:14 +00:00
Daniel Neilson 1e68724d24 Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)
Summary:
 This is a resurrection of work first proposed and discussed in Aug 2015:
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
and initially landed (but then backed out) in Nov 2015:
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

 The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.

 This change is the first in a series that allows source and dest to each
have their own alignments by using the alignment attribute on their arguments.

 In this change we:
1) Remove the alignment argument.
2) Add alignment attributes to the source & dest arguments. We, temporarily,
   require that the alignments for source & dest be equal.

 For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)

 Downstream users may have to update their lit tests that check for
@llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script
may help with updating the majority of your tests, but it does not catch all possible
patterns so some manual checking and updating will be required.

s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g

 The remaining changes in the series will:
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
   source and dest alignments.
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
        and those that use use MemIntrinsicInst::[get|set]Alignment() to use
        getDestAlignment() and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
        MemIntrinsicInst::[get|set]Alignment() methods.

Reviewers: pete, hfinkel, lhames, reames, bollu

Reviewed By: reames

Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits

Differential Revision: https://reviews.llvm.org/D41675

llvm-svn: 322965
2018-01-19 17:13:12 +00:00
Sanjay Patel 0e690b05d1 [TargetLowering] add punctuation for readability; NFC
llvm-svn: 322855
2018-01-18 15:25:32 +00:00
Sam Parker 1f8c035d6b [SelectionDAG] Convert assert to condtion
Follow-up to r322120 which can cause assertions for AArch64 because
v1f64 and v1i64 are legal types.

Differential Revision: https://reviews.llvm.org/D42097

llvm-svn: 322823
2018-01-18 09:22:24 +00:00
Craig Topper 7f0d85ec1e [DAGCombiner] Add a DAG combine to turn a splat build_vector where the splat elemnt is a bitcast from a vector type into a concat_vector
For example, a build_vector of i64 bitcasted from v2i32 can be turned into a concat_vectors of the v2i32 vectors with a bitcast to a vXi64 type

Differential Revision: https://reviews.llvm.org/D42090

llvm-svn: 322811
2018-01-18 04:17:06 +00:00
Eli Friedman c60a23a6af [LegalizeDAG] Fix ATOMIC_CMP_SWAP_WITH_SUCCESS legalization.
The code wasn't zero-extending correctly, so the comparison could
spuriously fail.

Adds some AArch64 tests to cover this case.

Inspired by D41791.

Differential Revision: https://reviews.llvm.org/D41798

llvm-svn: 322767
2018-01-17 22:04:36 +00:00
Benjamin Kramer 736a343e97 Revert "[DAG] Elide overlapping stores"
This reverts commit r322085. Internal PPC testing is still showing the
same symptoms as when this patch landed the last time.

llvm-svn: 322474
2018-01-15 10:57:24 +00:00
Daniel Neilson 2409d24201 [NFC] Change MemIntrinsicInst::setAlignment() to take an unsigned instead of a Constant
Summary:
 In preparation for https://reviews.llvm.org/D41675 this NFC changes this
prototype of MemIntrinsicInst::setAlignment() to accept an unsigned instead
of a Constant.

llvm-svn: 322403
2018-01-12 21:33:37 +00:00
Adrian Prantl 29102b3002 dag-combine: Transfer debug information when folding (zext (truncate x))
-> (zext (truncate x))

This patch adds debug info support to the dagcombine rule (zext
(truncate x)) -> (zext (truncate x)).

Differential Revision: https://reviews.llvm.org/D41924

llvm-svn: 322304
2018-01-11 18:35:12 +00:00
Zvi Rackover 999e6c2967 DAGCombine: Let truncates negate extension through extract-subvector
Summary:
Fold cases such as:
(v8i8 truncate (v8i32 extract_subvector (v16i32 sext (v16i8 V), Idx)))
->
(v8i8 extract_subvector (v16i8 V), Idx)

This can be generalized to cases where the truncate and extend do not
fully cancel each other out, but it may require querying the target
about profitability.

Reviewers: RKSimon, craig.topper, spatel, efriedma

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41927

llvm-svn: 322300
2018-01-11 18:02:33 +00:00
Jonas Paulsson 9b395a12ed [VectorLegalizer] Remove broken code in ExpandStore.
The code that is supposed to "Round odd types to the next pow of two" seems
broken and as well completely unused (untested). It also seems that
ExpandStore really shouldn't ever change the memory VT, which this in fact
does.

As a first step in fixing the broken handling of vector stores (of irregular
types, e.g. an i1 vector), this code is removed. For discussion, see
https://bugs.llvm.org/show_bug.cgi?id=35520.

Review: Eli Friedman
llvm-svn: 322275
2018-01-11 13:03:21 +00:00
Craig Topper af4eb17223 [SelectionDAG][X86] Explicitly store the scale in the gather/scatter ISD nodes
Currently we infer the scale at isel time by analyzing whether the base is a constant 0 or not. If it is we assume scale is 1, else we take it from the element size of the pass thru or stored value. This seems a little weird and I think it makes more sense to make it explicit in the DAG rather than doing tricky things in the backend.

Most of this patch is just making sure we copy the scale around everywhere.

Differential Revision: https://reviews.llvm.org/D40055

llvm-svn: 322210
2018-01-10 19:16:05 +00:00
Jonas Paulsson 9222b91e24 [SelectionDAGBuilder] Chain prefetches less aggressively.
Prefetches used to always be chained between any previous and following
memory accesses. The problem with this was that later optimizations, such as
folding of a load into the user instruction, got disrupted.

This patch relaxes the chaining of prefetches in order to remedy this.

Reveiw: Hal Finkel
https://reviews.llvm.org/D38886

llvm-svn: 322163
2018-01-10 09:33:00 +00:00
Tim Renouf d68fa1be57 [SelectionDAG] Fixed f16-from-vector promotion problem
Summary:
In the case of an fp_extend of v1f16 to v1f32 where the v1f16 is the
result of a bitcast from i16, avoid creating an illegal fp16_to_fp where
the input is not a vector and the result is a v1f32.

V2: The fix is now to avoid vector scalarization creating a v1->scalar
bitcast.

Reviewers: srhines, t.p.northover

Subscribers: nhaehnle, llvm-commits, dstuttard, t-tye, yaxunl, wdng, kzhuravl, arsenm

Differential Revision: https://reviews.llvm.org/D41126

llvm-svn: 322120
2018-01-09 21:36:25 +00:00
Sanjay Patel 37e28e40cb [SelectionDAG] lower math intrinsics to finite version of libcalls when possible (PR35672)
Ingredients in this patch:
1. Add HANDLE_LIBCALL defs for finite mathlib functions that correspond to LLVM intrinsics.
2. Plumbing to send TargetLibraryInfo down to SelectionDAGLegalize.
3. Relaxed math and library checking in SelectionDAGLegalize::ConvertNodeToLibcall() to choose finite libcalls.

There was a bug about determining the availability of the finite calls that should be fixed with:
rL322010

Not in this patch:
This doesn't resolve the question/bug of clang creating the intrinsic IR in the first place.
There's likely follow-up work needed to support the long double variants better.
There's room for improvement to reduce the code duplication.
Create finite calls that don't originate from a corresponding intrinsic or DAG node?

Differential Revision: https://reviews.llvm.org/D41338

llvm-svn: 322087
2018-01-09 15:41:00 +00:00
Nirav Dave 30304a3bd7 [DAG] Elide overlapping stores
Relanding after fixing handling of pre-indexed memory operations in
BaseIndexOffset analysis (r322003).

Extend overlapping store elision to handle overwrites of stores by
larger stores.

Reviewers: craig.topper, rnk, t.p.northover

Subscribers: javed.absar, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40969

llvm-svn: 322085
2018-01-09 15:23:12 +00:00
Nirav Dave 6e2d03d410 [DAG] Teach BaseIndexOffset to correctly handle with indexed operations
BaseIndexOffset address analysis incorrectly ignores offsets folded
into indexed memory operations causing potential errors in alias
analysis of pre-indexed operations.

Reviewers: efriedma, RKSimon, hfinkel, jyknight

Subscribers: hiraditya, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D41701

llvm-svn: 322003
2018-01-08 16:21:35 +00:00
Sam Parker 3800f0f11d [DAGCombine] Fix for PR35761
I had falsely assumed that constant operands would be operand(1) of
the bin ops that may need their constant operand to be masked.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=35761

Differential Revision: https://reviews.llvm.org/D41667

llvm-svn: 321991
2018-01-08 13:21:24 +00:00
Simon Pilgrim 998180dad3 [DAG] Fix for Bug PR34620 - Allow SimplifyDemandedBits to look through bitcasts
Allow SimplifyDemandedBits to use TargetLoweringOpt::computeKnownBits to look through bitcasts. This can help simplifying in some cases where bitcasts of constants generated during or after legalization can't be folded away, and thus didn't get picked up by SimplifyDemandedBits. This fixes PR34620, where a redundant pand created during legalization from lowering and lshr <16xi8> wasn't being simplified due to the presence of a bitcasted build_vector as an operand.

Committed on the behalf of @sameconrad (Sam Conrad)

Differential Revision: https://reviews.llvm.org/D41643

llvm-svn: 321969
2018-01-07 19:09:40 +00:00
Craig Topper d58c165545 [X86] Make v2i1 and v4i1 legal types without VLX
Summary:
There are few oddities that occur due to v1i1, v8i1, v16i1 being legal without v2i1 and v4i1 being legal when we don't have VLX. Particularly during legalization of v2i32/v4i32/v2i64/v4i64 masked gather/scatter/load/store. We end up promoting the mask argument to these during type legalization and then have to widen the promoted type to v8iX/v16iX and truncate it to get the element size back down to v8i1/v16i1 to use a 512-bit operation. Since need to fill the upper bits of the mask we have to fill with 0s at the promoted type.

It would be better if we could just have the v2i1/v4i1 types as legal so they don't undergo any promotion. Then we can just widen with 0s directly in a k register. There are no real v4i1/v2i1 instructions anyway. Everything is done on a larger register anyway.

This also fixes an issue that we couldn't implement a masked vextractf32x4 from zmm to xmm properly.

We now have to support widening more compares to 512-bit to get a mask result out so new tablegen patterns got added.

I had to hack the legalizer for widening the operand of a setcc a bit so it didn't try create a setcc returning v4i32, extract from it, then try to promote it using a sign extend to v2i1. Now we create the setcc with v4i1 if the original setcc's result type is v2i1. Then extract that and don't sign extend it at all.

There's definitely room for improvement with some follow up patches.

Reviewers: RKSimon, zvi, guyblank

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41560

llvm-svn: 321967
2018-01-07 18:20:37 +00:00
Craig Topper d461aefe5f [PowerPC] Add an ISD::TRUNCATE to the legalization for ppc_is_decremented_ctr_nonzero
Summary:
I believe legalization is really expecting that ReplaceNodeResults will return something with the same type as the thing that's being legalized. Ultimately, it uses the output to replace the uses in the DAG so the type should match to make that work.

There are two relevant cases here. When crbits are enabled, then i1 is a legal type and getSetCCResultType should return i1. In this case, the truncate will be between i1 and i1 and should be removed (SelectionDAG::getNode does this). Otherwise, getSetCCResultType will be i32 and the legalizer will promote the truncate to be i32 -> i32 which will be similarly removed.

With this fixed we can remove some code from PromoteIntRes_SETCC that seemed to only exist to deal with the intrinsic being replaced with a larger type without changing the other operand. With the truncate being used for connectivity this doesn't happen anymore.

Reviewers: hfinkel

Reviewed By: hfinkel

Subscribers: nemanjai, llvm-commits, kbarton

Differential Revision: https://reviews.llvm.org/D41654

llvm-svn: 321959
2018-01-07 07:51:36 +00:00
Sam Parker 1ad085b808 [DAGCombine] Fix for PR37563
While searching for loads to be narrowed, equal sized loads were not
added to the list, resulting in anyext loads not being converted to
zext loads.

https://bugs.llvm.org/show_bug.cgi?id=35763

Differential Revision: https://reviews.llvm.org/D41628

llvm-svn: 321862
2018-01-05 08:47:23 +00:00
Amara Emerson 8a100dd2a6 [DAGCombine] Ensure SDNode use iterator is incremented properly.
Fixes an ASAN bug found by oss-fuzz.

llvm-svn: 321813
2018-01-04 18:38:45 +00:00
Simon Pilgrim ec0a2fb703 [DAGCombine] Handle out of range EXTRACT_VECTOR_ELT indices
Handle this in DAGCombiner::visitEXTRACT_VECTOR_ELT the same as we already do in SelectionDAG::getNode and use APInt instead of getZExtValue.

This should also fix oss-fuzz #4910

llvm-svn: 321767
2018-01-03 22:42:33 +00:00
Daniel Jasper cc4903e2ba Revert r321089: "[DAG] Elide overlapping store" (and subsequent fix in r321204)
Our internal testing has revealed has discovered bugs in PPC builds.
I have forward reproduction instructions to the original author (Nirav).

llvm-svn: 321649
2018-01-02 14:38:52 +00:00
Sam Parker 3570c554b5 [DAGCombine] Fix for PR35765
Remove the acceptance of ANY_EXTEND nodes while trying to move and
nodes back to loads.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=35765

Differential Revision: https://reviews.llvm.org/D41625

llvm-svn: 321641
2018-01-02 10:19:01 +00:00
Craig Topper e3b6bd337a [SelectionDAG] Teach WidenVecOp_Convert to widen the operation if a widened result type would still be legal.
llvm-svn: 321638
2018-01-02 07:30:53 +00:00
Craig Topper 3e528c7518 [SelectionDAG] Remove ifs on getTypeAction being TypeWidenVector from some of the WideVecOp handlers.
We should only be in the handler if the tyep action is TypeWidenVector. There's no reason to try to do anything else.

llvm-svn: 321635
2018-01-02 01:55:07 +00:00
Craig Topper a4f9997675 [SelectionDAG][X86][AArch64] Require targets to specify the promotion type when using setOperationAction Promote for INT_TO_FP and FP_TO_INT
Currently the promotion for these ignores the normal getTypeToPromoteTo and instead just tries to double the element width. This is because the default behavior of getTypeToPromote to just adds 1 to the SimpleVT, which has the affect of increasing the element count while keeping the scalar size the same.

If multiple steps are required to get to a legal operation type, int_to_fp will be promoted multiple times. And fp_to_int will keep trying wider types in a loop until it finds one that works.

getTypeToPromoteTo does have the ability to query a promotion map to get the type and not do the increasing behavior. It seems better to just let the target specify the promotion type in the map explicitly instead of letting the legalizer iterate via widening.

FWIW, it's worth I think for any other vector operations that need to be promoted, we have to specify the type explicitly because the default behavior of getTypeToPromote isn't useful for vectors. The other types of promotion already require either the element count is constant or the total vector width is constant, but neither happens by incrementing the SimpleVT enum.

Differential Revision: https://reviews.llvm.org/D40664

llvm-svn: 321629
2018-01-01 19:21:35 +00:00
Benjamin Kramer c7fc81e659 Use phi ranges to simplify code. No functionality change intended.
llvm-svn: 321585
2017-12-30 15:27:33 +00:00
Dimitry Andric 2fb6134305 Avoid modifying DbgInfo while looping in salvageDebuginfo
Summary:
I have been getting rather difficult to reproduce SIGBUS crashes when
compiling certain FreeBSD sources, and their stack traces pointed
squarely at `SelectionDAG::salvageDebugInfo()`:

```
Core was generated by `/usr/obj/share/dim/src/freebsd/clang600-import/amd64.amd64/tmp/usr/bin/cc -cc1 -'.
Program terminated with signal SIGBUS, Bus error.
#0  isInvalidated () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SDNodeDbgValue.h:115
115       bool isInvalidated() const { return Invalid; }
(gdb) bt
#0  isInvalidated () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SDNodeDbgValue.h:115
#1  salvageDebugInfo () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7116
#2  0x00000000033b2516 in operator() () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3595
#3  __invoke<(lambda at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3593:59) &, llvm::SDNode *, llvm::SDNode *> () at /usr/include/c++/v1/type_traits:4323
#4  __call<(lambda at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3593:59) &, llvm::SDNode *, llvm::SDNode *> () at /usr/include/c++/v1/__functional_base:349
#5  operator() () at /usr/include/c++/v1/functional:1562
#6  0x00000000033b0817 in operator() () at /usr/include/c++/v1/functional:1916
#7  NodeDeleted () at /share/dim/src/freebsd/clang600-import/contrib/llvm/include/llvm/CodeGen/SelectionDAG.h:293
#8  0x0000000003529dde in RemoveDeadNodes () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:610
#9  0x00000000035556df in MorphNodeTo () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:6794
#10 0x00000000033a9acc in MorphNode () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:2594
#11 0x00000000033ac80b in SelectCodeCommon () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3601
#12 0x00000000023d464b in SelectCode () at /usr/obj/share/dim/src/freebsd/clang600-import/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/X86GenDAGISel.inc:282902
#13 Select () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:3072
#14 0x00000000033a5afa in DoInstructionSelection () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:988
#15 0x00000000033a4e1a in CodeGenAndEmitDAG () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:868
#16 0x00000000033a2643 in SelectAllBasicBlocks () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1624
#17 0x000000000339f158 in runOnMachineFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:466
#18 0x00000000023d03c4 in runOnMachineFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:175
#19 0x00000000035cc8c2 in runOnFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/MachineFunctionPass.cpp:62
#20 0x00000000030dca9a in runOnFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1520
#21 0x00000000030dccf3 in runOnModule () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1541
#22 0x00000000030dd228 in runOnModule () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1597
#23 run () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1700
#24 0x00000000014db578 in EmitAssembly () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/CodeGen/BackendUtil.cpp:815
#25 EmitBackendOutput () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/CodeGen/BackendUtil.cpp:1181
#26 0x00000000014d5b26 in HandleTranslationUnit () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/CodeGen/CodeGenAction.cpp:292
#27 0x0000000001c4c332 in ParseAST () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/Parse/ParseAST.cpp:159
#28 0x00000000015d546c in Execute () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/Frontend/FrontendAction.cpp:897
#29 0x0000000001cec311 in ExecuteAction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/Frontend/CompilerInstance.cpp:991
#30 0x00000000014b4f81 in ExecuteCompilerInvocation () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:252
#31 0x00000000014aa73f in cc1_main () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/tools/driver/cc1_main.cpp:221
#32 0x00000000014b2928 in ExecuteCC1Tool () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/tools/driver/driver.cpp:309
#33 main () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/tools/driver/driver.cpp:388
(gdb) frame 1
#1  salvageDebugInfo () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7116
7116        if (DV->isInvalidated())
(gdb) disassemble
Dump of assembler code for function salvageDebugInfo():
[...]
   0x0000000003557348 <+744>:   nopl   0x0(%rax,%rax,1)
   0x0000000003557350 <+752>:   mov    (%r12),%r13
=> 0x0000000003557354 <+756>:   cmpb   $0x0,0x31(%r13)
   0x0000000003557359 <+761>:   jne    0x35573b0 <salvageDebugInfo()+848>
(gdb) info registers
[...]
r13            0x5a5a5a5a5a5a5a5a       6510615555426900570
```

The `0x5a5a5a5a5a5a5a5a` value in `r13` indicates the memory was either
uninitialized, or already freed.

Unfortunately I do not have a simple self-contained test case for this.
However, it seems pretty clear that the call to `AddDbgValue()` in
`salvageDebugInfo()` causes the problems, since it modifies
`SelectionDag::DbgInfo` while looping through one of its DenseMaps:

```
void SelectionDAG::salvageDebugInfo(SDNode &N) {
[...]
  for (auto DV : GetDbgValues(&N)) {
    if (DV->isInvalidated())
      continue;
[...]
        AddDbgValue(Clone, N0.getNode(), false);
[...]
  }
}
```

At least, if I comment out the `AddDbgValue()` call, the crashes go
away.  I propose to change this function slightly, similar to the
`SelectionDAG::transferDbgValues()` function just above it, to save the
cloned SDDbgValues in a separate SmallVector, and only call
AddDbgValue() on them after the for loop is done.

Reviewers: aprantl, bogner, bkramer, davide

Reviewed By: davide

Subscribers: davide, krytarowski, JDevlieghere, emaste, llvm-commits

Differential Revision: https://reviews.llvm.org/D41589

llvm-svn: 321545
2017-12-28 23:42:44 +00:00
Craig Topper c077ae68c3 [SelectionDAG] Add creating new node debug messages for load, store, gather, and scatter.
llvm-svn: 321540
2017-12-28 19:46:16 +00:00
Craig Topper d5fed997db [SelectionDAG] Add some debug print messages to LegalizeVectorOps.
llvm-svn: 321535
2017-12-28 19:46:01 +00:00
Simon Pilgrim 6fad3cbc66 [DAGCombine] foldBinOpIntoSelect can fail to constant fold in some cases.
For example, float operations may fail to constant fold under certain circumstances (inf/nan/denormal creation etc.)

Reduced from oss-fuzz #4802 test case

llvm-svn: 321488
2017-12-27 11:36:18 +00:00
Simon Pilgrim b17c204cc0 [DAGCombine] visitANDLike - ensure APInt is is in range for getSExtValue/getZExtValue
Reduced from oss-fuzz #4782 test case

llvm-svn: 321464
2017-12-26 23:27:44 +00:00
Simon Pilgrim 628f63e5fd [DAGCombine] Don't combine (and (setne X, 0), (setne X, -1)) --> (setuge (add X, 1), 2) for i1
Reduced from oss-fuzz #4773 test case

llvm-svn: 321455
2017-12-26 14:48:28 +00:00
Craig Topper f65a6e4ed8 [DAGCombiners] Don't turn ANDs to shuffles with zero so early. Give some other combines a chance to run.
This moves the combine for turning ANDs into shuffle with zero out of SimplifyVBinOps and places it only in visitAND below the reassociate handling. This fixes the specific case I noticed where we failed to combine two ands with constants.

llvm-svn: 321417
2017-12-24 02:05:18 +00:00
Craig Topper 1f2f265fc1 [SelectionDAG] Teach SelectionDAG::getNode to constant fold zext/aext/sext of constant build vectors.
llvm-svn: 321414
2017-12-23 20:21:29 +00:00
Craig Topper d6a8f2e67d [SelectionDAG][X86] Don't use ->getValueType(0) after a call to getOperand to get the type of the operand.
getOperand returns an SDValue that contains the node and the result number. There is no guarantee that the result number if 0. By using the -> operator we are calling SDNode::getValueType rather than SDValue::getValueType. This requires supplying a result number and we shouldn't assume it was 0.

I don't have a test case. Just noticed while cleaning up some other code and saw that it occurred in other places.

llvm-svn: 321397
2017-12-23 02:54:50 +00:00
Nirav Dave afeae77058 [DAG] Add missing case check from findbaseoffset merge from r321389.
llvm-svn: 321391
2017-12-22 22:06:56 +00:00
Nirav Dave d8e3633db4 Integrate findBaseOffset address analyses to BaseIndexOffset. NFCI.
BaseIndexOffset supercedes findBaseOffset analysis save only Constant
Pool addresses. Migrate analysis to BaseIndexOffset.

Relanding after correcting base address matching check.

llvm-svn: 321389
2017-12-22 21:20:55 +00:00
Nirav Dave 9c26e92aab Revert "[DAG] Integrate findBaseOffset address analyses to BaseIndexOffset. NFCI."
which was causing miscompilations in for some test-suite components.

This reverts commit 3e9de9ff0f3162920a2a3cba51c7dc14b54b4d16.

llvm-svn: 321380
2017-12-22 19:33:56 +00:00
Craig Topper b2368fbdf4 [SelectionDAG] Reverse the order of operands in the ISD::ADD created by TargetLowering::getVectorElementPointer so that the FrameIndex is on the left.
This seems to improve X86's ability to match this into an address computation. Otherwise the other operand gets assigned to the base register and the stack pointer + frame index ends up in the index register. But index registers can't encode ESP/RSP so we end up having to move it into another register to meet the constraint.

I could try to improve the address matcher in X86, but swapping the producer seemed easier. Several other places already have the operands in this order so this is at least consistent.

llvm-svn: 321370
2017-12-22 17:18:13 +00:00
Nirav Dave 4ce902657a [DAG] Integrate findBaseOffset address analyses to BaseIndexOffset. NFCI.
BaseIndexOffset supercedes findBaseOffset analysis save only Constant
Pool addresses. Migrate analysis to BaseIndexOffset.

llvm-svn: 321364
2017-12-22 16:59:09 +00:00
Sam Parker cf426fccd4 [DAGCombine] Revert r321259
Improve ReduceLoadWidth for SRL Patch is causing an issue on the
PPC64 BE santizer.

llvm-svn: 321349
2017-12-22 08:36:25 +00:00
Simon Pilgrim 695c7da3d5 [DAGCombiner] Remove (xor (xor x, c1), c2) -> (xor x, (xor c1, c2)) fold. NFCI.
More general cases are already handled by constant canonicalization and then the ReassociateOps call at line 5327

llvm-svn: 321280
2017-12-21 16:54:03 +00:00
Simon Pilgrim 6b915d3353 [DAGCombiner] Generalize (or (and X, c1), c2) -> (and (or X, c2), c1|c2) combine to work on non-splat vectors
The knownbits_mask_or_shuffle_uitofp change is interesting - shuffle combines manage to kick in, removing the AND constant mask load. For targets with fast-variable-shuffle this should reduce further to VPOR+VPSHUFB+VCVTDQ2PS.

llvm-svn: 321279
2017-12-21 16:34:46 +00:00
Simon Pilgrim 4dd03ed7e3 [DAGCombiner] Generalize (and (or x, C), D) -> D iff (C & D) == D combine to work on non-splat vectors
llvm-svn: 321275
2017-12-21 15:17:29 +00:00
Sam Parker 59efb8cb5b [DAGCombine] Improve ReduceLoadWidth for SRL
If the SRL node is only used by an AND, we may be able to set the
ExtVT to the width of the mask, making the AND redundant. To support
this, another check has been added in isLegalNarrowLoad which queries
whether the load is valid.

Differential Revision: https://reviews.llvm.org/D41350

llvm-svn: 321259
2017-12-21 12:55:04 +00:00
Matt Arsenault d60951f469 DAG: Tolerate non-MemSDNodes for OPC_RecordMemRef
When intrinsics are allowed to have mem operands, there
are two ways this can happen. First is an intrinsic
that is marked has having a mem operand, but is not handled
by getTgtMemIntrinsic.

The second way can occur even for intrinsics which do not
have a mem operand. It seems the selector table does
some kind of sorting based on the opcode, and the
mem ref recording can happen in the same scope for
intrinsics that both do and do not have mem refs.
I haven't been able to figure out exactly why this happens
(although it happens even with the matcher optimizations disabled).
I'm not sure if it's worth trying to avoid hitting this for
these nodes since I think it's still reasonable to handle
this in case getTgtMemIntrinic is not implemented.

llvm-svn: 321208
2017-12-20 19:11:59 +00:00
Nirav Dave a869856c60 [DAG] Fix condition on overlapping store check.
Prevent overlapping store elision when overlapping store is
pre-inc/dec as analysis is wrong in these cases.

llvm-svn: 321204
2017-12-20 19:06:47 +00:00
Krzysztof Parzyszek 3257e44c66 Add optional SelectionDAG* parameter to SValue::dump and SDValue::dumpr
These functions simply call their counterparts in the associated SDNode,
which do take an optional SelectionDAG. This change makes the legalization
debug trace a little easier to read, since target-specific nodes will
now have their names shown instead of "Unknown node #123".

llvm-svn: 321180
2017-12-20 15:15:04 +00:00
Adrian Prantl 0e6694d111 Silence a bunch of implicit fallthrough warnings
llvm-svn: 321114
2017-12-19 22:05:25 +00:00
Nirav Dave 51425fa5ba [DAG] Elide overlapping store
Summary:
Extend overlapping store elision to handle overwrites of stores by
larger stores.

Nontemporal tests have been modified to add memory dependencies to
prevent store elision.

Reviewers: craig.topper, rnk, t.p.northover

Subscribers: javed.absar, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40969

llvm-svn: 321089
2017-12-19 17:10:56 +00:00
Sam Parker 00804efd72 [DAGCombine] Move AND nodes to multiple load leaves
Search from AND nodes to find whether they can be propagated back to
loads, so that the AND and load can be combined into a narrow load.
We search through OR, XOR and other AND nodes and all bar one of the
leaves are required to be loads or constants. The exception node then
needs to be masked off meaning that the 'and' isn't removed, but the
loads(s) are narrowed still.

Differential Revision: https://reviews.llvm.org/D41177

llvm-svn: 320962
2017-12-18 10:04:27 +00:00
Matthias Braun 042fed54fb Fix unused variable in non-assert builds
llvm-svn: 320885
2017-12-15 22:53:33 +00:00
Matthias Braun f1caa2833f MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

llvm-svn: 320884
2017-12-15 22:22:58 +00:00
Craig Topper 3fb8386685 [SelectionDAG][X86] Fix insert_vector_elt lowering for v32i1/v64i1 with non-constant index
Summary:
Currently we don't handle v32i1/v64i1 insert_vector_elt correctly as we fail to look at the number of elements closely and assume it can only be v16i1 or v8i1.

We also can't type legalize v64i1 insert_vector_elt correctly on KNL due to the type not being byte addressable as required by the legalizing through memory accesses path requires.

For the first issue, the patch now tries to pick a 512-bit register with the correct number of elements and promotes to that.

For the second issue, we now extend the vector to a byte addressable type, do the stores to memory, load the two halves, and then truncate the halves back to the original type. Technically since we changed the type, we may not need two loads, but actually checking that is more work and for the v64i1 case we do need them.

Reviewers: RKSimon, delena, spatel, zvi

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40942

llvm-svn: 320849
2017-12-15 19:35:22 +00:00
Craig Topper 23951ec2cd [SelectionDAG] Make getNode calls that take an ArrayRef of SDValue for operands call NewSDValueDbgMsg.
This makes it work better with some build_vector and concat_vectors creations.

Adjust the NewSDValueDbgMsg in getConstant to avoid duplicating the print when it calls getSplatBuildVector since getSplatBuildVector didn't trigger a print before.

llvm-svn: 320783
2017-12-15 01:03:45 +00:00
Adrian Prantl c133d8a5be EmitFuncArgumentDbgValue: Prefer stack slots over registers for stack arguments
While investigating LLVM PR22316 (http://llvm.org/bugs/show_bug.cgi?id=22316)
I started wondering if it were not always preferable to emit the
initial DBG_VALUEs for stack arguments as FI locations instead of
describing the first register they get copied into. The advantage of
doing this is that the arguments will be available as soon as the
stack is setup. As illustrated by the testcase in the PR, the first
copy of the FI into a register may be sunk by MachineSink.cpp into a
later basic block. By describing the argument on the stack, we nicely
circumvent this problem.

<rdar://problem/19583723>

Differential Revision: https://reviews.llvm.org/D41135

llvm-svn: 320758
2017-12-14 22:55:06 +00:00
Matt Arsenault 7d7adf4f2e TLI: Allow using PSV for intrinsic mem operands
llvm-svn: 320756
2017-12-14 22:34:10 +00:00
Zachary Turner 260fe3eca6 Fix many -Wsign-compare and -Wtautological-constant-compare warnings.
Most of the -Wsign-compare warnings are due to the fact that
enums are signed by default in the MS ABI, while the
tautological comparison warnings trigger on x86 builds where
sizeof(size_t) is 4 bytes, so N > numeric_limits<unsigned>::max()
is always false.

Differential Revision: https://reviews.llvm.org/D41256

llvm-svn: 320750
2017-12-14 22:07:03 +00:00
Matt Arsenault 1117133687 DAG: Expose all MMO flags in getTgtMemIntrinsic
Rather than adding more bits to express every
MMO flag you could want, just directly use the
MMO flags. Also fixes using a bunch of bool arguments to
getMemIntrinsicNode.

On AMDGPU, buffer and image intrinsics should always
have MODereferencable set, but currently there is no
way to do that directly during the initial intrinsic
lowering.

llvm-svn: 320746
2017-12-14 21:39:51 +00:00
Benjamin Kramer a85822cb1e Revert "[DAGCombine] Move AND nodes to multiple load leaves"
This reverts commit r320679. Causes miscompiles.

llvm-svn: 320698
2017-12-14 14:03:07 +00:00
Sam Parker ef12b41ef7 [DAGCombine] Move AND nodes to multiple load leaves
Recommitting rL319773, which was reverted due to a recursive issue
causing timeouts. This happened because I failed to check whether
the discovered loads could be narrowed further. In the case of a tree
with one or more narrow loads, that could not be further narrowed, as
well as a node that would need masking, an AND could be introduced
which could then be visited and recombined again with the same load.
This could again create the masking load, with would be combined
again... We now check that the load can be narrowed so that this
process stops.

Original commit message:
Search from AND nodes to find whether they can be propagated back to
loads, so that the AND and load can be combined into a narrow load.
We search through OR, XOR and other AND nodes and all bar one of the
leaves are required to be loads or constants. The exception node then
needs to be masked off meaning that the 'and' isn't removed, but the
loads(s) are narrowed still.

Differential Revision: https://reviews.llvm.org/D41177

llvm-svn: 320679
2017-12-14 09:31:01 +00:00
Craig Topper eab2d4665f [SelectionDAG][X86] Improve legalization of v32i1 CONCAT_VECTORS of v16i1 for AVX512F.
A v32i1 CONCAT_VECTORS of v16i1 uses promotion to v32i8 to legalize the v32i1. This results in a bunch of extract_vector_elts and a build_vector that ultimately gets scalarized.

This patch checks to see if v16i8 is legal and inserts a any_extend to that so that we can concat v16i8 to v32i8 and avoid creating the extracts.

llvm-svn: 320674
2017-12-14 08:25:58 +00:00
Craig Topper cf77203ff6 [SelectionDAG] When legalizing the result type of CONCAT_VECTORS, take into account whether the input type also needs to be promoted.
If so go ahead and get the promoted input vector to extract from. Previously, we would create a bunch of any_extends of extract_vector_elts with illegal input type that needs to be promoted. The legalization of those extract_vector_elts would then potentially introduce a truncate. So now we have a bunch of any_extends of truncates. By legalizing both parts together we avoid creating these extra nodes.

The test changes seem to be because we were previously combining the build_vector with the any_extend before the any_extend got combined with the truncate.

llvm-svn: 320669
2017-12-14 06:49:07 +00:00
Michael Zolotukhin c468b648fd Remove redundant includes from lib/CodeGen.
llvm-svn: 320619
2017-12-13 21:30:47 +00:00
Roger Ferrer Ibanez e8d4e88bab [DAG] Promote ADDCARRY / SUBCARRY
Add missing case that was not implemented yet.

Differential Revision: https://reviews.llvm.org/D38942

llvm-svn: 320567
2017-12-13 10:45:21 +00:00
Sanjay Patel f3436d7dab [DAGCombiner] protect against an infinite loop between shl <--> mul (PR35579)
At first, I tried to thread the x86 needle and use a target hook (isVectorShiftByScalarCheap())
to disable the transform only for non-splat pow-of-2 constants, but not AVX2, but only some
element types, but...it's difficult.

Here we just avoid the loop with the x86 vector transform that conflicts with the general DAG
combine and preserve all of the existing behavior AFAICT otherwise.

Some tests that will probably fail if someone does try to restrict this in a more targeted way
for x86-only may be found in:

test/CodeGen/X86/combine-mul.ll
test/CodeGen/X86/vector-mul.ll
test/CodeGen/X86/widen_arith-5.ll

This should prevent the infinite looping seen with:
https://bugs.llvm.org/show_bug.cgi?id=35579

Differential Revision: https://reviews.llvm.org/D41040

llvm-svn: 320374
2017-12-11 15:19:31 +00:00
Nemanja Ivanovic 25d9af0cb5 [DAGCombiner] Add combined indexed load to the work list
This commit is the first part of https://reviews.llvm.org/D40348.
In order to allow target combines to be performed on newly combined
indexed loads, add them back to the worklist. The remainder of the
above patch will be committed in subsequent revisions and will use
this. Test cases will be included with those follow-up commits.

llvm-svn: 320365
2017-12-11 14:16:02 +00:00
Roger Ferrer Ibanez 5ea0f2501f [ARM] Use ADDCARRY / SUBCARRY
This is a preparatory step for D34515.

This change:
 - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
 - lowering is done by first converting the boolean value into the carry flag
   using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
   using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
   operations does the actual addition.
 - for subtraction, given that ISD::SUBCARRY second result is actually a
   borrow, we need to invert the value of the second operand and result before
   and after using ARMISD::SUBE. We need to invert the carry result of
   ARMISD::SUBE to preserve the semantics.
 - given that the generic combiner may lower ISD::ADDCARRY and
   ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
   as well otherwise i64 operations now would require branches. This implies
   updating the corresponding test for unsigned.
 - add new combiner to remove the redundant conversions from/to carry flags
   to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
 - fixes PR34045
 - fixes PR34564
 - fixes PR35103

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 320355
2017-12-11 12:13:45 +00:00
Alex Bradbury 660bcceccf [RISCV] Support lowering FrameIndex
Introduces the AddrFI "addressing mode", which is necessary simply because 
it's not possible to write a pattern that directly matches a frameindex.

Ensure callee-saved registers are accessed relative to the stackpointer. This
is necessary as callee-saved register spills are performed before the frame
pointer is set.

Move HexagonDAGToDAGISel::isOrEquivalentToAdd to SelectionDAGISel, so we can 
make use of it in the RISC-V backend.

Differential Revision: https://reviews.llvm.org/D39848

llvm-svn: 320353
2017-12-11 11:53:54 +00:00
Craig Topper ad45bf5895 [DAGCombiner] Support folding (mulhs/u X, 0)->0 for vectors.
We should probably also fold (mulhs/u X, 1) for vectors, but that's harder.

llvm-svn: 320344
2017-12-11 08:33:20 +00:00
Craig Topper 65ed4d4492 [DAGCombiner] Reuse existing SDLoc variable instead of creating a new one. NFC
llvm-svn: 320343
2017-12-11 08:33:19 +00:00
Dylan McKay 80463fe64d Relax unaligned access assertion when type is byte aligned
Summary:
This relaxes an assertion inside SelectionDAGBuilder which is overly
restrictive on targets which have no concept of alignment (such as AVR).

In these architectures, all types are aligned to 8-bits.

After this, LLVM will only assert that accesses are aligned on targets
which actually require alignment.

This patch follows from a discussion on llvm-dev a few months ago
http://llvm.1065342.n5.nabble.com/llvm-dev-Unaligned-atomic-load-store-td112815.html

Reviewers: bogner, nemanjai, joerg, efriedma

Reviewed By: efriedma

Subscribers: efriedma, cactus, llvm-commits

Differential Revision: https://reviews.llvm.org/D39946

llvm-svn: 320243
2017-12-09 06:45:36 +00:00
Adrian Prantl d13170174c Generalize llvm::replaceDbgDeclare and actually support the use-case that
is mentioned in the documentation (inserting a deref before the plus_uconst).

llvm-svn: 320203
2017-12-08 21:58:18 +00:00
Sanjay Patel 9012391af1 [DAGCombiner] eliminate shuffle of insert element
I noticed this pattern in D38316 / D38388. We failed to combine a shuffle that is either 
repeating a scalar insertion at the same position in a vector or translated to a different 
element index.

Like the earlier patch, this could be an instcombine too, but since we opted to make this 
a DAG transform earlier, I've made this one a DAG patch too.

We do not need any legality checking because the new insert is identical to the existing 
insert except that it may have a different constant insertion operand.

The constant insertion test in test/CodeGen/X86/vector-shuffle-combining.ll was the 
motivation for D38756.

Differential Revision: https://reviews.llvm.org/D40209

llvm-svn: 320050
2017-12-07 15:17:58 +00:00
Craig Topper dfecd45f37 [SelectionDAG] In SplitVecOp_EXTRACT_VECTOR_ELT, simplify the code that makes the type byte addressable.
We can just extend the original vector to vXi1 and trust that the legalization process will revisit it.

llvm-svn: 320013
2017-12-07 08:04:34 +00:00
Craig Topper 26ed8d1263 [SelectionDAG] Use TLI.getVectorIdxTy to determine type for an EXTRACT_VECTOR_ELT index instead of hardcoding MVT::i8.
llvm-svn: 320012
2017-12-07 08:04:33 +00:00
Nirav Dave 7d8f3e0c93 [ARM][AArch64][DAG] Reenable post-legalize store merge
Reenable post-legalize stores with constant merging computation and
corresponding test case.

 * Properly truncate store merge constants
 * Disable merging of truncated stores floating points
 * Ensure merges of constant stores into a single vector are
   constructed from legal elements.

Reviewers: eastig, efriedma

Reviewed By: eastig

Subscribers: spatel, rengolin, aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40701

llvm-svn: 319899
2017-12-06 15:30:13 +00:00
Vlad Tsyrklevich 0b40f21134 Revert "[DAGCombine] Move AND nodes to multiple load leaves"
This reverts commit r319773. It was causing some buildbots to hang, e.g.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/5589

llvm-svn: 319867
2017-12-06 01:16:08 +00:00
Craig Topper 0328d0083e [SelectionDAG] Don't promote the condition operand of VSELECT when promoting the result.
The condition operand should be promoted during operand promotion.

llvm-svn: 319853
2017-12-05 23:08:32 +00:00
Craig Topper dfd90802af [SelectionDAG] Don't promote mask operand when widening mstore and mscatter.
If the mask needs to be promoted that should occur by the legalizer detecting the mask operand needs to be promoted not as a side effect of another action.

llvm-svn: 319852
2017-12-05 23:08:30 +00:00
Craig Topper ddc6bba0ee [SelectionDAG] Don't promote mask when splitting mstore.
If the mask needs to be promoted it should be handled by operand promotion after the result is legalized.

llvm-svn: 319851
2017-12-05 23:08:28 +00:00
Craig Topper 2e684593b7 [SelectionDAG] Don't promote mask operands of MGATHER and MLOAD to setcc result type while widening the result. Just widen the mask.
The mask will be promoted if necessary when operands are promoted. It's possible the mask type is legal, but the setcc result type is a different. We shouldn't promote to the setcc result type unless the mask needs to be promoted.

llvm-svn: 319850
2017-12-05 23:08:27 +00:00
Craig Topper 57440a6f65 [SelectionDAG] Don't call GetWidenedVector for mask operands of MLOAD/MSTORE.
GetWidenedVector does't guarantee the widened elements are zero which would break the intended behavior of the operation.

llvm-svn: 319849
2017-12-05 23:08:25 +00:00
Hans Wennborg 5df9f0878b Re-commit r319490 "XOR the frame pointer with the stack cookie when protecting the stack"
The patch originally broke Chromium (crbug.com/791714) due to its failing to
specify that the new pseudo instructions clobber EFLAGS. This commit fixes
that.

> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622

llvm-svn: 319824
2017-12-05 20:22:20 +00:00
Craig Topper 8adcbe8c9f [SelectionDAG] Remove the code that handles SETCC with a scalar result type from vector widening.
There's no such thing as a setcc with vector operands and scalar result. And if we're trying to widen the result we would have to already be looking at a vector result type.

So this patch renames the VSETCC function as the SETCC function and delete the original SETCC function.

llvm-svn: 319799
2017-12-05 17:37:19 +00:00
Craig Topper 558cc48b44 [SelectionDAG] Remove unused method declaration.
The method implementation was removed in r318982.

llvm-svn: 319798
2017-12-05 17:37:17 +00:00