Commit Graph

10250 Commits

Author SHA1 Message Date
Clement Courbet cb15ba84fe Reland "[DAGCombiner] Allow zextended load combines."
Check that the generated type is simple.
2019-11-22 14:47:18 +01:00
Roman Lebedev 96cf5c8d47
[Codegen] TargetLowering::prepareUREMEqFold(): `x u% C1 ==/!= C2` (PR35479)
Summary:
The current lowering is:
```
Name: (X % C1) == C2 -> X * C3 <= C4 || false
Pre: (C2 == 0 || C1 u<= C2) && (C1 u>> countTrailingZeros(C1)) * C3 == 1
%zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition
%o0 = urem i8 %x, C1
%r = icmp eq i8 %o0, C2
  =>
%zz = and i8 C3, 0 ; and silence it from complaining about said reg
%C4 = -1 /u C1
%n0 = mul i8 %x, C3
%n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right
%n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right
%n3 = or i8 %n1, %n2 ; rotate right
%is_tautologically_false = icmp ule i8 C1, C2
%C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4
%res = icmp ule i8 %n3, %C4_fixed
%r = xor i1 %res, %is_tautologically_false
```
https://rise4fun.com/Alive/2xC
https://rise4fun.com/Alive/jpb5

However, we can support non-tautological cases `C1 u> C2` too.
Said handling consists of two parts:
* `C2 u<= (-1 %u C1)`. It just works. We only have to change `(X % C1) == C2` into `((X - C2) % C1) == 0`
```
Name: (X % C1) == C2 -> (X - C2) * C3 <= C4   iff C2 u<= (-1 %u C1)
Pre: (C1 u>> countTrailingZeros(C1)) * C3 == 1 && C2 u<= (-1 %u C1)
%zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition
%o0 = urem i8 %x, C1
%r = icmp eq i8 %o0, C2
  =>
%zz = and i8 C3, 0 ; and silence it from complaining about said reg
%C4 = (-1 /u C1)
%n0 = sub i8 %x, C2
%n1 = mul i8 %n0, C3
%n2 = lshr i8 %n1, countTrailingZeros(C1) ; rotate right
%n3 = shl i8 %n1, ((8-countTrailingZeros(C1)) %u 8) ; rotate right
%n4 = or i8 %n2, %n3 ; rotate right
%is_tautologically_false = icmp ule i8 C1, C2
%C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4
%res = icmp ule i8 %n4, %C4_fixed
%r = xor i1 %res, %is_tautologically_false
```
https://rise4fun.com/Alive/m4P
https://rise4fun.com/Alive/SKrx
* `C2 u> (-1 %u C1)`. We also have to change `(X % C1) == C2` into `((X - C2) % C1) == 0`,
  and we have to decrement C4:
```
Name: (X % C1) == C2 -> (X - C2) * C3 <= C4   iff C2 u> (-1 %u C1)
Pre: (C1 u>> countTrailingZeros(C1)) * C3 == 1 && C2 u> (-1 %u C1)
%zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition
%o0 = urem i8 %x, C1
%r = icmp eq i8 %o0, C2
  =>
%zz = and i8 C3, 0 ; and silence it from complaining about said reg
%C4 = (-1 /u C1)-1
%n0 = sub i8 %x, C2
%n1 = mul i8 %n0, C3
%n2 = lshr i8 %n1, countTrailingZeros(C1) ; rotate right
%n3 = shl i8 %n1, ((8-countTrailingZeros(C1)) %u 8) ; rotate right
%n4 = or i8 %n2, %n3 ; rotate right
%is_tautologically_false = icmp ule i8 C1, C2
%C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4
%res = icmp ule i8 %n4, %C4_fixed
%r = xor i1 %res, %is_tautologically_false
```
https://rise4fun.com/Alive/d40
https://rise4fun.com/Alive/8cF

I believe this concludes `x u% C1 ==/!= C2` lowering.
In fact, clang is may now be better in this regard than gcc:
as it can be seen from `@t32_6_4` test, we do lower `x % 6 == 4`
via this pattern, while gcc does not: https://godbolt.org/z/XNU2z9
And all the general alive proofs say this is legal.
And manual checking agrees: https://rise4fun.com/Alive/WA2

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=35479 | PR35479 ]].

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: RKSimon

Subscribers: nick, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70053
2019-11-22 15:22:42 +03:00
Roman Lebedev 3f46022e33
[Codegen] TargetLowering::prepareUREMEqFold(): `x u% C1 ==/!= C2` with tautological C1 u<= C2 (PR35479)
Summary:
This is a preparatory cleanup before i add more
of this fold to deal with comparisons with non-zero.

In essence, the current lowering is:
```
Name: (X % C1) == 0 -> X * C3 <= C4
Pre: (C1 u>> countTrailingZeros(C1)) * C3 == 1
%zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition
%o0 = urem i8 %x, C1
%r = icmp eq i8 %o0, 0
  =>
%zz = and i8 C3, 0 ; and silence it from complaining about said reg
%C4 = -1 /u C1
%n0 = mul i8 %x, C3
%n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right
%n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right
%n3 = or i8 %n1, %n2 ; rotate right
%r = icmp ule i8 %n3, %C4
```
https://rise4fun.com/Alive/oqd

It kinda just works, really no weird edge-cases.
But it isn't all that great for when comparing with non-zero.
In particular, given `(X % C1) == C2`, there will be problems
in the always-false tautological case where `C2 u>= C1`:
https://rise4fun.com/Alive/pH3

That case is tautological, always-false:
```
Name: (X % Y) u>= Y
%o0 = urem i8 %x, %y
%r = icmp uge i8 %o0, %y
  =>
%r = false
```
https://rise4fun.com/Alive/ofu

While we can't/shouldn't get such tautological case normally,
we do deal with non-splat vectors, so unless we want to give up
in this case, we need to fixup/short-circuit such lanes.

There are two lowering variants:
1. We can blend between whatever computed result and the correct tautological result
```
Name: (X % C1) == C2 -> X * C3 <= C4 || false
Pre: (C2 == 0 || C1 u<= C2) && (C1 u>> countTrailingZeros(C1)) * C3 == 1
%zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition
%o0 = urem i8 %x, C1
%r = icmp eq i8 %o0, C2
  =>
%zz = and i8 C3, 0 ; and silence it from complaining about said reg
%C4 = -1 /u C1
%n0 = mul i8 %x, C3
%n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right
%n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right
%n3 = or i8 %n1, %n2 ; rotate right
%is_tautologically_false = icmp ule i8 C1, C2
%res = icmp ule i8 %n3, %C4
%r = select i1 %is_tautologically_false, i1 0, i1 %res
```
https://rise4fun.com/Alive/PjT5
https://rise4fun.com/Alive/1KV

2. We can invert the comparison result
```
Name: (X % C1) == C2 -> X * C3 <= C4 || false
Pre: (C2 == 0 || C1 u<= C2) && (C1 u>> countTrailingZeros(C1)) * C3 == 1
%zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition
%o0 = urem i8 %x, C1
%r = icmp eq i8 %o0, C2
  =>
%zz = and i8 C3, 0 ; and silence it from complaining about said reg
%C4 = -1 /u C1
%n0 = mul i8 %x, C3
%n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right
%n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right
%n3 = or i8 %n1, %n2 ; rotate right
%is_tautologically_false = icmp ule i8 C1, C2
%C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4
%res = icmp ule i8 %n3, %C4_fixed
%r = xor i1 %res, %is_tautologically_false
```
https://rise4fun.com/Alive/2xC
https://rise4fun.com/Alive/jpb5

3. We can expand into `and`/`or`:
https://rise4fun.com/Alive/WGn
https://rise4fun.com/Alive/lcb5

Blend-one is likely better since we avoid having to load the
replacement from constant pool. `xor` is second best since
it's still pretty general. I'm not adding `and`/`or` variants.

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: RKSimon

Subscribers: nick, hiraditya, xbolva00, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70051
2019-11-22 15:16:03 +03:00
Clement Courbet 88e205525c Revert "[DAGCombiner] Allow zextended load combines."
Breaks some bots.
2019-11-22 09:01:08 +01:00
Clement Courbet 036790f988 [DAGCombiner] Allow zextended load combines.
Summary: or(zext(load8(base)), zext(load8(base+1)) -> zext(load16 base)

Reviewers: apilipenko, RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70487
2019-11-22 08:40:19 +01:00
Pengfei Wang 22a0edd070 [FPEnv] Add an option to disable strict float node mutating to an normal
float node

This patch add an option 'disable-strictnode-mutation' to prevent strict
node mutating to an normal node.
So we can make sure that the patch which sets strict-node as legal works
correctly.

Patch by Chen Liu(LiuChen3)

Differential Revision: https://reviews.llvm.org/D70226
2019-11-21 18:07:11 -08:00
Craig Topper 7696b99258 [LegalizeDAG][X86] Add support for turning STRICT_FADD/SUB/MUL/DIV into libcalls. Use it for fp128 on x86-64.
This requires a minor hack for f32/f64 strict fadd/fsub to avoid
turning those into libcalls.
2019-11-21 16:19:25 -08:00
Hiroshi Yamauchi 52e377497d [PGO][PGSO] DAG.shouldOptForSize part.
Summary:
(Split of off D67120)

SelectionDAG::shouldOptForSize changes for profile guided size optimization.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70095
2019-11-21 14:16:00 -08:00
Tom Stellard ab411801b8 [cmake] Explicitly mark libraries defined in lib/ as "Component Libraries"
Summary:
Most libraries are defined in the lib/ directory but there are also a
few libraries defined in tools/ e.g. libLLVM, libLTO.  I'm defining
"Component Libraries" as libraries defined in lib/ that may be included in
libLLVM.so.  Explicitly marking the libraries in lib/ as component
libraries allows us to remove some fragile checks that attempt to
differentiate between lib/ libraries and tools/ libraires:

1. In tools/llvm-shlib, because
llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of
all libraries defined in the whole project, there was custom code
needed to filter out libraries defined in tools/, none of which should
be included in libLLVM.so.  This code assumed that any library
defined as static was from lib/ and everything else should be
excluded.

With this change, llvm_map_components_to_libnames(LIB_NAMES, "all")
only returns libraries that have been added to the LLVM_COMPONENT_LIBS
global cmake property, so this custom filtering logic can be removed.
Doing this also fixes the build with BUILD_SHARED_LIBS=ON
and LLVM_BUILD_LLVM_DYLIB=ON.

2. There was some code in llvm_add_library that assumed that
libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or
ARG_LINK_COMPONENTS set.  This is only true because libraries
defined lib lib/ use LLVMBuild.txt and don't set these values.
This code has been fixed now to check if the library has been
explicitly marked as a component library, which should now make it
easier to remove LLVMBuild at some point in the future.

I have tested this patch on Windows, MacOS and Linux with release builds
and the following combinations of CMake options:

- "" (No options)
- -DLLVM_BUILD_LLVM_DYLIB=ON
- -DLLVM_LINK_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON

Reviewers: beanz, smeenai, compnerd, phosek

Reviewed By: beanz

Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70179
2019-11-21 10:48:08 -08:00
Clement Courbet 252567377c [DAGCombine][NFC] Use ArrayRef and correctly size SmallVectors.
In preparation for D70487.
2019-11-21 08:53:37 +01:00
Craig Topper c9e8e808cf [SelectionDAG][X86] Mutate strictFP nodes to non-strict in DoInstructionSelection when the node is marked Expand rather than when it is not Legal.
This allows operations that are marked Custom, but have some type
combinations that are legal to get past this code.

Add custom mutation code to X86's Select function for the nodes
that don't have isel patterns yet.
2019-11-20 10:36:02 -08:00
David Zarzycki 257acbf6ae
[SelectionDAG] Combine U{ADD,SUB}O diamonds into {ADD,SUB}CARRY
Summary:
Convert (uaddo (uaddo x, y), carryIn) into addcarry x, y, carryIn if-and-only-if the carry flags of the first two uaddo are merged via OR or XOR.

Work remaining: match ADD, etc.

Reviewers: craig.topper, RKSimon, spatel, niravd, jonpa, uweigand, deadalnix, nikic, lebedev.ri, dmgreen, chfast

Reviewed By: lebedev.ri

Subscribers: chfast, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70079
2019-11-20 16:25:42 +02:00
Serge Pavlov ea8678d1c7 Move floating point related entities to namespace level
This is recommit of commit e6584b2b7b, which was reverted in
30e7ee3c4b together with af57dbf12e.
Original message is below.

Enumerations that describe rounding mode and exception behavior were
defined inside ConstrainedFPIntrinsic. It makes sense to use the same
definitions to represent the same properties in other cases, not only
in constrained intrinsics. It was however inconvenient as required to
include constrained intrinsics definitions even if they were not needed.
Also using long scope prefix reduced readability.

This change moves these definitioins to the namespace llvm::fp.
No functional changes.

Differential Revision: https://reviews.llvm.org/D69552
2019-11-20 19:05:46 +07:00
Serge Pavlov 0c50c0b055 [FEnv] File with properties of constrained intrinsics
Summary
In several places we need to enumerate all constrained intrinsics or IR
nodes that should be represented by them. It is easy to miss some of
the cases. To make working with these intrinsics more convenient and
robust, this change introduces file containing definitions of all
constrained intrinsics and some of their properties. This file can be
included to generate constrained intrinsics processing code.

Reviewers: kpn, andrew.w.kaylor, cameron.mcinally, uweigand

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69887
2019-11-20 13:30:07 +07:00
Craig Topper c4b41e8d1d [LegalizeDAG][X86] Enable STRICT_FP_TO_SINT/UINT to be promoted
Differential Revision: https://reviews.llvm.org/D70220
2019-11-19 16:14:37 -08:00
Matt Arsenault 7fe9435dc8 Work on cleaning up denormal mode handling
Cleanup handling of the denormal-fp-math attribute. Consolidate places
checking the allowed names in one place.

This is in preparation for introducing FP type specific variants of
the denormal-fp-mode attribute. AMDGPU will switch to using this in
place of the current hacky use of subtarget features for the denormal
mode.

Introduce a new header for dealing with FP modes. The constrained
intrinsic classes define related enums that should also be moved into
this header for uses in other contexts.

The verifier could use a check to make sure the denorm-fp-mode
attribute is sane, but there currently isn't one.

Currently, DAGCombiner incorrectly asssumes non-IEEE behavior by
default in the one current user. Clang must be taught to start
emitting this attribute by default to avoid regressions when this is
switched to assume ieee behavior if the attribute isn't present.
2019-11-19 22:01:14 +05:30
Matt Arsenault b696b9dba7 DAG: Add function context to isFMAFasterThanFMulAndFAdd
AMDGPU needs to know the FP mode for the function to answer this
correctly when this is removed from the subtarget.

AArch64 had to make this more complicated by using this from an IR
hook, so add an IR typed overload.
2019-11-19 19:25:26 +05:30
Craig Topper dc02eb1909 [SelectionDAG] Merge the two identical ExpandChainLibCall methods from LegalizeTypes and LegalizeDAG to one version in TaretLowering.
Reviewers: RKSimon, efriedma, spatel

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70354
2019-11-18 20:22:33 -08:00
Craig Topper 6e20d70a69 [LegalizeDAG] Convert strict fp nodes to libcalls without losing the chain.
Previously we mutated the node and then converted it to a libcall. But this loses the chain information.

This patch keeps the chain, but unfortunately breaks tail call optimization as the functions involved in deciding if a node is in tail call position can't handle the chain. But correct ordering seems more important to be right.

Somehow the SystemZ tests improved. I looked at one of them and it seemed that we're handling the split vector elements in a different order and that made the copies work better.

Differential Revision: https://reviews.llvm.org/D70334
2019-11-18 11:24:08 -08:00
Eric Christopher 30e7ee3c4b Temporarily Revert "Add support for options -frounding-math, ftrapping-math, -ffp-model=, and -ffp-exception-behavior="
and a follow-up NFC rearrangement as it's causing a crash on valid. Testcase is on the original review thread.

This reverts commits af57dbf12e and e6584b2b7b
2019-11-18 10:46:48 -08:00
Graham Hunter 3f08ad611a [SVE][CodeGen] Scalable vector MVT size queries
* Implements scalable size queries for MVTs, split out from D53137.

* Contains a fix for FindMemType to avoid using scalable vector type
  to contain non-scalable types.

* Explicit casts for several places where implicit integer sign
  changes or promotion from 32 to 64 bits caused problems.

* CodeGenDAGPatterns will treat scalable and non-scalable vector types
  as different.

Reviewers: greened, cameron.mcinally, sdesmalen, rovka

Reviewed By: rovka

Differential Revision: https://reviews.llvm.org/D66871
2019-11-18 12:30:59 +00:00
Craig Topper bfbbf0aba8 [LegalizeTypes] Remove SoftenFloat handling from ExpandIntRes_LLROUND_LLRINT and remove assert from the strict fp path.
These were both recently added. While the call to GetSoftenedFloat
is a little more optimal, we don't do it in the expand for
FP_TO_SINT/UINT so there's no real reason to do it here. This
avoids a FIXME for strict fp.
2019-11-17 23:48:31 -08:00
Craig Topper 5a56d2aa33 [LegalizeTypes] Remove unnecessary conversion from EVT to MVT to MVT::SimpleValueType just to assign back to EVT. NFC 2019-11-17 23:48:31 -08:00
Craig Topper af435286e5 [LegalizeTypes][X86] Add support for expanding the result type of STRICT_LLROUND and STRICT_LLRINT.
This doesn't handle softening the input type, but we don't handle
softening any of the strict nodes yet. Skipping that made it easy
to reuse an existing function for creating a libcall from a node
with a chain.
2019-11-17 20:03:05 -08:00
Craig Topper 1b0efe2b17 [LegalizeTypes] When expanding the integer result of LLROUND/LLRINT, also call GetSoftenedFloat if the floating point input needs to be softened.
Before this we were emitting a bitcast to integer from the lowering
code that itself will need to be legalized. By calling
GetSoftenedFloat we get the integer conversion in one step without
needing to relegalize a bitcast.
2019-11-17 13:31:30 -08:00
Craig Topper 9b515b6dd9 [LegalizeTypes] Remove PromoteFloat support form ExpandIntRes_LLROUND_LLRINT.
This code isn't exercised, and was in the wrong place. If we need
this, we would need to promote the type before figuring out which
libcall to use.

I'm choosing to remove it rather than fixing since we don't
support PromoteFloat for LRINT/LROUND/LLRINT/LLROUND when the
result type is legal so I don't see much reason to support it
for the case where the result type isn't legal.
2019-11-17 13:31:30 -08:00
Craig Topper d4ba11ae32 [LegalizeTypes] Merge ExpandIntRes_LLROUND and ExpandIntRes_LLRINT into one function that handles both. NFC
These too functions are were the same except for which libcall gets
emitted. Just merge them into one.

This is prep work for some other work including strict fp support.
2019-11-17 13:31:30 -08:00
Serge Pavlov e6584b2b7b Move floating point related entities to namespace level
Enumerations that describe rounding mode and exception behavior were
defined inside ConstrainedFPIntrinsic. It makes sense to use the same
definitions to represent the same properties in other cases, not only
in constrained intrinsics. It was however inconvenient as required to
include constrained intrinsics definitions even if they were not needed.
Also using long scope prefix reduced readability.

This change moves these definitioins to the namespace llvm::fp.
No functional changes.

Differential Revision: https://reviews.llvm.org/D69552
2019-11-15 19:56:33 +07:00
Reid Kleckner 5fe3f00ae2 Replace wrongly deleted header banner, fix formatting
I reviewed the diff hunks of 05da2fe521 that don't contain
'#include' lines, and found two unintended changes. I deleted a header
banner inadvertently while inserting a header, and changed the
indentation of a constructor in an odd way. Add back the banner, and
reformat the constructor.
2019-11-14 10:21:42 -08:00
Paweł Bylica 1c247dd028
[DAGCombiner] Drop redundant DAG method param. NFC 2019-11-14 14:02:53 +01:00
Paweł Bylica 9b89bda517
[DAGCombiner] Use TLI field already available. NFC 2019-11-14 14:02:52 +01:00
Reid Kleckner 05da2fe521 Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.

I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
  recompiles    touches affected_files  header
  342380        95      3604    llvm/include/llvm/ADT/STLExtras.h
  314730        234     1345    llvm/include/llvm/InitializePasses.h
  307036        118     2602    llvm/include/llvm/ADT/APInt.h
  213049        59      3611    llvm/include/llvm/Support/MathExtras.h
  170422        47      3626    llvm/include/llvm/Support/Compiler.h
  162225        45      3605    llvm/include/llvm/ADT/Optional.h
  158319        63      2513    llvm/include/llvm/ADT/Triple.h
  140322        39      3598    llvm/include/llvm/ADT/StringRef.h
  137647        59      2333    llvm/include/llvm/Support/Error.h
  131619        73      1803    llvm/include/llvm/Support/FileSystem.h

Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.

Reviewers: bkramer, asbirlea, bollu, jdoerfert

Differential Revision: https://reviews.llvm.org/D70211
2019-11-13 16:34:37 -08:00
Sander de Smalen 9a1c243aa5 [AArch64][SVE] Allocate locals that are scalable vectors.
This patch adds a target interface to set the StackID for a given type,
which allows scalable vectors (e.g. `<vscale x 16 x i8>`) to be assigned a
'sve-vec' StackID, so it is allocated in the SVE area of the stack frame.

Reviewers: ostannard, efriedma, rengolin, cameron.mcinally

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D70080
2019-11-13 09:45:24 +00:00
joanlluch d384ad6b63 [TargetLowering][DAGCombine][MSP430] Shift Amount Threshold in DAGCombine (4)
Summary:
Replaces
```
unsigned getShiftAmountThreshold(EVT VT)
```
by

```
bool shouldAvoidTransformToShift(EVT VT, unsigned amount)
```
thus giving more flexibility for targets to decide whether particular shift amounts must be considered expensive or not.

Updates the MSP430 target with a custom implementation.

This continues  D69116, D69120, D69326 and updates them, so all of them must be committed before this.

Existing tests apply, a few more have been added.

Reviewers: asl, spatel

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70042
2019-11-13 09:23:08 +01:00
aqjune e87d71668e [IR] Redefine Freeze instruction
Summary:
This patch redefines freeze instruction from being UnaryOperator to a subclass of UnaryInstruction.

ConstantExpr freeze is removed, as discussed in the previous review.
FreezeOperator is not added because there's no ConstantExpr freeze.
`freeze i8* null` test is added to `test/Bindings/llvm-c/freeze.ll` as well, because the null pointer-related bug in `tools/llvm-c/echo.cpp` is now fixed.
InstVisitor has visitFreeze now because freeze is not unaryop anymore.

Reviewers: whitequark, deadalnix, craig.topper, jdoerfert, lebedev.ri

Reviewed By: craig.topper, lebedev.ri

Subscribers: regehr, nlopes, mehdi_amini, hiraditya, steven_wu, dexonsmith, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69932
2019-11-12 10:49:00 +09:00
joanlluch e0012c5d6a [TargetLowering][DAGCombine][MSP430] Shift Amount Threshold in DAGCombine (3)
Summary:
Additional filtering of undesired shifts for targets that do not support them efficiently.

Related with  D69116 and  D69120

Applies the TLI.getShiftAmountThreshold hook to prevent undesired generation of shifts for the following IR code:

```
define i16 @testShiftBits(i16 %a) {
entry:
  %and = and i16 %a, -64
  %cmp = icmp eq i16 %and, 64
  %conv = zext i1 %cmp to i16
  ret i16 %conv
}

define i16 @testShiftBits_11(i16 %a) {
entry:
  %cmp = icmp ugt i16 %a, 63
  %conv = zext i1 %cmp to i16
  ret i16 %conv
}

define i16 @testShiftBits_12(i16 %a) {
entry:
  %cmp = icmp ult i16 %a, 64
  %conv = zext i1 %cmp to i16
  ret i16 %conv
}
```
The attached diff file shows the piece code in TargetLowering that is responsible for the generation of shifts in relation to the IR above.

Before applying this patch, shifts will be generated to replace non-legal icmp immediates. However, shifts may be undesired if they are even more expensive for the target.

For all my previous patches in this series (cited above) I added test cases for the MSP430 target. However, in this case, the target is not suitable for showing improvements related with this patch, because the MSP430 does not implement "isLegalICmpImmediate". The default implementation returns always true, therefore the patched code in TargetLowering is never reached for that target. Targets implementing both "isLegalICmpImmediate" and "getShiftAmountThreshold" will benefit from this.

The differential effect of this patch can only be shown for the MSP430 by temporarily implementing "isLegalICmpImmediate" to return false for large immediates. This is simulated with the implementation of a command line flag that was incorporated in D69975

This patch belongs to a initiative to "relax" the generation of shifts by LLVM for targets requiring it

Reviewers: spatel, lebedev.ri, asl

Reviewed By: spatel

Subscribers: lenary, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69326
2019-11-11 10:18:25 +01:00
Eli Friedman 5df3a87224 [AArch64][X86] Don't assume __powidf2 is available on Windows.
We had some code for this for 32-bit ARM, but this doesn't really need
to be in target-specific code; generalize it.

(I think this started showing up recently because we added an
optimization that converts pow to powi.)

Differential Revision: https://reviews.llvm.org/D69013
2019-11-08 12:43:21 -08:00
Sanjay Patel 777d1d1d98 [SDAG] reduce code duplication; NFC 2019-11-07 10:28:45 -05:00
Sanjay Patel 2fdd58c506 [SDAG] reduce code duplication; NFC 2019-11-07 10:15:17 -05:00
Philip Reames db036ee0a4 [X86/Atomics] Correct a few transforms for new atomic lowering
This is a partial fix for the issues described in commit message of 027aa27 (the revert of G24609).  Unfortunately, I can't provide test coverage for it on it's own as the only (known) wrong example is still wrong, but due to a separate issue.

These fixes are cases where when performing unrelated DAG combines, we were dropping the atomicity flags entirely.
2019-11-05 13:20:08 -08:00
Thomas Preud'homme 646896a442 Fix PR40644: miscompile indexed FP constant store
Summary:
Functions replaceStoreOfFPConstant() and OptimizeFloatStore() both
replace store of float by a store of an integer unconditionally. However
this generates wrong code when the store that is replaced is an indexed
or truncating store. This commit solves this issue by adding an early
return in these functions when the store being considered is not a
normal store.

Bug was only observed on out of tree targets, hence the lack of testcase
in this commit.

Reviewers: efriedma

Subscribers: hiraditya, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68420
2019-11-05 11:07:52 +00:00
aqjune 58acbce3de [IR] Add Freeze instruction
Summary:
- Define Instruction::Freeze, let it be UnaryOperator
- Add support for freeze to LLLexer/LLParser/BitcodeReader/BitcodeWriter
  The format is `%x = freeze <ty> %v`
- Add support for freeze instruction to llvm-c interface.
- Add m_Freeze in PatternMatch.
- Erase freeze when lowering IR to SelDag.

Reviewers: deadalnix, hfinkel, efriedma, lebedev.ri, nlopes, jdoerfert, regehr, filcab, delcypher, whitequark

Reviewed By: lebedev.ri, jdoerfert

Subscribers: jfb, kristof.beyls, hiraditya, lebedev.ri, steven_wu, dexonsmith, xbolva00, delcypher, spatel, regehr, trentxintong, vsk, filcab, nlopes, mehdi_amini, deadalnix, llvm-commits

Differential Revision: https://reviews.llvm.org/D29011
2019-11-05 15:54:56 +09:00
Sanjay Patel 113181e9bd [DAGCombine][MSP430] use shift amount threshold in DAGCombine (2/2)
Continuation of:
D69116

Contributes to a fix for PR43559:
https://bugs.llvm.org/show_bug.cgi?id=43559

See also D69099 and D69116

Use the TLI hook in DAGCombine.cpp to guard against creating
shift nodes that are not optimal for a target.

Patch by: @joanlluch (Joan LLuch)

Differential Revision: https://reviews.llvm.org/D69120
2019-11-04 13:41:41 -05:00
Ulrich Weigand 664f84e246 [FPEnv][SelectionDAG] Refactor strict FP node construction
Small refactoring in visitConstrainedFPIntrinsic that should make
it easier to create DAG nodes requiring extra arguments.  That is
the case currently only for STRICT_FP_ROUND, but may be the case
for additional nodes (in particular compares) in the future.

Extracted from the patch for D69281.

NFC.
2019-11-04 17:45:54 +01:00
Dávid Bolvanský a18a8db0d4 [SelectionDAG] Fixed null check after dereferencing warning. NFCI. 2019-11-03 19:34:03 +01:00
Simon Pilgrim 095d2a4ced FastISel - fix uninitialized variable warnings in constructor. NFCI. 2019-11-02 18:03:22 +00:00
Simon Pilgrim 97725707f4 Fix uninitialized variable warning. NFCI. 2019-11-02 14:42:38 +00:00
Craig Topper 96bb076621 [TargetLowering] Move the setBooleanContents check on (xor (setcc), (setcc)) == / != 1 -> (setcc) != / == (setcc) to the right place
We need to be checking the value types for the inner setccs not
the outer setcc. We need to ensure those setccs produce a 0/1
value or that the xor is on the i1 type. I think at the time
this code was originally written, getBooleanContents didn't
take any arguments so this was probably correct. But now we can
have a different boolean contents for integer and floating point.

Not sure why the other combines below the xor were also checking
the boolean contents. None of them involve any setccs other than
the outer one and they only produce a new setcc.

Differential Revision: https://reviews.llvm.org/D69480
2019-11-01 14:43:17 -07:00
Matt Arsenault 6221767055 DAG: Add DAG argument to isFPExtFoldable
For AMDGPU this is dependent on the FP mode, which should eventually
not be a property of the subtarget.
2019-10-31 22:32:45 -07:00
Hiroshi Yamauchi 0d987e411a [PGO][PGSO] TargetLowering/TargetTransformationInfo/SwitchLoweringUtils part.
Summary:
(Split of off D67120)

TargetLowering/TargetTransformationInfo/SwitchLoweringUtils changes for profile
guided size optimization.

Reviewers: davidxl

Subscribers: eraman, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69580
2019-10-31 13:22:56 -07:00