Commit Graph

11722 Commits

Author SHA1 Message Date
Mircea Trofin b2b460b0a0 [mlgo] Fix tests
Missed a few tests in D119507
2022-08-24 17:31:40 -07:00
Mircea Trofin 5ce4c9aa04 [mlgo] Use TFLite for 'development' mode.
TLite is a lightweight, statically linkable[1], model evaluator, supporting a
subset of what the full tensorflow library does, sufficient for the
types of scenarios we envision having. It is also faster.

We still use saved models as "source of truth" - 'release' mode's AOT
starts from a saved model; and the ML training side operates in terms of
saved models.

Using TFLite solves the following problems compared to using the full TF
C API:

- a compiler-friendly implementation for runtime-loadable (as opposed
  to AOT-embedded) models: it's statically linked; it can be built via
  cmake;
- solves an issue we had when building the compiler with both AOT and
  full TF C API support, whereby, due to a packaging issue on the TF
  side, we needed to have the pip package and the TF C API library at
  the same version. We have no such constraints now.

The main liability is it supporting a subset of what the full TF
framework does. We do not expect that to cause an issue, but should that
be the case, we can always revert back to using the full framework
(after also figuring out a way to address the problems that motivated
the move to TFLite).

Details:

This change switches the development mode to TFLite. Models are still
expected to be placed in a directory - i.e. the parameters to clang
don't change; what changes is the directory content: we still need
an `output_spec.json` file; but instead of the saved_model protobuf and
the `variables` directory, we now just have one file, `model.tflite`.

The change includes a utility showing how to take a saved model and
convert it to TFLite, which it uses for testing.

The full TF implementation can still be built (not side-by-side). We
intend to remove it shortly, after patching downstream dependencies. The
build behavior, however, prioritizes TFLite - i.e. trying to enable both
full TF C API and TFLite will just pick TFLite.

[1] thanks to @petrhosek's changes to TFLite's cmake support and its deps!
2022-08-24 16:07:24 -07:00
Jakub Kuderski 6fa87ec10f [ADT] Deprecate is_splat and replace all uses with all_equal
See the discussion thread for more details:
https://discourse.llvm.org/t/adt-is-splat-and-empty-ranges/64692

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D132335
2022-08-23 11:36:27 -04:00
Philip Reames c9608d57b8 [TTI] Plumb through OperandValueInfo in getMemoryOpCost [NFC]
This has the effect of exposing the power-of-two property for use in memory op costing, but no target actually uses it yet.  The main point of this change is simple consistency with the recently changes getArithmeticInstrCost, and to remove the last (interface) use of OperandValueKind.
2022-08-23 07:55:42 -07:00
Aditya Kumar 0af3ab02fd [NFC] LoopAccess: Move expressions close to usage
Avoids useless evaluation of these expressions.

Reviewed By: michaelmaitland, fhahn

Differential Revision: https://reviews.llvm.org/D132337
2022-08-23 07:08:42 -07:00
liqinweng 9181ab9223 [NFC]] Use llvm::all_of instead of std::all_of
Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D131886
2022-08-23 12:21:53 +08:00
Philip Reames 104fa367ee [TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC]
This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed independently in this API with a single container class which contains both.

This is the change which motivated the whole sequence which preceeded it.  In an original spike version of this change, I'd noticed a nasty bug: I'd changed the signature without changing names, and as result, we silently passed additional information through a callsite which previously dropped the power-of-two fact.  This might be harmless in most cases, but at least a couple clearly dependend for correctness on not passing that property through.

I did my best to split off prior changes which reduced the scope of this one, and which made it possible to use compiler assistance.  For instance, every parameter which changes type in this change also changes name.  This was intentional to make sure that every call site possible effected must show up in the diff.  This let me audit each one closely.
2022-08-22 15:16:39 -07:00
Philip Reames 27d3321c4f [TTI] Use OperandValueInfo in getMemoryOpCost client api [nfc]
This removes the last use of OperandValueKind from the client side API, and (once this is fully plumbed through TTI implementation) allow use of the same properties in store costing as arithmetic costing.
2022-08-22 11:26:31 -07:00
Philip Reames 274f86e7a6 [TTI] Remove OperandValueKind/Properties from getArithmeticInstrCost interface [nfc]
This completes the client side transition to the OperandValueInfo version of this routine.  Backend TTI implementations still use the prior versions for now.
2022-08-22 11:06:32 -07:00
Philip Reames c42a5f1cc2 [TTI] Migrate getOperandInfo to OperandVaueInfo [nfc]
This is part of merging OperandValueKind and OperandValueProperties.
2022-08-22 10:19:02 -07:00
Philip Reames 5cd427106d [TTI] Start process of merging OperandValueKind and OperandValueProperties [nfc]
OperandValueKind and OperandValueProperties both provide facts about the operands of an instruction for purposes of cost modeling.  We've discussed merging them several times; before I plumb through more flags, let's go ahead and do so.

This change only adds the client side interface for getArithmeticInstrCost and makes a couple of minor changes in client code to prove that it works.  Target TTI implementations still use the split flags.  I'm deliberately splitting what could be one big change into a series of smaller ones so that I can lean on the compiler to catch errors along the way.
2022-08-22 09:48:15 -07:00
Max Kazantsev e587199a50 [SCEV] Prove condition invariance via context, try 2
Initial implementation had too weak requirements to positive/negative
range crossings. Not crossing zero with nuw is not enough for two reasons:

- If ArLHS has negative step, it may turn from positive to negative
  without crossing 0 boundary from left to right (and crossing right to
  left doesn't count for unsigned);
- If ArLHS crosses SINT_MAX boundary, it still turns from positive to
  negative;

In fact we require that ArLHS always stays non-negative or negative,
which an be enforced by the following set of preconditions:

- both nuw and nsw;
- positive step (looks liftable);

Because of positive step, boundary crossing is only possible from left
part to the right part. And because of no-wrap flags, it is guaranteed
to never happen.
2022-08-22 14:31:19 +07:00
Ting Wang d2d77e050b [PowerPC][Coroutines] Add tail-call check with call information for coroutines
Fixes #56679.

Reviewed By: ChuanqiXu, shchenz

Differential Revision: https://reviews.llvm.org/D131953
2022-08-21 22:20:40 -04:00
Simon Pilgrim 5263155d5b [CostModel] Add CostKind argument to getShuffleCost
Defaults to TCK_RecipThroughput - as most explicit calls were assuming TCK_RecipThroughput (vectorizers) or was just doing a before-vs-after comparison (vectorcombiner). Calls via getInstructionCost were just dropping the CostKind, so again there should be no change at this time (as getShuffleCost and its expansions don't use CostKind yet) - but it will make it easier for us to better account for size/latency shuffle costs in inline/unroll passes in the future.

Differential Revision: https://reviews.llvm.org/D132287
2022-08-21 10:54:51 +01:00
Kazu Hirata 258531b7ac Remove redundant initialization of Optional (NFC) 2022-08-20 21:18:28 -07:00
Sanjay Patel 2981a94902 [EarlyCSE][ConstantFolding] do not constant fold atan2(+/-0.0, +/-0.0), part 2
Follow-up to 7f1262a322. That patch avoided removing the
call, but it still allowed the constant-folded result. This
makes the behavior consistent with 1-arg libm folding: if the
call potentially raises an exception, then we just bail out.

It seems likely that there are other corner-cases like this,
but the tests are incomplete, so we have lived with these
discrepancies for a long time. This was untested before the
the constant folding was expanded in D127964.
2022-08-20 10:16:06 -04:00
Philip Reames b0a2c48e9f [tti] Consolidate getOperandInfo without OperandValueProperties copies [nfc] 2022-08-19 16:22:22 -07:00
Sanjay Patel 7f1262a322 [EarlyCSE][ConstantFolding] do not constant fold atan2(+/-0.0, +/-0.0)
These may raise an error (set errno) as discussed in the post-commit
comments for D127964, so we can't fold away the call and potentially
alter that behavior.
2022-08-19 12:27:29 -04:00
Alexey Bataev d53e245951 [COST][NFC]Introduce OperandValueKind in getMemoryOpCost, NFC.
Added OperandValueKind OpdInfo parameter to getMemoryOpCost functions to
better estimate cost with immediate values.

Part of D126885.
2022-08-19 07:33:00 -07:00
Max Kazantsev f798c042f4 Revert "[SCEV] Prove condition invariance via context"
This reverts commit a3d1fb3b59.

Reverting until investigation of https://github.com/llvm/llvm-project/issues/57247
has concluded.
2022-08-19 21:02:06 +07:00
Michael Maitland f29401fcdf [LoopVectorize][LoopAccessAnalysis] add newline to debug message
A debug message in `LoopAccessAnalysis` did not have a newline in it, causing printed debug messages to be formatted incorrectly.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D132172
2022-08-18 13:44:05 -07:00
Florian Hahn b8709a9d03
[LV] Support fixed order recurrences.
If the incoming previous value of a fixed-order recurrence is a phi in
the header, go through incoming values from the latch until we find a
non-phi value. Use this as the new Previous, all uses in the header
will be dominated by the original phi, but need to be moved after
the non-phi previous value.

At the moment, fixed-order recurrences are modeled as a chain of
first-order recurrences.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D119661
2022-08-18 19:15:52 +01:00
Simon Pilgrim fdec50182d [CostModel] Replace getUserCost with getInstructionCost
* Replace getUserCost with getInstructionCost, covering all cost kinds.
* Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks.

Original Patch by @samparker (Sam Parker)

Differential Revision: https://reviews.llvm.org/D79483
2022-08-18 11:55:23 +01:00
Simon Pilgrim b994f87184 [Analysis] CostModel.cpp - merge isa<IntrinsicInst> and dyn_cast<IntrinsicInst> checks
Pulled out of D79483
2022-08-18 10:43:29 +01:00
Simon Pilgrim 1d522a39f7 [TTI] Remove getInstructionThroughput cost helper.
Pulled out of D79483 - we can just as easily use getUserCost directly
2022-08-17 11:41:47 +01:00
Zain Jaffal f61f99a105
[instcombine] Optimise for zero initialisation of product given fast flags are enabled
Currently, clang ignores the 0 initialisation in finite math
For example:

```
double f_prod = 0;
double arr[1000];
for (size_t i = 0; i < 1000; i++) {
  f_prod *= arr[i];
 }
```
Clang will ignore that `f_prod` is set to zero and it will generate assembly to iterate over the loop.

Reviewed By: fhahn, spatel

Differential Revision: https://reviews.llvm.org/D131672
2022-08-17 11:12:15 +01:00
Graham Hunter 70d35443dc [LAA] Handle forked pointers with add/sub instructions
Handle cases where a forked pointer has an add or sub instruction
before reaching a select.

Reviewed By: fhahn
Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D130278
2022-08-17 09:51:13 +01:00
Simon Pilgrim 08d153d806 [ValueTracking] computeKnownBits - attempt to use a branch condition feeding a phi to improve known bits range (PR38280)
If computeKnownBits encounters a phi node, and we fail to determine any known bits through direct analysis, see if the incoming value is part of a branch condition feeding the phi.

Handle cases where icmp(IncomingValue PRED Constant) is driving a branch instruction feeding that phi node - at the moment this only handles EQ/ULT/ULE predicate cases as they are the most straightforward to handle and most likely for branch-loop 'max upper bound' cases - we can extend this if/when necessary.

I investigated a more general icmp(LHS PRED RHS) KnownBits system, but the hard limits we put on value tracking depth through phi nodes meant that we were mainly catching constants anyhow.

Fixes the pointless vectorization in PR38280 / Issue #37628 (excessive unrolling still needs handling though)

Differential Revision: https://reviews.llvm.org/D131838
2022-08-16 16:54:44 +01:00
Max Kazantsev ebabd6bf18 Return "[SCEV] Use context to strengthen flags of BinOps"
This reverts commit 354fa0b480.

Returning as is. The patch was reverted due to a miscompile, but
this patch is not causing it. This patch made it possible to infer
some nuw flags in code guarded by `false` condition, and then someone
else to managed to propagate the flag from dead code outside.

Returning the patch to be able to reproduce the issue.
2022-08-16 14:12:36 +07:00
Craig Topper ef8c34e954 [InstSimplify] sle on i1 also encodes implication
We already support SGE, so the same logic should hold for SLE with
the LHS and RHS swapped.

I didn't see this in the wild. Just happened to walk past this code
and thought it was odd that it was asymmetric in what condition
codes it handled.

Reviewed By: spatel, reames

Differential Revision: https://reviews.llvm.org/D131805
2022-08-15 08:27:23 -07:00
Max Kazantsev 354fa0b480 Revert "[SCEV] Use context to strengthen flags of BinOps"
This reverts commit 34ae308c73.

Our internal testing found a miscompile. Not sure if it's caused by
this patch or it revealed something else. Reverting while investigating.
2022-08-15 18:51:59 +07:00
Wolfgang Pieb 7ddfb4dfeb [Inlining] Introduce the function attribute "inline-max-stacksize"
The value of the attribute is a size in bytes. It has the effect of
suppressing inlining of functions whose stacksizes exceed the given value.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D129904
2022-08-12 11:07:18 -07:00
Max Kazantsev a3d1fb3b59 [SCEV] Prove condition invariance via context
Contextual knowledge may be used to prove invariance of some conditions.
For example, in this case:
```
  ; %len >= 0
  guard(%iv = {start,+,1}<nuw> <s %len)
  guard(%iv = {start,+,1}<nuw> <u %len)
```
the 2nd check always fails if `start` is negative and always passes otherwise.

It looks like there are more opportunities of this kind that are still to be
implemented in the future.

Differential Revision: https://reviews.llvm.org/D129753
Reviewed By: apilipenko
2022-08-12 14:23:35 +07:00
Mircea Trofin 3486b1b736 [mlgo][nfc] regalloc test model generator: prep for TFLite
Casting operator to make TFLite happy.

Reviewed By: yundiqian

Differential Revision: https://reviews.llvm.org/D131584
2022-08-11 15:53:23 -07:00
Fangrui Song 57f334d817 [Support] Remove Log2 workaround for Android API level < 18
The function added by D9467 is unneeded.
https://github.com/android/ndk/wiki/Changelog-r24 shows that the NDK has
moved forward to at least a minimum target API of 19.

Reviewed By: srhines

Differential Revision: https://reviews.llvm.org/D131656
2022-08-11 17:39:41 +00:00
Kevin P. Neal de64d0076e [FPEnv][InstSimplify] Fix formatting error.
My most recent change for D131607 had a formatting error that I didn't
notice until after I committed it. Let me fix it now so changes to this
file will be back-to-back from me.
2022-08-11 12:10:05 -04:00
Kevin P. Neal 7bdb010d7c [FPEnv][InstSimplify] 0.0 - -X ==> X
Another ticket split out of D107285, this extends the optimization
of 0.0 - -X to just X when using constrained intrinsics and the
optimization is allowed.

If the negation of X is done with fsub then the match fails because of
the lack of IR Matcher support for constrained intrinsics.

While I'm here, remove some TODO notices since the work is no longer
planned.

Differential Revision: https://reviews.llvm.org/D131607
2022-08-11 11:35:33 -04:00
Martin Sebor 0dcfe7aa35 [InstCombine] Tighten up known library function signature tests (PR #56463)
Replace a switch statement used to validate arguments to known library
functions with a more consistent table-driven approach and tighten it
up.
2022-08-10 14:15:46 -06:00
Simon Pilgrim 77d33f4c1b [Analysis] Remove unused CostModelAnalysis::getInstructionCost helper. NFCI.
Everything now uses TTI costs calls directly
2022-08-10 17:21:46 +01:00
Mohammed Nurul Hoque 30abc1a6a1 [ConstantFolding] Eliminate atan and atan2 calls
From the opengroup specifications, atan2 may fail if the result
underflows and atan may fail if the argument is subnormal, but
we assume that does not happen and eliminate the calls if we
can constant fold the result at compile-time.

Differential Revision: https://reviews.llvm.org/D127964
2022-08-10 11:01:50 -04:00
Dinar Temirbulatov cab6cd6834 [AArch64][LoopVectorize] Introduce trip count minimal value threshold to ignore tail-folding.
After D121595 was commited, I noticed regressions assosicated with small trip
count numbersvectorisation by tail folding with scalable vectors. As a solution
for those issues I propose to introduce the minimal trip count threshold value.

  Differential Revision: https://reviews.llvm.org/D130755
2022-08-09 22:10:17 +01:00
yundiqian 3edd8978c3 fix mlgo regalloc test model generation for tflite
To move from TF C API to TFLite, we found that the argmax op in TFLite does not work for int64 inputs, so cast the int64 inputs to int32 inputs to make TFLite argmax op work

Differential Revision: https://reviews.llvm.org/D131462
2022-08-09 12:36:28 -07:00
Fangrui Song de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
Kazu Hirata e20d210eef [llvm] Qualify auto (NFC)
Identified with readability-qualified-auto.
2022-08-07 23:55:27 -07:00
Sanjay Patel 74b5e797d5 [InstSimplify] fold scalable vectors with over-shift splat constant to poison
Fixes #56968
2022-08-07 16:26:05 -04:00
Sanjay Patel 8148c28fad [ConstFolding] fix overzealous assert when converting FP half
Fixes #56981
2022-08-07 13:34:51 -04:00
Kazu Hirata a2d4501718 [llvm] Fix comment typos (NFC) 2022-08-07 00:16:14 -07:00
Kazu Hirata c8e6ebd74e Use value instead of getValue (NFC) 2022-08-06 11:21:39 -07:00
Vitaly Buka 8d2901d537 [NFC][Inliner] Add Load/Store handler
This is an additional signal which may benefit sanitizers.

Reviewed By: kda

Differential Revision: https://reviews.llvm.org/D131129
2022-08-05 13:42:17 -07:00
Sanjay Patel b63fc26d33 [InstSimplify] make uses of isImpliedCondition more efficient (NFCI)
As suggested in the post-commit comments for 019d76196f,
this makes the usage symmetric with the 'and' patterns and should
be more efficient.
2022-08-05 12:06:47 -04:00