Commit Graph

11752 Commits

Author SHA1 Message Date
Nikita Popov 31cc0ab321 [BasicAA] Delay getAllocTypeSize() call (NFC)
This call is expensive, so don't perform it for zero indices.

Also rename the variable to use Alloc rather than Alloca, this
doesn't have anything to do with allocas in particular.
2022-09-13 10:24:50 +02:00
Matthias Gehre c1502425ba Move TargetTransformInfo::maxLegalDivRemBitWidth -> TargetLowering::maxSupportedDivRemBitWidth
Also remove new-pass-manager version of ExpandLargeDivRem because there is no way
yet to access TargetLowering in the new pass manager.

Differential Revision: https://reviews.llvm.org/D133691
2022-09-12 17:06:16 +01:00
zhongyunde 8a15695be2 [AA] Improve the BasicAA analysis capability
According https://discourse.llvm.org/t/memoryssa-does-the-accessedbetween-support-scalable-vector-pointer/65052,
scalable vector support in BasicAA is currently essentially limited,
and should be improved effectively for a constant offset GEP if the scalable index is zero, eg:
  getelementptr <vscale x 4 x i32>, ptr %p, i64 0, i64 %i

Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D133567
2022-09-12 19:41:17 +08:00
Junduo Dong 6975ab7126 [Clang] Reimplement time tracing of NewPassManager by PassInstrumentation framework
The previous implementation of time tracing in NewPassManager is direct but messive.

The key codes are like the demo below:
```
  /// Runs the function pass across every function in the module.
  PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM,
                        LazyCallGraph &CG, CGSCCUpdateResult &UR) {
      /// ...
      PreservedAnalyses PassPA;
      {
        TimeTraceScope TimeScope(Pass.name());
        PassPA = Pass.run(F, FAM);
      }
      /// ...
 }
```

It can be bothered to judge where should we add the tracing codes by hands.

With the PassInstrumentation framework, we can easily add `Before/After` callback
functions to add time tracing codes.

Differential Revision: https://reviews.llvm.org/D131960
2022-09-11 05:42:55 -07:00
Aiden Grossman ec83c7e358 [MLGO] Make TFLiteUtils throw an error if some features haven't been passed to the model
In the Tensorflow C lib utilities, an error gets thrown if some features
haven't gotten passed into the model (due to differences in ordering
which now don't exist with the transition to TFLite). However, this is
not currently the case when using TFLiteUtils. This patch makes some
minor changes to throw an error when not all inputs of the model have
been passed, which when not handled will result in a seg fault within
TFLite.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D133451
2022-09-10 22:59:03 +00:00
Nikita Popov a9f312c7f4 [AST] Use BatchAA in aliasesUnknownInst() (NFCI) 2022-09-09 15:54:48 +02:00
Mircea Trofin a219a8a822 [mlgo][nfc] Set logging level to warning or higher for TFLite 2022-09-08 12:10:56 -07:00
Nikita Popov 98a3a340c3 [ConstantExpr] Don't create fneg expressions
Don't create fneg expressions unless explicitly requested by IR or
bitcode.
2022-09-07 11:27:25 +02:00
Matthias Gehre 2090e85fee [llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64
This adds the ExpandLargeDivRem to the default pass pipeline.
The limit at which it expands div/rem instructions is configured
via a new TargetTransformInfo hook (default: no expansion)
X86, Arm and AArch64 backends implement this hook to expand div/rem
instructions with more than 128 bits.

Differential Revision: https://reviews.llvm.org/D130076
2022-09-06 15:32:04 +01:00
Sanjay Patel a8fcb51242 [InstSimplify] allow poison/undef in constant match for "C - X ==/!= X -> false/true"
This fold was added with 5e9522c311, but over-specified.
We can assume that an undef element is an odd number:
https://alive2.llvm.org/ce/z/djQmWU
2022-09-06 08:19:30 -04:00
eopXD ea3630e8d4 [CMake][MLGO] Fix cmake for MLGO
The if-statement should check whehter TFLITE is on or not rather than if the variable is specified.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D132902
2022-09-06 00:32:08 -07:00
Arthur Eubanks 7e3aa8f01a Revert "[LoopPassManager] Implement and use LoopNestAnalysis::run() instead of manually creating LoopNests"
This reverts commit 57fd866551.

Causes crashes, see comments in D132581.
2022-09-05 15:42:48 -07:00
LiaoChunyu 456c7ef68e [InstSimplify][NFC] shortened the code 2022-09-05 23:57:53 +08:00
LiaoChunyu 5e9522c311 [InstSimplify] Odd - X ==/!= X -> false/true
Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D132989
2022-09-05 23:51:45 +08:00
Kazu Hirata 3850edd9e0 Use llvm::count_if (NFC) 2022-09-03 11:17:35 -07:00
Simon Pilgrim e2d140e9c3 [TTI] Add isExpensiveToSpeculativelyExecute wrapper
CGP uses a raw `getInstructionCost(I, TargetTransformInfo::TCK_SizeAndLatency) >= TCC_Expensive` check to see if its better to move an expensive instruction used in a select behind a branch instead.

This is causing issues with upcoming improvements to TCK_SizeAndLatency costs on X86 as we need to use TCK_SizeAndLatency as an uop count (so its compatible with various target-specific buffer sizes - see D132288), but we can have instructions that have a low TCK_SizeAndLatency value but should still be treated as 'expensive' (FDIV for example) - by adding a isExpensiveToSpeculativelyExecute wrapper we can keep the current behaviour but still add an x86 override in a future patch when the cost tables are updated to compensate.
2022-09-03 13:12:22 +01:00
Arthur Eubanks 57fd866551 [LoopPassManager] Implement and use LoopNestAnalysis::run() instead of manually creating LoopNests
The current code is basically just emulating what the analysis manager does.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D132581
2022-09-02 10:55:53 -07:00
Rong Xu 0caa4a9559 [PGO] Support PGO annotation of CallBrInst
We currently instrument CallBrInst but do not annotate it with
the branch weight. This patch enables PGO annotation of CallBrInst.

Differential Revision: https://reviews.llvm.org/D133040
2022-09-01 14:13:50 -07:00
Nikita Popov 4f046bc8e0 [PHITranslateAddr] Require dominance when searching for translated address (PR57025)
This is a fix for PR57025 and an alternative to D131776. The problem
in the phi-translation-to-wrong-context.ll test case is that phi
translation of %gep.j into if2 pick %gep.i as the result. While this
instruction has the correct pointer address, it occurs in a context
where %i != 0. As such, we get a NoAlias result for the store in
if2, even though they do alias for %i == 0 (which is legal in the
original context of the pointer).

PHITranslateValue already has a MustDominate option, which can be
used to restrict PHI translation results to values that dominate the
translated-into block. However, this is more aggressive than what we
need and would significantly regress GVN results. In particular, if
we have a pointer value that does not require any translation, then
it is fine to continue using that value in the predecessor, because
the context is still correct for the original query. We only run into
problems if PHITranslateSubExpr() picks a completely random
instruction in a context that may have preconditions that do not hold.

Fix this by always performing the dominance checks in
PHITranslateSubExpr(), without enabling the more general MustDominate
requirement.

Fixes https://github.com/llvm/llvm-project/issues/57025. This also
fixes the test case for https://github.com/llvm/llvm-project/issues/30999,
but I'm not sure whether that's just the particular test case,
or a general solution to the problem.

Differential Revision: https://reviews.llvm.org/D132935
2022-09-01 16:26:42 +02:00
Pavel Samolysov 88581db62f [LazyCallGraph] Reformat the code in accordance with the code style. NFC
Also, some local variables were renamed in accordance with the code
style as well as `std::tie` occurrences and `.first`/`.second` member
uses were replaced with structure bindings.

Differential Revision: https://reviews.llvm.org/D132806
2022-08-30 11:06:42 +03:00
Kazu Hirata 6ed2cb4ad5 Revert "[llvm] Use llvm::is_contained (NFC)"
This reverts commit ebf574f59a.

This patch seems to cause build failures on Windows.
2022-08-28 18:52:49 -07:00
Kazu Hirata ebf574f59a [llvm] Use llvm::is_contained (NFC) 2022-08-28 17:35:03 -07:00
Kazu Hirata ce9f007c7c [llvm] Use llvm::find_if (NFC) 2022-08-28 10:41:48 -07:00
Kazu Hirata 7a617fdf39 Use std::gcd (NFC)
This patch replaces calls to GreatestCommonDivisor64 with std::gcd
where both arguments are known to be of unsigned types no larger than
64 bits in size.
2022-08-27 21:20:59 -07:00
Arthur Eubanks 7a94d189ad [LazyCallGraph] Update libcall list when replacing a libcall node's function
Otherwise when we visit all libcalls in
updateCGAndAnalysisManagerForPass(), the old libcall is dead and doesn't
have a node.

We treat libcalls conservatively in LazyCallGraph because any function
may introduce calls to them out of thin air.

It is weird to change the signature of a libcall since introducing calls
to the libcall with a different signature may break, but other passes
like deadargelim already do it, so let's preserve this behavior for now.

Fixes an issue found in D128830.

Reviewed By: psamolysov

Differential Revision: https://reviews.llvm.org/D132764
2022-08-27 10:57:53 -07:00
Mircea Trofin 3546b5c520 [mlgo] Fix flaky test
The source of the flakyness is internal uninitialized buffers due to
a dangling variable in the model.
2022-08-26 21:29:25 -07:00
Florian Hahn 9405af1c85
[LAA] Require AddRecs to be in the innermost loop for diff-checks.
The simpler diff-checks require pointers with add-recs from the same
innermost loop, but this property wasn't check completely. Add the
missing check to ensure both addrecs are in the innermost loop.

Fixes #57315.
2022-08-26 20:39:52 +01:00
Philip Reames 86b67a310d [LAA] Prune dependencies with distance large than access implied by trip count
When we have a dependency with a dependence distance which can only be hit on an iteration beyond the actual trip count of the loop, we can ignore that dependency when analyzing said loop. We already had this code, but had restricted it solely to unknown dependence distances. This change applies it to all dependence distances.

Without this code, we relied on the vectorizer reducing VF such that our infeasible dependence was respected. This usually worked out to about the same result, but not always. For fixed length vectorization, this could mean a smaller VF than optimal being chosen or additional runtime checks. For scalable vectorization - where the bounds on access implied by VF are broader - we could often not find a feasible VF at all.

Differential Revision: https://reviews.llvm.org/D131924
2022-08-25 14:24:13 -07:00
Sanjay Patel 4e44c22c97 [ValueTracking][InstCombine] restrict FP min/max matching to avoid miscompile
This is a long-standing FIXME with a non-FMF test that exposes
the bug as shown in issue #57357.

It's possible that there's still a way to miscompile by
mis-identifying/mis-folding FP min/max patterns, but
this patch only exposes a couple of seemingly minor
regressions while preventing the broken transform.
2022-08-25 16:52:40 -04:00
Florian Hahn c035efc814
[LAA] Cache PSE.getSE() in variable (NFC).
Preparation for follow-up patches will introduce additional uses
of SE.
2022-08-25 21:40:22 +01:00
Mircea Trofin b2b460b0a0 [mlgo] Fix tests
Missed a few tests in D119507
2022-08-24 17:31:40 -07:00
Mircea Trofin 5ce4c9aa04 [mlgo] Use TFLite for 'development' mode.
TLite is a lightweight, statically linkable[1], model evaluator, supporting a
subset of what the full tensorflow library does, sufficient for the
types of scenarios we envision having. It is also faster.

We still use saved models as "source of truth" - 'release' mode's AOT
starts from a saved model; and the ML training side operates in terms of
saved models.

Using TFLite solves the following problems compared to using the full TF
C API:

- a compiler-friendly implementation for runtime-loadable (as opposed
  to AOT-embedded) models: it's statically linked; it can be built via
  cmake;
- solves an issue we had when building the compiler with both AOT and
  full TF C API support, whereby, due to a packaging issue on the TF
  side, we needed to have the pip package and the TF C API library at
  the same version. We have no such constraints now.

The main liability is it supporting a subset of what the full TF
framework does. We do not expect that to cause an issue, but should that
be the case, we can always revert back to using the full framework
(after also figuring out a way to address the problems that motivated
the move to TFLite).

Details:

This change switches the development mode to TFLite. Models are still
expected to be placed in a directory - i.e. the parameters to clang
don't change; what changes is the directory content: we still need
an `output_spec.json` file; but instead of the saved_model protobuf and
the `variables` directory, we now just have one file, `model.tflite`.

The change includes a utility showing how to take a saved model and
convert it to TFLite, which it uses for testing.

The full TF implementation can still be built (not side-by-side). We
intend to remove it shortly, after patching downstream dependencies. The
build behavior, however, prioritizes TFLite - i.e. trying to enable both
full TF C API and TFLite will just pick TFLite.

[1] thanks to @petrhosek's changes to TFLite's cmake support and its deps!
2022-08-24 16:07:24 -07:00
Jakub Kuderski 6fa87ec10f [ADT] Deprecate is_splat and replace all uses with all_equal
See the discussion thread for more details:
https://discourse.llvm.org/t/adt-is-splat-and-empty-ranges/64692

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D132335
2022-08-23 11:36:27 -04:00
Philip Reames c9608d57b8 [TTI] Plumb through OperandValueInfo in getMemoryOpCost [NFC]
This has the effect of exposing the power-of-two property for use in memory op costing, but no target actually uses it yet.  The main point of this change is simple consistency with the recently changes getArithmeticInstrCost, and to remove the last (interface) use of OperandValueKind.
2022-08-23 07:55:42 -07:00
Aditya Kumar 0af3ab02fd [NFC] LoopAccess: Move expressions close to usage
Avoids useless evaluation of these expressions.

Reviewed By: michaelmaitland, fhahn

Differential Revision: https://reviews.llvm.org/D132337
2022-08-23 07:08:42 -07:00
liqinweng 9181ab9223 [NFC]] Use llvm::all_of instead of std::all_of
Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D131886
2022-08-23 12:21:53 +08:00
Philip Reames 104fa367ee [TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC]
This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed independently in this API with a single container class which contains both.

This is the change which motivated the whole sequence which preceeded it.  In an original spike version of this change, I'd noticed a nasty bug: I'd changed the signature without changing names, and as result, we silently passed additional information through a callsite which previously dropped the power-of-two fact.  This might be harmless in most cases, but at least a couple clearly dependend for correctness on not passing that property through.

I did my best to split off prior changes which reduced the scope of this one, and which made it possible to use compiler assistance.  For instance, every parameter which changes type in this change also changes name.  This was intentional to make sure that every call site possible effected must show up in the diff.  This let me audit each one closely.
2022-08-22 15:16:39 -07:00
Philip Reames 27d3321c4f [TTI] Use OperandValueInfo in getMemoryOpCost client api [nfc]
This removes the last use of OperandValueKind from the client side API, and (once this is fully plumbed through TTI implementation) allow use of the same properties in store costing as arithmetic costing.
2022-08-22 11:26:31 -07:00
Philip Reames 274f86e7a6 [TTI] Remove OperandValueKind/Properties from getArithmeticInstrCost interface [nfc]
This completes the client side transition to the OperandValueInfo version of this routine.  Backend TTI implementations still use the prior versions for now.
2022-08-22 11:06:32 -07:00
Philip Reames c42a5f1cc2 [TTI] Migrate getOperandInfo to OperandVaueInfo [nfc]
This is part of merging OperandValueKind and OperandValueProperties.
2022-08-22 10:19:02 -07:00
Philip Reames 5cd427106d [TTI] Start process of merging OperandValueKind and OperandValueProperties [nfc]
OperandValueKind and OperandValueProperties both provide facts about the operands of an instruction for purposes of cost modeling.  We've discussed merging them several times; before I plumb through more flags, let's go ahead and do so.

This change only adds the client side interface for getArithmeticInstrCost and makes a couple of minor changes in client code to prove that it works.  Target TTI implementations still use the split flags.  I'm deliberately splitting what could be one big change into a series of smaller ones so that I can lean on the compiler to catch errors along the way.
2022-08-22 09:48:15 -07:00
Max Kazantsev e587199a50 [SCEV] Prove condition invariance via context, try 2
Initial implementation had too weak requirements to positive/negative
range crossings. Not crossing zero with nuw is not enough for two reasons:

- If ArLHS has negative step, it may turn from positive to negative
  without crossing 0 boundary from left to right (and crossing right to
  left doesn't count for unsigned);
- If ArLHS crosses SINT_MAX boundary, it still turns from positive to
  negative;

In fact we require that ArLHS always stays non-negative or negative,
which an be enforced by the following set of preconditions:

- both nuw and nsw;
- positive step (looks liftable);

Because of positive step, boundary crossing is only possible from left
part to the right part. And because of no-wrap flags, it is guaranteed
to never happen.
2022-08-22 14:31:19 +07:00
Ting Wang d2d77e050b [PowerPC][Coroutines] Add tail-call check with call information for coroutines
Fixes #56679.

Reviewed By: ChuanqiXu, shchenz

Differential Revision: https://reviews.llvm.org/D131953
2022-08-21 22:20:40 -04:00
Simon Pilgrim 5263155d5b [CostModel] Add CostKind argument to getShuffleCost
Defaults to TCK_RecipThroughput - as most explicit calls were assuming TCK_RecipThroughput (vectorizers) or was just doing a before-vs-after comparison (vectorcombiner). Calls via getInstructionCost were just dropping the CostKind, so again there should be no change at this time (as getShuffleCost and its expansions don't use CostKind yet) - but it will make it easier for us to better account for size/latency shuffle costs in inline/unroll passes in the future.

Differential Revision: https://reviews.llvm.org/D132287
2022-08-21 10:54:51 +01:00
Kazu Hirata 258531b7ac Remove redundant initialization of Optional (NFC) 2022-08-20 21:18:28 -07:00
Sanjay Patel 2981a94902 [EarlyCSE][ConstantFolding] do not constant fold atan2(+/-0.0, +/-0.0), part 2
Follow-up to 7f1262a322. That patch avoided removing the
call, but it still allowed the constant-folded result. This
makes the behavior consistent with 1-arg libm folding: if the
call potentially raises an exception, then we just bail out.

It seems likely that there are other corner-cases like this,
but the tests are incomplete, so we have lived with these
discrepancies for a long time. This was untested before the
the constant folding was expanded in D127964.
2022-08-20 10:16:06 -04:00
Philip Reames b0a2c48e9f [tti] Consolidate getOperandInfo without OperandValueProperties copies [nfc] 2022-08-19 16:22:22 -07:00
Sanjay Patel 7f1262a322 [EarlyCSE][ConstantFolding] do not constant fold atan2(+/-0.0, +/-0.0)
These may raise an error (set errno) as discussed in the post-commit
comments for D127964, so we can't fold away the call and potentially
alter that behavior.
2022-08-19 12:27:29 -04:00
Alexey Bataev d53e245951 [COST][NFC]Introduce OperandValueKind in getMemoryOpCost, NFC.
Added OperandValueKind OpdInfo parameter to getMemoryOpCost functions to
better estimate cost with immediate values.

Part of D126885.
2022-08-19 07:33:00 -07:00
Max Kazantsev f798c042f4 Revert "[SCEV] Prove condition invariance via context"
This reverts commit a3d1fb3b59.

Reverting until investigation of https://github.com/llvm/llvm-project/issues/57247
has concluded.
2022-08-19 21:02:06 +07:00