This commit adds a test for DAGCombiner commutative CSE on
nodes with multiple results (UMUL_LOHI). In this commit it
asserts the lack of CSE, a later commit will demonstrate
the CSE in the changed assertions.
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
Reviewed By: barannikov88
Differential Revision: https://reviews.llvm.org/D129905
Clean up checks for alloc-like ops in analysis. Use the analysis
utility to properly check for the desired kind of effects. The previous
locality utility worked for all practical purposes but wasn't sound and
was locally duplicate code. Instead, use mlir::hasSingleEffect.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D129439
```
// -----// IR Dump Before LowerLinalgMicrokernels (iree-vmvx-lower-linalg-microkernels) //----- //
```
I've been meaning to suggest this for a long time, and I think the only reason we don't have it is because we didn't used to have the `getArgument()` handy when printing these comments. When debugging or putting a pipeline together based on such dumps, I often find myself grepping for the argument name of the pass (which is often related but not universally).
This change allows the user of LivenessBlockInfo to specify an op within the block and get a set of all values that are live as of that op. Semantically it relies on having a dominance-based region that has ordered operations. For DFG regions, computing liveness statically this way doesn't really make sense, it likely needs to be done at runtime.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D129447
Since the introduction of GoogleTest sharding in D122251
<https://reviews.llvm.org/D122251>, some of the Solaris sanitizer tests
have been running extremly long (up to an hour) while they took mere
seconds before. Initial investigation suggests that massive lock
contention in Solaris procfs is involved here.
However, there's an easy way to somewhat reduce the impact: while the
current `ReadProcMaps` uses `ReadFileToBuffer` to read `/proc/self/xmap`,
that function primarily caters to Linux procfs reporting file sizes of 0
while the size on Solaris is accurate. This patch makes use of that,
reducing the number of syscalls involved and reducing the runtime of
affected tests by a factor of 4.
Besides, it handles shared mappings and doesn't call `readlink` for unnamed
map entries.
Tested on `sparcv9-sun-solaris2.11` and `amd64-pc-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D129837
While working on {D129830}, I realized that our handling of ICF +
eh_frame combined was untested. Additionally I realized that the comment
explaining why we were safely slicing away the functionAddress reloc
from our compact unwind entries was... insufficient and slightly
misleading. I've tried to clarify it.
Reviewed By: #lld-macho, thevinster
Differential Revision: https://reviews.llvm.org/D129894
Add a test for `-icp-inline` knob, which ensures that ICP is only performed for
functions that can be subsequently inlined.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D129803
A previous commit (f2b94bd) added some unnecessary statements that
dereferenced operations only to get the operations back. This patch
removes the unnecessary statements.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D129913
This is follow up of D107082, which enable vector support according to psABI.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D127982
This patch allows custom attribute and type builders to return
something other than the C++ type of the attribute or type.
This is useful for attributes or types that may perform extra work during
construction (e.g. canonicalization) that could result in a different
kind of attribute or type being returned.
Reviewed By: rriddle, lattner
Differential Revision: https://reviews.llvm.org/D129792
This patch adds a pattern to decompose a `linalg.generic` operations
that
- has only parallel iterator types
- has more than 2 statements (including the yield)
into multiple `linalg.generic` operation such that each operation has
a single statement and a yield.
The pattern added here just splits the matching `linalg.generic` into
two `linalg.generic`s, one containing the first statement, and the
other containing the remaining. The same pattern can be applied
repeatedly on the second op to ultimately fully decompose the generic
op.
Differential Revision: https://reviews.llvm.org/D129704
This patch reports number of counts being dropped when a hash-mismatch
happens. This information will be helpful to the users -- if the dropped
counts are large, the user should redo the instrumentation build and
recollect the profile.
Differential Revision: https://reviews.llvm.org/D129001
The visitor functions for `Region` and `Block` types did not always
check the value returned by recursive calls. This caused the top-level
visitor invocation to return `WalkResult::advance()` even if one or more
recursive invocations returned `WalkResult::interrupt()`. This patch
fixes the problem by check if any recursive call is interrupted, and if
so, return `WalkResult::interrupt()`.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D129718
The new driver generated offloadinga actions for each active toolchain.
However, for CUDA and HIP it is possible for the toolchain to be active
but one of the files is not a valid input. This can occur if the user
compiles both a CUDA and C source file in the same compiler invocation.
This patch adds some simple logic to quit if the input is not valid as
well.
Reviewed By: tra, MaskRay
Differential Revision: https://reviews.llvm.org/D129885
The reserve constructor was removed in 44f55509d7
but this one was missed. As a result, we attempt to iterate through 1024 threads
each time, most of which are 0.
Differential Revision: https://reviews.llvm.org/D129897
The device runtime uses the address space attribute to control the
placement of important constants on the GPU. The changes made in D126061
caused these to start emitting errors as they were not applied to the
type. This patch fixes the issues to make the warnings go away.
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D129896
Don't cross reference CSFDO profile and non-CSFDO profile when
checking the function hash. Only return hash_mismatch when
CS bits match, and return unknown_function otherwise.
Differential Revision: https://reviews.llvm.org/D129000
This patch improves FDO hash-mismatch handling:
(1) filter out warnings to weak functions.
Weak functions definition will be overridden by a strong definition by linker.
The hash mismatch in profile use compilation is expected.
Make the profile hash mismatch warning under the existing option (default true).
(2) add an option to trace the hash of functions with the specific string.
Note that an empty string parameter will trace all functions.
Differential Revision: https://reviews.llvm.org/D129002
A new sparse_tensor operation allows for
custom reduction code to be injected during
linalg.generic lowering for sparse tensors.
An identity value is provided to indicate
the starting value of the reduction. A single
block region is required to contain the
custom reduce computation.
Reviewed by: aartbik
Differential Revision: https://reviews.llvm.org/D128004
Changes since initial commit:
* Wrapping a pointer in an SCEV unknown hides the base, and SCEV is only able to compute a subtraction when the bases are known to be equal. This results in a SCEVCouldNotCompute flowing forward and triggering asserts. Test case added in d767b392.
* isLoopInvariant returns true for instructions outside the loop, but not necessarily *above* the loop. Since this code is allowed to visit uses of an IV outside of a loop, we have to make sure the operands of the compare are both invariant and dominating the header. Test case added in 2aed3cdb.
Original commit message follows...
The ICmpZero matching is checking to see if the expression is loop invariant per SCEV and expandable. This allows expressions inside the loop which can be made loop invariant to be seamlessly expanded, but is overly conservative for expressions which already *are* loop invariant.
As a simple justification for why this is correct, consider a loop invariant urem as RHS vs an alternate function with that same urem wrapped inside a helper call. Why would it be legal to match the later, but not the former?
Differential Revision: https://reviews.llvm.org/D129793
Enable stdio forwarding when nonstop mode is enabled, and disable it
once it is disabled. This makes it possible to cleanly handle stdio
forwarding while running multiple processes in non-stop mode.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D128932