Yes, if operands are non-positive this comes at the extra cost
of two extra negations. But a. division is already just
ridiculously costly, two more subtractions can't hurt much :)
and b. we have better/more analyzes/folds for an unsigned division,
we could end up narrowing it's bitwidth, converting it to lshr, etc.
This is essentially a take two on 0fdcca07ad,
which didn't fix the potential regression i was seeing,
because ValueTracking's computeKnownBits() doesn't make use
of dominating conditions in it's analysis.
While i could teach it that, this seems like the more general fix.
This big hammer actually does catch said potential regression.
Over vanilla test-suite + RawSpeed + darktable
(10M IR instrs, 1M IR BB, 1M X86 ASM instrs), this fires/converts 5 more
(+2%) SDiv's, the total instruction count at the end of middle-end pipeline
is only +6, so out of +10 extra negations, ~half are folded away,
and asm instr count is only +1, so practically speaking all extra
negations are folded away and are therefore free.
Sadly, all these new UDiv's remained, none folded away.
But there are two less basic blocks.
https://rise4fun.com/Alive/VS6
Name: v0
Pre: C0 >= 0 && C1 >= 0
%r = sdiv i8 C0, C1
=>
%r = udiv i8 C0, C1
Name: v1
Pre: C0 <= 0 && C1 >= 0
%r = sdiv i8 C0, C1
=>
%t0 = udiv i8 -C0, C1
%r = sub i8 0, %t0
Name: v2
Pre: C0 >= 0 && C1 <= 0
%r = sdiv i8 C0, C1
=>
%t0 = udiv i8 C0, -C1
%r = sub i8 0, %t0
Name: v3
Pre: C0 <= 0 && C1 <= 0
%r = sdiv i8 C0, C1
=>
%r = udiv i8 -C0, -C1
Currently that fold requires both operands to be non-negative,
but the only real requirement for the fold is that we must know
the domains of the operands.
As it's causing some bot failures (and per request from kbarton).
This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.
llvm-svn: 358546
Summary:
This patch adds processing of binary operations when the def of operands are in
the same block (i.e. local processing).
Earlier we bailed out in such cases (the bail out was introduced in rL252032)
because LVI at that time was more precise about context at the end of basic
blocks, which implied local def and use analysis didn't benefit CVP.
Since then we've added support for LVI in presence of assumes and guards. The
test cases added show how local def processing in CVP helps adding more
information to the ashr, sdiv, srem and add operators.
Note: processCmp which suffers from the same problem will
be handled in a later patch.
Reviewers: philip, apilipenko, SjoerdMeijer, hfinkel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38766
llvm-svn: 315634
The motivating example is this
for (j = n; j > 1; j = i) {
i = j / 2;
}
The signed division is safely to be changed to an unsigned division (j is known
to be larger than 1 from the loop guard) and later turned into a single shift
without considering the sign bit.
llvm-svn: 263406