This extends the work done in r233995 so that now getFragment (in addition to
getSection) also works for variable symbols.
With that the existing logic to decide if a-b can be computed works even if
a or b are variables. Given that, the expression evaluation can avoid expanding
variables as aggressively and that in turn lets the relocation code see the
original variable.
In order for this to work with the asm streamer, there is now a dummy fragment
per section. It is used to assign a section to a symbol when no other fragment
exists.
This patch is a joint work by Maxim Ostapenko andy myself.
llvm-svn: 249303
Summary:
The bitcode format is described in this document:
https://drive.google.com/file/d/0B036uwnWM6RWdnBLakxmeDdOeXc/view
For more info on ThinLTO see:
https://sites.google.com/site/llvmthinlto
The first customer is ThinLTO, however the data structures are designed
and named more generally based on prior feedback. There are a few
comments regarding how certain interfaces are used by ThinLTO, and the
options added here to gold currently have ThinLTO-specific names as the
behavior they provoke is currently ThinLTO-specific.
This patch includes support for generating per-module function indexes,
the combined index file via the gold plugin, and several tests
(more are included with the associated clang patch D11908).
Reviewers: dexonsmith, davidxl, joker.eph
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13107
llvm-svn: 249270
Track which basic blocks belong to which funclets. Permit branch
folding to fire but only if it can prove that doing so will not cause
code in one funclet to be reused in another.
llvm-svn: 249257
The most important part required to make clang
devirtualization works ( ͡°͜ʖ ͡°).
The code is able to find non local dependencies, but unfortunatelly
because the caller can only handle local dependencies, I had to add
some restrictions to look for dependencies only in the same BB.
http://reviews.llvm.org/D12992
llvm-svn: 249196
Summary:
This change teaches SCEV that to prove `A u< B` it is sufficient to
prove each of these facts individually:
- B >= 0
- A s< B
- A >= 0
In practice, SCEV sometimes finds it easier to prove these facts
individually than to prove `A u< B` as one atomic step.
Reviewers: reames, atrick, nlewycky, hfinkel
Subscribers: sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D13042
llvm-svn: 249168
We previously stopped producing Thumb2 relaxations when they weren't supported,
but only diagnosed the case where an actual relocation was produced. We should
also tell people if local symbols aren't going to work rather than silently
overflowing.
llvm-svn: 249164
This patch teaches FastIsel the following two things:
1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types;
2) On AVX, no instructions are needed for bitcasts between 256-bit vector types.
Example:
%1 = bitcast <4 x i31> %V to <2 x i64>
Before (-fast-isel -fast-isel-abort=1):
FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64>
Now we don't fall back to SelectionDAG and we correctly fold that computation
propagating the register associated to %V.
Originally reviewed here: http://reviews.llvm.org/D13347
llvm-svn: 249147
This patch teaches FastIsel the following two things:
1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types;
2) On AVX, no instructions are needed for bitcasts between 256-bit vector types.
Example:
%1 = bitcast <4 x i31> %V to <2 x i64>
Before (-fast-isel -fast-isel-abort=1):
FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64>
Now we don't fall back to SelectionDAG and we correctly fold that computation
propagating the register associated to %V.
Differential Revision: http://reviews.llvm.org/D13347
llvm-svn: 249121
When trying to optimize fortified library functions use the right
location to insert new instructions in order to preserve correct
def-use order.
This fixes an issue where a misplaced instruction definition would
happen to be *after* one of its use after a RAUW, forming invalid IR.
This behavior was introduced by r227250.
Differential Revision: http://reviews.llvm.org/D13301
rdar://problem/22802369
llvm-svn: 249092
AArch64 uses $d* and $x* to interleave between text and data.
llvm-objdump didn't know about this so it ended up printing garbage.
This patch is a first step towards a solution of the problem.
Differential Revision: http://reviews.llvm.org/D13360
llvm-svn: 249083
We emit denormalized tables, where every range of invokes in the same
state gets a complete list of EH action entries. This is significantly
simpler than trying to infer the correct nested scoping structure from
the MI. Fortunately, for SEH, the nesting structure is really just a
size optimization.
With this, some basic __try / __except examples work.
llvm-svn: 249078
Summary:
Instead of asserting when the kernel metadata is different than we expect,
we should just skip lowering that function. This fixes assertion
failures with OpenCL argument metadata from older LLVM releases.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D13356
llvm-svn: 249073
Catchret transfers control from a catch funclet to an earlier funclet.
However, it is not completely clear which funclet the catchret target is
part of. Make this clear by stapling the catchret target's funclet
membership onto the CATCHRET SDAG node.
llvm-svn: 249052
Add generic instructions for load complement, load negative and load positive
for fp32 and fp64, and let isel prefer them. They do not clobber CC, and so
give scheduler more freedom. SystemZElimCompare pass will convert them when it
can to the CC-setting variants.
Regression tests updated to expect the new opcodes in places where the old ones
where used. New test case SystemZ/fp-cmp-05.ll checks that
SystemZCompareElim.cpp can handle the new opcodes.
README.txt updated (bullet removed).
Note that fp128 is not yet handled, because it is relatively rare, and is a
bit trickier, because of the fact that l.dfr would operate on the sign bit of
one of the subregisters of a fp128, but we would not want to copy the other
sub-reg in case src and dst regs are not the same.
Reviewed by Ulrich Weigand.
llvm-svn: 249046
v2: Add test (Matt).
Fix capitalization of isEOP (Matt).
Move pattern to class parameter (Matt).
Make the instruction available to Cayman (Matt).
Change name from MEM_RAT WRITE_TYPED to MEM_RAT STORE_TYPED.
Patch by: Zoltan Gilian
llvm-svn: 249042
Summary:
Some passes may open up opportunities for optimizations, leaving empty
lifetime start/end ranges. For example, with the following code:
void foo(char *, char *);
void bar(int Size, bool flag) {
for (int i = 0; i < Size; ++i) {
char text[1];
char buff[1];
if (flag)
foo(text, buff); // BBFoo
}
}
the loop unswitch pass will create 2 versions of the loop, one with
flag==true, and the other one with flag==false, but always leaving
the BBFoo basic block, with lifetime ranges covering the scope of the for
loop. Simplify CFG will then remove BBFoo in the case where flag==false,
but will leave the lifetime markers.
This patch teaches InstCombine to remove trivially empty lifetime marker
ranges, that is ranges ending right after they were started (ignoring
debug info or other lifetime markers in the range).
This fixes PR24598: excessive compile time after r234581.
Reviewers: reames, chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13305
llvm-svn: 249018