This ensures that symbolic relocations are generated for stack
pointer manipulations.
These relocations are of type R_WEBASSEMBLY_GLOBAL_INDEX_LEB.
This change also adds support for reading relocations of this
type in WasmObjectFile.cpp.
Since its a globally imported symbol this does mean that
the get_global/set_global instruction won't be valid until
the objects are linked that global used in no longer an
imported global.
Differential Revision: https://reviews.llvm.org/D34172
llvm-svn: 305616
Suppose we had a type index offsets array with a boundary at type index
N. Then you request the name of the type with index N+1, and that name
requires the name of index N-1 (think a parameter list, for example). We
didn't handle this, and we would print something like (<unknown UDT>,
<unknown UDT>).
The fix for this is not entirely trivial, and speaks to a larger
problem. I think we need to kill TypeDatabase, or at the very least kill
TypeDatabaseVisitor. We need a thing that doesn't do any caching
whatsoever, just given a type index it can compute the type name "the
slow way". The reason for the bug is that we don't have anything like
that. Everything goes through the type database, and if we've visited a
record, then we're "done". It doesn't know how to do the expensive thing
of re-visiting dependent records if they've not yet been visited.
What I've done here is more or less copied the code (albeit greatly
simplified) from TypeDatabaseVisitor, but wrapped it in an interface
that just returns a std::string. The logic of caching the name is now in
LazyRandomTypeCollection. Eventually I'd like to move the record
database here as well and the visited record bitfield here as well, at
which point we can actually just delete TypeDatabase. I don't see any
reason for it if a "sequential" collection is just a special case of a
random access collection with an empty partial offsets array.
Differential Revision: https://reviews.llvm.org/D34297
llvm-svn: 305612
In this patch, I flip the switch in DriverUtils from using the external
cvtres.exe tool to using the Windows Resource library in llvm.
I also fixed a bug where .rsrc sections were marked as discardable
memory and therefore were placed in the wrong order in the final PE.
Furthermore, I modified WindowsResource to write the coff directly to a
memory buffer instead of to file, also had it use the machine types
already declared in COFF.h instead creating my own enum.
Finally, I flipped the switch to allow all unit tests that had
previously run only on windows due to a winres dependency to run
cross-platform.
Reviewers: zturner, ruiu
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D34265
llvm-svn: 305592
The recommit fixes two bugs: The first one is to use CurrentBlock instead of
PREInstr's Parent as param of performScalarPREInsertion because the Parent
of a clone instruction may be uninitialized. The second one is stop PRE when
CurrentBlock to its predecessor is a backedge and an operand of CurInst is
defined inside of CurrentBlock. The same value defined inside of loop in last
iteration can not be regarded as available.
Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.
long a[100], b[100], g1, g2, g3;
__attribute__((pure)) long goo();
void foo(long a, long b, long c, long d) {
g1 = a * b;
if (__builtin_expect(g2 > 3, 0)) {
a = c;
b = d;
g2 = a * b;
}
g3 = a * b; // fully redundant.
}
The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.
Differential Revision: https://reviews.llvm.org/D32252
llvm-svn: 305578
Revert because of reports of some PPC input starting to spill when it
was predicted that it wouldn't and no spillslot was reserved.
This reverts commit r305516.
llvm-svn: 305566
Summary:
Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html
This change is to alter the prototype for the atomic memcpy intrinsic. The prototype itself is being changed to more closely resemble the semantics and parameters of the llvm.memcpy intrinsic -- to ease later combination of the llvm.memcpy and atomic memcpy intrinsics. Furthermore, the name of the atomic memcpy intrinsic is being changed to make it clear that it is not a generic atomic memcpy, but specifically a memcpy is unordered atomic.
Reviewers: reames, sanjoy, efriedma
Reviewed By: reames
Subscribers: mzolotukhin, anna, llvm-commits, skatkov
Differential Revision: https://reviews.llvm.org/D33240
llvm-svn: 305558
This resubmits commit c0c249e9f2ef83e1d1e5f166b50673d92f3579d7.
It was broken due to some weird template issues, which have
since been fixed.
llvm-svn: 305517
Re-apply r276044/r279124. Trying to reproduce or disprove the ppc64
problems reported in the stage2 build last time, which I cannot
reproduce right now.
This is a variant of scavengeRegister() that works for
enterBasicBlockEnd()/backward(). The benefit of the backward mode is
that it is not affected by incomplete kill flags.
This patch also changes
PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register
scavenger in backwards mode.
Differential Revision: http://reviews.llvm.org/D21885
llvm-svn: 305516
This reverts commit 83ea17ebf2106859a51fbc2a86031b44d33696ad.
This is failing due to some strange template problems, so reverting
until it can be straightened out.
llvm-svn: 305505
After some internal discussions, we agreed that the raw output style had
outlived its usefulness. It was originally created before we had even
thought of dumping to YAML, and it was intended to give us some insight
into the internals of a PDB file. Now we have YAML mode which does
almost exactly this but is more powerful in that it can round-trip back
to a PDB, which the raw mode could not do. So the raw mode had become
purely a maintenance burden.
One option was to just delete it. However, its original goal was to be
as readable as possible while staying close to the "metal" - i.e.
presenting the output in a way that maps directly to the underlying file
format. We don't actually need that last requirement anymore since it's
covered by the yaml mode, so we could repurpose "raw" mode to actually
just be as readable as possible.
This patch implements about 80% of the functionality previously in raw
mode, but in a completely different style that is more akin to what
cvdump outputs. Records are very compressed, often times appearing on
just one line. One nice thing about this is that it makes full record
matching easier, because you can grep for indices, names, and leaf types
on a single line often.
See the tests for some examples of what the new output looks like.
Note that this patch actually regresses the functionality of raw mode in
a few areas, but only because the patch was already unreasonably large
and going 100% would have been even worse. Specifically, this patch is
missing:
The ability to dump module debug subsections (checksums, lines, etc)
The ability to dump section headers
Aside from that everything is here. While goign through the tests fixing
them all up, I found many duplicate tests. They've been deleted. In
subsequent patches I will go through and re-add the missing
functionality.
Differential Revision: https://reviews.llvm.org/D34191
llvm-svn: 305495
Add condition for MachineLICM to safely hoist instructions that utilize
non constant registers that are reserved.
On PPC, global variable access is done through the table of contents (TOC)
which is always in register X2. The ABI reserves this register in any
functions that have calls or access global variables.
A call through a function pointer involves saving, changing and restoring
this register around the call and thus MachineLICM does not consider it to
be invariant. We can however guarantee the register is preserved across the
call and thus is invariant.
Differential Revision: https://reviews.llvm.org/D33562
llvm-svn: 305490
The code assumed that we process instructions in basic block order. FastISel
processes instructions in reverse basic block order. We need to pre-assign
virtual registers before selecting otherwise we get def-use relationships wrong.
This only affects code with swifterror registers.
rdar://32659327
llvm-svn: 305484
If a regular LTO module has a summary index, then instead of linking
it into the combined regular LTO module right away, add it to the
combined summary index and associate it with a special module that
represents the combined regular LTO module.
Any such modules are linked during LTO::run(), at which time we use
the results of summary-based dead stripping to control whether to
link prevailing symbols.
Differential Revision: https://reviews.llvm.org/D33922
llvm-svn: 305482
This is a fix for PR33292 that shows a case of extremely long compilation
of a single .c file with clang, with most time spent within SCEV.
We have a mechanism of limiting recursion depth for getAddExpr to avoid
long analysis in SCEV. However, there are calls from getAddExpr to getMulExpr
and back that do not propagate the info about depth. As result of this, a chain
getAddExpr -> ... .> getAddExpr -> getMulExpr -> getAddExpr -> ... -> getAddExpr
can be extremely long, with every segment of getAddExpr's being up to max depth long.
This leads either to long compilation or crash by stack overflow. We face this situation while
analyzing big SCEVs in the test of PR33292.
This patch applies the same limit on max expression depth for getAddExpr and getMulExpr.
Differential Revision: https://reviews.llvm.org/D33984
llvm-svn: 305463
Add support for modulo for targets that have hardware division and for
those that don't. When hardware division is not available, we have to
choose the correct libcall to use. This is generally straightforward,
except for AEABI.
The AEABI variant is trickier than the other libcalls because it
returns { quotient, remainder }, instead of just one value like the
other libcalls that we've seen so far. Therefore, we need to use custom
lowering for it. However, we don't want to have too much special code,
so we refactor the target-independent code in the legalizer by adding a
helper for replacing an instruction with a libcall. This helper is used
by the legalizer itself when dealing with simple calls, and also by the
custom ARM legalization for the more complicated AEABI divmod calls.
llvm-svn: 305459
Previously if you used fmt_align(7, Center) you would get the
output ' 7 '. It may be desirable for the user to specify
the fill character though, for example producing '---7---'. This
patch adds that.
llvm-svn: 305449
The current name (addModulePath) and return value
(ModulePathStringTableTy::iterator) is a little confusing. This
API adds a module, not just a path. And the iterator is basically
just an implementation detail of the summary index. Address
both of those issues by renaming to addModule and introducing a
ModuleSummaryIndex::ModuleInfo type that the function returns.
Differential Revision: https://reviews.llvm.org/D34124
llvm-svn: 305422
Many times unit tests for different libraries would like to use
the same helper functions for checking common types of errors.
This patch adds a common library with helpers for testing things
in Support, and introduces helpers in here for integrating the
llvm::Error and llvm::Expected<T> classes with gtest and gmock.
Normally, we would just be able to write:
EXPECT_THAT(someFunction(), succeeded());
but due to some quirks in llvm::Error's move semantics, gmock
doesn't make this easy, so two macros EXPECT_THAT_ERROR() and
EXPECT_THAT_EXPECTED() are introduced to gloss over the difficulties.
Consider this an exception, and possibly only temporary as we
look for ways to improve this.
Differential Revision: https://reviews.llvm.org/D33059
llvm-svn: 305395
This was originally reverted because of some non-deterministic
failures on certain buildbots. Luckily ASAN eventually caught
this as a stack-use-after-scope, so the fix is included in
this patch.
llvm-svn: 305393
Summary:
This patch is part of 3 patches that together form a single patch, but must be introduced in stages in order not to break things.
The way that LLVM interprets DW_OP_plus in DIExpression nodes is basically that of the DW_OP_plus_uconst operator since LLVM expects an unsigned constant operand. This unnecessarily restricts the DW_OP_plus operator, preventing it from being used to describe the evaluation of runtime values on the expression stack. These patches try to align the semantics of DW_OP_plus and DW_OP_minus with that of the DWARF definition, which pops two elements off the expression stack, performs the operation and pushes the result back on the stack.
This is done in three stages:
• The first patch (LLVM) adds support for DW_OP_plus_uconst.
• The second patch (Clang) contains changes all its uses from DW_OP_plus to DW_OP_plus_uconst.
• The third patch (LLVM) changes the semantics of DW_OP_plus and DW_OP_minus to be in line with its DWARF meaning. This patch includes the bitcode upgrade from legacy DIExpressions.
Patch by Sander de Smalen.
Reviewers: echristo, pcc, aprantl
Reviewed By: aprantl
Subscribers: fhahn, javed.absar, aprantl, llvm-commits
Differential Revision: https://reviews.llvm.org/D33894
llvm-svn: 305386
This is causing failures on linux bots with an invalid stream
read. It doesn't repro in any configuration on Windows, so
reverting until I have a chance to investigate on Linux.
llvm-svn: 305371
This allows us to use yaml2obj and obj2yaml to round-trip CodeView
symbol and type information without having to manually specify the bytes
of the section. This makes for much easier to maintain tests. See the
tests under lld/COFF in this patch for example. Before they just said
SectionData: <blob> whereas now we can use meaningful record
descriptions. Note that it still supports the SectionData yaml field,
which could be useful for initializing a section to invalid bytes for
testing, for example.
Differential Revision: https://reviews.llvm.org/D34127
llvm-svn: 305366
This patch adds code which verifies that each bucket in the .apple_names
accelerator table is either empty or has a valid hash index.
Differential Revision: https://reviews.llvm.org/D34177
llvm-svn: 305344
Summary:
When legalizing G_LOAD/G_STORE using NarrowScalar, we should avoid emitting
%0 = G_CONSTANT ty 0
%1 = G_GEP %x, %0
since it's cheaper to not emit the redundant instructions than it is to fold them
away later.
Reviewers: qcolombet, t.p.northover, ab, rovka, aditya_nandakumar, kristof.beyls
Reviewed By: qcolombet
Subscribers: javed.absar, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D32746
llvm-svn: 305340
Summary:
Previously, when D33102 landed, this broke -Werror buildbots.
http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/3249
```
FAILED: /home/buildbot/Buildbot/Slave1a/clang-with-lto-ubuntu/install/stage1/bin/clang++ -DGTEST_HAS_RTTI=0 -DLLVM_BUILD_GLOBAL_ISEL -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/CodeGen/AsmPrinter -I/home/buildbot/Buildbot/Slave1a/clang-with-lto-ubuntu/llvm.src/lib/CodeGen/AsmPrinter -Iinclude -I/home/buildbot/Buildbot/Slave1a/clang-with-lto-ubuntu/llvm.src/include -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wcovered-switch-default -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fcolor-diagnostics -ffunction-sections -fdata-sections -O3 -UNDEBUG -fno-exceptions -fno-rtti -MMD -MT lib/CodeGen/AsmPrinter/CMakeFiles/LLVMAsmPrinter.dir/CodeViewDebug.cpp.o -MF lib/CodeGen/AsmPrinter/CMakeFiles/LLVMAsmPrinter.dir/CodeViewDebug.cpp.o.d -o lib/CodeGen/AsmPrinter/CMakeFiles/LLVMAsmPrinter.dir/CodeViewDebug.cpp.o -c /home/buildbot/Buildbot/Slave1a/clang-with-lto-ubuntu/llvm.src/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
In file included from /home/buildbot/Buildbot/Slave1a/clang-with-lto-ubuntu/llvm.src/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp:14:
In file included from /home/buildbot/Buildbot/Slave1a/clang-with-lto-ubuntu/llvm.src/lib/CodeGen/AsmPrinter/CodeViewDebug.h:17:
In file included from /home/buildbot/Buildbot/Slave1a/clang-with-lto-ubuntu/llvm.src/lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.h:15:
In file included from /home/buildbot/Buildbot/Slave1a/clang-with-lto-ubuntu/llvm.src/include/llvm/IR/DebugInfoMetadata.h:26:
In file included from /home/buildbot/Buildbot/Slave1a/clang-with-lto-ubuntu/llvm.src/include/llvm/IR/Metadata.h:23:
/home/buildbot/Buildbot/Slave1a/clang-with-lto-ubuntu/llvm.src/include/llvm/ADT/PointerUnion.h:161:19: error: cast from 'void **' to 'const llvm::DISubprogram **' must have all intermediate pointers const qualified to be safe [-Werror,-Wcast-qual]
return (PT1 *)Val.getAddrOfPointer();
^
/home/buildbot/Buildbot/Slave1a/clang-with-lto-ubuntu/llvm.src/include/llvm/ADT/TinyPtrVector.h:177:18: note: in instantiation of member function 'llvm::PointerUnion<const llvm::DISubprogram *, llvm::SmallVector<const llvm::DISubprogram *, 4> *>::getAddrOfPtr1' requested here
return Val.getAddrOfPtr1();
^
/home/buildbot/Buildbot/Slave1a/clang-with-lto-ubuntu/llvm.src/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp:1885:33: note: in instantiation of member function 'llvm::TinyPtrVector<const llvm::DISubprogram *>::begin' requested here
for (const DISubprogram *SP : MethodItr.second) {
^
1 error generated.
```
Reviewers: dblaikie, akyrtzi
Reviewed By: dblaikie
Subscribers: joerg, mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D34153
llvm-svn: 305319
Summary: Added output to stderr so that we can actually see what is happening when the test fails on big endian.
Reviewers: zturner
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D34155
llvm-svn: 305314
Previously, the matching was done incorrectly for the case where
operands for FCmpInst and SelectInst were in opposite order.
Patch by Andrei Elovikov.
Differential Revision: https://reviews.llvm.org/D33185
llvm-svn: 305308
Summary: Fixes an issue using RegisterStandardPasses from a statically linked object before PassManagerBuilder::addGlobalExtension is called from a dynamic library.
Reviewers: efriedma, theraven
Reviewed By: efriedma
Subscribers: mehdi_amini, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D33515
llvm-svn: 305303
Summary:
Expose the module descriptor index and fill it in for section
contributions.
Reviewers: zturner
Subscribers: llvm-commits, ruiu, hiraditya
Differential Revision: https://reviews.llvm.org/D34126
llvm-svn: 305296
Summary: Added test cases for multiple machine types, file merging, multiple languages, and more resource types. Also fixed new bugs these tests exposed.
Subscribers: javed.absar, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D34047
llvm-svn: 305258
I accidentally combined this patch with one for adding more tests, they
should be separated.
This reverts commit 3da218a523be78df32e637d3446ecf97c9ea0465.
llvm-svn: 305257
This just forwarded to the same signature in User. The version in User is protected so there's no danger of anyone outside of PHINode constructing with the wrong operator new. All PHINodes are created by a static Create function in PHINode.
I believe at one point in history this called User::operator new(s, 0) so it was useful then.
llvm-svn: 305255
User has 3 signatures for operator new today. They take a single size, a size and a number of users, and a size, number of users, and descriptor size.
Historically there used to only be one signature that took size and a number of uses. Long ago derived classes implemented their own versions that took just a size and would call the size and use count version. Then they left an unimplemented signature for the size and use count signature from User. As we moved to C++11 this unimplemented signature because = delete.
Since then operator new has picked up two new signatures for operator new. But when the 3 argument version was added it was never added to the delete list in all of the derived classes where the 2 argument version is deleted. This makes things inconsistent.
I believe once one version of operator new is created in a derived class name hiding will take care of making all of the base class signatures unavailable. So I don't think the deleted lines are needed at all.
This patch removes all of the deletes in cases where there is an override or there is already a delete of another signature (that should trigger name hiding too).
Differential Revision: https://reviews.llvm.org/D34120
llvm-svn: 305251
The last fix required the user to manually add the required
feature. This caused an LLD test to fail because I failed to
update LLD. In practice we can hide this logic so it can just
be transparently added when we write the PDB.
llvm-svn: 305236
Older PDBs don't have this. Its presence is detected by using
the various "feature" flags that come at the end of the PDB
Stream. Detect this, and don't try to dump the ID stream if the
features tells us it's not present.
llvm-svn: 305235
This is a precursor to another change (coming soon) that aims to make
FoldingSet's API more type-safe. Without this, the type-safety change
would just duplicate 4 more public methods between the already very
similar classes.
This renames FoldingSetImpl to FoldingSetBase so it's consistent with
the FooBase -> FooImpl<T> -> Foo<T> convention we seem to have with
other containers.
llvm-svn: 305231
Summary:
Use the filepath used to open the archive member as the archive member
name instead of the file basename. This path might be absolute or
relative. This is important because the archive member name will show
up in the PDB, and we want our PDBs to look as much like MSVC's as
possible.
This also helps avoid an issue in our PDB module descriptor writing
code, which assumes that all module names are unique. Relative paths
still aren't guaranteed to be unique, but they're much better than
basenames, which definitely aren't unique.
Reviewers: ruiu, zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33575
llvm-svn: 305223
This step is just intended to reduce code duplication rather than change any functionality.
A follow-up would be to replace PPCTargetLowering::spliceIntoChain() usage with this new helper.
Differential Revision: https://reviews.llvm.org/D33649
llvm-svn: 305192
Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation.
Reviewers: chandlerc, rnk, reames
Reviewed By: reames
Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits
Differential Revision: https://reviews.llvm.org/D33903
llvm-svn: 305189
Summary:
LLDB built with asan on NetBSD detected issues in the following code:
```
void ArchSpec::Clear() {
m_triple = llvm::Triple();
m_core = kCore_invalid;
m_byte_order = eByteOrderInvalid;
m_distribution_id.Clear();
m_flags = 0;
}
```
--- lldb/source/Core/ArchSpec.cpp
Runtime error messages:
/public/pkgsrc-tmp/wip/lldb-netbsd/work/.buildlink/include/llvm/ADT/Triple.h:44:7: runtime error: load of value 32639, which is not a valid value for type 'SubArchType'
/public/pkgsrc-tmp/wip/lldb-netbsd/work/.buildlink/include/llvm/ADT/Triple.h:44:7: runtime error: load of value 3200171710, which is not a valid value for type 'SubArchType'
/public/pkgsrc-tmp/wip/lldb-netbsd/work/.buildlink/include/llvm/ADT/Triple.h:44:7: runtime error: load of value 3200171710, which is not a valid value for type 'SubArchType'
Correct this issue with initialization of SubArch() in the class Triple constructor.
Sponsored by <The NetBSD Foundation>
Reviewers: chandlerc, zturner
Reviewed By: zturner
Subscribers: llvm-commits, zturner
Differential Revision: https://reviews.llvm.org/D33845
llvm-svn: 305178
Summary:
This prevents the iterator overrides from being selected in
the case where non-iterator types are used as arguments, which
is of particular importance in cases where other overrides with
identical types exist.
Reviewers: dblaikie, bkramer, rafael
Subscribers: llvm-commits, efriedma
Differential Revision: https://reviews.llvm.org/D33919
llvm-svn: 305105
Previously extractors tried to be stateless with any additional
context information needed in order to parse items being passed
in via the extraction method. This led to quite cumbersome
implementation challenges and awkwardness of use. This patch
brings back support for stateful extractors, making the
implementation and usage simpler.
llvm-svn: 305093
Summary: Add the WindowsResourceCOFFWriter class for producing the final COFF after all parsing is done.
Reviewers: hiraditya!, zturner, ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34020
llvm-svn: 305092
By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown,
backends can request that LLVM to scalarize vector types for calls
and returns.
The MIPS vector ABI requires that vector arguments and returns are passed in
integer registers. With SelectionDAG's new hooks, the MIPS backend can now
handle LLVM-IR with vector types in calls and returns. E.g.
'call @foo(<4 x i32> %4)'.
Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for
calls and returns if vector types were not legal. If vector types were legal,
a single 128bit vector argument would be assigned to a single 32 bit / 64 bit
integer register.
By teaching the MIPS backend to inspect the original types, it can now
implement the MIPS vector ABI which requires a particular method of
scalarizing vectors.
Previously, the MIPS backend relied on clang to scalarize types such as "call
@foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3,
i32 inreg %4)".
This patch enables the MIPS backend to take either form for vector types.
The previous version of this patch had a "conditional move or jump depends on
uninitialized value".
Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur
Differential Revision: https://reviews.llvm.org/D27845
llvm-svn: 305083
If we're shrinking a binary operation, it may be the case that the new
operations wraps where the old didn't. If this happens, the behavior
should be well-defined. So, we can't always carry wrapping flags with us
when we shrink operations.
If we do, we get incorrect optimizations in cases like:
void foo(const unsigned char *from, unsigned char *to, int n) {
for (int i = 0; i < n; i++)
to[i] = from[i] - 128;
}
which gets optimized to:
void foo(const unsigned char *from, unsigned char *to, int n) {
for (int i = 0; i < n; i++)
to[i] = from[i] | 128;
}
Because:
- InstCombine turned `sub i32 %from.i, 128` into
`add nuw nsw i32 %from.i, 128`.
- LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a
vector full of `i8 128`s
- InstCombine took advantage of the fact that the newly-shrunken add
"couldn't wrap", and changed the `add` to an `or`.
InstCombine seems happy to figure out whether we can add nuw/nsw on its
own, so I just decided to drop the flags. There are already a number of
places in LoopVectorize where we rely on InstCombine to clean up.
llvm-svn: 305053
Summary:
RelocOffset is a 32-bit value, but we previously truncated it to 16 bits.
Fixes PR33335.
Reviewers: zturner, hiraditya!
Reviewed By: zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33968
llvm-svn: 305043
This is a preparatory change to expose the debug compression style to
clang. It requires exposing the enumeration and passing the actual
value through to the backend from the frontend in actual value form
rather than a boolean that selects the GNU style of debug info
compression.
Minor tweak to the ELF Object Writer to use a variable for re-used
values. Add an assertion that debug information format is one of the
two currently known types if debug information is being compressed.
llvm-svn: 305038
This adds support for Symbols, StringTable, and FrameData subsection
types. Even though these subsections rarely if ever appear in a PDB
file (they are usually in object files), there's no theoretical reason
why they *couldn't* appear in a PDB. The real issue though is that in
order to add support for dumping and writing them (which will be useful
for object files), we need a way to test them. And since there is no
support for reading and writing them to / from object files yet, making
PDB support them is the best way to both add support for the underlying
format and add support for tests at the same time. Later, when we go
to add support for reading / writing them from object files, we'll need
only minimal changes in the underlying read/write code.
llvm-svn: 305037
This is the same change for the YAML Output style applied to the
raw output style. Previously we would queue up all subsections
until every one had been read, and then output them in a pre-
determined order. This was because some subsections need to be
read first in order to properly dump later subsections. This
patch allows them to be dumped in the order they appear.
Differential Revision: https://reviews.llvm.org/D34015
llvm-svn: 305034
These used to be virtual methods that would enable doing the right thing with only a TerminatorInst pointer. I believe they were also acting as vtable anchors in my cases. I think the fact that they had a separate name ending in V was to allow a version without V to be called without a virtual call in a pre-C++11 final keyword world.
Where possible the base methods in TerminatorInst dispatch directly to the public methods in the classes that have the same signature. For some classes this wasn't possible so I've left private method versions that match the name and signature of the version in TerminatorInst. All versions have been moved into the class definitions since we no longer need vtable anchors here.
Differential Revision: https://reviews.llvm.org/D34011
llvm-svn: 305028
This is to prepare to allow for dead stripping of globals in the
merged modules.
Differential Revision: https://reviews.llvm.org/D33921
llvm-svn: 305027
This data type includes the contents of a bitcode file.
Right now a bitcode file can only contain modules, but
a later change will add a symbol table.
Differential Revision: https://reviews.llvm.org/D33969
llvm-svn: 305019
(0) RegAllocPBQP: Since getRawAllocationOrder() may return a collection that includes reserved physical registers, iterate to find an un-reserved physical register.
(1) VirtRegMap: Enforce the invariant: "no reserved physical registers" in assignVirt2Phys(). Previously, this was checked only after the fact in VirtRegRewriter::rewrite.
(2) MachineVerifier: updated the test per MatzeB's review.
(3) +testcase
Patch by Nick Johnson<Nicholas.Paul.Johnson@deshawresearch.com>!
Differential Revision: https://reviews.llvm.org/D33947
llvm-svn: 305016
Apparently support for /debug:fastlink PDBs isn't part of the
DIA SDK (!), and it was causing llvm-pdbdump to crash because
we weren't checking for a null pointer return value. This
manifests when calling findChildren on the IDiaSymbol, and
it returns E_NOTIMPL.
llvm-svn: 304982
The zero heuristic assumes that integers are more likely positive than negative,
but this also has the effect of assuming that strcmp return values are more
likely positive than negative. Given that for nonzero strcmp return values it's
the ordering of arguments that determines the sign of the result there's no
reason to assume that's true.
Fix this by inspecting the LHS of the compare and using TargetLibraryInfo to
decide if it's strcmp-like, and if so only assume that nonzero is more likely
than zero i.e. strings are more often different than the same. This causes a
slight code generation change in the spec2006 benchmark 403.gcc, but with no
noticeable performance impact. The intent of this patch is to allow better
optimisation of dhrystone on Cortex-M cpus, but currently it won't as there are
also some changes that need to be made to if-conversion.
Differential Revision: https://reviews.llvm.org/D33934
llvm-svn: 304970
This code now lives in lib/Object. The idea is that it can now be reused by
IRObjectFile among other things.
Differential Revision: https://reviews.llvm.org/D31921
llvm-svn: 304958
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.
Differential Revision: https://reviews.llvm.org/D33843
llvm-svn: 304864
Previously you would have to use operator==(uint64_t) which does the getActiveBits call and a uint64_t comparison. But we can get all we need to know from the getActiveBits call.
This method will be used in another commit.
llvm-svn: 304854
This patch introduces a new command line option, called brief, to
llvm-dwarfdump. When -brief is used, the attribute forms for the
.debug_info section will not be emitted to output.
Patch by Spyridoula Gravani!
rdar://problem/21474365
Differential Revision: https://reviews.llvm.org/D33867
llvm-svn: 304844
Summary:
I would like to add printing of registered targets to clang's version
information. For this to work correctly, the VersionPrinter logic in
CommandLine.cpp should support printing to arbitrary raw_ostreams,
instead of always defaulting to outs().
Add a raw_ostream& parameter to the function pointer type used for
VersionPrinter, and while doing so, introduce a typedef for convenience.
Note that VersionPrinter::print() will still default to using outs(),
the clang part will necessarily go into a separate review.
Reviewers: beanz, chandlerc, dberris, mehdi_amini, zturner
Reviewed By: mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33899
llvm-svn: 304835
Summary:
LVIPrinter pass was previously relying on the LVICache. We now directly call the
the LVI functions which solves the value if the LVI information is not already
available in the cache. This has 2 benefits over the printing of LVI cache:
1. higher coverage (i.e. catches errors) in LVI code when cache value is
invalidated.
2. relies on the core functions, and not dependent on the LVI cache (which may
be scrapped at some point).
It would still catch any cache invalidation errors, since we first go through
the cache.
Reviewers: reames, dberlin, sanjoy
Reviewed by: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32135
llvm-svn: 304819
The change cleans up and unifies the handling of relocation
entries in WasmObjectWriter. Type index relocation no longer
need to be handled separately.
The only externally visible change should be that type
index relocations are no longer grouped at the end.
Differential Revision: https://reviews.llvm.org/D33918
llvm-svn: 304816
1. When there is no perfect iteration order, we can't let phi nodes
put themselves in terms of things that come later in the iteration
order, or we will endlessly cycle (the normal RPO algorithm clears the
hashtable to avoid this issue).
2. We are sometimes erasing the wrong expression (causing pessimism)
because our equality says loads and stores are the same.
We introduce an exact equality function and use it when erasing to
make sure we erase only identical expressions, not equivalent ones.
llvm-svn: 304807
Summary:
Expanding the loop idiom test for memcpy to also recognize
unordered atomic memcpy. The only difference for recognizing
an unordered atomic memcpy and instead of a normal memcpy is
that the loads and/or stores involved are unordered atomic operations.
Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html
Patch by Daniel Neilson!
Reviewers: reames, anna, skatkov
Reviewed By: reames, anna
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D33243
llvm-svn: 304806
These methods looks like they were originally came from
MCELFObjectTargetWriter but they are never called by the
WasmObjectWriter.
Remove these methods meant the declaration of WasmRelocationEntry
could also move into the cpp file.
Differential Revision: https://reviews.llvm.org/D33905
llvm-svn: 304804
This was masked by lucky #include ordering in the .cpp files and
uncovered when we moved to the canonical ordering because the primary
header was included first (yay!). Unfortunately, I can't build this
locally so took a build-bot iteration to find it.
llvm-svn: 304789
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
If -simplify-mir option is passed then MIRPrinter will not print such fields.
This change also required some lit test cases in CodeGen directory to be changed.
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D32304
llvm-svn: 304779
Summary:
This problem stems from the fact that instructions are allocated using new
in LLVM, i.e. there is no relationship that can be derived by just looking
at the pointer value.
This interface dispatches to appropriate dominance check given 2 instructions,
i.e. in case the instructions are in the same basic block, ordered basicblock
(with instruction numbering and caching) are used. Otherwise, dominator tree
is used.
This is a preparation patch for https://reviews.llvm.org/D32720
Reviewers: dberlin, hfinkel, davide
Subscribers: davide, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D33380
llvm-svn: 304764
When parsing .mir files immediately construct the MachineFunctions and
put them into MachineModuleInfo.
This allows us to get rid of the delayed construction (and delayed error
reporting) through the MachineFunctionInitialzier interface.
Differential Revision: https://reviews.llvm.org/D33809
llvm-svn: 304758
Summary:
Reverse iteration can be turned on, by default, by setting -DLLVM_REVERSE_ITERATION:BOOL=ON during cmake.
With this enabled, we can uncover lots of cases of non-determinism in codegen by simply running our tests (without any other change).
We can then setup a buildbot which will have this turned on by default. Initially, a lot of unit tests will fail in this configuration.
Once we start fixing non-determinism issues, we can gradually make this a blocker for patches.
Reviewers: davide, dblaikie, mehdi_amini, dberlin
Reviewed By: dblaikie
Subscribers: probinson, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D33908
llvm-svn: 304757
- Move ISel (and pre-isel) pass construction into TargetPassConfig
- Extract AsmPrinter construction into a helper function
Putting the ISel code into TargetPassConfig seems a lot more natural and
both changes together make make it easier to build custom pipelines
involving .mir in an upcoming commit. This moves MachineModuleInfo to an
earlier place in the pass pipeline which shouldn't have any effect.
llvm-svn: 304754
While it's not entirely clear why a compiler or linker might
put this information into an object or PDB file, one has been
spotted in the wild which was causing llvm-pdbdump to crash.
This patch adds support for reading-writing these sections.
Since I don't know how to get one of the native tools to
generate this kind of debug info, the only test here is one
in which we feed YAML into the tool to produce a PDB and
then spit out YAML from the resulting PDB and make sure that
it matches.
llvm-svn: 304738
This ensures that we can emit the ObjC Image Info structure on COFF and
ELF as well. The frontend already would attempt to emit this
information but would get dropped when generating assembly or an object
file.
llvm-svn: 304736
This removes a quadratic behavior in assert-enabled builds.
GVN propagates the equivalence from a condition into the blocks guarded by the
condition. E.g. for 'if (a == 7) { ... }', 'a' will be replaced in the block
with 7. It does this by replacing all the uses of 'a' that are dominated by
the true edge.
For a switch with N cases and U uses of the value, this will mean N * U calls
to 'dominates'. Asserting isSingleEdge in 'dominates' make this N^2 * U
because this function checks for the uniqueness of the edge. I.e. traverses
each edge between the SwitchInst's block and the cases.
The change removes the assert and makes 'dominates' works correctly in the
presence of non-unique edges.
This brings build time down by an order of magnitude for an input that has
~10k cases in a switch statement.
Differential Revision: https://reviews.llvm.org/D33584
llvm-svn: 304721
procedural optimizations to prevent dropping symbols and allow the linker
to process re-directs.
PR33145: --wrap doesn't work with lto.
Differential Revision: https://reviews.llvm.org/D33621
llvm-svn: 304719
Summary:
Since LLVM_NATIVE_ARCH, LLVM_NATIVE_ASMPARSER, LLVM_NATIVE_ASMPRINTER,
LLVM_NATIVE_DISASSEMBLER, LLVM_NATIVE_TARGET, LLVM_NATIVE_TARGETINFO and
LLVM_NATIVE_TARGETMC are already defined in llvm-config.h, there seems
to be no reason to also define them in config.h. Also, I can only find
usage of these macros in files that include llvm-config.h.
So let's remove the duplicated macros from config.h.
Reviewers: chandlerc, rnk, mehdi_amini, joerg
Reviewed By: rnk
Subscribers: chapuni, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D33881
llvm-svn: 304714
The C functions added are LLVMGetNumContainedTypes and
LLVMGetSubtypes.
The OCaml function added is Llvm.subtypes.
Patch by Ekaterina Vaartis.
Differential Revision: https://reviews.llvm.org/D33677
llvm-svn: 304709
This patch provides a means to specify section-names for global variables,
functions and static variables, using #pragma directives.
This feature is only defined to work sensibly for ELF targets.
One can specify section names as:
#pragma clang section bss="myBSS" data="myData" rodata="myRodata" text="myText"
One can "unspecify" a section name with empty string e.g.
#pragma clang section bss="" data="" text="" rodata=""
Reviewers: Roger Ferrer, Jonathan Roelofs, Reid Kleckner
Differential Revision: https://reviews.llvm.org/D33413
llvm-svn: 304704
Previously MappedBlockStream owned its own BumpPtrAllocator that
it would allocate from when a read crossed a block boundary. This
way it could still return the user a contiguous buffer of the
requested size. However, It's not uncommon to open a stream, read
some stuff, close it, and then save the information for later.
After all, since the entire file is mapped into memory, the data
should always be available as long as the file is open.
Of course, the exception to this is when the data isn't *in* the
file, but rather in some buffer that we temporarily allocated to
present this contiguous view. And this buffer would get destroyed
as soon as the strema was closed.
The fix here is to force the user to specify the allocator, this
way it can provide an allocator that has whatever lifetime it
chooses.
Differential Revision: https://reviews.llvm.org/D33858
llvm-svn: 304623
We'd called this "vm state" in the early days, but have long since standardized on calling it "deopt" in line with the operand bundle tag. Fix a few cases we'd missed.
llvm-svn: 304607
This pass allows to run the register scavenging independently of
PrologEpilogInserter to allow targeted testing.
Also adds some basic register scavenging tests.
llvm-svn: 304606
Use the initializeXXX method to initialize the RABasic pass in the
pipeline. This enables us to take advantage of the .mir infrastructure.
llvm-svn: 304602
Previously we would expect certain subsections to appear
in a certain order because some subsections would reference
other subsections, but in practice we need to support
arbitrary orderings since some object file and PDB file
producers generate them this way. This also paves the
way for supporting Yaml <-> Object File conversion of
CodeView, since Object Files typically have quite a
large number of subsections in their debug info.
Differential Revision: https://reviews.llvm.org/D33807
llvm-svn: 304588
This reverts commit r304561 and re-lands r303490 & co.
The fix was to use "SymbolName" when translating LLD's internal export
list to lib/Object's short export struct. The SymbolName reflects the
actual symbol name, which may include fastcall and stdcall mangling bits
not included in the /EXPORT or .def file EXPORTS name:
@@ -434,8 +434,7 @@ std::vector<COFFShortExport> createCOFFShortExportFromConfig() {
std::vector<COFFShortExport> Exports;
for (Export &E1 : Config->Exports) {
COFFShortExport E2;
- E2.Name = E1.Name;
+ // Use SymbolName, which will have any stdcall or fastcall qualifiers.
+ E2.Name = E1.SymbolName;
E2.ExtName = E1.ExtName;
E2.Ordinal = E1.Ordinal;
E2.Noname = E1.Noname;
llvm-svn: 304573
This might give a few better opportunities to optimize these to memcpy
rather than loops - also a few minor cleanups (StringRef-izing,
templating (to avoid std::function indirection), etc).
The SmallVector::assign(iter, iter) could be improved with the use of
SFINAE, but the (iter, iter) ctor and append(iter, iter) need it to and
don't have it - so, workaround it for now rather than bothering with the
added complexity.
(also, as noted in the added FIXME, these assign ops could potentially
be optimized better at least for non-trivially-copyable types)
llvm-svn: 304566
While doing so, clarify the comments and update them to reflect current reality.
Note: I'm going to let this sit for a week or so before adding further verification. I want to give this time to cycle through bots and merge it into our downstream tree before pushing this further.
llvm-svn: 304565
This initial patch doesn't actually do much useful. It's just to show where the new code goes. Once this is in, I'll extend the verification logic to check more useful properties.
For those curious, the more complicated version of this patch already found one very suspicious thing.
Differential Revision: https://reviews.llvm.org/D33819
llvm-svn: 304564
This reverts commits r303490, r303491, r303493, and r303494.
This caused http://crbug.com/728726. Essentially, exporting stdcall
functions doesn't appear to work after this change. Reduced test case
soon.
llvm-svn: 304561
Summary:
As we teach Clang to use ThinkLTO + new PM, it's good for the users to
inject through Config, instead of setting a flag in the LTOBackend
library. Move the flag to llvm-lto2.
As it moves to llvm-lto2, a new name -use-new-pm seems simpler and as
clear.
Reviewers: davide, tejohnson
Subscribers: mehdi_amini, Prazek, inglorion, eraman, chandlerc, llvm-commits
Differential Revision: https://reviews.llvm.org/D33799
llvm-svn: 304492
This was rL304226, reverted in 304228 due to a clang assertion failure
on the build bots. That problem should have been addressed by clang
commit rL304470.
llvm-svn: 304488
Object files have symbol records not aligned to any particular
boundary (e.g. 1-byte aligned), while PDB files have symbol
records padded to 4-byte aligned boundaries. Since they share
the same reading / writing code, we have to provide an option to
specify the alignment and propagate it up to the producer or
consumer who knows what the alignment is supposed to be for the
given container type.
Added a test for this by modifying the existing PDB -> YAML -> PDB
round-tripping code to round trip symbol records as well as types.
Differential Revision: https://reviews.llvm.org/D33785
llvm-svn: 304484
Summary:
Clang wants to clone a function before it is done building the entire
compilation unit. As of now, there is no good way to do that, because
CloneFunction doesn't like dealing with temporary metadata. However,
as long as clang doesn't want to add any variables to this SP, it
should be fine to just prematurely finalize it. Add an API to allow this.
This is done in preparation of a clang commit to fix the assertion that
necessitated the revert of D33655.
Reviewers: aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33704
llvm-svn: 304467
Replace GVFlags::LiveRoot with GVFlags::Live and use that instead of
all the DeadSymbols sets. This is refactoring in order to make
liveness information available in the RegularLTO pipeline.
llvm-svn: 304466
This commit introduces a structure that holds all the flags that
control the pretty printing of dwarf output.
Patch by Spyridoula Gravani!
Differential Revision: https://reviews.llvm.org/D33749
llvm-svn: 304446
Based on the original patch by Davide, but I've adjusted the API exposed
to just be different entry points rather than exposing more state
parameters. I've factored all the common logic out so that we don't have
any duplicate pipelines, we just stitch them together in different ways.
I think this makes the build easier to reason about and understand.
This adds a direct method for getting the module simplification pipeline
as well as a method to get the optimization pipeline. While not my
express goal, this seems nice and gives a good place comment about the
restrictions that are imposed on them.
I did make some minor changes to the way the pipelines are structured
here, but hopefully not ones that are significant or controversial:
1) I sunk the PGO indirect call promotion to only be run when we have
PGO enabled (or as part of the special ThinLTO pipeline).
2) I made the extra GlobalOpt run in ThinLTO just happen all the time
and at a slightly more powerful place (before we remove available
externaly functions). This seems like general goodness and not a big
compile time sink, so it didn't make sense to *only* use it in
ThinLTO. Fewer differences in the pipeline makes everything simpler
IMO.
3) I hoisted the ThinLTO stop point pre-link above the the RPO function
attr inference. The RPO inference won't infer anything terribly
meaningful pre-link (recursiveness?) so it didn't make a lot of
sense. But if the placement of RPO inference starts to matter, we
should move it to the canonicalization phase anyways which seems like
a better place for it (and there is a FIXME to this effect!). But
that seemed a bridge too far for this patch.
If we ever need to parameterize these pipelines more heavily, we can
always sink the logic to helper functions with parameters to keep those
parameters out of the public API. But the changes above seemed minor
that we could possible get away without the parameters entirely.
I added support for parsing 'thinlto' and 'thinlto-pre-link' names in
pass pipelines to make it easy to test these routines and play with them
in larger pipelines. I also added a really basic manifest of passes test
that will show exactly how the pipelines behave and work as well as
making updates to them clear.
Lastly, this factoring does introduce a nesting layer of module pass
managers in the default pipeline. I don't think this is a big deal and
the flexibility of decoupling the pipelines seems easily worth it.
Differential Revision: https://reviews.llvm.org/D33540
llvm-svn: 304407
Summary:
This is a continuation of the work started in D29872 . Passing the carry down as a value rather than as a glue allows for further optimizations. Introducing setcccarry makes the use of addc/subc unecessary and we can start the removal process.
This patch only introduce the optimization strictly required to get the same level of optimization as was available before nothing more.
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33374
llvm-svn: 304404
They weren't used often enough to justify having two different interfaces. Push the responsiblity of creating a StringInit up to the caller.
llvm-svn: 304388
Summary: Also see D33429 for other ThinLTO + New PM related changes.
Reviewers: davide, chandlerc, tejohnson
Subscribers: mehdi_amini, Prazek, cfe-commits, inglorion, llvm-commits, eraman
Differential Revision: https://reviews.llvm.org/D33525
llvm-svn: 304378
Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB.
Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb
Reviewed By: MatzeB, andreadb
Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32563
llvm-svn: 304371
After transforming FP to ST registers:
- Do not add the ST register to the livein lists, they are reserved so
we do not need to track their liveness.
- Remove the FP registers from the livein lists, they don't have defs or
uses anymore and so are not live.
- (The setKillFlags() call is moved to an earlier place as it relies on
the FP registers still being present in the livein list.)
llvm-svn: 304342
Summary:
Fairly straightforward patch to fill in some of the holes in the
attributes API with respect to accessing parameter/argument attributes.
The patch aims to step further towards encapsulating the
idx+FirstArgIndex pattern to access these attributes to within the
AttributeList.
Patch by Daniel Neilson!
Reviewers: rnk, chandlerc, pete, javed.absar, reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33355
llvm-svn: 304329
Internally both these methods just return the result of getValue on either a StringInit or a CodeInit object. In both cases this returns a StringRef pointing to a string allocated in the BumpPtrAllocator so its not going anywhere. So we can just pass that StringRef along.
This is a fairly naive patch that targets just the build failures caused by this change. There's additional work that can be done to avoid creating std::string at call sites that still think getValueAsString returns a std::string. I'll try to clean those up in future patches.
Differential Revision: https://reviews.llvm.org/D33710
llvm-svn: 304325
This adds a callback to the LLVMTargetMachine that lets target indicate
that they do not pass the machine verifier checks in all cases yet.
This is intended to be a temporary measure while the targets are fixed
allowing us to enable the machine verifier by default with
EXPENSIVE_CHECKS enabled!
Differential Revision: https://reviews.llvm.org/D33696
llvm-svn: 304320
This reverts commit r304310.
It caused build failures in polly and mingw
due to undefined reference to
llvm::RTLIB::getMEMCPY_ELEMENT_ATOMIC.
llvm-svn: 304315
This patch does an inline expansion of memcmp.
It changes the memcmp library call into an inline expansion when the size is
known at compile time and is under a target specified threshold.
This expansion is implemented in CodeGenPrepare and expands into straight line
code. The target specifies a maximum load size and the expansion works by using
this size to load the two sources, compare, and exit early if a difference is
found. It also has a special case when the memcmp result is used in a compare
to zero equality.
Differential Revision: https://reviews.llvm.org/D28637
llvm-svn: 304313
Summary:
Expanding the loop idiom test for memcpy to also recognize unordered atomic memcpy.
The only difference for recognizing
an unordered atomic memcpy and instead of a normal memcpy is
that the loads and/or stores involved are unordered atomic operations.
Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html
Patch by Daniel Neilson!
Reviewers: reames, anna, skatkov
Reviewed By: reames
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D33243
llvm-svn: 304310
The code was a mess and disorganized due to the sheer amount
of it being in one file. So I'm splitting this into three files.
One for CodeView types, one for CodeView symbols, and one for
CodeView debug subsections. NFC.
llvm-svn: 304278
CodeViewYAML.h attempts to hide the details of many of the
CodeView yaml structures and types, but at the same time it
exposes the mapping traits for them to external users of the
header.
This patch just hides these in the implementation files so that
the interface is kept as simple as possible.
llvm-svn: 304263
This continues the effort to get the CodeView YAML parsing logic
into ObjectYAML. After this patch, the only missing piece will
be the CodeView debug symbol subsections.
llvm-svn: 304256
This is the beginning of an effort to move the codeview yaml
reader / writer into ObjectYAML so that it can be shared.
Currently the only consumer / producer of CodeView YAML is
llvm-pdbdump, but CodeView can exist outside of PDB files, and
indeed is put into object files and passed to the linker to
produce PDB files. Furthermore, there are subtle differences
in the types of records that show up in object file CodeView
vs PDB file CodeView, but they are otherwise 99% the same.
By having this code in ObjectYAML, we can have llvm-pdbdump
reuse this code, while teaching obj2yaml and yaml2obj to use
this syntax for dealing with object files that can contain
CodeView.
This patch only adds support for CodeView type information
to ObjectYAML. Subsequent patches will add support for
CodeView symbol information.
llvm-svn: 304248
TargetPassConfig is not useful for targets that do not use the CodeGen
library, so we may just as well store a pointer to an
LLVMTargetMachine instead of just to a TargetMachine.
While at it, also change the constructor to take a reference instead of a
pointer as the TM must not be nullptr.
llvm-svn: 304247
Summary:
In rL302576, DISubprograms gained the constraint that a !dbg attachments to functions must
have a 1:1 mapping to DISubprograms. As part of that change, the function cloning support
was adjusted to attempt to enforce this invariant during cloning. However, there
were several problems with the implementation. Part of these were fixed in rL304079.
However, there was a more fundamental problem with these changes, namely that it
bypasses the matadata value map, causing the cloned metadata to be a mix of metadata
pointing to the new suprogram (where manual code was added to fix those up) and the
old suprogram (where this was not the case). This mismatch could cause a number of
different assertion failures in the DWARF emitter. Some of these are given at
https://github.com/JuliaLang/julia/issues/22069, but some others have been observed
as well. Attempt to rectify this by partially reverting the manual DI metadata fixup,
and instead using the standard value map approach. To retain the desired semantics
of not duplicating the compilation unit and inlined subprograms, explicitly freeze
these in the value map.
Reviewers: dblaikie, aprantl, GorNishanov, echristo
Reviewed By: aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33655
llvm-svn: 304226
This adds implementations for Symbols and FrameData, and renames
the existing codeview::StringTable class to conform to the
DebugSectionStringTable convention.
llvm-svn: 304222
Params DT and LI are redundant, because these values are contained in fields anyways.
Differential Revision: https://reviews.llvm.org/D33668
llvm-svn: 304204
The MC ConstantPool class uses a DenseMap to track generated constants, with
the int64_t value of the constant as the key. This fails when values of
0x7fffffffffffffff or 0x7ffffffffffffffe are inserted into the constant pool, as
these are sentinel values for DenseMap.
The fix is to use std::map instead, which doesn't use sentinel values.
Differential revision: https://reviews.llvm.org/D33667
llvm-svn: 304199
This is super awkward, but GCC doesn't let us have template visible when
an argument is an inline function and -fvisibility-inlines-hidden is
used.
llvm-svn: 304175
With fix of uninitialized variable.
Original commit message:
This change is intended to use for LLD in D33183.
Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to.
Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo
interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason of incorrect behavior when
sections share the same offsets, like D33176 shows.
This patch makes DWARF parsers to return section index as well. That solves problem mentioned above.
Differential revision: https://reviews.llvm.org/D33184
llvm-svn: 304078
DagInits are allocated in a BumpPtrAllocator so they are never destructed. This means the destructor for the SmallVector never runs.
To fix this we now allocate the vectors in the BumpPtrAllocator too using TrailingObjects.
llvm-svn: 304077
Rewrite fixupKills() to use the LivePhysRegs class. Simplifies the code
and fixes a bug where the CSR registers in return blocks where missed
leading to invalid kill flags. Also remove the unnecessary rule that we
wouldn't set kill flags on tied operands.
No tests as I have an upcoming commit improving MachineVerifier checks
to catch these cases in multiple existing lit tests.
llvm-svn: 304055
This reverts commit r299287 plus clean-ups.
The localizer pass is a helper pass that could be run at O0 in the GISel
pipeline to work around the deficiency of the fast register allocator.
It basically shortens the live-ranges of the constants so that the
allocator does not spill all over the place.
Long term fix would be to make the greedy allocator fast.
llvm-svn: 304051
The recommit is to fix a bug about ExtractValue and InsertValue ops. For those
ops, some varargs inside GVN::Expression are not value numbers but raw index
numbers. It is wrong to do phi-translate for raw index numbers, and the fix is
to stop doing that.
Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.
long a[100], b[100], g1, g2, g3;
__attribute__((pure)) long goo();
void foo(long a, long b, long c, long d) {
g1 = a * b;
if (__builtin_expect(g2 > 3, 0)) {
a = c;
b = d;
g2 = a * b;
}
g3 = a * b; // fully redundant.
}
The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.
Differential Revision: https://reviews.llvm.org/D32252
llvm-svn: 304050
[AMDGPU] add intrinsic for s_getpc
Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc.
Patch by Tim Corringham
llvm-svn: 304031
Consistent with GCC and addresses a shortcoming with ThinLTO where many
imported CUs may end up being empty (because the functions imported from
them either ended up not being used (and were then discarded, since
they're imported as available_externally) or optimized away entirely).
Test cases previously testing empty CUs (either intentionally, or
because they didn't need anything more complicated) had a trivial 'int'
or similar basic type added to their retained types list.
This is a first order approximation - a deeper implementation could do
things like:
1) Be more lazy about construction of the CU - for example if two CUs
containing a single identical retained type are linked together, with
this change one of the two CUs will be produced but empty (since a
duplicate type won't be produced).
2) Go further and invert all the CU links the same way the subprogram
link is inverted - keep named CU lists of retained types, macros, etc,
and have those link back to the CU. Then if they're emitted, the CU is
emitted, but never otherwise - this would allow the metadata itself to
be dropped earlier too, though it seems unlikely that's an important
optimization as there shouldn't be many CUs relative to the number of
other entities.
llvm-svn: 304020
This change is intended to use for LLD in D33183.
Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to.
Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo
interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason of incorrect behavior when
sections share the same offsets, like D33176 shows.
This patch makes DWARF parsers to return section index as well. That solves problem mentioned above.
Differential revision: https://reviews.llvm.org/D33184
llvm-svn: 304002
With fix of test compilation.
Initial commit message:
This change is intended to use for LLD in D33183.
Problem we have in LLD when building .gdb_index is that we need to know section
which address range belongs to.
Previously it was solved on LLD side by providing fake section addresses
with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed.
Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason
of incorrect behavior when
sections share the same offsets, like D33176 shows.
This patch makes DWARF parsers to return section index as well.
That solves problem mentioned above.
Differential revision: https://reviews.llvm.org/D33184
llvm-svn: 303983
This change is intended to use for LLD in D33183.
Problem we have in LLD when building .gdb_index is that we need to know section
which address range belongs to.
Previously it was solved on LLD side by providing fake section addresses
with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed.
Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason
of incorrect behavior when
sections share the same offsets, like D33176 shows.
This patch makes DWARF parsers to return section index as well.
That solves problem mentioned above.
Differential revision: https://reviews.llvm.org/D33184
llvm-svn: 303978
The patch rL303730 was reverted because test lsr-expand-quadratic.ll failed on
many non-X86 configs with this patch. The reason of this is that the patch
makes a correctless fix that changes optimizer's behavior for this test.
Without the change, LSR was making an overconfident simplification basing on a
wrong SCEV. Apparently it did not need the IV analysis to do this. With the
change, it chose a different way to simplify (that wasn't so confident), and
this way required the IV analysis. Now, following the right execution path,
LSR tries to make a transformation relying on IV Users analysis. This analysis
is target-dependent due to this code:
// LSR is not APInt clean, do not touch integers bigger than 64-bits.
// Also avoid creating IVs of non-native types. For example, we don't want a
// 64-bit IV in 32-bit code just because the loop has one 64-bit cast.
uint64_t Width = SE->getTypeSizeInBits(I->getType());
if (Width > 64 || !DL.isLegalInteger(Width))
return false;
To make a proper transformation in this test case, the type i32 needs to be
legal for the specified data layout. When the test runs on some non-X86
configuration (e.g. pure ARM 64), opt gets confused by the specified target
and does not use it, rejecting the specified data layout as well. Instead,
it uses some default layout that does not treat i32 as a legal type
(currently the layout that is used when it is not specified does not have
legal types at all). As result, the transformation we expect to happen does
not happen for this test.
This re-enabling patch does not have any source code changes compared to the
original patch rL303730. The only difference is that the failing test is
moved to X86 directory and now has requirement of running on x86 only to comply
with the specified target triple and data layout.
Differential Revision: https://reviews.llvm.org/D33543
llvm-svn: 303971
Re-commit r303937 + r303949 as they were not the cause for the build
failures.
We do not track liveness of reserved registers so adding them to the
liveins list in computeLiveIns() was completely unnecessary.
llvm-svn: 303970
block.
This allows writing much more natural and readable range based for loops
directly over the PHI nodes. It also takes advantage of the same tricks
for terminating the sequence as the hand coded versions.
I've replaced one example of this mostly to showcase the difference and
I've added a unit test to make sure the facilities really work the way
they're intended. I want to use this inside of SimpleLoopUnswitch but it
seems generally nice.
Differential Revision: https://reviews.llvm.org/D33533
llvm-svn: 303964
Summary:
RelocVisitor had too many, too small functions. This patch group them
by architecture rather than each relocation type.
Reviewers: grimar, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33580
llvm-svn: 303950
Merging two type streams is one of the most time consuming
parts of generating a PDB, and as such it needs to be as
fast as possible. The visitor abstractions used for interoperating
nicely with many different types of inputs and outputs have
been used widely and help greatly for testability and implementing
tools, but the abstractions build up and get in the way of
performance.
This patch removes all of the visitation stuff from the type
stream merger, essentially re-inventing the leaf / member switch
and loop, but at a very low level. This allows us many other
optimizations, such as not actually deserializing *any* records
(even member records which don't describe their own length), as
the operation of "figure out how long this record is" is somewhat
faster than "figure out how long this record *and* get all its
fields out". Furthermore, whereas before we had to deserialize,
re-write type indices, then re-serialize, now we don't have to
do any of those 3 steps. We just find out where the type indices
are and pull them directly out of the byte stream and re-write
them.
This is worth a 50-60% performance increase. On top of all other
optimizations that have been applied this week, I now get the
following numbers when linking lld.exe and lld.pdb
MSVC: 25.67s
Before This Patch: 18.59s
After This Patch: 8.92s
So this is a huge performance win.
Differential Revision: https://reviews.llvm.org/D33564
llvm-svn: 303935
Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.
long a[100], b[100], g1, g2, g3;
__attribute__((pure)) long goo();
void foo(long a, long b, long c, long d) {
g1 = a * b;
if (__builtin_expect(g2 > 3, 0)) {
a = c;
b = d;
g2 = a * b;
}
g3 = a * b; // fully redundant.
}
The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.
Differential Revision: https://reviews.llvm.org/D32252
llvm-svn: 303923
Originally this was intended to be set up so that when linking
a PDB which refers to a type server, it would only visit the
PDB once, and on subsequent visitations it would just skip it
since all the records had already been added.
Due to some C++ scoping issues, this was not occurring and it
was revisiting the type server every time, which caused every
record to end up being thrown away on all subsequent visitations.
This doesn't affect the performance of linking clang-cl generated
object files because we don't use type servers, but when linking
object files and libraries generated with /Zi via MSVC, this means
only 1 object file has to be linked instead of N object files, so
the speedup is quite large.
llvm-svn: 303920
Previously, every time we wanted to serialize a field list record, we
would create a new copy of FieldListRecordBuilder, which would in turn
create a temporary instance of TypeSerializer, which itself had a
std::vector<> that was about 128K in size. So this 128K allocation was
happening every time. We can re-use the same instance over and over, we
just have to clear its internal hash table and seen records list between
each run. This saves us from the constant re-allocations.
This is worth an ~18.5% speed increase (3.75s -> 3.05s) in my tests.
Differential Revision: https://reviews.llvm.org/D33506
llvm-svn: 303919
Summary:
DbiStreamBuilder calculated the offset of the source file names inside
the file info substream as the size of the file info substream minus
the size of the file names. Since the file info substream is padded to
a multiple of 4 bytes, this caused the first file name to be aligned
on a 4-byte boundary. By contrast, DbiModuleList would read the file
names immediately after the file name offset table, without skipping
to the next 4-byte boundary. This change makes it so that the file
names are written to the location where DbiModuleList expects them,
and puts any necessary padding for the file info substream after the
file names instead of before it.
Reviewers: amccarth, rnk, zturner
Reviewed By: amccarth, zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33475
llvm-svn: 303917
It was using the number of blocks of the entire PDB file as the number
of blocks of each stream that was created. This was only an issue in
the readLongestContiguousChunk function, which was never called prior.
This bug surfaced when I updated an algorithm to use this function and
the algorithm broke.
llvm-svn: 303916
A profile shows the majority of time doing type merging is spent
deserializing records from sequences of bytes into friendly C++ structures
that we can easily access members of in order to find the type indices to
re-write.
Records are prefixed with their length, however, and most records have
type indices that appear at fixed offsets in the record. For these
records, we can save some cycles by just looking at the right place in the
byte sequence and re-writing the value, then skipping the record in the
type stream. This saves us from the costly deserialization of examining
every field, including potentially null terminated strings which are the
slowest, even though it was unnecessary to begin with.
In addition, we apply another optimization. Previously, after
deserializing a record and re-writing its type indices, we would
unconditionally re-serialize it in order to compute the hash of the
re-written record. This would result in an alloc and memcpy for every
record. If no type indices were re-written, however, this was an
unnecessary allocation. In this patch re-writing is made two phase. The
first phase discovers the indices that need to be rewritten and their new
values. This information is passed through to the de-duplication code,
which only copies and re-writes type indices in the serialized byte
sequence if at least one type index is different.
Some records have type indices which only appear after variable length
strings, or which have lists of type indices, or various other situations
that can make it tricky to make this optimization. While I'm not giving up
on optimizing these cases as well, for now we can get the easy cases out
of the way and lay the groundwork for more complicated cases later.
This patch yields another 50% speedup on top of the already large speedups
submitted over the past 2 days. In two tests I have run, I went from 9
seconds to 3 seconds, and from 16 seconds to 8 seconds.
Differential Revision: https://reviews.llvm.org/D33480
llvm-svn: 303914
This patch provides an initial prototype for a pass that sinks instructions based on GVN information, similar to GVNHoist. It is not yet ready for commiting but I've uploaded it to gather some initial thoughts.
This pass attempts to sink instructions into successors, reducing static
instruction count and enabling if-conversion.
We use a variant of global value numbering to decide what can be sunk.
Consider:
[ %a1 = add i32 %b, 1 ] [ %c1 = add i32 %d, 1 ]
[ %a2 = xor i32 %a1, 1 ] [ %c2 = xor i32 %c1, 1 ]
\ /
[ %e = phi i32 %a2, %c2 ]
[ add i32 %e, 4 ]
GVN would number %a1 and %c1 differently because they compute different
results - the VN of an instruction is a function of its opcode and the
transitive closure of its operands. This is the key property for hoisting
and CSE.
What we want when sinking however is for a numbering that is a function of
the *uses* of an instruction, which allows us to answer the question "if I
replace %a1 with %c1, will it contribute in an equivalent way to all
successive instructions?". The (new) PostValueTable class in GVN provides this
mapping.
This pass has some shown really impressive improvements especially for codesize already on internal benchmarks, so I have high hopes it can replace all the sinking logic in SimplifyCFG.
Differential revision: https://reviews.llvm.org/D24805
llvm-svn: 303850
If Op is equal to array_lengthof, the lookup would be out of bounds, but we were only checking for greater than. I suspect nothing ever passes in the equal value because its a sentinel to mark the end of the builtin opcodes and not a real opcode.
So really this fix is just so that the code looks right and makes sense.
llvm-svn: 303840
having it internally allocate the loop.
This is a much more flexible API and necessary in the new loop unswitch
to reasonably support both new and old PMs in common code. It also just
seems like a cleaner separation of concerns.
NFC, this should just be a pure refactoring.
Differential Revision: https://reviews.llvm.org/D33528
llvm-svn: 303834
This change allows llvm-nm to print symbols found in import libraries,
in part by allowing COFFImportFiles to be casted to SymbolicFiles.
Patch by Dave Lee!
llvm-svn: 303821
The loop vectorizer usually vectorizes any instruction it can and then
extracts the elements for a scalarized use. On SystemZ, all elements
containing addresses must be extracted into address registers (GRs). Since
this extraction is not free, it is better to have the address in a suitable
register to begin with. By forcing address arithmetic instructions and loads
of addresses to be scalar after vectorization, two benefits result:
* No need to extract the register
* LSR optimizations trigger (LSR isn't handling vector addresses currently)
Benchmarking show improvements on SystemZ with this new behaviour.
Any other target could try this by returning false in the new hook
prefersVectorizedAddressing().
Review: Renato Golin, Elena Demikhovsky, Ulrich Weigand
https://reviews.llvm.org/D32422
llvm-svn: 303744
When folding arguments of AddExpr or MulExpr with recurrences, we rely on the fact that
the loop of our base recurrency is the bottom-lost in terms of domination. This assumption
may be broken by an expression which is treated as invariant, and which depends on a complex
Phi for which SCEVUnknown was created. If such Phi is a loop Phi, and this loop is lower than
the chosen AddRecExpr's loop, it is invalid to fold our expression with the recurrence.
Another reason why it might be invalid to fold SCEVUnknown into Phi start value is that unlike
other SCEVs, SCEVUnknown are sometimes position-bound. For example, here:
for (...) { // loop
phi = {A,+,B}
}
X = load ...
Folding phi + X into {A+X,+,B}<loop> actually makes no sense, because X does not exist and cannot
exist while we are iterating in loop (this memory can be even not allocated and not filled by this moment).
It is only valid to make such folding if X is defined before the loop. In this case the recurrence {A+X,+,B}<loop>
may be existant.
This patch prohibits folding of SCEVUnknown (and those who use them) into the start value of an AddRecExpr,
if this instruction is dominated by the loop. Merging the dominating unknown values is still valid. Some tests that
relied on the fact that some SCEVUnknown should be folded into AddRec's are changed so that they no longer
expect such behavior.
llvm-svn: 303730
LazyRandomTypeCollection is designed for random access, and in
order to provide this it lazily indexes ranges of types. In the
case of types from an object file, there is no partial index
to build off of, so it has to index the full stream up front.
However, merging types only requires sequential access, and when
that is needed, this extra work is simply wasted. Changing the
algorithm to work on sequential arrays of types rather than
random access type collections eliminates this up front scan.
llvm-svn: 303707
When writing field list records, we would construct a temporary
type serializer that shared a bump ptr allocator with the rest
of the application, so anything allocated from here would live
forever. Furthermore, this temporary serializer had all the
properties of a full blown serializer including record hashing
and de-duplication.
These features are required when you're merging multiple type
streams into each other, because different streams may contain
identical records, but records from the same type stream will
never collide with each other. So all of this hashing was
unnecessary.
To solve this, two fixes are made:
1) The temporary serializer keeps its own bump ptr allocator
instead of sharing a global one. When it's finished, all of
its memory is freed.
2) Instead of using the same temporary serializer for the life
of an entire type stream, we use it only for the life of a single
field list record and delete it when the field list record is
completed. This way the hash table will not grow as other
records from the same type stream get inserted. Further improvements
could eliminate hashing entirely from this codepath.
This reduces the link time by 85% in my test, from 1 minute to 9
seconds.
llvm-svn: 303676
Summary:
This is a first patch for GSoC project, bash-completion for clang.
To use this on bash, please run `source clang/utils/bash-autocomplete.sh`.
bash-autocomplete.sh is code for bash-completion.
Simple flag completion and path completion is available in this patch.
Reviewers: teemperor, v.g.vassilev, ruiu, Bigcheese, efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33237
llvm-svn: 303670
Summary:
First, StringMap uses llvm::HashString, which is only good for short
identifiers and really bad for large blobs of binary data like type
records. Moving to `DenseMap<StringRef, TypeIndex>` with some tricks for
memory allocation fixes that.
Unfortunately, that didn't buy very much performance. Profiling showed
that we spend a long time during DenseMap growth rehashing existing
entries. Also, in general, DenseMap is faster when the keys are small.
This change takes that to the logical conclusion by introducing a small
wrapper value type around a pointer to key data. The key data contains a
precomputed hash, the original record data (pointer and size), and the
type index, which is the "value" of our original map.
This reduces the time to produce llvm-as.exe and llvm-as.pdb from ~15s
on my machine to 3.5s, which is about a 4x improvement.
Reviewers: zturner, inglorion, ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33428
llvm-svn: 303665
Summary:
Before this change, AttributeLists stored a pair of index and
AttributeSet. This is memory efficient if most arguments do not have
attributes. However, it requires doing a search over the pairs to test
an argument or function attribute. Profiling shows that this loop was
0.76% of the time in 'opt -O2' of sqlite3.c, because LLVM constantly
tests values for nullability.
This was worth about 2.5% of mid-level optimization cycles on the
sqlite3 amalgamation. Here are the full perf results:
https://reviews.llvm.org/P7995
Here are just the before and after cycle counts:
```
$ perf stat -r 5 ./opt_before -O2 sqlite3.bc -o /dev/null
13,274,181,184 cycles # 3.047 GHz ( +- 0.28% )
$ perf stat -r 5 ./opt_after -O2 sqlite3.bc -o /dev/null
12,906,927,263 cycles # 3.043 GHz ( +- 0.51% )
```
This patch *does not* change the indices used to query attributes, as
requested by reviewers. Tracking whether an index is usable for array
indexing is a huge pain that affects many of the internal APIs, so it
would be good to come back later and do a cleanup to remove this
internal adjustment.
Reviewers: pete, chandlerc
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D32819
llvm-svn: 303654
This patch builds over https://reviews.llvm.org/rL303349 and replaces
the use of the condition only if it is safe to do so.
We should not blindly RAUW the condition if experimental.guard or assume
is a use of that
condition. This is because LVI may have used the guard/assume to
identify the
value of the condition, and RUAWing will fold the guard/assume and uses
before the guards/assumes.
Reviewers: sanjoy, reames, trentxintong, mkazantsev
Reviewed by: sanjoy, reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33257
llvm-svn: 303633
Summary:
Add Max ModFlagBehavior, which can be used to take the max of two
module flag values when merging modules. Use it for the PIE and PIC
levels.
This avoids an error when we try to import from a module built -fpic
into a module built -fPIC, for example. For both PIE and PIC levels,
this will be legal, since the code generation gets more conservative
as the level is increased. Therefore we can take the max instead of
somehow trying to block importing between modules compiled with
different levels.
Reviewers: tmsriram, pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33418
llvm-svn: 303590
The forward declarations and the SimplifyQuery class at the beginning of the namespace weren't indented. But the closing brace for SimplifyQuery and everything after it were indented.
This commit makes the whole file consistent to no identation per coding standards. The signature of every function in this file changed a few weeks ago so this isn't a big disturbance to the revision history.
llvm-svn: 303588
Previous algotirhm assumed that types and ids are in a single
unified stream. For inputs that come from object files, this
is the case. But if the input is already a PDB, or is the result
of a previous merge, then the types and ids will already have
been split up, in which case we need an algorithm that can
accept operate on independent streams of types and ids that
refer across stream boundaries to each other.
Differential Revision: https://reviews.llvm.org/D33417
llvm-svn: 303577
MachineInstructions that don't generate any code (such as
IMPLICIT_DEFs) should not generate any debug info either.
Fixes PR33107.
https://bugs.llvm.org/show_bug.cgi?id=33107
This reapplies r303566 without any modifications. The stage2 build
failures persisted even after reverting this patch, and looking back
through history, it looks like these tests are flaky.
llvm-svn: 303575
MachineInstructions that don't generate any code (such as
IMPLICIT_DEFs) should not generate any debug info either.
Fixes PR33107.
https://bugs.llvm.org/show_bug.cgi?id=33107
llvm-svn: 303566
It's causing some buildbots to timeout whenever tablegen needs re-compilation,
particularly those with -fsanitize=memory but not only them. A compile time
regression was expected since it triples the amount of SelectionDAG rules we
are able to import but it's currently too high.
llvm-svn: 303542
Re-applying now that PR32825 which was raised on the commit this fixed up is now known to have also been fixed by this commit.
Original commit message:
Multiple ldr pseudoinstructions with the same constant value will
reuse the same constant pool entry. However, if the constant pool
is explicitly flushed with a .ltorg directive, we should not try
to reference constants in the previous pool any longer, since they
may be out of range.
This fixes assembling hand-written assembler source which repeatedly
loads the same constant value, across a binary size larger than the
pc-relative fixup range for ldr instructions (4096 bytes). Such
assembler source already uses explicit .ltorg instructions to emit
constant pools with regular intervals. However if we try to reuse
constants emitted in earlier pools, they end up out of range.
This makes the output of the testcase match what binutils gas does
(prior to this patch, it would fail to assemble).
Differential Revision: https://reviews.llvm.org/D32847
llvm-svn: 303540
Re-applying now that the open bug on this commit, PR32825, is known to be fixed.
Original commit message:
Summary: This patch returns the same label if the CP entry with the same value has been created.
Reviewers: eli.friedman, rengolin, jmolloy
Subscribers: majnemer, jmolloy, llvm-commits
Differential Revision: https://reviews.llvm.org/D25804
llvm-svn: 303539
This reverts commit r302416. This was a fixup for r286006, which has now been reverted so this doesn't apply (either in concept or in code).
This commit itself has no problems, but the underlying issue it was fixing has now disappeared from the codebase.
llvm-svn: 303536
llvm-symbolizer would fail to symbolize addresses in unlinked object
files when handling .dwo file data because the addresses would not be
relocated in the same way as the ranges in the skeleton CU in the object
file.
Fix that so object files can be symbolized the same as executables.
llvm-svn: 303532
This is a re-application of a r303497 that was reverted in r303498.
I thought it had broken a bot when it had not (the breakage did not
go away with the revert).
This change makes the split between the "exact" backedge taken count
and the "maximum" backedge taken count a bit more obvious. Both of
these are upper bounds on the number of times the loop header
executes (since SCEV does not account for most kinds of abnormal
control flow), but the latter is guaranteed to be a constant.
There were a few places where the max backedge taken count *was* a
non-constant; I've changed those to compute constants instead.
At this point, I'm not sure if the constant max backedge count can be
computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without
losing precision. If it can, we can simplify even further by making
`getMaxBackedgeTakenCount` a thin wrapper around
`getBackedgeTakenCount` and `getUnsignedRange`.
llvm-svn: 303531
This change makes the split between the "exact" backedge taken count
and the "maximum" backedge taken count a bit more obvious. Both of
these are upper bounds on the number of times the loop header
executes (since SCEV does not account for most kinds of abnormal
control flow), but the latter is guaranteed to be a constant.
There were a few places where the max backedge taken count *was* a
non-constant; I've changed those to compute constants instead.
At this point, I'm not sure if the constant max backedge count can be
computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without
losing precision. If it can, we can simplify even further by making
`getMaxBackedgeTakenCount` a thin wrapper around
`getBackedgeTakenCount` and `getUnsignedRange`.
llvm-svn: 303497
Summary: This allows pthread_self to be pulled out of a loop by LICM.
Reviewers: hfinkel, arsenm, davide
Reviewed By: davide
Subscribers: davide, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D32782
llvm-svn: 303495
This is split up into two commits.
The will create the DEF parser in LLVM.
Check the following commit to see the removal from LLD
Reviewers: ruiu
Differential Revision: https://reviews.llvm.org/D32689
llvm-svn: 303490
Summary: Added the new modules in the Object/ folder. Updated the
llvm-cvtres interface as well, and added additional tests.
Subscribers: llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D33180
llvm-svn: 303480
This reapplies commit r303438 modified to not verify cross-imported
bitcode in FunctionImporter.
rdar://problem/31233625
Differential Revision: https://reviews.llvm.org/D33370
llvm-svn: 303470
Refactor the strlen optimization code to work for both strlen and wcslen.
This especially helps with programs in the wild where people pass
L"string"s to const std::wstring& function parameters and the wstring
constructor gets inlined.
This also fixes a lingerind API problem/bug in getConstantStringInfo()
where zeroinitializers would always give you an empty string (without a
length) back regardless of the actual length of the initializer which
did not work well in the TrimAtNul==false causing the PR mentioned
below.
Note that the fixed getConstantStringInfo() needed fixes to SelectionDAG
memcpy lowering and may lead to some cases for out-of-bounds
zeroinitializer accesses not getting optimized anymore. So some code
with UB may produce out of bound memory reads now instead of just
producing zeros.
The refactoring "accidentally" fixes http://llvm.org/PR32124
Differential Revision: https://reviews.llvm.org/D32839
llvm-svn: 303461
This is a complicated bug involving two issues:
1. What do we do with phi nodes when we prove all arguments are not
live?
2. When is it safe to use value leaders to determine if we can ignore
an argumnet?
llvm-svn: 303453