We'd called this "vm state" in the early days, but have long since standardized on calling it "deopt" in line with the operand bundle tag. Fix a few cases we'd missed.
llvm-svn: 304607
This pass allows to run the register scavenging independently of
PrologEpilogInserter to allow targeted testing.
Also adds some basic register scavenging tests.
llvm-svn: 304606
Use the initializeXXX method to initialize the RABasic pass in the
pipeline. This enables us to take advantage of the .mir infrastructure.
llvm-svn: 304602
Previously we would expect certain subsections to appear
in a certain order because some subsections would reference
other subsections, but in practice we need to support
arbitrary orderings since some object file and PDB file
producers generate them this way. This also paves the
way for supporting Yaml <-> Object File conversion of
CodeView, since Object Files typically have quite a
large number of subsections in their debug info.
Differential Revision: https://reviews.llvm.org/D33807
llvm-svn: 304588
This reverts commit r304561 and re-lands r303490 & co.
The fix was to use "SymbolName" when translating LLD's internal export
list to lib/Object's short export struct. The SymbolName reflects the
actual symbol name, which may include fastcall and stdcall mangling bits
not included in the /EXPORT or .def file EXPORTS name:
@@ -434,8 +434,7 @@ std::vector<COFFShortExport> createCOFFShortExportFromConfig() {
std::vector<COFFShortExport> Exports;
for (Export &E1 : Config->Exports) {
COFFShortExport E2;
- E2.Name = E1.Name;
+ // Use SymbolName, which will have any stdcall or fastcall qualifiers.
+ E2.Name = E1.SymbolName;
E2.ExtName = E1.ExtName;
E2.Ordinal = E1.Ordinal;
E2.Noname = E1.Noname;
llvm-svn: 304573
This might give a few better opportunities to optimize these to memcpy
rather than loops - also a few minor cleanups (StringRef-izing,
templating (to avoid std::function indirection), etc).
The SmallVector::assign(iter, iter) could be improved with the use of
SFINAE, but the (iter, iter) ctor and append(iter, iter) need it to and
don't have it - so, workaround it for now rather than bothering with the
added complexity.
(also, as noted in the added FIXME, these assign ops could potentially
be optimized better at least for non-trivially-copyable types)
llvm-svn: 304566
While doing so, clarify the comments and update them to reflect current reality.
Note: I'm going to let this sit for a week or so before adding further verification. I want to give this time to cycle through bots and merge it into our downstream tree before pushing this further.
llvm-svn: 304565
This initial patch doesn't actually do much useful. It's just to show where the new code goes. Once this is in, I'll extend the verification logic to check more useful properties.
For those curious, the more complicated version of this patch already found one very suspicious thing.
Differential Revision: https://reviews.llvm.org/D33819
llvm-svn: 304564
This reverts commits r303490, r303491, r303493, and r303494.
This caused http://crbug.com/728726. Essentially, exporting stdcall
functions doesn't appear to work after this change. Reduced test case
soon.
llvm-svn: 304561
Summary:
As we teach Clang to use ThinkLTO + new PM, it's good for the users to
inject through Config, instead of setting a flag in the LTOBackend
library. Move the flag to llvm-lto2.
As it moves to llvm-lto2, a new name -use-new-pm seems simpler and as
clear.
Reviewers: davide, tejohnson
Subscribers: mehdi_amini, Prazek, inglorion, eraman, chandlerc, llvm-commits
Differential Revision: https://reviews.llvm.org/D33799
llvm-svn: 304492
This was rL304226, reverted in 304228 due to a clang assertion failure
on the build bots. That problem should have been addressed by clang
commit rL304470.
llvm-svn: 304488
Object files have symbol records not aligned to any particular
boundary (e.g. 1-byte aligned), while PDB files have symbol
records padded to 4-byte aligned boundaries. Since they share
the same reading / writing code, we have to provide an option to
specify the alignment and propagate it up to the producer or
consumer who knows what the alignment is supposed to be for the
given container type.
Added a test for this by modifying the existing PDB -> YAML -> PDB
round-tripping code to round trip symbol records as well as types.
Differential Revision: https://reviews.llvm.org/D33785
llvm-svn: 304484
Summary:
Clang wants to clone a function before it is done building the entire
compilation unit. As of now, there is no good way to do that, because
CloneFunction doesn't like dealing with temporary metadata. However,
as long as clang doesn't want to add any variables to this SP, it
should be fine to just prematurely finalize it. Add an API to allow this.
This is done in preparation of a clang commit to fix the assertion that
necessitated the revert of D33655.
Reviewers: aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33704
llvm-svn: 304467
Replace GVFlags::LiveRoot with GVFlags::Live and use that instead of
all the DeadSymbols sets. This is refactoring in order to make
liveness information available in the RegularLTO pipeline.
llvm-svn: 304466
This commit introduces a structure that holds all the flags that
control the pretty printing of dwarf output.
Patch by Spyridoula Gravani!
Differential Revision: https://reviews.llvm.org/D33749
llvm-svn: 304446
Based on the original patch by Davide, but I've adjusted the API exposed
to just be different entry points rather than exposing more state
parameters. I've factored all the common logic out so that we don't have
any duplicate pipelines, we just stitch them together in different ways.
I think this makes the build easier to reason about and understand.
This adds a direct method for getting the module simplification pipeline
as well as a method to get the optimization pipeline. While not my
express goal, this seems nice and gives a good place comment about the
restrictions that are imposed on them.
I did make some minor changes to the way the pipelines are structured
here, but hopefully not ones that are significant or controversial:
1) I sunk the PGO indirect call promotion to only be run when we have
PGO enabled (or as part of the special ThinLTO pipeline).
2) I made the extra GlobalOpt run in ThinLTO just happen all the time
and at a slightly more powerful place (before we remove available
externaly functions). This seems like general goodness and not a big
compile time sink, so it didn't make sense to *only* use it in
ThinLTO. Fewer differences in the pipeline makes everything simpler
IMO.
3) I hoisted the ThinLTO stop point pre-link above the the RPO function
attr inference. The RPO inference won't infer anything terribly
meaningful pre-link (recursiveness?) so it didn't make a lot of
sense. But if the placement of RPO inference starts to matter, we
should move it to the canonicalization phase anyways which seems like
a better place for it (and there is a FIXME to this effect!). But
that seemed a bridge too far for this patch.
If we ever need to parameterize these pipelines more heavily, we can
always sink the logic to helper functions with parameters to keep those
parameters out of the public API. But the changes above seemed minor
that we could possible get away without the parameters entirely.
I added support for parsing 'thinlto' and 'thinlto-pre-link' names in
pass pipelines to make it easy to test these routines and play with them
in larger pipelines. I also added a really basic manifest of passes test
that will show exactly how the pipelines behave and work as well as
making updates to them clear.
Lastly, this factoring does introduce a nesting layer of module pass
managers in the default pipeline. I don't think this is a big deal and
the flexibility of decoupling the pipelines seems easily worth it.
Differential Revision: https://reviews.llvm.org/D33540
llvm-svn: 304407
Summary:
This is a continuation of the work started in D29872 . Passing the carry down as a value rather than as a glue allows for further optimizations. Introducing setcccarry makes the use of addc/subc unecessary and we can start the removal process.
This patch only introduce the optimization strictly required to get the same level of optimization as was available before nothing more.
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33374
llvm-svn: 304404
They weren't used often enough to justify having two different interfaces. Push the responsiblity of creating a StringInit up to the caller.
llvm-svn: 304388
Summary: Also see D33429 for other ThinLTO + New PM related changes.
Reviewers: davide, chandlerc, tejohnson
Subscribers: mehdi_amini, Prazek, cfe-commits, inglorion, llvm-commits, eraman
Differential Revision: https://reviews.llvm.org/D33525
llvm-svn: 304378
Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB.
Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb
Reviewed By: MatzeB, andreadb
Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32563
llvm-svn: 304371
After transforming FP to ST registers:
- Do not add the ST register to the livein lists, they are reserved so
we do not need to track their liveness.
- Remove the FP registers from the livein lists, they don't have defs or
uses anymore and so are not live.
- (The setKillFlags() call is moved to an earlier place as it relies on
the FP registers still being present in the livein list.)
llvm-svn: 304342
Summary:
Fairly straightforward patch to fill in some of the holes in the
attributes API with respect to accessing parameter/argument attributes.
The patch aims to step further towards encapsulating the
idx+FirstArgIndex pattern to access these attributes to within the
AttributeList.
Patch by Daniel Neilson!
Reviewers: rnk, chandlerc, pete, javed.absar, reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33355
llvm-svn: 304329
Internally both these methods just return the result of getValue on either a StringInit or a CodeInit object. In both cases this returns a StringRef pointing to a string allocated in the BumpPtrAllocator so its not going anywhere. So we can just pass that StringRef along.
This is a fairly naive patch that targets just the build failures caused by this change. There's additional work that can be done to avoid creating std::string at call sites that still think getValueAsString returns a std::string. I'll try to clean those up in future patches.
Differential Revision: https://reviews.llvm.org/D33710
llvm-svn: 304325
This adds a callback to the LLVMTargetMachine that lets target indicate
that they do not pass the machine verifier checks in all cases yet.
This is intended to be a temporary measure while the targets are fixed
allowing us to enable the machine verifier by default with
EXPENSIVE_CHECKS enabled!
Differential Revision: https://reviews.llvm.org/D33696
llvm-svn: 304320
This reverts commit r304310.
It caused build failures in polly and mingw
due to undefined reference to
llvm::RTLIB::getMEMCPY_ELEMENT_ATOMIC.
llvm-svn: 304315
This patch does an inline expansion of memcmp.
It changes the memcmp library call into an inline expansion when the size is
known at compile time and is under a target specified threshold.
This expansion is implemented in CodeGenPrepare and expands into straight line
code. The target specifies a maximum load size and the expansion works by using
this size to load the two sources, compare, and exit early if a difference is
found. It also has a special case when the memcmp result is used in a compare
to zero equality.
Differential Revision: https://reviews.llvm.org/D28637
llvm-svn: 304313
Summary:
Expanding the loop idiom test for memcpy to also recognize unordered atomic memcpy.
The only difference for recognizing
an unordered atomic memcpy and instead of a normal memcpy is
that the loads and/or stores involved are unordered atomic operations.
Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html
Patch by Daniel Neilson!
Reviewers: reames, anna, skatkov
Reviewed By: reames
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D33243
llvm-svn: 304310
The code was a mess and disorganized due to the sheer amount
of it being in one file. So I'm splitting this into three files.
One for CodeView types, one for CodeView symbols, and one for
CodeView debug subsections. NFC.
llvm-svn: 304278
CodeViewYAML.h attempts to hide the details of many of the
CodeView yaml structures and types, but at the same time it
exposes the mapping traits for them to external users of the
header.
This patch just hides these in the implementation files so that
the interface is kept as simple as possible.
llvm-svn: 304263
This continues the effort to get the CodeView YAML parsing logic
into ObjectYAML. After this patch, the only missing piece will
be the CodeView debug symbol subsections.
llvm-svn: 304256
This is the beginning of an effort to move the codeview yaml
reader / writer into ObjectYAML so that it can be shared.
Currently the only consumer / producer of CodeView YAML is
llvm-pdbdump, but CodeView can exist outside of PDB files, and
indeed is put into object files and passed to the linker to
produce PDB files. Furthermore, there are subtle differences
in the types of records that show up in object file CodeView
vs PDB file CodeView, but they are otherwise 99% the same.
By having this code in ObjectYAML, we can have llvm-pdbdump
reuse this code, while teaching obj2yaml and yaml2obj to use
this syntax for dealing with object files that can contain
CodeView.
This patch only adds support for CodeView type information
to ObjectYAML. Subsequent patches will add support for
CodeView symbol information.
llvm-svn: 304248
TargetPassConfig is not useful for targets that do not use the CodeGen
library, so we may just as well store a pointer to an
LLVMTargetMachine instead of just to a TargetMachine.
While at it, also change the constructor to take a reference instead of a
pointer as the TM must not be nullptr.
llvm-svn: 304247
Summary:
In rL302576, DISubprograms gained the constraint that a !dbg attachments to functions must
have a 1:1 mapping to DISubprograms. As part of that change, the function cloning support
was adjusted to attempt to enforce this invariant during cloning. However, there
were several problems with the implementation. Part of these were fixed in rL304079.
However, there was a more fundamental problem with these changes, namely that it
bypasses the matadata value map, causing the cloned metadata to be a mix of metadata
pointing to the new suprogram (where manual code was added to fix those up) and the
old suprogram (where this was not the case). This mismatch could cause a number of
different assertion failures in the DWARF emitter. Some of these are given at
https://github.com/JuliaLang/julia/issues/22069, but some others have been observed
as well. Attempt to rectify this by partially reverting the manual DI metadata fixup,
and instead using the standard value map approach. To retain the desired semantics
of not duplicating the compilation unit and inlined subprograms, explicitly freeze
these in the value map.
Reviewers: dblaikie, aprantl, GorNishanov, echristo
Reviewed By: aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33655
llvm-svn: 304226
This adds implementations for Symbols and FrameData, and renames
the existing codeview::StringTable class to conform to the
DebugSectionStringTable convention.
llvm-svn: 304222
Params DT and LI are redundant, because these values are contained in fields anyways.
Differential Revision: https://reviews.llvm.org/D33668
llvm-svn: 304204
The MC ConstantPool class uses a DenseMap to track generated constants, with
the int64_t value of the constant as the key. This fails when values of
0x7fffffffffffffff or 0x7ffffffffffffffe are inserted into the constant pool, as
these are sentinel values for DenseMap.
The fix is to use std::map instead, which doesn't use sentinel values.
Differential revision: https://reviews.llvm.org/D33667
llvm-svn: 304199
This is super awkward, but GCC doesn't let us have template visible when
an argument is an inline function and -fvisibility-inlines-hidden is
used.
llvm-svn: 304175
With fix of uninitialized variable.
Original commit message:
This change is intended to use for LLD in D33183.
Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to.
Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo
interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason of incorrect behavior when
sections share the same offsets, like D33176 shows.
This patch makes DWARF parsers to return section index as well. That solves problem mentioned above.
Differential revision: https://reviews.llvm.org/D33184
llvm-svn: 304078
DagInits are allocated in a BumpPtrAllocator so they are never destructed. This means the destructor for the SmallVector never runs.
To fix this we now allocate the vectors in the BumpPtrAllocator too using TrailingObjects.
llvm-svn: 304077
Rewrite fixupKills() to use the LivePhysRegs class. Simplifies the code
and fixes a bug where the CSR registers in return blocks where missed
leading to invalid kill flags. Also remove the unnecessary rule that we
wouldn't set kill flags on tied operands.
No tests as I have an upcoming commit improving MachineVerifier checks
to catch these cases in multiple existing lit tests.
llvm-svn: 304055
This reverts commit r299287 plus clean-ups.
The localizer pass is a helper pass that could be run at O0 in the GISel
pipeline to work around the deficiency of the fast register allocator.
It basically shortens the live-ranges of the constants so that the
allocator does not spill all over the place.
Long term fix would be to make the greedy allocator fast.
llvm-svn: 304051
The recommit is to fix a bug about ExtractValue and InsertValue ops. For those
ops, some varargs inside GVN::Expression are not value numbers but raw index
numbers. It is wrong to do phi-translate for raw index numbers, and the fix is
to stop doing that.
Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.
long a[100], b[100], g1, g2, g3;
__attribute__((pure)) long goo();
void foo(long a, long b, long c, long d) {
g1 = a * b;
if (__builtin_expect(g2 > 3, 0)) {
a = c;
b = d;
g2 = a * b;
}
g3 = a * b; // fully redundant.
}
The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.
Differential Revision: https://reviews.llvm.org/D32252
llvm-svn: 304050
[AMDGPU] add intrinsic for s_getpc
Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc.
Patch by Tim Corringham
llvm-svn: 304031
Consistent with GCC and addresses a shortcoming with ThinLTO where many
imported CUs may end up being empty (because the functions imported from
them either ended up not being used (and were then discarded, since
they're imported as available_externally) or optimized away entirely).
Test cases previously testing empty CUs (either intentionally, or
because they didn't need anything more complicated) had a trivial 'int'
or similar basic type added to their retained types list.
This is a first order approximation - a deeper implementation could do
things like:
1) Be more lazy about construction of the CU - for example if two CUs
containing a single identical retained type are linked together, with
this change one of the two CUs will be produced but empty (since a
duplicate type won't be produced).
2) Go further and invert all the CU links the same way the subprogram
link is inverted - keep named CU lists of retained types, macros, etc,
and have those link back to the CU. Then if they're emitted, the CU is
emitted, but never otherwise - this would allow the metadata itself to
be dropped earlier too, though it seems unlikely that's an important
optimization as there shouldn't be many CUs relative to the number of
other entities.
llvm-svn: 304020
This change is intended to use for LLD in D33183.
Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to.
Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo
interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason of incorrect behavior when
sections share the same offsets, like D33176 shows.
This patch makes DWARF parsers to return section index as well. That solves problem mentioned above.
Differential revision: https://reviews.llvm.org/D33184
llvm-svn: 304002
With fix of test compilation.
Initial commit message:
This change is intended to use for LLD in D33183.
Problem we have in LLD when building .gdb_index is that we need to know section
which address range belongs to.
Previously it was solved on LLD side by providing fake section addresses
with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed.
Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason
of incorrect behavior when
sections share the same offsets, like D33176 shows.
This patch makes DWARF parsers to return section index as well.
That solves problem mentioned above.
Differential revision: https://reviews.llvm.org/D33184
llvm-svn: 303983
This change is intended to use for LLD in D33183.
Problem we have in LLD when building .gdb_index is that we need to know section
which address range belongs to.
Previously it was solved on LLD side by providing fake section addresses
with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed.
Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason
of incorrect behavior when
sections share the same offsets, like D33176 shows.
This patch makes DWARF parsers to return section index as well.
That solves problem mentioned above.
Differential revision: https://reviews.llvm.org/D33184
llvm-svn: 303978
The patch rL303730 was reverted because test lsr-expand-quadratic.ll failed on
many non-X86 configs with this patch. The reason of this is that the patch
makes a correctless fix that changes optimizer's behavior for this test.
Without the change, LSR was making an overconfident simplification basing on a
wrong SCEV. Apparently it did not need the IV analysis to do this. With the
change, it chose a different way to simplify (that wasn't so confident), and
this way required the IV analysis. Now, following the right execution path,
LSR tries to make a transformation relying on IV Users analysis. This analysis
is target-dependent due to this code:
// LSR is not APInt clean, do not touch integers bigger than 64-bits.
// Also avoid creating IVs of non-native types. For example, we don't want a
// 64-bit IV in 32-bit code just because the loop has one 64-bit cast.
uint64_t Width = SE->getTypeSizeInBits(I->getType());
if (Width > 64 || !DL.isLegalInteger(Width))
return false;
To make a proper transformation in this test case, the type i32 needs to be
legal for the specified data layout. When the test runs on some non-X86
configuration (e.g. pure ARM 64), opt gets confused by the specified target
and does not use it, rejecting the specified data layout as well. Instead,
it uses some default layout that does not treat i32 as a legal type
(currently the layout that is used when it is not specified does not have
legal types at all). As result, the transformation we expect to happen does
not happen for this test.
This re-enabling patch does not have any source code changes compared to the
original patch rL303730. The only difference is that the failing test is
moved to X86 directory and now has requirement of running on x86 only to comply
with the specified target triple and data layout.
Differential Revision: https://reviews.llvm.org/D33543
llvm-svn: 303971
Re-commit r303937 + r303949 as they were not the cause for the build
failures.
We do not track liveness of reserved registers so adding them to the
liveins list in computeLiveIns() was completely unnecessary.
llvm-svn: 303970
block.
This allows writing much more natural and readable range based for loops
directly over the PHI nodes. It also takes advantage of the same tricks
for terminating the sequence as the hand coded versions.
I've replaced one example of this mostly to showcase the difference and
I've added a unit test to make sure the facilities really work the way
they're intended. I want to use this inside of SimpleLoopUnswitch but it
seems generally nice.
Differential Revision: https://reviews.llvm.org/D33533
llvm-svn: 303964
Summary:
RelocVisitor had too many, too small functions. This patch group them
by architecture rather than each relocation type.
Reviewers: grimar, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33580
llvm-svn: 303950
Merging two type streams is one of the most time consuming
parts of generating a PDB, and as such it needs to be as
fast as possible. The visitor abstractions used for interoperating
nicely with many different types of inputs and outputs have
been used widely and help greatly for testability and implementing
tools, but the abstractions build up and get in the way of
performance.
This patch removes all of the visitation stuff from the type
stream merger, essentially re-inventing the leaf / member switch
and loop, but at a very low level. This allows us many other
optimizations, such as not actually deserializing *any* records
(even member records which don't describe their own length), as
the operation of "figure out how long this record is" is somewhat
faster than "figure out how long this record *and* get all its
fields out". Furthermore, whereas before we had to deserialize,
re-write type indices, then re-serialize, now we don't have to
do any of those 3 steps. We just find out where the type indices
are and pull them directly out of the byte stream and re-write
them.
This is worth a 50-60% performance increase. On top of all other
optimizations that have been applied this week, I now get the
following numbers when linking lld.exe and lld.pdb
MSVC: 25.67s
Before This Patch: 18.59s
After This Patch: 8.92s
So this is a huge performance win.
Differential Revision: https://reviews.llvm.org/D33564
llvm-svn: 303935
Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.
long a[100], b[100], g1, g2, g3;
__attribute__((pure)) long goo();
void foo(long a, long b, long c, long d) {
g1 = a * b;
if (__builtin_expect(g2 > 3, 0)) {
a = c;
b = d;
g2 = a * b;
}
g3 = a * b; // fully redundant.
}
The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.
Differential Revision: https://reviews.llvm.org/D32252
llvm-svn: 303923
Originally this was intended to be set up so that when linking
a PDB which refers to a type server, it would only visit the
PDB once, and on subsequent visitations it would just skip it
since all the records had already been added.
Due to some C++ scoping issues, this was not occurring and it
was revisiting the type server every time, which caused every
record to end up being thrown away on all subsequent visitations.
This doesn't affect the performance of linking clang-cl generated
object files because we don't use type servers, but when linking
object files and libraries generated with /Zi via MSVC, this means
only 1 object file has to be linked instead of N object files, so
the speedup is quite large.
llvm-svn: 303920
Previously, every time we wanted to serialize a field list record, we
would create a new copy of FieldListRecordBuilder, which would in turn
create a temporary instance of TypeSerializer, which itself had a
std::vector<> that was about 128K in size. So this 128K allocation was
happening every time. We can re-use the same instance over and over, we
just have to clear its internal hash table and seen records list between
each run. This saves us from the constant re-allocations.
This is worth an ~18.5% speed increase (3.75s -> 3.05s) in my tests.
Differential Revision: https://reviews.llvm.org/D33506
llvm-svn: 303919
Summary:
DbiStreamBuilder calculated the offset of the source file names inside
the file info substream as the size of the file info substream minus
the size of the file names. Since the file info substream is padded to
a multiple of 4 bytes, this caused the first file name to be aligned
on a 4-byte boundary. By contrast, DbiModuleList would read the file
names immediately after the file name offset table, without skipping
to the next 4-byte boundary. This change makes it so that the file
names are written to the location where DbiModuleList expects them,
and puts any necessary padding for the file info substream after the
file names instead of before it.
Reviewers: amccarth, rnk, zturner
Reviewed By: amccarth, zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33475
llvm-svn: 303917
It was using the number of blocks of the entire PDB file as the number
of blocks of each stream that was created. This was only an issue in
the readLongestContiguousChunk function, which was never called prior.
This bug surfaced when I updated an algorithm to use this function and
the algorithm broke.
llvm-svn: 303916
A profile shows the majority of time doing type merging is spent
deserializing records from sequences of bytes into friendly C++ structures
that we can easily access members of in order to find the type indices to
re-write.
Records are prefixed with their length, however, and most records have
type indices that appear at fixed offsets in the record. For these
records, we can save some cycles by just looking at the right place in the
byte sequence and re-writing the value, then skipping the record in the
type stream. This saves us from the costly deserialization of examining
every field, including potentially null terminated strings which are the
slowest, even though it was unnecessary to begin with.
In addition, we apply another optimization. Previously, after
deserializing a record and re-writing its type indices, we would
unconditionally re-serialize it in order to compute the hash of the
re-written record. This would result in an alloc and memcpy for every
record. If no type indices were re-written, however, this was an
unnecessary allocation. In this patch re-writing is made two phase. The
first phase discovers the indices that need to be rewritten and their new
values. This information is passed through to the de-duplication code,
which only copies and re-writes type indices in the serialized byte
sequence if at least one type index is different.
Some records have type indices which only appear after variable length
strings, or which have lists of type indices, or various other situations
that can make it tricky to make this optimization. While I'm not giving up
on optimizing these cases as well, for now we can get the easy cases out
of the way and lay the groundwork for more complicated cases later.
This patch yields another 50% speedup on top of the already large speedups
submitted over the past 2 days. In two tests I have run, I went from 9
seconds to 3 seconds, and from 16 seconds to 8 seconds.
Differential Revision: https://reviews.llvm.org/D33480
llvm-svn: 303914
This patch provides an initial prototype for a pass that sinks instructions based on GVN information, similar to GVNHoist. It is not yet ready for commiting but I've uploaded it to gather some initial thoughts.
This pass attempts to sink instructions into successors, reducing static
instruction count and enabling if-conversion.
We use a variant of global value numbering to decide what can be sunk.
Consider:
[ %a1 = add i32 %b, 1 ] [ %c1 = add i32 %d, 1 ]
[ %a2 = xor i32 %a1, 1 ] [ %c2 = xor i32 %c1, 1 ]
\ /
[ %e = phi i32 %a2, %c2 ]
[ add i32 %e, 4 ]
GVN would number %a1 and %c1 differently because they compute different
results - the VN of an instruction is a function of its opcode and the
transitive closure of its operands. This is the key property for hoisting
and CSE.
What we want when sinking however is for a numbering that is a function of
the *uses* of an instruction, which allows us to answer the question "if I
replace %a1 with %c1, will it contribute in an equivalent way to all
successive instructions?". The (new) PostValueTable class in GVN provides this
mapping.
This pass has some shown really impressive improvements especially for codesize already on internal benchmarks, so I have high hopes it can replace all the sinking logic in SimplifyCFG.
Differential revision: https://reviews.llvm.org/D24805
llvm-svn: 303850
If Op is equal to array_lengthof, the lookup would be out of bounds, but we were only checking for greater than. I suspect nothing ever passes in the equal value because its a sentinel to mark the end of the builtin opcodes and not a real opcode.
So really this fix is just so that the code looks right and makes sense.
llvm-svn: 303840
having it internally allocate the loop.
This is a much more flexible API and necessary in the new loop unswitch
to reasonably support both new and old PMs in common code. It also just
seems like a cleaner separation of concerns.
NFC, this should just be a pure refactoring.
Differential Revision: https://reviews.llvm.org/D33528
llvm-svn: 303834
This change allows llvm-nm to print symbols found in import libraries,
in part by allowing COFFImportFiles to be casted to SymbolicFiles.
Patch by Dave Lee!
llvm-svn: 303821
The loop vectorizer usually vectorizes any instruction it can and then
extracts the elements for a scalarized use. On SystemZ, all elements
containing addresses must be extracted into address registers (GRs). Since
this extraction is not free, it is better to have the address in a suitable
register to begin with. By forcing address arithmetic instructions and loads
of addresses to be scalar after vectorization, two benefits result:
* No need to extract the register
* LSR optimizations trigger (LSR isn't handling vector addresses currently)
Benchmarking show improvements on SystemZ with this new behaviour.
Any other target could try this by returning false in the new hook
prefersVectorizedAddressing().
Review: Renato Golin, Elena Demikhovsky, Ulrich Weigand
https://reviews.llvm.org/D32422
llvm-svn: 303744
When folding arguments of AddExpr or MulExpr with recurrences, we rely on the fact that
the loop of our base recurrency is the bottom-lost in terms of domination. This assumption
may be broken by an expression which is treated as invariant, and which depends on a complex
Phi for which SCEVUnknown was created. If such Phi is a loop Phi, and this loop is lower than
the chosen AddRecExpr's loop, it is invalid to fold our expression with the recurrence.
Another reason why it might be invalid to fold SCEVUnknown into Phi start value is that unlike
other SCEVs, SCEVUnknown are sometimes position-bound. For example, here:
for (...) { // loop
phi = {A,+,B}
}
X = load ...
Folding phi + X into {A+X,+,B}<loop> actually makes no sense, because X does not exist and cannot
exist while we are iterating in loop (this memory can be even not allocated and not filled by this moment).
It is only valid to make such folding if X is defined before the loop. In this case the recurrence {A+X,+,B}<loop>
may be existant.
This patch prohibits folding of SCEVUnknown (and those who use them) into the start value of an AddRecExpr,
if this instruction is dominated by the loop. Merging the dominating unknown values is still valid. Some tests that
relied on the fact that some SCEVUnknown should be folded into AddRec's are changed so that they no longer
expect such behavior.
llvm-svn: 303730
LazyRandomTypeCollection is designed for random access, and in
order to provide this it lazily indexes ranges of types. In the
case of types from an object file, there is no partial index
to build off of, so it has to index the full stream up front.
However, merging types only requires sequential access, and when
that is needed, this extra work is simply wasted. Changing the
algorithm to work on sequential arrays of types rather than
random access type collections eliminates this up front scan.
llvm-svn: 303707
When writing field list records, we would construct a temporary
type serializer that shared a bump ptr allocator with the rest
of the application, so anything allocated from here would live
forever. Furthermore, this temporary serializer had all the
properties of a full blown serializer including record hashing
and de-duplication.
These features are required when you're merging multiple type
streams into each other, because different streams may contain
identical records, but records from the same type stream will
never collide with each other. So all of this hashing was
unnecessary.
To solve this, two fixes are made:
1) The temporary serializer keeps its own bump ptr allocator
instead of sharing a global one. When it's finished, all of
its memory is freed.
2) Instead of using the same temporary serializer for the life
of an entire type stream, we use it only for the life of a single
field list record and delete it when the field list record is
completed. This way the hash table will not grow as other
records from the same type stream get inserted. Further improvements
could eliminate hashing entirely from this codepath.
This reduces the link time by 85% in my test, from 1 minute to 9
seconds.
llvm-svn: 303676
Summary:
This is a first patch for GSoC project, bash-completion for clang.
To use this on bash, please run `source clang/utils/bash-autocomplete.sh`.
bash-autocomplete.sh is code for bash-completion.
Simple flag completion and path completion is available in this patch.
Reviewers: teemperor, v.g.vassilev, ruiu, Bigcheese, efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33237
llvm-svn: 303670
Summary:
First, StringMap uses llvm::HashString, which is only good for short
identifiers and really bad for large blobs of binary data like type
records. Moving to `DenseMap<StringRef, TypeIndex>` with some tricks for
memory allocation fixes that.
Unfortunately, that didn't buy very much performance. Profiling showed
that we spend a long time during DenseMap growth rehashing existing
entries. Also, in general, DenseMap is faster when the keys are small.
This change takes that to the logical conclusion by introducing a small
wrapper value type around a pointer to key data. The key data contains a
precomputed hash, the original record data (pointer and size), and the
type index, which is the "value" of our original map.
This reduces the time to produce llvm-as.exe and llvm-as.pdb from ~15s
on my machine to 3.5s, which is about a 4x improvement.
Reviewers: zturner, inglorion, ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33428
llvm-svn: 303665
Summary:
Before this change, AttributeLists stored a pair of index and
AttributeSet. This is memory efficient if most arguments do not have
attributes. However, it requires doing a search over the pairs to test
an argument or function attribute. Profiling shows that this loop was
0.76% of the time in 'opt -O2' of sqlite3.c, because LLVM constantly
tests values for nullability.
This was worth about 2.5% of mid-level optimization cycles on the
sqlite3 amalgamation. Here are the full perf results:
https://reviews.llvm.org/P7995
Here are just the before and after cycle counts:
```
$ perf stat -r 5 ./opt_before -O2 sqlite3.bc -o /dev/null
13,274,181,184 cycles # 3.047 GHz ( +- 0.28% )
$ perf stat -r 5 ./opt_after -O2 sqlite3.bc -o /dev/null
12,906,927,263 cycles # 3.043 GHz ( +- 0.51% )
```
This patch *does not* change the indices used to query attributes, as
requested by reviewers. Tracking whether an index is usable for array
indexing is a huge pain that affects many of the internal APIs, so it
would be good to come back later and do a cleanup to remove this
internal adjustment.
Reviewers: pete, chandlerc
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D32819
llvm-svn: 303654
This patch builds over https://reviews.llvm.org/rL303349 and replaces
the use of the condition only if it is safe to do so.
We should not blindly RAUW the condition if experimental.guard or assume
is a use of that
condition. This is because LVI may have used the guard/assume to
identify the
value of the condition, and RUAWing will fold the guard/assume and uses
before the guards/assumes.
Reviewers: sanjoy, reames, trentxintong, mkazantsev
Reviewed by: sanjoy, reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33257
llvm-svn: 303633
Summary:
Add Max ModFlagBehavior, which can be used to take the max of two
module flag values when merging modules. Use it for the PIE and PIC
levels.
This avoids an error when we try to import from a module built -fpic
into a module built -fPIC, for example. For both PIE and PIC levels,
this will be legal, since the code generation gets more conservative
as the level is increased. Therefore we can take the max instead of
somehow trying to block importing between modules compiled with
different levels.
Reviewers: tmsriram, pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33418
llvm-svn: 303590
The forward declarations and the SimplifyQuery class at the beginning of the namespace weren't indented. But the closing brace for SimplifyQuery and everything after it were indented.
This commit makes the whole file consistent to no identation per coding standards. The signature of every function in this file changed a few weeks ago so this isn't a big disturbance to the revision history.
llvm-svn: 303588
Previous algotirhm assumed that types and ids are in a single
unified stream. For inputs that come from object files, this
is the case. But if the input is already a PDB, or is the result
of a previous merge, then the types and ids will already have
been split up, in which case we need an algorithm that can
accept operate on independent streams of types and ids that
refer across stream boundaries to each other.
Differential Revision: https://reviews.llvm.org/D33417
llvm-svn: 303577
MachineInstructions that don't generate any code (such as
IMPLICIT_DEFs) should not generate any debug info either.
Fixes PR33107.
https://bugs.llvm.org/show_bug.cgi?id=33107
This reapplies r303566 without any modifications. The stage2 build
failures persisted even after reverting this patch, and looking back
through history, it looks like these tests are flaky.
llvm-svn: 303575
MachineInstructions that don't generate any code (such as
IMPLICIT_DEFs) should not generate any debug info either.
Fixes PR33107.
https://bugs.llvm.org/show_bug.cgi?id=33107
llvm-svn: 303566
It's causing some buildbots to timeout whenever tablegen needs re-compilation,
particularly those with -fsanitize=memory but not only them. A compile time
regression was expected since it triples the amount of SelectionDAG rules we
are able to import but it's currently too high.
llvm-svn: 303542
Re-applying now that PR32825 which was raised on the commit this fixed up is now known to have also been fixed by this commit.
Original commit message:
Multiple ldr pseudoinstructions with the same constant value will
reuse the same constant pool entry. However, if the constant pool
is explicitly flushed with a .ltorg directive, we should not try
to reference constants in the previous pool any longer, since they
may be out of range.
This fixes assembling hand-written assembler source which repeatedly
loads the same constant value, across a binary size larger than the
pc-relative fixup range for ldr instructions (4096 bytes). Such
assembler source already uses explicit .ltorg instructions to emit
constant pools with regular intervals. However if we try to reuse
constants emitted in earlier pools, they end up out of range.
This makes the output of the testcase match what binutils gas does
(prior to this patch, it would fail to assemble).
Differential Revision: https://reviews.llvm.org/D32847
llvm-svn: 303540
Re-applying now that the open bug on this commit, PR32825, is known to be fixed.
Original commit message:
Summary: This patch returns the same label if the CP entry with the same value has been created.
Reviewers: eli.friedman, rengolin, jmolloy
Subscribers: majnemer, jmolloy, llvm-commits
Differential Revision: https://reviews.llvm.org/D25804
llvm-svn: 303539
This reverts commit r302416. This was a fixup for r286006, which has now been reverted so this doesn't apply (either in concept or in code).
This commit itself has no problems, but the underlying issue it was fixing has now disappeared from the codebase.
llvm-svn: 303536
llvm-symbolizer would fail to symbolize addresses in unlinked object
files when handling .dwo file data because the addresses would not be
relocated in the same way as the ranges in the skeleton CU in the object
file.
Fix that so object files can be symbolized the same as executables.
llvm-svn: 303532
This is a re-application of a r303497 that was reverted in r303498.
I thought it had broken a bot when it had not (the breakage did not
go away with the revert).
This change makes the split between the "exact" backedge taken count
and the "maximum" backedge taken count a bit more obvious. Both of
these are upper bounds on the number of times the loop header
executes (since SCEV does not account for most kinds of abnormal
control flow), but the latter is guaranteed to be a constant.
There were a few places where the max backedge taken count *was* a
non-constant; I've changed those to compute constants instead.
At this point, I'm not sure if the constant max backedge count can be
computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without
losing precision. If it can, we can simplify even further by making
`getMaxBackedgeTakenCount` a thin wrapper around
`getBackedgeTakenCount` and `getUnsignedRange`.
llvm-svn: 303531
This change makes the split between the "exact" backedge taken count
and the "maximum" backedge taken count a bit more obvious. Both of
these are upper bounds on the number of times the loop header
executes (since SCEV does not account for most kinds of abnormal
control flow), but the latter is guaranteed to be a constant.
There were a few places where the max backedge taken count *was* a
non-constant; I've changed those to compute constants instead.
At this point, I'm not sure if the constant max backedge count can be
computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without
losing precision. If it can, we can simplify even further by making
`getMaxBackedgeTakenCount` a thin wrapper around
`getBackedgeTakenCount` and `getUnsignedRange`.
llvm-svn: 303497
Summary: This allows pthread_self to be pulled out of a loop by LICM.
Reviewers: hfinkel, arsenm, davide
Reviewed By: davide
Subscribers: davide, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D32782
llvm-svn: 303495
This is split up into two commits.
The will create the DEF parser in LLVM.
Check the following commit to see the removal from LLD
Reviewers: ruiu
Differential Revision: https://reviews.llvm.org/D32689
llvm-svn: 303490
Summary: Added the new modules in the Object/ folder. Updated the
llvm-cvtres interface as well, and added additional tests.
Subscribers: llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D33180
llvm-svn: 303480
This reapplies commit r303438 modified to not verify cross-imported
bitcode in FunctionImporter.
rdar://problem/31233625
Differential Revision: https://reviews.llvm.org/D33370
llvm-svn: 303470
Refactor the strlen optimization code to work for both strlen and wcslen.
This especially helps with programs in the wild where people pass
L"string"s to const std::wstring& function parameters and the wstring
constructor gets inlined.
This also fixes a lingerind API problem/bug in getConstantStringInfo()
where zeroinitializers would always give you an empty string (without a
length) back regardless of the actual length of the initializer which
did not work well in the TrimAtNul==false causing the PR mentioned
below.
Note that the fixed getConstantStringInfo() needed fixes to SelectionDAG
memcpy lowering and may lead to some cases for out-of-bounds
zeroinitializer accesses not getting optimized anymore. So some code
with UB may produce out of bound memory reads now instead of just
producing zeros.
The refactoring "accidentally" fixes http://llvm.org/PR32124
Differential Revision: https://reviews.llvm.org/D32839
llvm-svn: 303461
This is a complicated bug involving two issues:
1. What do we do with phi nodes when we prove all arguments are not
live?
2. When is it safe to use value leaders to determine if we can ignore
an argumnet?
llvm-svn: 303453
This was originally reverted because it was a breaking a bunch
of bots and the breakage was not surfacing on Windows. After much
head-scratching this was ultimately traced back to a bug in the
lit test runner related to its pipe handling. Now that the bug
in lit is fixed, Windows correctly reports these test failures,
and as such I have finally (hopefully) fixed all of them in this
patch.
llvm-svn: 303446
Summary:
This patch adds udiv/sdiv/urem/srem/udivrem/sdivrem methods that can divide by a uint64_t. This makes division consistent with all the other arithmetic operations.
This modifies the interface of the divide helper method to work on raw arrays instead of APInts. This way we can pass the uint64_t in for the RHS without wrapping it in an APInt. This required moving all the Quotient and Remainder allocation handling up to the callers. For udiv/urem this was as simple as just creating the Quotient/Remainder with the right size when they were declared. For udivrem we have to rely on reallocate not changing the contents of the variable LHS or RHS is aliased with the Quotient or Remainder APInts. We also have to zero the upper bits of Remainder and Quotient that divide doesn't write to if lhsWords/rhsWords is smaller than the width.
I've update the toString method to use the new udivrem.
Reviewers: hans, dblaikie, RKSimon
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33310
llvm-svn: 303431
This is a squash of ~5 reverts of, well, pretty much everything
I did today. Something is seriously broken with lit on Windows
right now, and as a result assertions that fire in tests are
triggering failures. I've been breaking non-Windows bots all
day which has seriously confused me because all my tests have
been passing, and after running lit with -a to view the output
even on successful runs, I find out that the tool is crashing
and yet lit is still reporting it as a success!
At this point I don't even know where to start, so rather than
leave the tree broken for who knows how long, I will get this
back to green, and then once lit is fixed on Windows, hopefully
hopefully fix the remaining set of problems for real.
llvm-svn: 303409
We were using a BumpPtrAllocator to allocate stable storage for
a record, then trying to insert that into a hash table. If a
collision occurred, the bytes were never inserted and the
allocation was unnecessary. At the cost of an extra hash
computation, check first if it exists, and only if it does do
we allocate and insert.
llvm-svn: 303407
This reverts commit r303383.
This breaks the modules-enabled macOS build with:
lib/Support/LockFileManager.cpp:86:7: error: declaration of 'gethostuuid' must be imported from module 'Darwin.POSIX.unistd' before it is required
llvm-svn: 303402
Merging PDBs is a feature that will be used heavily by
the linker. The functionality already exists but does not
have deep test coverage because it's not easily exposed through
any tools. This patch aims to address that by adding the
ability to merge PDBs via llvm-pdbdump. It takes arbitrarily
many PDBs and outputs a single PDB.
Using this new functionality, a test is added for merging
type records. Future patches will add the ability to merge
symbol records, module information, etc.
llvm-svn: 303389
Right now we have multiple notions of things that represent collections of
types. Most commonly used are TypeDatabase, which is supposed to keep
mappings from TypeIndex to type name when reading a type stream, which
happens when reading PDBs. And also TypeTableBuilder, which is used to
build up a collection of types dynamically which we will later serialize
(i.e. when writing PDBs).
But often you just want to do some operation on a collection of types, and
you may want to do the same operation on any kind of collection. For
example, you might want to merge two TypeTableBuilders or you might want
to merge two type streams that you loaded from various files.
This dichotomy between reading and writing is responsible for a lot of the
existing code duplication and overlapping responsibilities in the existing
CodeView library classes. For example, after building up a
TypeTableBuilder with a bunch of type records, if we want to dump it we
have to re-invent a bunch of extra glue because our dumper takes a
TypeDatabase or a CVTypeArray, which are both incompatible with
TypeTableBuilder.
This patch introduces an abstract base class called TypeCollection which
is shared between the various type collection like things. Wherever we
previously stored a TypeDatabase& in some common class, we now store a
TypeCollection&.
The advantage of this is that all the details of how the collection are
implemented, such as lazy deserialization of partial type streams, is
completely transparent and you can just treat any collection of types the
same regardless of where it came from.
Differential Revision: https://reviews.llvm.org/D33293
llvm-svn: 303388
Currently m_Not only works the canonical xor X, -1 form that InstCombine produces. InstSimplify can't rely on this canonicalization.
Differential Revision: https://reviews.llvm.org/D33331
llvm-svn: 303379
This also reverts follow-ups r303292 and r303298.
It broke some Chromium tests under MSan, and apparently also internal
tests at Google.
llvm-svn: 303369
Summary:
Implements PR889
Removing the virtual table pointer from Value saves 1% of RSS when doing
LTO of llc on Linux. The impact on time was positive, but too noisy to
conclusively say that performance improved. Here is a link to the
spreadsheet with the original data:
https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing
This change makes it invalid to directly delete a Value, User, or
Instruction pointer. Instead, such code can be rewritten to a null check
and a call Value::deleteValue(). Value objects tend to have their
lifetimes managed through iplist, so for the most part, this isn't a big
deal. However, there are some places where LLVM deletes values, and
those places had to be migrated to deleteValue. I have also created
llvm::unique_value, which has a custom deleter, so it can be used in
place of std::unique_ptr<Value>.
I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which
derives from User outside of lib/IR. Code in IR cannot include MemorySSA
headers or call the MemoryAccess object destructors without introducing
a circular dependency, so we need some level of indirection.
Unfortunately, no class derived from User may have any virtual methods,
because adding a virtual method would break User::getHungOffOperands(),
which assumes that it can find the use list immediately prior to the
User object. I've added a static_assert to the appropriate OperandTraits
templates to help people avoid this trap.
Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv
Reviewed By: chandlerc
Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits
Differential Revision: https://reviews.llvm.org/D31261
llvm-svn: 303362
This provides a new way to access the TargetMachine through
TargetPassConfig, as a dependency.
The patterns replaced here are:
* Passes handling a null TargetMachine call
`getAnalysisIfAvailable<TargetPassConfig>`.
* Passes not handling a null TargetMachine
`addRequired<TargetPassConfig>` and call
`getAnalysis<TargetPassConfig>`.
* MachineFunctionPasses now use MF.getTarget().
* Remove all the TargetMachine constructors.
* Remove INITIALIZE_TM_PASS.
This fixes a crash when running `llc -start-before prologepilog`.
PEI needs StackProtector, which gets constructed without a TargetMachine
by the pass manager. The StackProtector pass doesn't handle the case
where there is no TargetMachine, so it segfaults.
Related to PR30324.
Differential Revision: https://reviews.llvm.org/D33222
llvm-svn: 303360
Summary:
As of this patch, 1018 out of 3938 rules are currently imported.
Depends on D32275
Reviewers: qcolombet, kristof.beyls, rovka, t.p.northover, ab, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: dberris, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D32278
The previous commit failed on test-suite/Bitcode/simd_ops/AArch64_halide_runtime.bc
because isImmOperandEqual() assumed MO was a register operand and that's not
always true.
llvm-svn: 303341
We do not need to store relocation width field.
Patch removes relative code, that simplifies implementation.
Differential revision: https://reviews.llvm.org/D33274
llvm-svn: 303335
I revisited Decompressor API (issue with it was triggered during D32865 review)
and found it is probably provides more then we really need.
Issue was about next method's signature:
Error decompress(SmallString<32> &Out);
It is too strict. At first I wanted to change it to decompress(SmallVectorImpl<char> &Out),
but then found it is still not flexible because sticks to SmallVector.
During reviews was suggested to use templating to simplify code. Patch do that.
Differential revision: https://reviews.llvm.org/D33200
llvm-svn: 303331
compatible target triple
Currently, an assertion fails in ThinLTOCodeGenerator::addModule when
the target triple of the module being added doesn't match that of the
one stored in TMBuilder. This patch relaxes the constraint and makes
changes to allow target triples that only differ in their version
numbers on Apple platforms, similarly to what r228999 did.
rdar://problem/30133904
Differential Revision: https://reviews.llvm.org/D33291
llvm-svn: 303326
Summary:
There are several places in the codebase that try to calculate a maximum value in a Statistic object. We currently do this in one of two ways:
MaxNumFoo = std::max(MaxNumFoo, NumFoo);
or
MaxNumFoo = (MaxNumFoo > NumFoo) ? MaxNumFoo : NumFoo;
The first version reads from MaxNumFoo one time and uncontionally rwrites to it. The second version possibly reads it twice depending on the result of the first compare. But we have no way of knowing if the value was changed by another thread between the reads and the writes.
This patch adds a method to the Statistic object that can ensure that we only store if our value is the max and the previous max didn't change after we read it. If it changed we'll recheck if our value should still be the max or not and try again.
This spawned from an audit I'm trying to do of all places we uses the implicit conversion to unsigned on the Statistics objects. See my previous thread on llvm-dev https://groups.google.com/forum/#!topic/llvm-dev/yfvxiorKrDQ
Reviewers: dberlin, chandlerc, hfinkel, dblaikie
Reviewed By: chandlerc
Subscribers: llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D33301
llvm-svn: 303318
Often you have an array and you just want to use it. With the current
design, you have to first construct a `BinaryByteStream`, and then create
a `BinaryStreamRef` from it. Worse, the `BinaryStreamRef` holds a pointer
to the `BinaryByteStream`, so you can't just create a temporary one to
appease the compiler, you have to actually hold onto both the `ArrayRef`
as well as the `BinaryByteStream` *AND* the `BinaryStreamReader` on top of
that. This makes for very cumbersome code, often requiring one to store a
`BinaryByteStream` in a class just to circumvent this.
At the cost of some added complexity (not exposed to users, but internal
to the library), we can do better than this. This patch allows us to
construct `BinaryStreamReaders` and `BinaryStreamWriters` directly from
source data (e.g. `StringRef`, `MutableArrayRef<uint8_t>`, etc). Not only
does this reduce the amount of code you have to type and make it more
obvious how to use it, but it solves real lifetime issues when it's
inconvenient to hold onto a `BinaryByteStream` for a long time.
The additional complexity is in the form of an added layer of indirection.
Whereas before we simply stored a `BinaryStream*` in the ref, we now store
both a `BinaryStream*` **and** a `std::shared_ptr<BinaryStream>`. When
the user wants to construct a `BinaryStreamRef` directly from an
`ArrayRef` etc, we allocate an internal object that holds ownership over a
`BinaryByteStream` and forwards all calls, and store this in the
`shared_ptr<>`. This also maintains the ref semantics, as you can copy it
by value and references refer to the same underlying stream -- the one
being held in the object stored in the `shared_ptr`.
Differential Revision: https://reviews.llvm.org/D33293
llvm-svn: 303294
There is often a lot of boilerplate code required to visit a type
record or type stream. The #1 use case is that you have a sequence
of bytes that represent one or more records, and you want to
deserialize each one, switch on it, and call a callback with the
deserialized record that the user can examine. Currently this
requires at least 6 lines of code:
codeview::TypeVisitorCallbackPipeline Pipeline;
Pipeline.addCallbackToPipeline(Deserializer);
Pipeline.addCallbackToPipeline(MyCallbacks);
codeview::CVTypeVisitor Visitor(Pipeline);
consumeError(Visitor.visitTypeRecord(Record));
With this patch, it becomes one line of code:
consumeError(codeview::visitTypeRecord(Record, MyCallbacks));
This is done by having the deserialization happen internally inside
of the visitTypeRecord function. Since this is occasionally not
desirable, the function provides a 3rd parameter that can be used
to change this behavior.
Hopefully this can significantly reduce the barrier to entry
to using the visitation infrastructure.
Differential Revision: https://reviews.llvm.org/D33245
llvm-svn: 303271
A lot of code is duplicated between the first_last and the
next / prev methods. All of this code can be shared if they
are implemented in terms of find_first_in(Begin, End) etc,
in which case find_first = find_first_in(0, Size) and find_next
is find_first_in(Prev+1, Size), with similar reductions for
the other methods.
Differential Revision: https://reviews.llvm.org/D33104
llvm-svn: 303269
Summary:
As of this patch, 1018 out of 3938 rules are currently imported.
Depends on D32275
Reviewers: qcolombet, kristof.beyls, rovka, t.p.northover, ab, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: dberris, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D32278
llvm-svn: 303259
RelocAddrMap was a pair of <width, address>, where width is relocation size (4/8/x, x < 8),
and width field was never used in code.
Relocations proccessing loop had checks for width field. Does not look like DWARF parser
should do that. There is probably no much sense to validate relocations during proccessing
them in parser.
Patch removes relocation's width relative code from DWARFContext.
Differential revision: https://reviews.llvm.org/D33194
llvm-svn: 303251
Summary:
This fixes pr32392.
The lowering pipeline is:
llvm.ppc.cfence in IR -> PPC::CFENCE8 in isel -> Actual instructions in
expandPostRAPseudo.
The reason why expandPostRAPseudo is chosen is because previous passes
are likely eliminating instructions like cmpw 3, 3 (early CSE) and bne-
7, .+4 (some branch pass(s)).
Differential Revision: https://reviews.llvm.org/D32763
llvm-svn: 303205
ProfileSummaryInfo already checks whether the module has sample profile
in determining profile counts. This will also be useful in inliner to
clean up threshold updates.
llvm-svn: 303204
Recommit of r303159 "[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector"
All places were shitched to use DWARFAddressRange now.
Suggested during review of D33184.
llvm-svn: 303163
The information collected when requested by -time-passes is only printed when
llvm_shutdown is called at the moment. This means that when linking against the LTO
library dynamically and using the C interface, it is not possible to see the timing
information, because llvm_shutdown cannot be called. This change modifies the LTO
code generation functions for both regular LTO and thin LTO to explicitly print and
reset the timing information.
I have tested that this works with our proprietary linker. However, as this relies
on a specific method of building and linking against the LTO library, I'm not sure
how or if this can be tested in the LLVM testsuite.
Reviewed by: mehdi_amini
Differential Revision: https://reviews.llvm.org/D32803
llvm-svn: 303152
This function gives the wrong answer on some non-ELF platforms in some
cases. The function that does the right thing lives in Mangler.h. To try to
discourage people from using this function, give it a different name.
Differential Revision: https://reviews.llvm.org/D33162
llvm-svn: 303134
ARM Neon has native support for half-sized vector registers (64 bits). This
is beneficial for example for 2D and 3D graphics. This patch adds the option
to lower MinVecRegSize from 128 via a TTI in the SLP Vectorizer.
*** Performance Analysis
This change was motivated by some internal benchmarks but it is also
beneficial on SPEC and the LLVM testsuite.
The results are with -O3 and PGO. A negative percentage is an improvement.
The testsuite was run with a sample size of 4.
** SPEC
* CFP2006/482.sphinx3 -3.34%
A pretty hot loop is SLP vectorized resulting in nice instruction reduction.
This used to be a +22% regression before rL299482.
* CFP2000/177.mesa -3.34%
* CINT2000/256.bzip2 +6.97%
My current plan is to extend the fix in rL299482 to i16 which brings the
regression down to +2.5%. There are also other problems with the codegen in
this loop so there is further room for improvement.
** LLVM testsuite
* SingleSource/Benchmarks/Misc/ReedSolomon -10.75%
There are multiple small SLP vectorizations outside the hot code. It's a bit
surprising that it adds up to 10%. Some of this may be code-layout noise.
* MultiSource/Benchmarks/VersaBench/beamformer/beamformer -8.40%
The opt-viewer screenshot can be seen at F3218284. We start at a colder store
but the tree leads us into the hottest loop.
* MultiSource/Applications/lambda-0.1.3/lambda -2.68%
* MultiSource/Benchmarks/Bullet/bullet -2.18%
This is using 3D vectors.
* SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists +6.67%
Noise, binary is unchanged.
* MultiSource/Benchmarks/Ptrdist/anagram/anagram +4.90%
There is an additional SLP in the cold code. The test runs for ~1sec and
prints out over 2000 lines. This is most likely noise.
* MultiSource/Applications/aha/aha +1.63%
* MultiSource/Applications/JM/lencod/lencod +1.41%
* SingleSource/Benchmarks/Misc/richards_benchmark +1.15%
Differential Revision: https://reviews.llvm.org/D31965
llvm-svn: 303116
Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional).
CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0).
Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation.
Differential Revision: https://reviews.llvm.org/D32487
llvm-svn: 303050
This is a very thin wrapper around StringRef::getAsInteger.
It serves three purposes.
1) It allows a cleaner syntax when you have something other than
a StringRef - for example, a std::string or an llvm::SmallString.
Previously, in this case you would have to write something like:
StringRef(MyStr).getAsInteger(0, Result)
by explicitly constructing a temporary StringRef. This can be
done implicitly however with the new function by just writing:
to_integer(MyStr, ...).
2) Correcting the travesty that is getAsInteger's return value.
This function returns true on success, and false on failure.
While this may cause confusion with people familiar with the
getAsInteger API, there seems to be widespread agreement that
the return semantics of getAsInteger was a mistake.
3) It allows the Radix to be deduced as a default argument by
putting it last in the parameter list. Most uses of getAsInteger
pass 0 for the first argument. With this syntax it can just be
omitted.
llvm-svn: 303011
This reorganisation prevents us from cluttering up the top-level lib directory
with more driver libraries such as llvm-dlltool (see D29892).
llvm-svn: 302995
- MIRYamlMapping: Default value provided for fields which have optional
mappings. Implemented == operators for required classes. When a field's value is
same as default value specified YAML IO class will not print it.
- MIRPrinter: Above mentioned behaviour is not on by default. If -simplify-mir
option not specified, then make yaml::Output to print fields with default values
too.
Differential Revision: https://reviews.llvm.org/D32304
llvm-svn: 302984
We end up dereferencing the end iterator here when the Aspect doesn't exist in the DefaultAction map.
Change the API to return Optional<LLT> and return None when not found.
Also update the callers to handle the None case
llvm-svn: 302963
Summary:
As discussed in the D32195 review thread and on IRC, remove this option
and replace with parameter, which will be set to true when invoked
from clang in the context of a ThinLTO distributed backend.
Reviewers: pcc
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D33133
llvm-svn: 302939
Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB.
Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb
Reviewed By: MatzeB, andreadb
Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32563
llvm-svn: 302938
Summary:
Eli pointed out that it's unsafe to combine the shifts to ISD::SHL etc.,
because those are not defined for b > sizeof(a) * 8, even after some of
the combiners run.
However, PPCISD::SHL defines that behavior (as the instructions themselves).
Move the combination to the backend.
The tests in shift_mask.ll still pass.
Reviewers: echristo, hfinkel, efriedma, iteratee
Subscribers: nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D33076
llvm-svn: 302937
This adds a visitor that is capable of accessing type
records randomly and caching intermediate results that it
learns about during partial linear scans. This yields
amortized O(1) access to a type stream even though type
streams cannot normally be indexed.
Differential Revision: https://reviews.llvm.org/D33009
llvm-svn: 302936
This patch adds min/max population count, leading/trailing zero/one bit counting methods.
The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give.
Differential Revision: https://reviews.llvm.org/D32931
llvm-svn: 302925
The functions in MathExtras.h uses a safer memcpy instead of going through a union.
Differential Revision: https://reviews.llvm.org/D33116
llvm-svn: 302916
I tried to compile LLD using GCC 7.1.0 and got warnings like
"warning: this statement may fall through [-Wimplicit-fallthrough=]"
(some more details are here: D32907)
GCC's __cplusplus value is 201402L by default, so macro expands to nothing,
though GCC 7 has support for [[fallthrough]].
Patch uses gnu::fallthrough when it is available and fixes warning I am observing.
Initial idea of way to fix belongs to Davide Italiano.
Differential revision: https://reviews.llvm.org/D33036
llvm-svn: 302878
Summary:
This adds a resize method to APInt that manages deleting/allocating storage for an APInt and changes its bit width. Use this to simplify code in copy assignment and divide.
The assignment code in particular was overly complicated. Treating every possible case as a separate implementation. I'm also pretty sure the clearUnusedBits code at the end was unnecessary. Since we always copying whole words from the source APInt. All unused bits should be clear in the source.
Reviewers: hans, RKSimon
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33073
llvm-svn: 302863
Summary: This patch changes the function profile output order to be deterministic. In order to make it easier to understand, hottest functions (with most total samples) is ordered first.
Reviewers: dnovillo, davidxl
Reviewed By: dnovillo
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33111
llvm-svn: 302851
The erase/remove from parent methods now use a switch table to remove
themselves from their appropriate parent ilist.
The copyAttributesFrom method is now completely non-virtual, since we
only ever copy attributes from a global of the appropriate type.
Pre-requisite to de-virtualizing Value to save a vptr
(https://reviews.llvm.org/D31261).
NFC
llvm-svn: 302823
Updates the MSP430 target to generate EABI-compatible libcall names.
As a byproduct, adjusts the hardware multiplier options available in
the MSP430 target, adds support for promotion of the ISD::MUL operation
for 8-bit integers, and correctly marks R11 as used by call instructions.
Patch by Andrew Wygle.
Differential Revision: https://reviews.llvm.org/D32676
llvm-svn: 302820
The approach I followed was to emit the remark after getTreeCost concludes
that SLP is profitable. I initially tried emitting them after the
vectorizeRootInstruction calls in vectorizeChainsInBlock but I vaguely
remember missing a few cases for example in HorizontalReduction::tryToReduce.
ORE is placed in BoUpSLP so that it's available from everywhere (notably
HorizontalReduction::tryToReduce).
We use the first instruction in the root bundle as the locator for the remark.
In order to get a sense how far the tree is spanning I've include the size of
the tree in the remark. This is not perfect of course but it gives you at
least a rough idea about the tree. Then you can follow up with -view-slp-tree
to really see the actual tree.
llvm-svn: 302811
A recent commit made GlobalVariable.h depend on intrinsics generation, so (I
think) it needs to be in the lower-level module. I'll confirm with others, but
this should fix the bots.
llvm-svn: 302803
This patch extends llvm-ir to allow attributes to be set on global variables.
An RFC was sent out earlier by my colleague James Molloy: http://lists.llvm.org/pipermail/cfe-dev/2017-March/053100.html
A key part of that proposal was to extend LLVM-IR to carry attributes on global variables.
This generic feature could be useful for multiple purposes.
In our present context, it would be useful to carry user specified sections for bss/rodata/data.
Reviewed by: Jonathan Roelofs, Reid Kleckner
Differential Revision: https://reviews.llvm.org/D32009
llvm-svn: 302794
This reverts r302712.
The change fails with ASAN enabled:
ERROR: AddressSanitizer: use-after-poison on address ... at ...
READ of size 2 at ... thread T0
#0 ... in llvm::SDNode::getNumValues() const <snip>/include/llvm/CodeGen/SelectionDAGNodes.h:855:42
#1 ... in llvm::SDNode::hasAnyUseOfValue(unsigned int) const <snip>/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7270:3
#2 ... in llvm::SDValue::use_empty() const <snip> include/llvm/CodeGen/SelectionDAGNodes.h:1042:17
#3 ... in (anonymous namespace)::DAGCombiner::MergeConsecutiveStores(llvm::StoreSDNode*) <snip>/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12944:7
Reviewers: niravd
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33081
llvm-svn: 302746
Summary:
Allow consecutive stores whose values come from consecutive loads to
merged in the presense of other uses of the loads. Previously this was
disallowed as in general the merged load cannot be shared with the
other uses. Merging N stores into 1 may cause as many as N redundant
loads. However in the context of caching this should have neglible
affect on memory pressure and reduce instruction count making it
almost always a win.
Fixes PR32086.
Reviewers: spatel, jyknight, andreadb, hfinkel, efriedma
Reviewed By: efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30471
llvm-svn: 302712
This pass uses a new target hook to decide whether or not to expand a particular
intrinsic to the shuffevector sequence.
Differential Revision: https://reviews.llvm.org/D32245
llvm-svn: 302631
Before r247167, the pass manager builder controlled which AA
implementations were used, exporting them all in the AliasAnalysis
analysis group.
Now, AAResultsWrapperPass always uses BasicAA, but still uses other AA
implementations if made available in the pass pipeline.
But regardless, SDAGISel is required at O0, and really doesn't need to
be doing fancy optimizations based on useful AA results.
Don't require AA at CodeGenOpt::None, and only use it otherwise.
This does have a functional impact (and one testcase is pessimized
because we can't reuse a load). But I think that's desirable no matter
what.
Note that this alone doesn't result in less DT computations: TwoAddress
was previously able to reuse the DT we computed for SDAG. That will be
fixed separately.
Differential Revision: https://reviews.llvm.org/D32766
llvm-svn: 302611
This lets the pass focus on gathering the required analyzes, and the
utility class focus on the transformation.
Differential Revision: https://reviews.llvm.org/D31303
llvm-svn: 302609
This warning didn't show up on my local build
but is causing the bots to fail. Seems like a
bad idea to have types and variables with the
same name anyhow.
Differential Revision: https://reviews.llvm.org/D33022
llvm-svn: 302606
Previously we had only supported the importing and
exporting of functions and globals.
Also, add usefull overload of getWasmSymbol() and
getNumberOfSymbols() in support of lld port.
Differential Revision: https://reviews.llvm.org/D33011
llvm-svn: 302601
This change is required because the notion of count is different for
sample profiling and getProfileCount will need to determine the
underlying profile type.
Differential revision: https://reviews.llvm.org/D33012
llvm-svn: 302597
frames.
RuntimeDyld was previously responsible for tracking allocated EH frames, but it
makes more sense to have the RuntimeDyld::MemoryManager track them (since the
frames are allocated through the memory manager, and written to memory owned by
the memory manager). This patch moves the frame tracking into
RTDyldMemoryManager, and changes the deregisterFrames method on
RuntimeDyld::MemoryManager from:
void deregisterEHFrames(uint8_t *Addr, uint64_t LoadAddr, size_t Size);
to:
void deregisterEHFrames();
Separating this responsibility will allow ORC to continue to throw the
RuntimeDyld instances away post-link (saving a few dozen bytes per lazy
function) while properly deregistering frames when modules are unloaded.
This patch also updates ORC to call deregisterEHFrames when modules are
unloaded. This fixes a bug where an exception that tears down the JIT can then
unwind through dangling EH frames that have been deallocated but not
deregistered, resulting in UB.
For people using SectionMemoryManager this should be pretty much a no-op. For
people with custom allocators that override registerEHFrames/deregisterEHFrames,
you will now be responsible for tracking allocated EH frames.
Reviewed in https://reviews.llvm.org/D32829
llvm-svn: 302589
Fixes inalloca parameters, which previously all pointed to the same
offset. Extend the test to use llvm-readobj so that we can test the
offset in a readable way.
llvm-svn: 302578
As recently discussed on llvm-dev [1], this patch makes it illegal for
two Functions to point to the same DISubprogram and updates
FunctionCloner to also clone the debug info of a function to conform
to the new requirement. To simplify the implementation it also factors
out the creation of inlineAt locations from the Inliner into a
general-purpose utility in DILocation.
[1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html
<rdar://problem/31926379>
Differential Revision: https://reviews.llvm.org/D32975
This reapplies r302469 with a fix for a bot failure (reparentDebugInfo
now checks for the case the orig and new function are identical).
llvm-svn: 302576
Use variadic templates instead of relying on <cstdarg> + sentinel.
This enforces better type checking and makes code more readable.
Differential Revision: https://reviews.llvm.org/D32541
llvm-svn: 302571
The description says it returns the number of words needed to represent the results. But the way it was coded it always returns (lhsWords + rhsWords) or (lhsWords + rhsWords - 1). But the result could be even smaller than that and it wouldn't tell you.
No one uses the result today so rather than try to fix it, just remove it.
llvm-svn: 302551
Now both emitLeadingFence and emitTrailingFence take the instruction
itself, instead of taking IsLoad/IsStore pairs.
Instruction::mayReadFromMemory and Instrucion::mayWriteToMemory are used
for determining those two booleans.
The instruction argument is also useful for later D32763, in
emitTrailingFence. For emitLeadingFence, it seems to have cleaner
interface with the proposed change.
Differential Revision: https://reviews.llvm.org/D32762
llvm-svn: 302539
This caused PR32977.
Original commit message:
> Make it illegal for two Functions to point to the same DISubprogram
>
> As recently discussed on llvm-dev [1], this patch makes it illegal for
> two Functions to point to the same DISubprogram and updates
> FunctionCloner to also clone the debug info of a function to conform
> to the new requirement. To simplify the implementation it also factors
> out the creation of inlineAt locations from the Inliner into a
> general-purpose utility in DILocation.
>
> [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html
> <rdar://problem/31926379>
>
> Differential Revision: https://reviews.llvm.org/D32975
llvm-svn: 302533
Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs the backend that the
argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END
sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size
stored in CALLSEQ_START in this case does not count the size of this
argument. However CALLSEQ_END still keeps total frame size, as caller can
be responsible for cleanup of entire frame. So CALLSEQ_START and
CALLSEQ_END keep different frame size and the difference is treated by
MachineVerifier as stack error. Currently there is no way to distinguish
this case from actual errors.
This patch adds additional argument to CALLSEQ_START and its
target-specific counterparts to keep size of stack that is set up prior to
the call frame sequence. This argument allows MachineVerifier to calculate
actual frame size associated with frame setup instruction and correctly
process the case of inalloca arguments.
The changes made by the patch are:
- Frame setup instructions get the second mandatory argument. It
affects all targets that use frame pseudo instructions and touched many
files although the changes are uniform.
- Access to frame properties are implemented using special instructions
rather than calls getOperand(N).getImm(). For X86 and ARM such
replacement was made previously.
- Changes that reflect appearance of additional argument of frame setup
instruction. These involve proper instruction initialization and
methods that access instruction arguments.
- MachineVerifier retrieves frame size using method, which reports sum of
frame parts initialized inside frame instruction pair and outside it.
The patch implements approach proposed by Quentin Colombet in
https://bugs.llvm.org/show_bug.cgi?id=27481#c1.
It fixes 9 tests failed with machine verifier enabled and listed
in PR27481.
Differential Revision: https://reviews.llvm.org/D32394
llvm-svn: 302527
- This change allows targets to opt-in to using them instead of the log2
shufflevector algorithm.
- The SLP and Loop vectorizers have the common code to do shuffle reductions
factored out into LoopUtils, and now have a unified interface for generating
reductions regardless of the preference of the target. LoopUtils now uses TTI
to determine what kind of reductions the target wants to handle.
- For CodeGen, basic legalization support is added.
Differential Revision: https://reviews.llvm.org/D30086
llvm-svn: 302514
As recently discussed on llvm-dev [1], this patch makes it illegal for
two Functions to point to the same DISubprogram and updates
FunctionCloner to also clone the debug info of a function to conform
to the new requirement. To simplify the implementation it also factors
out the creation of inlineAt locations from the Inliner into a
general-purpose utility in DILocation.
[1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html
<rdar://problem/31926379>
Differential Revision: https://reviews.llvm.org/D32975
llvm-svn: 302469
Previously type visitation was done strictly sequentially, and
TypeIndexes were computed by incrementing the TypeIndex of the
last visited record. This works fine for situations like dumping,
but not when you want to visit types in random order. For example,
in a debug session someone might lookup a symbol by name, find that
it has TypeIndex 10,000 and then want to go straight to TypeIndex
10,000.
In order to make this work, the visitation framework needs a mode
where it can plumb TypeIndices through the callback pipeline. This
patch adds such a mode. In doing so, it is necessary to provide
an alternative implementation of TypeDatabase that supports random
access, so that is done as well.
Nothing actually uses these random access capabilities yet, but
this will be done in subsequent patches.
Differential Revision: https://reviews.llvm.org/D32928
llvm-svn: 302454
This introduces a new interface for computeKnownBits that returns the KnownBits object instead of requiring it to be pre-constructed and passed in by reference.
This is a much more convenient interface as it doesn't require the caller to figure out the BitWidth to pre-construct the object. It's so convenient that I believe we can use this interface to remove the special ComputeSignBit flavor of computeKnownBits.
As a step towards that idea, this patch replaces all of the internal usages of ComputeSignBit with this new interface. As you can see from the patch there were a couple places where we called ComputeSignBit which really called computeKnownBits, and then called computeKnownBits again directly. I've reduced those places to only making one call to computeKnownBits. I bet there are probably external users that do it too.
A future patch will update the external users and remove the ComputeSignBit interface. I'll also working on moving more locations to the KnownBits returning interface for computeKnownBits.
Differential Revision: https://reviews.llvm.org/D32848
llvm-svn: 302437
Summary:
Following up on Sanjay's suggetion in D32955, move this functionality
into ShuffleVectornstruction.
Reviewers: spatel, RKSimon
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32956
llvm-svn: 302420
Multiple ldr pseudoinstructions with the same constant value will
reuse the same constant pool entry. However, if the constant pool
is explicitly flushed with a .ltorg directive, we should not try
to reference constants in the previous pool any longer, since they
may be out of range.
This fixes assembling hand-written assembler source which repeatedly
loads the same constant value, across a binary size larger than the
pc-relative fixup range for ldr instructions (4096 bytes). Such
assembler source already uses explicit .ltorg instructions to emit
constant pools with regular intervals. However if we try to reuse
constants emitted in earlier pools, they end up out of range.
This makes the output of the testcase match what binutils gas does
(prior to this patch, it would fail to assemble).
Differential Revision: https://reviews.llvm.org/D32847
llvm-svn: 302416
This patch introduces an LLVM intrinsic and a target opcode for custom event
logging in XRay. Initially, its use case will be to allow users of XRay to log
some type of string ("poor man's printf"). The target opcode compiles to a noop
sled large enough to enable calling through to a runtime-determined relative
function call. At runtime, when X-Ray is enabled, the sled is replaced by
compiler-rt with a trampoline to the logic for creating the custom log entries.
Future patches will implement the compiler-rt parts and clang-side support for
emitting the IR corresponding to this intrinsic.
Reviewers: timshen, dberris
Subscribers: igorb, pelikan, rSerge, timshen, echristo, dberris, llvm-commits
Differential Revision: https://reviews.llvm.org/D27503
llvm-svn: 302405
Summary: Continue making updates to llvm-readobj to display resource sections. This is necessary for testing the up and coming cvtres tool.
Reviewers: zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32609
llvm-svn: 302399
Summary:
This reverts commit 56beec1b1cfc6d263e5eddb7efff06117c0724d2.
Revert "Quick fix to D32609, it seems .o files are not transferred in all cases."
This reverts commit 7652eecd29cfdeeab7f76f687586607a99ff4e36.
Revert "Update llvm-readobj -coff-resources to display tree structure."
This reverts commit 422b62c4d302cfc92401418c2acd165056081ed7.
Reviewers: zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32958
llvm-svn: 302397
Summary: Continue making updates to llvm-readobj to display resource sections. This is necessary for testing the up and coming cvtres tool.
Reviewers: zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32609
llvm-svn: 302386
Previously SimplifyCFG used getSetSize which returns an APInt that is 1 bit wider than the ConstantRange's bit width. In the reasonably common case that the ConstantRange is 64-bits wide, this requires returning a 65-bit APInt. APInt's can only store 64-bits without a memory allocation so this is inefficient.
The new method takes the 8 as an input and tells if the range contains more than that many elements without requiring any wider math.
llvm-svn: 302385
Currently llvm-rtdyld in -check mode will map sections to back-to-back 4k
aligned slabs starting at 0x1000. Automatically remapping sections by default is
helpful because it quickly exposes relocation bugs due to use of local addresses
rather than load addresses (these would silently pass if the load address was
not remapped). These mappings can be explicitly overridden on a per-section
basis using llvm-rtdlyd's -map-section option. This patch extends this scheme to
also preserve any mappings made by RuntimeDyld itself. Preserving RuntimeDyld's
automatic mappings allows us to write test cases to verify that these automatic
mappings have been applied.
This will allow the fix in https://reviews.llvm.org/D32899 to be tested with
llvm-rtdyld -check.
llvm-svn: 302372
Summary: This makes setRange take ConstantRange by rvalue reference since most callers were passing an unnamed temporary ConstantRange. We can then move that ConstantRange into the DenseMap caches. For the callers that weren't passing a temporary, I've added std::move to to the local variable being passed.
Reviewers: sanjoy, mzolotukhin, efriedma
Reviewed By: sanjoy
Subscribers: takuto.ikuta, llvm-commits
Differential Revision: https://reviews.llvm.org/D32943
llvm-svn: 302371
Summary:
When writing a loop pass I made a mistake and hit the assertion
"Unreachable block in loop". Later, I hit an assertion when I called
`BasicBlock::eraseFromParent()` incorrectly: "Use still stuck around
after Def is destroyed". This latter assertion, however, printed out
exactly which value is being deleted and what uses remain, which helped
me debug the issue.
To help people debugging their loop passes in the future, print out
exactly which basic block is unreachable in a loop.
Reviewers: sanjoy, hfinkel, mehdi_amini
Reviewed By: mehdi_amini
Subscribers: mzolotukhin
Differential Revision: https://reviews.llvm.org/D32878
llvm-svn: 302354
This is a step toward having statically allocated instruciton mapping.
We are going to tablegen them eventually, so let us reflect that in
the API.
NFC.
llvm-svn: 302316
This exposes a method in MachineFrameInfo that calculates
MaxCallFrameSize and calls it after instruction selection in the ARM
target.
This avoids
ARMBaseRegisterInfo::canRealignStack()/ARMFrameLowering::hasReservedCallFrame()
giving different answers in early/late phases of codegen.
The testcase shows a particular nasty example result of that where we
would fail to properly align an alloca.
Differential Revision: https://reviews.llvm.org/D32622
llvm-svn: 302303
Most of the time we know exactly how many type records we
have in a list, and we want to use the visitor to deserialize
them into actual records in a database. Previously we were
just using push_back() every time without reserving the space
up front in the vector. This is obviously terrible from a
performance standpoint, and it's not uncommon to have PDB
files with half a million type records, where the performance
degredation was quite noticeable.
llvm-svn: 302302
When randomly accessing an element by offset, we weren't passing
the offset through so if you called .offset() it would return a
value of 0.
llvm-svn: 302292
- MIParser: If the successor list is not specified successors will be
added based on basic block operands in the block and possible
fallthrough.
- MIRPrinter: Adds a new `simplify-mir` option, with that option set:
Skip printing of block successor lists in cases where the
parser is guaranteed to reconstruct it. This means we still print the
list if some successor cannot be determined (happens for example for
jump tables), if the successor order changes or branch probabilities
being unequal.
Differential Revision: https://reviews.llvm.org/D31262
llvm-svn: 302289
wcslen is part of the C99 and C++98 standards.
- This introduces the function to TargetLibraryInfo.
- Also set attributes for wcslen in llvm::inferLibFuncAttributes().
Differential Revision: https://reviews.llvm.org/D32837
llvm-svn: 302278
This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown.
Differential Revision: https://reviews.llvm.org/D32637
llvm-svn: 302262
This is similar to my recent fix for VarStreamArrayIterator, but the cause
(and thus the fix) is subtley different. The FixedStreamArrayIterator
iterates over a const Array, so the iterator's value type must be const.
llvm-svn: 302257
This almost completes the matrix of all possible find
functions.
*EXISTING*
----------
find_first
find_first_unset
find_next
find_next_unset
find_last
find_last_unset
*NEW*
----
find_prev
*STILL MISSING*
---------------
find_prev_unset
Differential Revision: https://reviews.llvm.org/D32885
llvm-svn: 302254
llvm-dwarfdump currently prints no message if decompression fails
for some reason. I noticed that during work on one of LLD patches
where LLD produced an broken output. It was a bit confusing to see
no output for section dumped and no any error message at all.
Patch adds error message for such cases.
Differential revision: https://reviews.llvm.org/D32865
llvm-svn: 302221
Verifying the hash values as we are currently doing
results in iterating every type record before the user
even tries to access the first one, and the API user
has no control over, or ability to hook into this
process.
As a result, when the user wants to iterate over types
to print them or index them, this results in a second
iteration over the same list of types. When there's
upwards of 1,000,000 type records, this is obviously
quite undesirable.
This patch raises the verification outside of TpiStream
, and llvm-pdbdump hooks a hash verification visitor
into the normal dumping process. So we still verify
the hash records, but we can do it while not requiring
a second iteration over the type stream.
Differential Revision: https://reviews.llvm.org/D32873
llvm-svn: 302206
I tried to run llvm-pdbdump on a very large (~1.5GB) PDB to
try and identify show-stopping performance problems. This
patch addresses the first such problem.
When loading the DBI stream, before anyone has even tried to
access a single record, we build an in memory map of every
source file for every module. In the particular PDB I was
using, this was over 85 million files. Specifically, the
complexity is O(m*n) where m is the number of modules and
n is the average number of source files (including headers)
per module.
The whole reason for doing this was so that we could have
constant time access to any module and any of its source
file lists. However, we can still get O(1) access to the
source file list for a given module with a simple O(m)
precomputation, and access to the list of modules is
already O(1) anyway.
So this patches reduces the O(m*n) up-front precomputation
to an O(m) one, where n is ~6,500 and n*m is about 85 million
in my pathological test case.
Differential Revision: https://reviews.llvm.org/D32870
llvm-svn: 302205
Summary:
As of this patch, 350 out of 3938 rules are currently imported.
Depends on D32229
Reviewers: qcolombet, kristof.beyls, rovka, t.p.northover, ab, aditya_nandakumar
Reviewed By: ab
Subscribers: dberris, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D32275
llvm-svn: 302154
Added the integer data processing intrinsics from ACLE v2.1 Chapter 9
but I have missed out the saturation_occurred intrinsics for now. For
the instructions that read and write the GE bits, a chain is included
and the only instruction that reads these flags (sel) is only
selectable via the implemented intrinsic.
Differential Revision: https://reviews.llvm.org/D32281
llvm-svn: 302126
Summary:
This change adds a new section to the xray-instrumented binary that
stores an index into ranges of the instrumentation map, where sleds
associated with the same function can be accessed as an array. At
runtime, we can get access to this index by function ID offset allowing
for selective patching and unpatching by function ID.
Each entry in this new section (xray_fn_idx) will include two pointers
indicating the start and one past the end of the sleds associated with
the same function. These entries will be 16 bytes long on x86 and
aarch64. On arm, we align to 16 bytes anyway so the runtime has to take
that into consideration.
__{start,stop}_xray_fn_idx will be the symbols that the runtime will
look for when we implement the selective patching/unpatching by function
id APIs. Because XRay synthesizes the function id's in a monotonically
increasing manner at runtime now, implementations (and users) can use
this table to look up the sleds associated with a specific function.
This is useful in implementations that want to do things like:
- Implement coverage mode for functions by patching everything
pre-main, then as functions are encountered, the installed handler
can unpatch the function that's been encountered after recording
that it's been called.
- Do "learning mode", so that the implementation can figure out some
statistical information about function calls by function id for a
time being, and then determine which functions are worth
uninstrumenting at runtime.
- Do "selective instrumentation" where an implementation can
specifically instrument only certain function id's at runtime
(either based on some external data, or through some other
heuristics) instead of patching all the instrumented functions at
runtime.
Reviewers: dblaikie, echristo, chandlerc, javed.absar
Subscribers: pelikan, aemerson, kpw, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D32693
llvm-svn: 302109
When profiling a no-op incremental link of Chromium I found that the functions
computeImportForFunction and computeDeadSymbols were consuming roughly 10% of
the profile. The goal of this change is to improve the performance of those
functions by changing the map lookups that they were previously doing into
pointer dereferences.
This is achieved by changing the ValueInfo data structure to be a pointer to
an element of the global value map owned by ModuleSummaryIndex, and changing
reference lists in the GlobalValueSummary to hold ValueInfos instead of GUIDs.
This means that a ValueInfo will take a client directly to the summary list
for a given GUID.
Differential Revision: https://reviews.llvm.org/D32471
llvm-svn: 302108
Summary:
The existing implementation creates a symbolic SCEV expression every
time we analyze a phi node and then has to remove it, when the analysis
is finished. This is very expensive, and in most of the cases it's also
unnecessary. According to the data I collected, ~60-70% of analyzed phi
nodes (measured on SPEC) have the following form:
PN = phi(Start, OP(Self, Constant))
Handling such cases separately significantly speeds this up.
Reviewers: sanjoy, pete
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32663
llvm-svn: 302096
This patch adds isConstant and getConstant for determining if KnownBits represents a constant value and to retrieve the value. Use them to simplify code.
Differential Revision: https://reviews.llvm.org/D32785
llvm-svn: 302091
This patch adds zext, sext, and trunc methods to KnownBits and uses them where possible.
Differential Revision: https://reviews.llvm.org/D32784
llvm-svn: 302088
Otherwise, each CPU has to manually specify the extensions it supports,
even though they have to be a superset of the base arch extensions.
And when there's redundant data there's stale data, so most of the CPUs
lie about the features they support (almost none lists AEK_FP).
Instead, do the saner thing: add the optional extensions on top of the
base extensions provided by the architecture.
The ARM TargetParser has the same behavior.
Differential Revision: https://reviews.llvm.org/D32780
llvm-svn: 302078
That's only a required extension as of v8.1a.
Remove it from the "generic" CPU as well: it should only support the
base ISA (and binutils agrees).
Also unify the MC tests into crc.s and arm64-crc32.s
llvm-svn: 302077
Adrian requested that we break things down to make things clean in the DWARFVerifier. This patch breaks everything down into nice individual functions and cleans up the code quite a bit and prepares us for the next round of verifiers.
Differential Revision: https://reviews.llvm.org/D32812
llvm-svn: 302062
Summary:
Do three things to help with that:
- Add AttributeList::FirstArgIndex, which is an enumerator currently set
to 1. It allows us to change the indexing scheme with fewer changes.
- Add addParamAttr/removeParamAttr. This just shortens addAttribute call
sites that would otherwise need to spell out FirstArgIndex.
- Remove some attribute-specific getters and setters from Function that
take attribute list indices. Most of these were only used from
BuildLibCalls, and doesNotAlias was only used to test or set if the
return value is malloc-like.
I'm happy to split the patch, but I think they are probably easier to
review when taken together.
This patch should be NFC, but it sets the stage to change the indexing
scheme to this, which is more convenient when indexing into an array:
0: func attrs
1: retattrs
2...: arg attrs
Reviewers: chandlerc, pete, javed.absar
Subscribers: david2050, llvm-commits
Differential Revision: https://reviews.llvm.org/D32811
llvm-svn: 302060
The raw CodeView format references strings by "offsets", but it's
confusing what table the offset refers to. In the case of line
number information, it's an offset into a buffer of records,
and an indirection is required to get another offset into a
different table to find the final string. And in the case of
checksum information, there is no indirection, and the offset
refers directly to the location of the string in another buffer.
This would be less confusing if we always just referred to the
strings by their value, and have the library be smart enough
to correctly resolve the offsets on its own from the right
location.
This patch makes that possible. When either reading or writing,
all the user deals with are strings, and the library does the
appropriate translations behind the scenes.
llvm-svn: 302053
llvm-readobj hand rolls some CodeView parsing code for string
tables, so this patch updates it to re-use some of the newly
introduced parsing code in LLVMDebugInfoCodeView.
Differential Revision: https://reviews.llvm.org/D32772
llvm-svn: 302052
Adrian requested we create a DWARFVerifier.cpp file to contain all of the DWARF verification stuff. This change simply moves the functionality over into DWARFVerifier.h and DWARFVerifier.cpp, renames the DWARFVerifier methods to start with lower case, and switches DWARFContext.cpp over to using the new functionality.
Differential Revision: https://reviews.llvm.org/D32809
llvm-svn: 302044
This was reverted due to a "missing" file, but in reality
what happened was that I renamed a file, and then due to
a merge conflict both the old file and the new file got
added to the repository. This led to an unused cpp file
being in the repo and not referenced by any CMakeLists.txt
but #including a .h file that wasn't in the repo. In an
even more unfortunate coincidence, CMake didn't report the
unused cpp file because it was in a subdirectory of the
folder with the CMakeLists.txt, and not in the same directory
as any CMakeLists.txt.
The presence of the unused file was then breaking certain
tools that determine file lists by globbing rather than
by what's specified in CMakeLists.txt
In any case, the fix is to just remove the unused file from
the patch set.
llvm-svn: 302042
This patch adds support for the the LightWeight Profiling (LWP) instructions which are available on all AMD Bulldozer class CPUs (bdver1 to bdver4).
Reapplied - this time without changing line endings of existing files.
Differential Revision: https://reviews.llvm.org/D32769
llvm-svn: 302041
Currently several places assume the VAL member is always at least the same size as pVal. In particular for a memcpy in the move assignment operator. While this is a true assumption, it isn't good practice to assume this.
This patch gives the union a name so we can write the memcpy in terms of the union itself. This also adds a similar memcpy to the move constructor where we previously just copied using VAL directly.
This patch is mostly just a mechanical addition of the U in front of VAL and pVAL everywhere. But several constructors had to be modified since we can't directly initializer a field of named union from the initializer list.
Differential Revision: https://reviews.llvm.org/D30629
llvm-svn: 302040
This patch adds support for the the LightWeight Profiling (LWP) instructions which are available on all AMD Bulldozer class CPUs (bdver1 to bdver4).
Differential Revision: https://reviews.llvm.org/D32769
llvm-svn: 302028
Fixes PR31789 - When loop-vectorize tries to use these intrinsics for a
non-default address space pointer we fail with a "Calling a function with a
bad singature!" assertion. This patch solves this by adding the 'vector of
pointers' argument as an overloaded type which will determine the address
space.
Differential revision: https://reviews.llvm.org/D31490
llvm-svn: 302018
The patch is failing to add StringTableStreamBuilder.h, but that isn't
even discovered because the corresponding StringTableStreamBuilder.cpp
isn't added to any CMakeLists.txt file and thus never built. I think
this patch is just incomplete.
llvm-svn: 302002
This was reported by the ASAN bot, and it turned out to be
a fairly fundamental problem with the design of VarStreamArray
and the way it passes context information to the extractor.
The fix was cumbersome, and I'm not entirely pleased with it,
so I plan to revisit this design in the future when I'm not
pressed to get the bots green again. For now, this fixes
the issue by storing the context information by value instead
of by reference, and introduces some impossibly-confusing
template magic to make things "work".
llvm-svn: 301999
Summary:
This is the corresponding llvm change to D28037 to ensure no performance
regression.
Reviewers: bogner, kbarton, hfinkel, iteratee, echristo
Subscribers: nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D28329
llvm-svn: 301990
Previously we had knowledge of how to serialize and deserialize
a string table inside of DebugInfo/PDB, but the string table
that it serializes contains a piece that is actually considered
CodeView and can appear outside of a PDB. We already have logic
in llvm-readobj and MCCodeView to read and write this format,
so it doesn't make sense to duplicate the logic in DebugInfoPDB
as well.
This patch makes codeview::StringTable (for writing) and
codeview::StringTableRef (for reading), updates DebugInfoPDB
to use these classes for its own writing, and updates llvm-readobj
to additionally use StringTableRef for reading.
It's a bit more difficult to get MCCodeView to use this for
writing, but it's a logical next step.
llvm-svn: 301986
This patch verifies the .debug_line:
- verify all addresses in a line table sequence have ascending addresses
- verify that all line table file indexes are valid
Unit tests added for both cases.
Differential Revision: https://reviews.llvm.org/D32765
llvm-svn: 301984
Remove "_NC" suffix and semantics from TLSDESC_LD{64,32}_LO12 and
TLSDESC_ADD_LO12 relocations
Rearrange ordering in AArch64.def to follow relocation encoding
Fix name:
R_AARCH64_P32_LD64_GOT_LO12_NC => R_AARCH64_P32_LD32_GOT_LO12_NC
Add support for several "TLS", "TLSGD", and "TLSLD" relocations for
ILP32
Fix return values from isNonILP32reloc
Add implementations for
R_AARCH64_ADR_PREL_PG_HI21_NC, R_AARCH64_P32_LD32_GOT_LO12_NC,
R_AARCH64_P32_TLSIE_LD32_GOTTPREL_LO12_NC,
R_AARCH64_P32_TLSDESC_LD32_LO12, R_AARCH64_LD64_GOT_LO12_NC,
*TLSLD_LDST128_DTPREL_LO12, *TLSLD_LDST128_DTPREL_LO12_NC,
*TLSLE_LDST128_TPREL_LO12, *TLSLE_LDST128_TPREL_LO12_NC
Modify error messages to give name of equivalent relocation in the
ABI not being used, along with better checking for non-existent
requested relocations.
Added assembler support for "pg_hi21_nc"
Relocation definitions added without implementations:
R_AARCH64_P32_TLSDESC_ADR_PREL21, R_AARCH64_P32_TLSGD_ADR_PREL21,
R_AARCH64_P32_TLSGD_ADD_LO12_NC, R_AARCH64_P32_TLSLD_ADR_PREL21,
R_AARCH64_P32_TLSLD_ADR_PAGE21, R_AARCH64_P32_TLSLD_ADD_LO12_NC,
R_AARCH64_P32_TLSLD_LD_PREL19, R_AARCH64_P32_TLSDESC_LD_PREL19,
R_AARCH64_P32_TLSGD_ADR_PAGE21, R_AARCH64_P32_TLS_DTPREL,
R_AARCH64_P32_TLS_DTPMOD, R_AARCH64_P32_TLS_TPREL,
R_AARCH64_P32_TLSDESC
Fix encoding:
R_AARCH64_P32_TLSDESC_ADR_PAGE21
Reviewers: Peter Smith
Patch by: Joel Jones (jjones@cavium.com)
Differential Revision: https://reviews.llvm.org/D32072
llvm-svn: 301980
The directory and file tables now have form-based content descriptors.
Parse these and extract the per-directory/file records based on the
descriptors. For now we support only DW_FORM_string (inline) for the
path names; follow-up work will add support for indirect forms (i.e.,
DW_FORM_strp, strx<N>, and line_strp).
Differential Revision: http://reviews.llvm.org/D32713
llvm-svn: 301978
LTO and other fancy linking previously led to DWARF that contained invalid references. We already validate that CU relative references fall into the CU, and the DW_FORM_ref_addr references fall inside the .debug_info section, but we didn't validate that the references pointed to correct DIE offsets. This new verification will ensure that all references refer to actual DIEs and not an offset in between.
This caught a bug in DWARFUnit::getDIEForOffset() where if you gave it any offset, it would match the DIE that mathes the offset _or_ the next DIE. This has been fixed.
Differential Revision: https://reviews.llvm.org/D32722
llvm-svn: 301971
With the forthcoming codeview::StringTable which a pdb::StringTable
would hold an instance of as one member, this ambiguity becomes
confusing. Rename to PDBStringTable to avoid this.
llvm-svn: 301948
TLSDESC_ADD_LO12 relocations
Rearrange ordering in AArch64.def to follow relocation encoding
Fix name:
R_AARCH64_P32_LD64_GOT_LO12_NC => R_AARCH64_P32_LD32_GOT_LO12_NC
Add support for several "TLS", "TLSGD", and "TLSLD" relocations for
ILP32
Fix return values from isNonILP32reloc
Add implementations for
R_AARCH64_ADR_PREL_PG_HI21_NC, R_AARCH64_P32_LD32_GOT_LO12_NC,
R_AARCH64_P32_TLSIE_LD32_GOTTPREL_LO12_NC,
R_AARCH64_P32_TLSDESC_LD32_LO12, R_AARCH64_LD64_GOT_LO12_NC,
*TLSLD_LDST128_DTPREL_LO12, *TLSLD_LDST128_DTPREL_LO12_NC,
*TLSLE_LDST128_TPREL_LO12, *TLSLE_LDST128_TPREL_LO12_NC
Modify error messages to give name of equivalent relocation in the
ABI not being used, along with better checking for non-existent
requested relocations.
Added assembler support for "pg_hi21_nc"
Relocation definitions added without implementations:
R_AARCH64_P32_TLSDESC_ADR_PREL21, R_AARCH64_P32_TLSGD_ADR_PREL21,
R_AARCH64_P32_TLSGD_ADD_LO12_NC, R_AARCH64_P32_TLSLD_ADR_PREL21,
R_AARCH64_P32_TLSLD_ADR_PAGE21, R_AARCH64_P32_TLSLD_ADD_LO12_NC,
R_AARCH64_P32_TLSLD_LD_PREL19, R_AARCH64_P32_TLSDESC_LD_PREL19,
R_AARCH64_P32_TLSGD_ADR_PAGE21, R_AARCH64_P32_TLS_DTPREL,
R_AARCH64_P32_TLS_DTPMOD, R_AARCH64_P32_TLS_TPREL,
R_AARCH64_P32_TLSDESC
Fix encoding:
R_AARCH64_P32_TLSDESC_ADR_PAGE21
Reviewers: Peter Smith
Patch by: Joel Jones (jjones@cavium.com)
Differential Revision: https://reviews.llvm.org/D32072
llvm-svn: 301939
Previously we wrote line information and file checksum
information, but we did not write information about inlinee
lines and functions. This patch adds support for that.
llvm-svn: 301936
This is motivated by https://reviews.llvm.org/D32488 where I am trying
to add printing of the section type for incompatible sections to LLD
error messages. This patch allows us to use the same code in
llvm-readobj and LLD instead of duplicating the function inside LLD.
Patch by Alexander Richardson!
llvm-svn: 301921
PR31088 demonstrated that we were assuming that only integers require promotion from <1 x iX> types, when in fact float types may require it as well - in this case half floats.
This patch adds support for extension/truncation for both integer and float types.
Differential Revision: https://reviews.llvm.org/D32391
llvm-svn: 301910
r288279 mistakenly added it to all arches, but it's only available
from v8.1 onwards.
The testcase is awkward, because (I suspect) of PR32873.
Spotted by inspection.
llvm-svn: 301890
This tracks whether MaxCallFrameSize is computed yet. Ideally we would
assert and fail when the value is queried before it is computed, however
this fails various targets that need to be fixed first.
Differential Revision: https://reviews.llvm.org/D32570
llvm-svn: 301851
lldb-dwarfdump gets a new "--verify" option that will verify a single file's DWARF debug info and will print out any errors that it finds. It will return an non-zero exit status if verification fails, and a zero exit status if verification succeeds. Adding the --quiet option will suppress any output the STDOUT or STDERR.
The first part of the verify does the following:
- verifies that all CU relative references (DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref4, DW_FORM_ref8, DW_FORM_ref_udata) have valid CU offsets
- verifies that all DW_FORM_ref_addr references have valid .debug_info offsets
- verifies that all DW_AT_ranges attributes have valid .debug_ranges offsets
- verifies that all DW_AT_stmt_list attributes have valid .debug_line offsets
- verifies that all DW_FORM_strp attributes have valid .debug_str offsets
Unit tests were added for each of the above cases.
Differential Revision: https://reviews.llvm.org/D32707
llvm-svn: 301844
This is to prepare for an upcoming change which uses pointers instead of
GUIDs to represent references.
Differential Revision: https://reviews.llvm.org/D32469
llvm-svn: 301843
We were using operator=(0) which implicitly calls countLeadingZeros but only to compare with 64 to determine if we can compare VAL or pVal[0] to uint64_t. By handling the multiword case with countLeadingZerosSlowCase==BitWidth we can prevent a load of pVal[0] from being inserted inline at each call site. This saves a little bit of code size.
llvm-svn: 301842
This was an omission in r301813. I had made the supporting changes to
make this happen, but I forgot to actually update the PrevPair
declaration.
llvm-svn: 301817
In cases where an instruction (a call site, say) is RAUW'ed with some
other value (this is possible via the `returned` attribute, for
instance), we want the slot in UnknownInsts to point to the original
Instruction we wanted to track, not the value it got replaced by.
Fixes PR32587.
This relands r301426.
llvm-svn: 301814
In preparation for introducing writing capabilities for each of
these classes, I would like to adopt a Foo / FooRef naming
convention, where Foo indicates that the class can manipulate and
serialize Foos, and FooRef indicates that it is an immutable view of
an existing Foo. In other words, Foo is a writer and FooRef is a
reader. This patch names some existing readers to conform to the
FooRef convention, while offering no functional change.
llvm-svn: 301810
Summary:
This frees up one slot in the HandleBaseKind enum, which I will use
later to add a new kind of value handle. The size of the
HandleBaseKind enum is important because we store a HandleBaseKind in
the low two bits of a (in the worst case) 4 byte aligned pointer.
Reviewers: davide, chandlerc
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D32634
llvm-svn: 301809
In this patch, I introduce a new alt macro feature.
This feature adds meaning for the % when using it as a prefix to the calling macro arguments.
In the altmacro mode, the percent sign '%' before an absolute expression convert the expression first to a string.
As described in the https://sourceware.org/binutils/docs-2.27/as/Altmacro.html
"Expression results as strings
You can write `%expr' to evaluate the expression expr and use the result as a string."
expression assumptions:
1. '%' can only evaluate an absolute expression.
2. Altmacro '%' must be the first character of the evaluated expression.
3. If no '%' is located before the expression, a regular module operation is expected.
4. The result of Absolute Expressions can be only integer.
Differential Revision: https://reviews.llvm.org/D32526
llvm-svn: 301797
Summary:
programUndefinedIfPoison makes more sense, given what the function
does; and I'm about to add a function with a name similar to
isKnownNotFullPoison (so do the rename to avoid confusion).
Reviewers: broune, majnemer, bjarke.roune
Reviewed By: broune
Subscribers: mcrosier, llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D30444
llvm-svn: 301776
Summary: As per discution on how to get better codegen an large int legalization, it became clear that using a glue for the carry was preventing several desirable optimizations. Passing the carry down as a value allow for more flexibility.
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D29872
llvm-svn: 301775
This features isn't used anywhere in tree. It's existence seems to be preventing selfhost builds from inlining any of the setBits methods including setLowBits, setHighBits, and setBitsFrom. This is because the code makes the method recursive.
If anyone needs this feature in the future we could consider adding a setBitsWithWrap method. This way only the calls that need it would pay for it.
llvm-svn: 301769
Summary:
Predicate<> now has a field to indicate how often it must be recomputed.
Currently, there are two frequencies, per-module (RecomputePerFunction==0)
and per-function (RecomputePerFunction==1). Per-function predicates are
currently recomputed more frequently than necessary since the only predicate
in this category is cheap to test. Per-module predicates are now computed in
getSubtargetImpl() while per-function predicates are computed in selectImpl().
Tablegen now manages the PredicateBitset internally. It should only be
necessary to add the required includes.
Also fixed a problem revealed by the test case where
constrainSelectedInstRegOperands() would attempt to tie operands that
BuildMI had already tied.
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D32491
llvm-svn: 301750
Summary: This patch adds isNegative, isNonNegative for querying whether the sign bit is known. It also adds makeNegative and makeNonNegative for controlling the sign bit.
Reviewers: RKSimon, spatel, davide
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32651
llvm-svn: 301747
There is a lot of duplicate code for printing line info between
YAML and the raw output printer. This introduces a base class
that can be shared between the two, and makes some minor
cleanups in the process.
llvm-svn: 301728
This broke the Clang build. (Clang-side patch missing?)
Original commit message:
> [IR] Make add/remove Attributes use AttrBuilder instead of
> AttributeList
>
> This change cleans up call sites and avoids creating temporary
> AttributeList objects.
>
> NFC
llvm-svn: 301712
Fixes the issue highlighted in
http://lists.llvm.org/pipermail/cfe-dev/2014-June/037500.html.
The DW_AT_decl_file and DW_AT_decl_line attributes on namespaces can
prevent LLVM from uniquing types that are in the same namespace. They
also don't carry any meaningful information.
rdar://problem/17484998
Differential Revision: https://reviews.llvm.org/D32648
llvm-svn: 301706
I used Null rather than Zero to match the getNullValue method name.
There are some other places outside APInt where isNullValue would be more readable than isMinValue even though they do the same thing. I'll update those in future patches.
llvm-svn: 301695
Also, add test for data relocations and fix addend to
be signed.
Subscribers: jfb, dschuff
Differential Revision: https://reviews.llvm.org/D32513
llvm-svn: 301690
The IntrNoMem, IntrReadMem, IntrWriteMem, and IntrArgMemOnly intrinsic
properties differ from their corresponding LLVM IR attributes by specifying
that the intrinsic, in addition to its memory properties, has no other side
effects.
The IntrHasSideEffects flag used in combination with one of the memory flags
listed above, makes it possible to define an intrinsic such that its
properties at the CodeGen layer match its properties at the IR layer.
Patch by Tom Stellard
llvm-svn: 301685
The method is called "get *Param* Alignment", and is only used for
return values exactly once, so it should take argument indices, not
attribute indices.
Avoids confusing code like:
IsSwiftError = CS->paramHasAttr(ArgIdx, Attribute::SwiftError);
Alignment = CS->getParamAlignment(ArgIdx + 1);
Add getRetAlignment to handle the one case in Value.cpp that wants the
return value alignment.
This is a potentially breaking change for out-of-tree backends that do
their own call lowering.
llvm-svn: 301682
Adds a new method finalizeLowering to TargetLoweringBase. This is in
preparation for an upcoming commit.
This function is meant for target specific adjustments to
MachineFrameInfo or register reservations.
Move the freezeRegisters() and the hasCopyImplyingStackAdjustment()
handling into the new function to prove the concept. As an added bonus
GlobalISel no longer missed the hasCopyImplyingStackAdjustment()
handling with this.
Differential Revision: https://reviews.llvm.org/D32621
llvm-svn: 301679
This eliminates many extra 'Idx' induction variables in loops over
arguments in CodeGen/ and Target/. It also reduces the number of places
where we assume that ReturnIndex is 0 and that we should add one to
argument numbers to get the corresponding attribute list index.
NFC
llvm-svn: 301666
This became no longer necessary after D19462 landed, and will be incompatible
with an upcoming change to the summary data structures that changes how we
represent references.
llvm-svn: 301660
Summary:
The motivation example is like below which has 13 cases but only 2 distinct targets
```
lor.lhs.false2: ; preds = %if.then
switch i32 %Status, label %if.then27 [
i32 -7012, label %if.end35
i32 -10008, label %if.end35
i32 -10016, label %if.end35
i32 15000, label %if.end35
i32 14013, label %if.end35
i32 10114, label %if.end35
i32 10107, label %if.end35
i32 10105, label %if.end35
i32 10013, label %if.end35
i32 10011, label %if.end35
i32 7008, label %if.end35
i32 7007, label %if.end35
i32 5002, label %if.end35
]
```
which is compiled into a balanced binary tree like this on AArch64 (similar on X86)
```
.LBB853_9: // %lor.lhs.false2
mov w8, #10012
cmp w19, w8
b.gt .LBB853_14
// BB#10: // %lor.lhs.false2
mov w8, #5001
cmp w19, w8
b.gt .LBB853_18
// BB#11: // %lor.lhs.false2
mov w8, #-10016
cmp w19, w8
b.eq .LBB853_23
// BB#12: // %lor.lhs.false2
mov w8, #-10008
cmp w19, w8
b.eq .LBB853_23
// BB#13: // %lor.lhs.false2
mov w8, #-7012
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_14: // %lor.lhs.false2
mov w8, #14012
cmp w19, w8
b.gt .LBB853_21
// BB#15: // %lor.lhs.false2
mov w8, #-10105
add w8, w19, w8
cmp w8, #9 // =9
b.hi .LBB853_17
// BB#16: // %lor.lhs.false2
orr w9, wzr, #0x1
lsl w8, w9, w8
mov w9, #517
and w8, w8, w9
cbnz w8, .LBB853_23
.LBB853_17: // %lor.lhs.false2
mov w8, #10013
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_18: // %lor.lhs.false2
mov w8, #-7007
add w8, w19, w8
cmp w8, #2 // =2
b.lo .LBB853_23
// BB#19: // %lor.lhs.false2
mov w8, #5002
cmp w19, w8
b.eq .LBB853_23
// BB#20: // %lor.lhs.false2
mov w8, #10011
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_21: // %lor.lhs.false2
mov w8, #14013
cmp w19, w8
b.eq .LBB853_23
// BB#22: // %lor.lhs.false2
mov w8, #15000
cmp w19, w8
b.ne .LBB853_3
```
However, the inline cost model estimates the cost to be linear with the number
of distinct targets and the cost of the above switch is just 2 InstrCosts.
The function containing this switch is then inlined about 900 times.
This change use the general way of switch lowering for the inline heuristic. It
etimate the number of case clusters with the suitability check for a jump table
or bit test. Considering the binary search tree built for the clusters, this
change modifies the model to be linear with the size of the balanced binary
tree. The model is off by default for now :
-inline-generic-switch-cost=false
This change was originally proposed by Haicheng in D29870.
Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier
Reviewed By: hans
Subscribers: joerg, aemerson, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D31085
llvm-svn: 301649
This is a follow up to the fix in r298360 to improve the handling of debug
values when redundant LEAs are removed. The fix in r298360 effectively
discarded the debug values. This patch now attempts to preserve the debug
values by using the DWARF DW_OP_stack_value operation via prependDIExpr.
Moved functions appendOffset and prependDIExpr from Local.cpp to
DebugInfoMetadata.cpp and made them available as static member functions of
DIExpression.
Differential Revision: https://reviews.llvm.org/D31604
llvm-svn: 301630
This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently.
This is largely a mechanical transformation from KnownZero to Known.Zero.
Differential Revision: https://reviews.llvm.org/D32569
llvm-svn: 301620
COFF Import libraries which use the obsolete CONSTANT export are
supposed to get two symbols, one with the `_imp_` prefix and one
without. Ensure that we expose both for iteration. This is necessary
to fix the librarian with COFF CONSTANT exports.
llvm-svn: 301614
When dumping raw data from a stream, you might know the offset
of a certain record you're interested in, as well as how long
that record is. Previously, you had to dump the entire stream
and wade through the bytes to find the interesting record.
This patch allows you to specify an offset and length on the
command line, and it will only dump the requested range.
llvm-svn: 301607
Use a combination of !associated, comdat, @llvm.compiler.used and
custom sections to allow dead stripping of globals and their asan
metadata. Sometimes.
Currently this works on LLD, which supports SHF_LINK_ORDER with
sh_link pointing to the associated section.
This also works on BFD, which seems to treat comdats as
all-or-nothing with respect to linker GC. There is a weird quirk
where the "first" global in each link is never GC-ed because of the
section symbols.
At this moment it does not work on Gold (as in the globals are never
stripped).
This is a second re-land of r298158. This time, this feature is
limited to -fdata-sections builds.
llvm-svn: 301587
This patch dumps the raw bytes of the .rsrc sections that
are present in COFF object and executable files. Subsequent
patches will parse this information and dump in a more human
readable format.
Differential Revision: https://reviews.llvm.org/D32463
Patch By: Eric Beckmann
llvm-svn: 301578
Currently, this pass only focuses on *trivial* loop unswitching. At that
reduced problem it remains significantly better than the current loop
unswitch:
- Old pass is worse than cubic complexity. New pass is (I think) linear.
- New pass is much simpler in its design by focusing on full unswitching. (See
below for details on this).
- New pass doesn't carry state for thresholds between pass iterations.
- New pass doesn't carry state for correctness (both miscompile and
infloop) between pass iterations.
- New pass produces substantially better code after unswitching.
- New pass can handle more trivial unswitch cases.
- New pass doesn't recompute the dominator tree for the entire function
and instead incrementally updates it.
I've ported all of the trivial unswitching test cases from the old pass
to the new one to make sure that major functionality isn't lost in the
process. For several of the test cases I've worked to improve the
precision and rigor of the CHECKs, but for many I've just updated them
to handle the new IR produced.
My initial motivation was the fact that the old pass carried state in
very unreliable ways between pass iterations, and these mechansims were
incompatible with the new pass manager. However, I discovered many more
improvements to make along the way.
This pass makes two very significant assumptions that enable most of these
improvements:
1) Focus on *full* unswitching -- that is, completely removing whatever
control flow construct is being unswitched from the loop. In the case
of trivial unswitching, this means removing the trivial (exiting)
edge. In non-trivial unswitching, this means removing the branch or
switch itself. This is in opposition to *partial* unswitching where
some part of the unswitched control flow remains in the loop. Partial
unswitching only really applies to switches and to folded branches.
These are very similar to full unrolling and partial unrolling. The
full form is an effective canonicalization, the partial form needs
a complex cost model, cannot be iterated, isn't canonicalizing, and
should be a separate pass that runs very late (much like unrolling).
2) Leverage LLVM's Loop machinery to the fullest. The original unswitch
dates from a time when a great deal of LLVM's loop infrastructure was
missing, ineffective, and/or unreliable. As a consequence, a lot of
complexity was added which we no longer need.
With these two overarching principles, I think we can build a fast and
effective unswitcher that fits in well in the new PM and in the
canonicalization pipeline. Some of the remaining functionality around
partial unswitching may not be relevant today (not many test cases or
benchmarks I can find) but if they are I'd like to add support for them
as a separate layer that runs very late in the pipeline.
Purely to make reviewing and introducing this code more manageable, I've
split this into first a trivial-unswitch-only pass and in the next patch
I'll add support for full non-trivial unswitching against a *fixed*
threshold, exactly like full unrolling. I even plan to re-use the
unrolling thresholds, as these are incredibly similar cost tradeoffs:
we're cloning a loop body in order to end up with simplified control
flow. We should only do that when the total growth is reasonably small.
One of the biggest changes with this pass compared to the previous one
is that previously, each individual trivial exiting edge from a switch
was unswitched separately as a branch. Now, we unswitch the entire
switch at once, with cases going to the various destinations. This lets
us unswitch multiple exiting edges in a single operation and also avoids
numerous extremely bad behaviors, where we would introduce 1000s of
branches to test for thousands of possible values, all of which would
take the exact same exit path bypassing the loop. Now we will use
a switch with 1000s of cases that can be efficiently lowered into
a jumptable. This avoids relying on somehow forming a switch out of the
branches or getting horrible code if that fails for any reason.
Another significant change is that this pass actively updates the CFG
based on unswitching. For trivial unswitching, this is actually very
easy because of the definition of loop simplified form. Doing this makes
the code coming out of loop unswitch dramatically more friendly. We
still should run loop-simplifycfg (at the least) after this to clean up,
but it will have to do a lot less work.
Finally, this pass makes much fewer attempts to simplify instructions
based on the unswitch. Something like loop-instsimplify, instcombine, or
GVN can be used to do increasingly powerful simplifications based on the
now dominating predicate. The old simplifications are things that
something like loop-instsimplify should get today or a very, very basic
loop-instcombine could get. Keeping that logic separate is a big
simplifying technique.
Most of the code in this pass that isn't in the old one has to do with
achieving specific goals:
- Updating the dominator tree as we go
- Unswitching all cases in a switch in a single step.
I think it is still shorter than just the trivial unswitching code in
the old pass despite having this functionality.
Differential Revision: https://reviews.llvm.org/D32409
llvm-svn: 301576
libraries are properly unloaded when llvm_shutdown is called.
Summary:
This was mostly affecting usage of the JIT, where storing the library handles in
a set made iteration unordered/undefined. This lead to disagreement between the
JIT and native code as to what the address and implementation of particularly on
Windows with stdlib functions:
JIT: putenv_s("TEST", "VALUE") // called msvcrt.dll, putenv_s
JIT: getenv("TEST") -> "VALUE" // called msvcrt.dll, getenv
Native: getenv("TEST") -> NULL // called ucrt.dll, getenv
Also fixed is the issue of DynamicLibrary::getPermanentLibrary(0,0) on Windows
not giving priority to the process' symbols as it did on Unix.
Reviewers: chapuni, v.g.vassilev, lhames
Reviewed By: lhames
Subscribers: danalbert, srhines, mgorny, vsk, llvm-commits
Differential Revision: https://reviews.llvm.org/D30107
llvm-svn: 301562
Previously parsing of these were all grouped together into a
single master class that could parse any type of debug info
fragment.
With writing forthcoming, the complexity of each individual
fragment is enough to warrant them having their own classes so
that reading and writing of each fragment type can be grouped
together, but isolated from the code for reading and writing
other fragment types.
In doing so, I found a place where parsing code was duplicated
for the FileChecksums fragment, across llvm-readobj and the
CodeView library, and one of the implementations had a bug.
Now that the codepaths are merged, the bug is resolved.
Differential Revision: https://reviews.llvm.org/D32547
llvm-svn: 301557
We have a lot of very similarly named classes related to
dealing with module debug info. This patch has NFC, it just
renames some classes to be more descriptive (albeit slightly
more to type). The mapping from old to new class names is as
follows:
Old | New
ModInfo | DbiModuleDescriptor
ModuleSubstream | ModuleDebugFragment
ModStream | ModuleDebugStream
With the corresponding Builder classes renamed accordingly.
Differential Revision: https://reviews.llvm.org/D32506
llvm-svn: 301555
This changes code that touches ValueHandleBase::V to go through
getValPtr and (newly added) setValPtr. This functionality will be
used later, but also seemed like a generally good cleanup.
I also renamed the field to Val, but that's just to make it obvious
that I fixed all the uses.
llvm-svn: 301518
Previously, an expression such as Saver.save(std::string("foo") + "bar")
didn't compile because there is an ambiguity as to whether the argument
is of const Twine& or StringRef.
llvm-svn: 301512
DISubprogram currently has 10 pointer operands, several of which are
often nullptr. This patch reduces the amount of memory allocated by
DISubprogram by rearranging the operands such that containing type,
template params, and thrown types come last, and are only allocated
when they are non-null (or followed by non-null operands).
This patch also eliminates the entirely unused DisplayName operand.
This saves up to 4 pointer operands per DISubprogram. (I tried
measuring the effect on peak memory usage on an LTO link of an X86
llc, but the results were very noisy).
This reapplies r301498 with an attempted workaround for g++.
Differential Revision: https://reviews.llvm.org/D32560
llvm-svn: 301501
DISubprogram currently has 10 pointer operands, several of which are
often nullptr. This patch reduces the amount of memory allocated by
DISubprogram by rearranging the operands such that containing type,
template params, and thrown types come last, and are only allocated
when they are non-null (or followed by non-null operands).
This patch also eliminates the entirely unused DisplayName operand.
This saves up to 4 pointer operands per DISubprogram. (I tried
measuring the effect on peak memory usage on an LTO link of an X86
llc, but the results were very noisy).
llvm-svn: 301498
For Swift we would like to be able to encode the error types that a
function may throw, so the debugger can display them alongside the
function's return value when finish-ing a function.
DWARF defines DW_TAG_thrown_type (intended to be used for C++ throw()
declarations) that is a perfect fit for this purpose. This patch wires
up support for DW_TAG_thrown_type in LLVM by adding a list of thrown
types to DISubprogram.
To offset the cost of the extra pointer, there is a follow-up patch
that turns DISubprogram into a variable-length node.
rdar://problem/29481673
Differential Revision: https://reviews.llvm.org/D32559
llvm-svn: 301489
The previous algorithm processed one character at a time, which is very
painful on a modern CPU. Replace it with xxHash64, which both already
exists in the codebase and is fairly fast.
Patch from Scott Smith!
Differential Revision: https://reviews.llvm.org/D32509
llvm-svn: 301487
This patch introduces a new KnownBits struct that wraps the two APInt used by computeKnownBits. This allows us to treat them as more of a unit.
Initially I've just altered the signatures of computeKnownBits and InstCombine's simplifyDemandedBits to pass a KnownBits reference instead of two separate APInt references. I'll do similar to the SelectionDAG version of computeKnownBits/simplifyDemandedBits as a separate patch.
I've added a constructor that allows initializing both APInts to the same bit width with a starting value of 0. This reduces the repeated pattern of initializing both APInts. Once place default constructed the APInts so I added a default constructor for those cases.
Going forward I would like to add more methods that will work on the pairs. For example trunc, zext, and sext occur on both APInts together in several places. We should probably add a clear method that can be used to clear both pieces. Maybe a method to check for conflicting information. A method to return (Zero|One) so we don't write it out everywhere. Maybe a method for (Zero|One).isAllOnesValue() to determine if all bits are known. I'm sure there are many other methods we can come up with.
Differential Revision: https://reviews.llvm.org/D32376
llvm-svn: 301432
Commits were:
"Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts"
"Add a new WeakVH value handle; NFC"
"Rename WeakVH to WeakTrackingVH; NFC"
The changes assumed pointers are 8 byte aligned on all architectures.
llvm-svn: 301429
Summary:
In cases where an instruction (a call site, say) is RAUW'ed with some
other value (this is possible via the `returned` attribute, amongst
other things), we want the slot in UnknownInsts to point to the
original Instruction we wanted to track, not the value it got replaced
by.
Fixes PR32587.
Reviewers: davide
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D32268
llvm-svn: 301426
Summary:
WeakVH nulls itself out if the value it was tracking gets deleted, but
it does not track RAUW.
Reviewers: dblaikie, davide
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D32267
llvm-svn: 301425
Summary:
I plan to use WeakVH to mean "nulls itself out on deletion, but does
not track RAUW" in a subsequent commit.
Reviewers: dblaikie, davide
Reviewed By: davide
Subscribers: arsenm, mehdi_amini, mcrosier, mzolotukhin, jfb, llvm-commits, nhaehnle
Differential Revision: https://reviews.llvm.org/D32266
llvm-svn: 301424
Summary:
Expose the internal query structure, start using it.
Note: This is the most minimal change possible i could create. I have
trivial followups, like fixing the one use of const FastMathFlags &,
the renaming of CtxI to be consistent, etc.
This should be NFC.
Reviewers: majnemer, davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32448
llvm-svn: 301379
Summary:
Addends are used as offsets to addresses of globals
and can be both positive and negative. This change
prints libObject in line with the spec and the MC
layer.
Subscribers: jfb, dschuff
Differential Revision: https://reviews.llvm.org/D32507
llvm-svn: 301369
We were already parsing and dumping this to the human readable
format, but not to the YAML format. This does so, in preparation
for reading it in and reconstructing the line information from
YAML.
llvm-svn: 301357
We already have a function toHex that will convert a string like
"\xFF\xFF" to the string "FFFF", but we do not have one that goes
the other way - i.e. to convert a textual string representing a
sequence of hexadecimal characters into the corresponding actual
bytes. This patch adds such a function.
llvm-svn: 301356
Remove the temporary, poorly named getSlotSet method which did the same
thing. Also remove getSlotNode, which is a hold-over from when we were
dealing with AttributeSetNode* instead of AttributeSet.
llvm-svn: 301267
Summary:
That API creates a temporary AttributeList to carry an index and a
single AttributeSet. We need to carry the index in addition to the set,
because that is how attribute groups are currently encoded.
NFC
Reviewers: pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32262
llvm-svn: 301245
Use constant references rather than `const auto` which will cause the
copy constructor. These particular cases cause issues for the swift
compiler.
llvm-svn: 301237
libraries are properly unloaded when llvm_shutdown is called.
Summary:
This was mostly affecting usage of the JIT, where storing the library handles in
a set made iteration unordered/undefined. This lead to disagreement between the
JIT and native code as to what the address and implementation of particularly on
Windows with stdlib functions:
JIT: putenv_s("TEST", "VALUE") // called msvcrt.dll, putenv_s
JIT: getenv("TEST") -> "VALUE" // called msvcrt.dll, getenv
Native: getenv("TEST") -> NULL // called ucrt.dll, getenv
Also fixed is the issue of DynamicLibrary::getPermanentLibrary(0,0) on Windows
not giving priority to the process' symbols as it did on Unix.
Reviewers: chapuni, v.g.vassilev, lhames
Reviewed By: lhames
Subscribers: danalbert, srhines, mgorny, vsk, llvm-commits
Differential Revision: https://reviews.llvm.org/D30107
llvm-svn: 301236