This fixes a problem introduced by r344334. A write from a non-zero move
eliminated at register renaming stage was not correctly handled by the PRF. This
would have led to an assertion failure if the processor model declares a PRF
that enables non-zero move elimination.
llvm-svn: 344392
This patch adds the ability to identify instructions that are "move elimination
candidates". It also allows scheduling models to describe processor register
files that allow move elimination.
A move elimination candidate is an instruction that can be eliminated at
register renaming stage.
Each subtarget can specify which instructions are move elimination candidates
with the help of tablegen class "IsOptimizableRegisterMove" (see
llvm/Target/TargetInstrPredicate.td).
For example, on X86, BtVer2 allows both GPR and MMX/SSE moves to be eliminated.
The definition of 'IsOptimizableRegisterMove' for BtVer2 looks like this:
```
def : IsOptimizableRegisterMove<[
InstructionEquivalenceClass<[
// GPR variants.
MOV32rr, MOV64rr,
// MMX variants.
MMX_MOVQ64rr,
// SSE variants.
MOVAPSrr, MOVUPSrr,
MOVAPDrr, MOVUPDrr,
MOVDQArr, MOVDQUrr,
// AVX variants.
VMOVAPSrr, VMOVUPSrr,
VMOVAPDrr, VMOVUPDrr,
VMOVDQArr, VMOVDQUrr
], CheckNot<CheckSameRegOperand<0, 1>> >
]>;
```
Definitions of IsOptimizableRegisterMove from processor models of a same
Target are processed by the SubtargetEmitter to auto-generate a target-specific
override for each of the following predicate methods:
```
bool TargetSubtargetInfo::isOptimizableRegisterMove(const MachineInstr *MI)
const;
bool MCInstrAnalysis::isOptimizableRegisterMove(const MCInst &MI, unsigned
CPUID) const;
```
By default, those methods return false (i.e. conservatively assume that there
are no move elimination candidates).
Tablegen class RegisterFile has been extended with the following information:
- The set of register classes that allow move elimination.
- Maxium number of moves that can be eliminated every cycle.
- Whether move elimination is restricted to moves from registers that are
known to be zero.
This patch is structured in three part:
A first part (which is mostly boilerplate) adds the new
'isOptimizableRegisterMove' target hooks, and extends existing register file
descriptors in MC by introducing new fields to describe properties related to
move elimination.
A second part, uses the new tablegen constructs to describe move elimination in
the BtVer2 scheduling model.
A third part, teaches llm-mca how to query the new 'isOptimizableRegisterMove'
hook to mark instructions that are candidates for move elimination. It also
teaches class RegisterFile how to describe constraints on move elimination at
PRF granularity.
llvm-mca tests for btver2 show differences before/after this patch.
Differential Revision: https://reviews.llvm.org/D53134
llvm-svn: 344334
Summary:
This change adds support for the GNU --target flag, which sets both --input-target and --output-target.
GNU objcopy doesn't do any checking for whether both --target and --{input,output}-target are used, and so it allows both, e.g. "--target A --output-target B" is equivalent to "--input-target A --output-target B" since the later command line flag would override earlier ones. This may be error prone, so I chose to implement it as an error if both are used. I'm not sure if anyone is actually using both.
Reviewers: jakehehrlich, jhenderson, alexshap
Reviewed By: jakehehrlich, alexshap
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D53029
llvm-svn: 344321
In this diff we move out CopyConfig from llvm-oobjcopy.cpp into a separate header CopyConfig.h
to enable us (in the future) reuse this class in the other implementations of objcopy (for coff, mach-o).
Additionally this enables us to unload the complexity from llvm-objcopy.cpp a little bit.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D53006
llvm-svn: 344307
error() in llvm-nm intentionally does not return so that the callee can move on to future files/slices. When printing the archive map, this is not currently handled (the caller assumes that error() returns), so processing continues despite there being an error.
Also, change one return to a break, so that symbols can be printed even if the archive map is corrupt.
llvm-svn: 344268
libtool requires this text to be present, in order to conclude that
the tool supports response files. Also add an explicit test of using
response files with llvm-nm.
Differential Revision: https://reviews.llvm.org/D53064
llvm-svn: 344222
This can be used to preserve profiling information across codebase
changes that have widespread impact on mangled names, but across which
most profiling data should still be usable. For example, when switching
from libstdc++ to libc++, or from the old libstdc++ ABI to the new ABI,
or even from a 32-bit to a 64-bit build.
The user can provide a remapping file specifying parts of mangled names
that should be treated as equivalent (eg, std::__1 should be treated as
equivalent to std::__cxx11), and profile data will be treated as
applying to a particular function if its name is equivalent to the name
of a function in the profile data under the provided equivalences. See
the documentation change for a description of how this is configured.
Remapping is supported for both sample-based profiling and instruction
profiling. We do not support remapping indirect branch target
information, but all other profile data should be remapped
appropriately.
Support is only added for the new pass manager. If someone wants to also
add support for this for the old pass manager, doing so should be
straightforward.
This is the LLVM side of Clang r344199.
Reviewers: davidxl, tejohnson, dlj, erik.pilkington
Subscribers: mehdi_amini, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D51249
llvm-svn: 344200
Add a library that parses optimization remarks (currently YAML, so based
on the YAMLParser).
The goal is to be able to provide tools a remark parser that is not
completely dependent on YAML, in case we decide to change the format
later.
It exposes a C API which takes a handler that is called with the remark
structure.
It adds a libLLVMOptRemark.a static library, and it's used in-tree by
the llvm-opt-report tool (from which the parser has been mostly moved
out).
Differential Revision: https://reviews.llvm.org/D52776
Fixed the tests by removing the usage of C++11 strings, which seems not
to be supported by gcc 4.8.4 if they're used as a macro argument.
llvm-svn: 344171
Add a library that parses optimization remarks (currently YAML, so based
on the YAMLParser).
The goal is to be able to provide tools a remark parser that is not
completely dependent on YAML, in case we decide to change the format
later.
It exposes a C API which takes a handler that is called with the remark
structure.
It adds a libLLVMOptRemark.a static library, and it's used in-tree by
the llvm-opt-report tool (from which the parser has been mostly moved
out).
Differential Revision: https://reviews.llvm.org/D52776
llvm-svn: 344162
Summary: Simplify code by having LLVMState hold the RegisterAliasingTrackerCache.
Reviewers: courbet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D53078
llvm-svn: 344143
When fillMachineFunction generates a return on targets without a return opcode
(such as AArch64) it should pass an empty set of registers as the return
registers, not 0 which means register number zero.
Differential Revision: https://reviews.llvm.org/D53074
llvm-svn: 344139
sancov subtracts one from the address to get the previous instruction,
which makes sense on x86_64, but not on other platforms.
This change ensures that the offset is correct for different platforms.
The logic for computing the offset is copied from sanitizer_common.
Differential Revision: https://reviews.llvm.org/D53039
llvm-svn: 344103
Summary:
Before, "[options] <inputs>" is unconditionally appended to the `Name` parameter. It is more flexible to change its semantic to `Usage` and let user customize the usage line.
% llvm-objcopy
...
USAGE: llvm-objcopy <input> [ <output> ] [options] <inputs>
With this patch:
% llvm-objcopy
...
USAGE: llvm-objcopy input [output]
Reviewers: rupprecht, alexshap, jhenderson
Reviewed By: rupprecht
Subscribers: jakehehrlich, mehdi_amini, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D51009
llvm-svn: 344097
We changed an ArrayRef<uint8_t> to an ArrayRef<uint32_t>, but
it needs to be an ArrayRef<support::ulittle32_t>.
We also change ArrayRef<> to FixedStreamArray<>. Technically
an ArrayRef<> will work, but it can cause a copy in the underlying
implementation if the memory is not contiguous, and there's no
reason not to use a FixedStreamArray<>.
Thanks to nemanjai@ and thakis@ for helping me track this down
and confirm the fix.
llvm-svn: 344063
Summary:
This moves checking logic into the accessors and makes the structure smaller.
It will also help when/if Operand are generated from the TD files.
Subscribers: tschuett, courbet, llvm-commits
Differential Revision: https://reviews.llvm.org/D52982
llvm-svn: 344028
The Globals table is a hash table keyed on symbol name, so
it's possible to lookup symbols by name in O(1) time. Add
a function to the globals stream to do this, and add an option
to llvm-pdbutil to exercise this, then use it to write some
tests to verify correctness.
llvm-svn: 343951
Summary:
The POSIX spec says:
```
If the −t option is used with the −v option, the standard output format shall be:
"%s %u/%u %u %s %d %d:%d %d %s\n", <member mode>, <user ID>,
<group ID>, <number of bytes in member>,
<abbreviated month>, <day-of-month>, <hour>,
<minute>, <year>, <file>
where:
...
<abbreviated month>
Equivalent to the format of the %b conversion specification format in date.
<day-of-month>
Equivalent to the format of the %e conversion specification format in date.
<hour> Equivalent to the format of the %H conversion specification format in date.
<minute> Equivalent to the format of the %M conversion specification format in date.
<year> Equivalent to the format of the %Y conversion specification format in date.
```
This actually used to be the format printed by llvm-ar. It was apparently accidentally changed (see r207385 followed by comments in r207387). This makes it conform to GNU ar for easier replacement.
Reviewers: MaskRay
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52940
llvm-svn: 343901
This matches the output of binutils' nm and ensures that any scripts
or tools that use nm and expect empty output in case there no symbols
don't break.
Differential Revision: https://reviews.llvm.org/D52943
llvm-svn: 343887
DWARF v5 introduces DW_AT_call_all_calls, a subprogram attribute which
indicates that all calls (both regular and tail) within the subprogram
have call site entries. The information within these call site entries
can be used by a debugger to populate backtraces with synthetic tail
call frames.
Tail calling frames go missing in backtraces because the frame of the
caller is reused by the callee. Call site entries allow a debugger to
reconstruct a sequence of (tail) calls which led from one function to
another. This improves backtrace quality. There are limitations: tail
recursion isn't handled, variables within synthetic frames may not
survive to be inspected, etc. This approach is not novel, see:
https://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=get&target=jelinek.pdf
This patch adds an IR-level flag (DIFlagAllCallsDescribed) which lowers
to DW_AT_call_all_calls. It adds the minimal amount of DWARF generation
support needed to emit standards-compliant call site entries. For easier
deployment, when the debugger tuning is LLDB, the DWARF requirement is
adjusted to v4.
Testing: Apart from check-{llvm, clang}, I built a stage2 RelWithDebInfo
clang binary. Its dSYM passed verification and grew by 1.4% compared to
the baseline. 151,879 call site entries were added.
rdar://42001377
Differential Revision: https://reviews.llvm.org/D49887
llvm-svn: 343883
Flag 'AllowZeroMoveEliminationOnly' should have been a property of the PRF, and
not set at register granularity.
This change also restricts move elimination to writes that update a full
physical register. We assume that there is a strong correlation between
logical registers that allow move elimination, and how those same registers are
allocated to physical registers by the register renamer.
This is still a no functional change, because this experimental code path is
disabled for now. This is done in preparation for another patch that will add
the ability to describe how move elimination works in scheduling models.
llvm-svn: 343787
This should help with catching inconsistent definitions of instructions with
zero opcodes, which also declare to consume scheduler/pipeline resources.
llvm-svn: 343766
Summary:
GNU nm (and other nm implementations, such as "go tool nm") prints an explicit "no symbols" message when an object file has no symbols. Currently llvm-nm just doesn't print anything. Adding an explicit "no symbols" message will allow llvm-nm to be used in place of nm: some scripts and build processes use `nm <file> | grep "no symbols"` as a test to see if a file has no symbols. It will also be more familiar to anyone used to nm.
That said, the format implemented here is slightly different, in that it doesn't print the tool name in the message (which IMHO is not useful to include).
Demo:
```
$ for nm in nm bin/llvm-nm ; do echo "nm implementation: $nm"; $nm /tmp/foo{1,2}.o; echo; done
nm implementation: nm
/tmp/foo1.o:
nm: /tmp/foo1.o: no symbols
/tmp/foo2.o:
0000000000000000 T foo2
nm implementation: bin/llvm-nm
/tmp/foo1.o:
no symbols
/tmp/foo2.o:
0000000000000000 T foo2
```
Reviewers: MaskRay
Reviewed By: MaskRay
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52810
llvm-svn: 343742
MCContext does not destroy MCSymbols on shutdown. So, rather than putting
SmallVectors (which may heap-allocate) inside MCSymbolWasm, use unowned pointer
to a WasmSignature instead. The signatures are now owned by the AsmPrinter.
Also uses WasmSignature instead of param and result vectors in TargetStreamer,
and leaves some TODOs for further simplification.
Differential Revision: https://reviews.llvm.org/D52580
llvm-svn: 343733
This patch teaches class RegisterFile how to analyze register writes from
instructions that are move elimination candidates.
In particular, it teaches it how to check if a move can be effectively eliminated
by the underlying PRF, and (if necessary) how to perform move elimination.
The long term goal is to allow processor models to describe instructions that
are valid move elimination candidates.
The idea is to let register file definitions in tablegen declare if/when moves
can be eliminated.
This patch is a non functional change.
The logic that performs move elimination is currently disabled. A future patch
will add support for move elimination in the processor models, and enable this
new code path.
llvm-svn: 343691
deserializeMCOperand - ensure that we at least match the first character of the sscanf pattern before calling
This reduces llvm-exegesis uops analysis of the instructions supported from btver2 from 5m13s to 2m1s on debug builds.
llvm-svn: 343690
Two cases in a ThinLTO test were passing for the wrong reasons, since
rL340374. The tests were supposed to be testing that files were being
pruned due to the cache size, but they were in fact being pruned because
they were older than the default expiration period of 1 week.
This change fixes the tests by explicitly setting the expiration time to
the maximum value. This required the option to be exposed in llvm-lto.
By assigning all files in the cache a similar time, it is possible to see
that the newest files are still being kept, and that we aren't passing
for the wrong reason again. In the event that the entry expiration were
to expire for them, then the test would start failing, because these
files would be removed too.
Reviewed by: rnk, inglorion
Differential Revision: https://reviews.llvm.org/D51992
llvm-svn: 343687
Summary:
This is redundant, as FetchStage::getNextInstruction already checks this
and returns llvm::ErrorSuccess() as appropriate.
NFC.
Reviewers: andreadb
Subscribers: gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D52642
llvm-svn: 343555
These work a little differently because they are actually in
the globals stream and are treated as symbol records, even though
DIA presents them as types. So this also adds the necessary
infrastructure to cache records that live somewhere other than
the TPI stream as well.
llvm-svn: 343507
Summary:
Reporting this as an error required stat()ing every file, as well as seeming
semantically questionable.
Reviewers: vsk, bkramer
Subscribers: mgrang, kristina, llvm-commits, liaoyuke
Differential Revision: https://reviews.llvm.org/D52648
llvm-svn: 343460
Summary: This is prelimineary to moving random functions to SnippetGenerator.
Reviewers: courbet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D52718
llvm-svn: 343456
Summary: I had added support for compressing dwarf sections in a prior commit,
this one adds support for decompressing. Usage is:
llvm-objcopy --decompress-debug-sections input.o output.o
Reviewers: jakehehrlich, jhenderson, alexshap
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D51841
llvm-svn: 343451
for the target machine.
This simplifies usage during setup of concurrent JIT stacks where the client
needs a DataLayout, but not a TargetMachine (TargetMachines are created on
the fly by the compile threads later).
llvm-svn: 343429
There are a few leftovers in rL343163 which span two lines. This commit
changes these llvm::sort(C.begin(), C.end, ...) to llvm::sort(C, ...)
llvm-svn: 343426
(1) Adds comments for the API.
(2) Removes the setArch method: This is redundant: the setArchStr method on the
triple should be used instead.
(3) Turns EmulatedTLS on by default. This matches EngineBuilder's behavior.
llvm-svn: 343423
CompileOnDemandLayer2 now supports user-supplied partition functions (the
original CompileOnDemandLayer already supported these).
Partition functions are called with the list of requested global values
(i.e. global values that currently have queries waiting on them) and have an
opportunity to select extra global values to materialize at the same time.
Also adds testing infrastructure for the new feature to lli.
llvm-svn: 343396
We didn't properly detect when a pointer was a member
pointer, and when that was the case we were not
properly returning class parent info. This caused
member pointers to render incorrectly in pretty mode.
However, we didn't even have pretty tests for pointers
in native mode, so those are also added now to ensure
this.
llvm-svn: 343393
(1) A const accessor for the LLVMContext held by a ThreadSafeContext.
(2) A const accessor for the ThreadSafeModules held by an IRMaterializationUnit.
(3) A const MaterializationResponsibility reference to IRTransformLayer2's
transform function. This makes IRTransformLayer2 useful for JIT debugging
(since it can inspect JIT state through the responsibility argument) as well
as program transformations.
llvm-svn: 343365
Summary: Adds missing debug information accessors to GlobalObject. This puts the finishing touches on cloning debug info in the echo tests.
Reviewers: whitequark, deadalnix
Reviewed By: whitequark
Subscribers: aprantl, JDevlieghere, llvm-commits, harlanhaskins
Differential Revision: https://reviews.llvm.org/D51522
llvm-svn: 343330
- Add fix so that all code paths that create DWARFContext
with an ObjectFile initialise the target architecture in the context
- Add an assert that the Arch is known in the Dwarf CallFrameString method
llvm-svn: 343317
This change is in preparation for a future work on improving support for
optimizable register moves. We already know if a write is from a zero-idiom, so
we can propagate that bit of information to the PRF. We use an APInt mask to
identify registers that are set to zero.
llvm-svn: 343307
one SymbolLinkagePromoter utility.
SymbolLinkagePromoter renames anonymous and private symbols, and bumps all
linkages to at least global/hidden-visibility. Modules whose symbols have been
promoted by this utility can be decomposed into sub-modules without introducing
link errors. This is used by the CompileOnDemandLayer to extract single-function
modules for lazy compilation.
llvm-svn: 343257
Summary: The key is now the resource name, not the resource id.
Reviewers: gchatelet
Subscribers: tschuett, RKSimon, llvm-commits
Differential Revision: https://reviews.llvm.org/D52607
llvm-svn: 343208
The export file of libLTO should has all the interfaces declared in
llvm-c/lto.h and llvm-c/Disassembler.h but LLVMCreateDisasmCPUFeatures
is missing from the list. Export the C API to be consistant.
llvm-svn: 343124
Modifies lit to add a 'thread_support' feature that can be used in lit test
REQUIRES clauses. The thread_support flag is set if -DLLVM_ENABLE_THREADS=ON
and unset if -DLLVM_ENABLE_THREADS=OFF. The lit flag is used to disable the
multiple-compile-threads-basic.ll testcase when threading is disabled.
llvm-svn: 343122
Summary:
THis is a backwards-compatible change (existing files will work as
expected).
See PR39082.
Reviewers: gchatelet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D52546
llvm-svn: 343108
This caused the DebugInfo/Sparc/gnu-window-save.ll test to fail.
> Functions that have signed return addresses need additional dwarf support:
> - After signing the LR, and before authenticating it, the LR register is in a
> state the is unusable by a debugger or unwinder
> - To account for this a new directive, .cfi_negate_ra_state, is added
> - This directive says the signed state of the LR register has now changed,
> i.e. unsigned -> signed or signed -> unsigned
> - This directive has the same CFA code as the SPARC directive GNU_window_save
> (0x2d), adding a macro to account for multiply defined codes
> - This patch matches the gcc implementation of this support:
> https://patchwork.ozlabs.org/patch/800271/
>
> Differential Revision: https://reviews.llvm.org/D50136
llvm-svn: 343103
This doesn't work well in builds configured with LLVM_ENABLE_THREADS=OFF,
causing the following assert when running
ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll:
lib/ExecutionEngine/Orc/Core.cpp:1748: Expected<llvm::JITEvaluatedSymbol>
llvm::orc::lookup(const llvm::orc::JITDylibList &, llvm::orc::SymbolStringPtr):
Assertion `ResultMap->size() == 1 && "Unexpected number of results"' failed.
> LLJIT and LLLazyJIT can now be constructed with an optional NumCompileThreads
> arguments. If this is non-zero then a thread-pool will be created with the
> given number of threads, and compile tasks will be dispatched to the thread
> pool.
>
> To enable testing of this feature, two new flags are added to lli:
>
> (1) -compile-threads=N (N = 0 by default) controls the number of compile threads
> to use.
>
> (2) -thread-entry can be used to execute code on additional threads. For each
> -thread-entry argument supplied (multiple are allowed) a new thread will be
> created and the given symbol called. These additional thread entry points are
> called after static constructors are run, but before main.
llvm-svn: 343099
Summary: This is is preparation of exploring value ranges.
Reviewers: courbet
Reviewed By: courbet
Subscribers: mgorny, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D52542
llvm-svn: 343098
Functions that have signed return addresses need additional dwarf support:
- After signing the LR, and before authenticating it, the LR register is in a
state the is unusable by a debugger or unwinder
- To account for this a new directive, .cfi_negate_ra_state, is added
- This directive says the signed state of the LR register has now changed,
i.e. unsigned -> signed or signed -> unsigned
- This directive has the same CFA code as the SPARC directive GNU_window_save
(0x2d), adding a macro to account for multiply defined codes
- This patch matches the gcc implementation of this support:
https://patchwork.ozlabs.org/patch/800271/
Differential Revision: https://reviews.llvm.org/D50136
llvm-svn: 343089
LLJIT and LLLazyJIT can now be constructed with an optional NumCompileThreads
arguments. If this is non-zero then a thread-pool will be created with the
given number of threads, and compile tasks will be dispatched to the thread
pool.
To enable testing of this feature, two new flags are added to lli:
(1) -compile-threads=N (N = 0 by default) controls the number of compile threads
to use.
(2) -thread-entry can be used to execute code on additional threads. For each
-thread-entry argument supplied (multiple are allowed) a new thread will be
created and the given symbol called. These additional thread entry points are
called after static constructors are run, but before main.
llvm-svn: 343058
compilation of IR in the JIT.
ThreadSafeContext is a pair of an LLVMContext and a mutex that can be used to
lock that context when it needs to be accessed from multiple threads.
ThreadSafeModule is a pair of a unique_ptr<Module> and a
shared_ptr<ThreadSafeContext>. This allows the lifetime of a ThreadSafeContext
to be managed automatically in terms of the ThreadSafeModules that refer to it:
Once all modules using a ThreadSafeContext are destructed, and providing the
client has not held on to a copy of shared context pointer, the context will be
automatically destructed.
This scheme is necessary due to the following constraits: (1) We need multiple
contexts for multithreaded compilation (at least one per compile thread plus
one to store any IR not currently being compiled, though one context per module
is simpler). (2) We need to free contexts that are no longer being used so that
the JIT does not leak memory over time. (3) Module lifetimes are not
predictable (modules are compiled as needed depending on the flow of JIT'd
code) so there is no single point where contexts could be reclaimed.
JIT clients not using concurrency can safely use one ThreadSafeContext for all
ThreadSafeModules.
JIT clients who want to be able to compile concurrently should use a different
ThreadSafeContext for each module, or call setCloneToNewContextOnEmit on their
top-level IRLayer. The former reduces compile latency (since no clone step is
needed) at the cost of additional memory overhead for uncompiled modules (as
every uncompiled module will duplicate the LLVM types, constants and metadata
that have been shared).
llvm-svn: 343055
Summary: This is a NFC in preparation of exporting the initial registers as part of the YAML dump
Reviewers: courbet
Reviewed By: courbet
Subscribers: mgorny, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D52427
llvm-svn: 342967
Summary:
This is a step towards fixing PR38048.
Note that right now the measurements are given per instruction. We'll
need to give measurements a per code snippet and update the analysis (PR38731).
Reviewers: gchatelet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D52041
llvm-svn: 342947
Implementing -print-before-all/-print-after-all/-filter-print-func support
through PassInstrumentation callbacks.
- PrintIR routines implement printing callbacks.
- StandardInstrumentations class provides a central place to manage all
the "standard" in-tree pass instrumentations. Currently it registers
PrintIR callbacks.
Reviewers: chandlerc, paquette, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D50923
llvm-svn: 342896
Summary:
The `set` statements was incorrectly reading the value of the local variable and
setting the value of the parent variable.
Reviewers: tycho, gchatelet, john.brawn
Subscribers: mgorny, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D52343
llvm-svn: 342865
This allows the native reader to find records of class/struct/
union type and dump them. This behavior is tested by using the
diadump subcommand against golden output produced by actual DIA
SDK on the same PDB file, and again using pretty -native to
confirm that we actually dump the classes. We don't find class
members or anything like that yet, for now it's just the class
itself.
llvm-svn: 342779
Instead of indexing local variables by DIE offset, use the variable
name + the path through the lexical block tree. This makes the lookup
key consistent across duplicate abstract origins in different CUs.
llvm-svn: 342776
Summary:
There isn't any actual dependency - there's one #include from CodeGen
but nothing from the header is actually used.
With this change we can use the MCA library from CodeGen without
circular dependencies (e.g. for scheduling).
Reviewers: andreadb
Reviewed By: andreadb
Authored By: orodley
Subscribers: mgorny, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D52288
llvm-svn: 342706
Summary:
Implement --version for objcopy and strip.
I think there are LLVM utilities that automatically handle this, but that doesn't seem to work with custom parsing since this binary handles both objcopy and strip, so it uses custom parsing.
This fixes PR38298
Reviewers: jhenderson, alexshap, jakehehrlich
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52328
llvm-svn: 342702
Some records point to an LF_CLASS, LF_UNION, LF_STRUCTURE, or LF_ENUM
which is a forward reference and doesn't contain complete debug
information. In these cases, we'd like to be able to quickly locate the
full record. The TPI stream stores an array of pre-computed record hash
values, one for each type record. If we pre-process this on startup, we
can build a mapping from hash value -> {list of possible matching type
indices}. Since hashes of full records are only based on the name and or
unique name and not the full record contents, we can then use forward
ref record to compute the hash of what *would* be the full record by
just hashing the name, use this to get the list of possible matches, and
iterate those looking for a match on name or unique name.
llvm-pdbutil is updated to resolve forward references for the purposes
of testing (plus it's just useful).
Differential Revision: https://reviews.llvm.org/D52283
llvm-svn: 342656
Summary:
Added function to set a register to a particular value + tests.
Add EFLAGS test, use new setRegTo instead of setRegToConstant.
Reviewers: courbet, javed.absar
Subscribers: llvm-commits, tschuett, mgorny
Differential Revision: https://reviews.llvm.org/D52297
llvm-svn: 342644
This patch adds the ability for processor models to describe dependency breaking
instructions.
Different processors may specify a different set of dependency-breaking
instructions.
That means, we cannot assume that all processors of the same target would use
the same rules to classify dependency breaking instructions.
The main goal of this patch is to provide the means to describe dependency
breaking instructions directly via tablegen, and have the following
TargetSubtargetInfo hooks redefined in overrides by tabegen'd
XXXGenSubtargetInfo classes (here, XXX is a Target name).
```
virtual bool isZeroIdiom(const MachineInstr *MI, APInt &Mask) const {
return false;
}
virtual bool isDependencyBreaking(const MachineInstr *MI, APInt &Mask) const {
return isZeroIdiom(MI);
}
```
An instruction MI is a dependency-breaking instruction if a call to method
isDependencyBreaking(MI) on the STI (TargetSubtargetInfo object) evaluates to
true. Similarly, an instruction MI is a special case of zero-idiom dependency
breaking instruction if a call to STI.isZeroIdiom(MI) returns true.
The extra APInt is used for those targets that may want to select which machine
operands have their dependency broken (see comments in code).
Note that by default, subtargets don't know about the existence of
dependency-breaking. In the absence of external information, those method calls
would always return false.
A new tablegen class named STIPredicate has been added by this patch to let
processor models classify instructions that have properties in common. The idea
is that, a MCInstrPredicate definition can be used to "generate" an instruction
equivalence class, with the idea that instructions of a same class all have a
property in common.
STIPredicate definitions are essentially a collection of instruction equivalence
classes.
Also, different processor models can specify a different variant of the same
STIPredicate with different rules (i.e. predicates) to classify instructions.
Tablegen backends (in this particular case, the SubtargetEmitter) will be able
to process STIPredicate definitions, and automatically generate functions in
XXXGenSubtargetInfo.
This patch introduces two special kind of STIPredicate classes named
IsZeroIdiomFunction and IsDepBreakingFunction in tablegen. It also adds a
definition for those in the BtVer2 scheduling model only.
This patch supersedes the one committed at r338372 (phabricator review: D49310).
The main advantages are:
- We can describe subtarget predicates via tablegen using STIPredicates.
- We can describe zero-idioms / dep-breaking instructions directly via
tablegen in the scheduling models.
In future, the STIPredicates framework can be used for solving other problems.
Examples of future developments are:
- Teach how to identify optimizable register-register moves
- Teach how to identify slow LEA instructions (each subtarget defining its own
concept of "slow" LEA).
- Teach how to identify instructions that have undocumented false dependencies
on the output registers on some processors only.
It is also (in my opinion) an elegant way to expose knowledge to both external
tools like llvm-mca, and codegen passes.
For example, machine schedulers in LLVM could reuse that information when
internally constructing the data dependency graph for a code region.
This new design feature is also an "opt-in" feature. Processor models don't have
to use the new STIPredicates. It has all been designed to be as unintrusive as
possible.
Differential Revision: https://reviews.llvm.org/D52174
llvm-svn: 342555
There were several issues with the previous implementation.
1) There were no tests.
2) We didn't support creating PDBSymbolTypePointer records for
builtin types since those aren't described by LF_POINTER
records.
3) We didn't support a wide enough variety of builtin types even
ignoring pointers.
This patch fixes all of these issues. In order to add tests,
it's helpful to be able to ignore the symbol index id hierarchy
because it makes the golden output from the DIA version not match
our output, so I've extended the dumper to disable dumping of id
fields.
llvm-svn: 342493
rL342465 is breaking the MSVC buildbots, but I need to revert this dependent revision as well.
Summary:
Added function to set a register to a particular value + tests.
Add EFLAGS test, use new setRegTo instead of setRegToConstant.
Reviewers: courbet, javed.absar
Subscribers: mgorny, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D51856
llvm-svn: 342489
This patch adds two new boolean fields:
- Field `ReadState::IndependentFromDef`.
- Field `WriteState::WritesZero`.
Field `IndependentFromDef` is set for ReadState objects associated with
dependency-breaking instructions. It is used by the simulator when updating data
dependencies between registers.
Field `WritesZero` is set by WriteState objects associated with dependency
breaking zero-idiom instructions. It helps the PRF identify which writes don't
consume any physical registers.
llvm-svn: 342483
Summary:
Added function to set a register to a particular value + tests.
Add EFLAGS test, use new setRegTo instead of setRegToConstant.
Reviewers: courbet, javed.absar
Subscribers: mgorny, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D51856
llvm-svn: 342466
top argument when superior to the instrumentated code list capacity can lead to a segfault.
Reviewers: dberris
Reviewed By: dberris
Differential Revision: https://reviews.llvm.org/D52224
llvm-svn: 342461
Previously we would dump the names of enum types, but not their
enumerator values. This adds support for enumerator values. In
doing so, we have to introduce a general purpose mechanism for
caching symbol indices of field list members. Unlike global
types, FieldList members do not have a TypeIndex. So instead,
we identify them by the pair {TypeIndexOfFieldList, IndexInFieldList}.
llvm-svn: 342415
Summary: This will be useful to generate many configurations and test instruction regimes (NaN, Inf, subnormal, normal).
Reviewers: courbet
Subscribers: mgorny, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D51858
llvm-svn: 342369
The original was reverted due to an apparent build-bot test failure,
but it looks like this is just a flaky test.
Also added a C-interface function for large values, and updated
llvm-lto's --thinlto-cache-max-size-bytes switch to take a type larger
than int.
The maximum cache size in terms of bytes is a 64-bit number. However,
the methods to set it only took unsigned previously, which meant that
the maximum cache size could not be specified above 4GB. That's quite
small compared to the output of some projects, so it makes sense to
provide the ability to set larger values in that field.
We also needed a C-interface function that provides a greater range
than the existing thinlto_codegen_set_cache_size_bytes, which also only
takes an unsigned, so this change also adds
hinlto_codegen_set_cache_size_megabytes.
Reviewed by: mehdi_amini, tejohnson, steven_wu
Differential Revision: https://reviews.llvm.org/D52023
llvm-svn: 342366
This diff adds -S as an alias for --strip-all-gnu
(for compatibility with binutils' objcopy).
Patch by Dmitry Golovin!
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D52163
llvm-svn: 342364
For people who use llvm-readelf as a replacement of GNU readelf, they would like to see -d -r ... listed in llvm-readelf -help. It also helps understanding the confusing -s (which is unfortunately different in semantics).
Reviewers: phosek, ruiu, echristo
Reviewed By: ruiu, echristo
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52129
llvm-svn: 342339
Naively computing the hash after the PDB data has been generated is in practice
as fast as other approaches I tried. I also tried online-computing the hash as
parts of the PDB were written out (https://reviews.llvm.org/D51887; that's also
where all the measuring data is) and computing the hash in parallel
(https://reviews.llvm.org/D51957). This approach here is simplest, without
being slower.
Differential Revision: https://reviews.llvm.org/D51956
llvm-svn: 342333
Currently if we got something like `const Foo` we'd ignore it and
just rely on printing the unmodified `Foo` later on. However,
for testing the native reading code we really would like to be able
to see these so that we can verify that the native reader can
actually handle them. Instead of printing out the full type though,
just print out the header.
llvm-svn: 342295
Also added a C-interface function for large values, and updated
llvm-lto's --thinlto-cache-max-size-bytes switch to take a type larger
than int.
The maximum cache size in terms of bytes is a 64-bit number. However,
the methods to set it only took unsigned previously, which meant that
the maximum cache size could not be specified above 4GB. That's quite
small compared to the output of some projects, so it makes sense to
provide the ability to set larger values in that field.
We also needed a C-interface function that provides a greater range
than the existing thinlto_codegen_set_cache_size_bytes, which also only
takes an unsigned, so this change also adds
hinlto_codegen_set_cache_size_megabytes.
Reviewed by: mehdi_amini, tejohnson, steven_wu
Differential Revision: https://reviews.llvm.org/D52023
llvm-svn: 342233
See rL342148
This probably only shows up in BUILD_SHARED_LIBS=ON builds
which might explain how it crept in.
Differential Revision: https://reviews.llvm.org/D52054
llvm-svn: 342180
add a tool to generate symbol remapping files.
Summary:
The new tool llvm-cxxmap builds a symbol mapping table from a file containing
a description of partial equivalences to apply to mangled names and files
containing old and new symbol tables.
Reviewers: davidxl
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D51470
llvm-svn: 342168
Summary:
The snippet-generation part goes to the SnippetGenerator class.
This will allow benchmarking arbitrary code (see PR38437).
Reviewers: gchatelet
Subscribers: mgorny, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D51979
llvm-svn: 342117
r342003 added support for emitting FPO data from the
DEBUG_S_FRAMEDATA subsection of the .debug$S section to the PDB
file. However, that is not the end of the story. FPO can end
up in two different destinations in a PDB, each corresponding to
a different FPO data source.
The case handled by r342003 involves copying data from the
DEBUG_S_FRAMEDATA subsection of the .debug$S section to the
"New FPO" stream in the PDB, which is then referred to by the
DBI stream. The case handled by this patch involves copying
records from the .debug$F section of an object file to the "FPO"
stream (or perhaps more aptly, the "Old FPO" stream) in the PDB
file, which is also referred to by the DBI stream.
The formats are largely similar, and the difference is mostly
only visible in masm generated object files, such as some of the
low-level CRT object files like memcpy. MASM doesn't appear to
support writing the DEBUG_S_FRAMEDATA subsection, and instead
just writes these records to the .debug$F section.
Although clang-cl does not emit a .debug$F section ever, lld still
needs to support it so we have good debugging for CRT functions.
Differential Revision: https://reviews.llvm.org/D51958
llvm-svn: 342080
Submitted on behalf of Armando Montanez (amontanez@google.com).
Objects with unused program headers copied by objcopy would always have
nonzero values for program header offset and program header entry size.
While technically valid, this atypical behavior triggers warnings in some
tools. This change sets the two fields to zero when the program header is
unused, better fitting the general expectations for unused program header
data.
Section headers behaved somewhat similarly (though only with the entry size),
and are fixed in this revision as well.
Differential Revision: https://reviews.llvm.org/D51961
llvm-svn: 342065
Eliminating some duplication of rangelist dumping code at the expense of
some version-dependent code in dump and extract routines.
Reviewer: dblaikie, JDevlieghere, vleschuk
Differential revision: https://reviews.llvm.org/D51081
llvm-svn: 342048
Summary:
There are two registers encoded in the S_FRAMEPROC flags: one for locals
and one for parameters. The encoding is described by the
ExpandEncodedBasePointerReg function in cvinfo.h. Two bits are used to
indicate one of four possible values:
0: no register - Used when there are no variables.
1: SP / standard - Variables are stored relative to the standard SP
for the ISA.
2: FP - Variables are addressed relative to the ISA frame
pointer, i.e. EBP on x86. If realignment is required, parameters
use this. If a dynamic alloca is used, locals will be EBP relative.
3: Alternative - Variables are stored relative to some alternative
third callee-saved register. This is required to address highly
aligned locals when there are dynamic stack adjustments. In this
case, both the incoming SP saved in the standard FP and the current
SP are at some dynamic offset from the locals. LLVM uses ESI in
this case, MSVC uses EBX.
Most of the changes in this patch are to pass around the CPU so that we
can decode these into real, named architectural registers.
Subscribers: hiraditya
Differential Revision: https://reviews.llvm.org/D51894
llvm-svn: 341999
Summary:
This patch removes the storing of accumulated floating point data
within the llvm-mca library.
This patch splits-up the two quantities: cycles and number of resource units.
By splitting-up these two quantities, we delay the calculation of "cycles per resource unit"
until that value is read, reducing the chance of accumulating floating point error.
I considered using the APFloat, but after measuring performance, for a large (many iteration)
sample, I decided to go with this faster solution.
Reviewers: andreadb, courbet, RKSimon
Reviewed By: andreadb
Subscribers: llvm-commits, javed.absar, tschuett, gbedwell
Differential Revision: https://reviews.llvm.org/D51903
llvm-svn: 341980
Summary:
In this change, we implement a `BlockPrinter` which orders records in a
Block that's been indexed by the `BlockIndexer`. This is used in the
`llvm-xray fdr-dump` tool which ties together the various types and
utilities we've been working on, to allow for inspection of XRay FDR
mode traces both with and without verification.
This change is the final step of the refactoring of D50441.
Reviewers: mboerger, eizan
Subscribers: mgorny, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D51846
llvm-svn: 341887
Some asm has double spaces between operands, the deserializer was keeping these empty split pieces, causing assertions later on:
'ADC16mi RDI i_0x1x i_0x0x i_0x1x'
llvm-svn: 341799
In order to start testing this, I've added a new mode to
llvm-pdbutil which is only really useful for writing tests.
It just dumps the value of raw fields in record format.
This isn't really ideal and it won't allow us to test some
important cases, but it's better than nothing for now.
llvm-svn: 341729
Before this patch, analyzeContext called getCanonicalDIEOffset(), for
which the result depends on the timings of the setCanonicalDIEOffset()
calls in the cloneLambda. This can lead to slightly different output
between runs due to threading.
To prevent this from happening, we now record the output debug info size
after importing the modules (before any concurrent processing takes
place). This value, named the ModulesEndOffset is used to compare the
canonical DIE offset against. If the value is greater than this offset,
the canonical DIE offset has been updated during cloning, and should
therefore not be considered for pruning.
Differential revision: https://reviews.llvm.org/D51443
llvm-svn: 341649
Third Attempt:
- Alignment issues resolved.
- zlib::isAvailable() detected.
- ArrayRef misuse fixed.
Usage:
llvm-objcopy --compress-debug-sections=zlib foo.o
llvm-objcopy --compress-debug-sections=zlib-gnu foo.o
In both cases the debug section contents is compressed with zlib. In the GNU
style case the header is the "ZLIB" magic string followed by the uint64 big-
endian decompressed size. In the non-GNU mode the header is the
Elf(32|64)_Chdr.
Decompression support is coming soon.
Differential Revision: https://reviews.llvm.org/D49678
llvm-svn: 341635
Second Attempt. Alignment issues resolved. zlib::isAvailable() detected.
Usage:
llvm-objcopy --compress-debug-sections=zlib foo.o
llvm-objcopy --compress-debug-sections=zlib-gnu foo.o
In both cases the debug section contents is compressed with zlib. In the GNU
style case the header is the "ZLIB" magic string followed by the uint64 big-
endian decompressed size. In the non-GNU mode the header is the
Elf(32|64)_Chdr.
Decompression support is coming soon.
Differential Revision: https://reviews.llvm.org/D49678
llvm-svn: 341607
It caused ambiguity between llvm:🆑:Optional and llvm::Optional, which
has been fixed by dropping `using namespace cl;` in favor of explicit
cl:: qualified names.
llvm-svn: 341586
`using namespace cl` makes llvm:🆑:Optional (in Support/CommandLine.h) visible which will cause ambiguity when unqualified `Optional` is looked up (can also refer to llvm::Optional).
cl:: is used much more than `using namespace cl`, so let's not use the latter.
Also append \n to the argument of cl::ParseCommandLineOptions
llvm-svn: 341584
MRI scripts have two comment chars, * and ;, but only the latter was
supported before.
Also allow leading spaces before comment chars (and before any command
string), and allow comments after a command.
Differential Revision: https://reviews.llvm.org/D51338
llvm-svn: 341571
Keeping the compile units in memory is expensive. For the single
threaded case we allocate them in the analyze part and deallocate them
again once we've finished cloning. This poses a problem in the single
threaded case where we did all the analysis first followed by all the
cloning. This meant we had all the link context in memory right after
analyzing finished.
This patch changes the way we order work in the single threaded case.
Instead of doing all the analysis and cloning in serial, we now
interleave the two so we can deallocate the memory as soon as a file is
processed. The result is binary identical and peak memory usage went
down from 13.43GB to 5.73GB for a debug build of trunk clang.
Differential revision: https://reviews.llvm.org/D51618
llvm-svn: 341568
The way DIA SDK works is that when you request a symbol, it
gets assigned an internal identifier that is unique for the
life of the session. You can then use this identifier to
get back the same symbol, with all of the same internal state
that it had before, even if you "destroyed" the original
copy of the object you had.
This didn't work properly in our native implementation, and
if you destroyed an object for a particular symbol, then
requested the same symbol again, it would get assigned a new
ID and you'd get a fresh copy of the object. In order to fix
this some refactoring had to happen to properly reuse cached
objects. Some unittests are added to verify that symbol
reuse is taking place, making use of the new unittest input
feature.
llvm-svn: 341503
Summary:
Allow strip to be called on multiple input files, which is interpreted as stripping N files in place. Using multiple input files is incompatible with -o.
To allow this, create a `DriverConfig` struct which just wraps a list of `CopyConfigs`. objcopy will only ever have a single `CopyConfig`, but strip will have N (where N >= 1) CopyConfigs.
Reviewers: alexshap, jakehehrlich
Reviewed By: alexshap, jakehehrlich
Subscribers: MaskRay, jakehehrlich, llvm-commits
Differential Revision: https://reviews.llvm.org/D51660
llvm-svn: 341464
Summary:
Fixes the error "Link field value 0 in section .rela.plt is invalid" when copying/stripping certain binaries. Minimal repro:
```
$ cat /tmp/a.c
int main() { return 0; }
$ clang -static /tmp/a.c -o /tmp/a
$ llvm-strip /tmp/a -o /tmp/b
llvm-strip: error: Link field value 0 in section .rela.plt is invalid.
```
Reviewers: jakehehrlich, alexshap
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D51493
llvm-svn: 341419
Also reverts follow-up commits r341343 and r341344.
The primary commit continues to break some build bots even after the
fixes in r341343 for UBSan issues:
http://lab.llvm.org:8011/builders/clang-cmake-aarch64-full/builds/5823
It is also failing for me locally (linux, x86-64).
llvm-svn: 341360
Usage:
llvm-objcopy --compress-debug-sections=zlib foo.o
llvm-objcopy --compress-debug-sections=zlib-gnu foo.o
In both cases the debug section contents is compressed with zlib. In the GNU
style case the header is the "ZLIB" magic string followed by the uint64 big-
endian decompressed size. In the non-GNU mode the header is the
Elf(32|64)_Chdr.
Decompression support is coming soon.
Differential Revision: https://reviews.llvm.org/D49678
llvm-svn: 341342
Also adjust some of dsymutil's headers to put the header guards at the top,
otherwise the compiler will not recognize them as header guards.
llvm-svn: 341323
Following D50807, and heading towards D50664, this intermediary change does the following:
1. Upgrade all custom Error types in llvm/trunk/lib/DebugInfo/ to use the new StringError behavior (D50807).
2. Implement std::is_error_code_enum and make_error_code() for DebugInfo error enumerations.
3. Rename GenericError -> PDBError (the file will be renamed in a subsequent commit)
4. Update custom error messages to follow the same formatting: (\w\s*)+\.
5. Keep generic "file not found" (ENOENT) errors as they are in PDB code. Previously, there used to be a custom enumeration for that purpose.
6. Remove a few extraneous LF in log() implementations. Printing LF is a responsability at a higher level, not at the error level.
Differential Revision: https://reviews.llvm.org/D51499
llvm-svn: 341228
When using -g and -dsym, llvm-objdump opens the dsym file and keeps the
MachOObjectFile alive, while the memory buffer that the MachOObjectFile
was based on gets destroyed.
Differential Revision: https://reviews.llvm.org/D51365
llvm-svn: 341209
forward declarations.
Especially with template instantiations, there are legitimate reasons
why for declarations might be emitted into a DW_TAG_module skeleton /
forward-declaration sub-tree, that are not forward declarations in the
sense of that there is a more complete definition over in a .pcm file.
The example in the testcase is a constant DW_TAG_member of a
DW_TAG_class template instatiation.
rdar://problem/43623196
llvm-svn: 341123
Summary: Add a new type for named metadata nodes. Use this to implement iterators and accessors for NamedMDNodes and extend the echo test to use them to copy module-level debug information.
Reviewers: whitequark, deadalnix, aprantl, dexonsmith
Reviewed By: whitequark
Subscribers: Wallbraker, JDevlieghere, llvm-commits, harlanhaskins
Differential Revision: https://reviews.llvm.org/D47179
llvm-svn: 341085
This patch introduces the following changes to the DispatchStatistics view:
* DispatchStatistics now reports the number of dispatched opcodes instead of
the number of dispatched instructions.
* The "Dynamic Dispatch Stall Cycles" table now also reports the percentage of
stall cycles against the total simulated cycles.
This change allows users to easily compare dispatch group sizes with the
processor DispatchWidth.
Before this change, it was difficult to correlate the two numbers, since
DispatchStatistics view reported numbers of instructions (instead of opcodes).
DispatchWidth defines the maximum size of a dispatch group in terms of number of
micro opcodes.
The other change introduced by this patch is related to how DispatchStage
generates "instruction dispatch" events.
In particular:
* There can be multiple dispatch events associated with a same instruction
* Each dispatch event now encapsulates the number of dispatched micro opcodes.
The number of micro opcodes declared by an instruction may exceed the processor
DispatchWidth. Therefore, we cannot assume that instructions are always fully
dispatched in a single cycle.
DispatchStage knows already how to handle instructions declaring a number of
opcodes bigger that DispatchWidth. However, DispatchStage always emitted a
single instruction dispatch event (during the first simulated dispatch cycle)
for instructions dispatched.
With this patch, DispatchStage now correctly notifies multiple dispatch events
for instructions that cannot be dispatched in a single cycle.
A few views had to be modified. Views can no longer assume that there can only
be one dispatch event per instruction.
Tests (and docs) have been updated.
Differential Revision: https://reviews.llvm.org/D51430
llvm-svn: 341055
That resulted in the check-llvm-* targets not being avaliable
in the QtCreator-configured build directories.
Moreover, that was a clearly non-NFC change, and i can't find any review
for it.
This reverts commit rL340435.
llvm-svn: 341045
The restoreDateOnFile() method used to preserve dates uses sys::fs::openFileForWrite(). That method defaults to opening files with CD_CreateAlways, which truncates the output file if it exists. Use CD_OpenExisting instead to open it and *not* truncate it, which also has the side benefit of erroring if the file does not exist (it should always exist, because we just wrote it out).
Also, fix the test case to make sure the output is a valid output file, and not empty. The extra test assertions are enough to catch this regression.
llvm-svn: 340996
This patch adds two new fields to the perf report generated by the SummaryView.
Fields are now logically organized into two small groups; only the second group
contains throughput indicators.
Example:
```
Iterations: 100
Instructions: 300
Total Cycles: 414
Total uOps: 700
Dispatch Width: 4
uOps Per Cycle: 1.69
IPC: 0.72
Block RThroughput: 4.0
```
This patch also updates the docs for llvm-mca.
Due to the nature of this change, several tests in the tools/llvm-mca directory
were affected, and had to be updated using script `update_mca_test_checks.py`.
llvm-svn: 340946
The addObjectFile method adds the given object file to the JIT session, making
its code available for execution.
Support for the -extra-object flag is added to lli when operating in
-jit-kind=orc-lazy mode to support testing of this feature.
llvm-svn: 340870
This patch also uses colors to highlight problematic wait-time entries.
A problematic entry is an entry with an high wait time that tends to match (or
exceed) the size of the scheduler's buffer.
Color RED is used if an instruction had to wait an average number of cycles
which is bigger than (or equal to) the size of the underlying scheduler's
buffer.
Color YELLOW is used if the time (in cycles) spend waiting for the
operands or pipeline resources is bigger than half the size of the underlying
scheduler's buffer.
Color MAGENTA is used if an instruction does not consume buffer resources
according to the scheduling model.
llvm-svn: 340825
This patch removes the MSBuild warnings about options that
clang-cl ignores. It also adds several additional fields to
the LLVM Configuration options page. The first is that it
adds support for LLD! To give the user flexibility though,
we don't want to force LLD to always-on, and if we're not
forcing LLD then we might as well not force clang-cl either.
So we add options that can enable or disable lld, clang-cl,
or any combination of the two. Whenever one is disabled,
it falls back to the Microsoft equivalent.
Additionally, for each of clang-cl and lld-link, we add a new
configuration setting that allows Additional Options to be
passed for that specific tool only. This is similar to the
C/C++ > Command Line > Additional Options entry box, but
it serves the use case where a user switches back and forth
between the toolsets in their vcxproj, but where cl.exe
won't accept some options that clang-cl will. In this case
you can pass those options in the clang-cl additional options
and whenever clang-cl is disabled (or the other toolset is
selected entirely), those options won't get passed at all.
llvm-svn: 340780
Normally we force Unix line endings in the repository, but since these are Windows files which are consumed by Microsoft tools that we don't have the source of, we should probably err on the side of caution and force CRLF.
llvm-svn: 340776
Summary:
This patch introduces llvm-mca as a library. The driver (llvm-mca.cpp), views, and stats, are not part of the library.
Those are separate components that are not required for the functioning of llvm-mca.
The directory has been organized as follows:
All library source files now reside in:
- `lib/HardwareUnits/` - All subclasses of HardwareUnit (these represent the simulated hardware components of a backend).
(LSUnit does not inherit from HardwareUnit, but Scheduler does which uses LSUnit).
- `lib/Stages/` - All subclasses of the pipeline stages.
- `lib/` - This is the root of the library and contains library code that does not fit into the Stages or HardwareUnit subdirs.
All library header files now reside in the `include` directory and mimic the same layout as the `lib` directory mentioned above.
In the (near) future we would like to move the library (include and lib) contents from tools and into the core of llvm somewhere.
That change would allow various analysis and optimization passes to make use of MCA functionality for things like cost modeling.
I left all of the non-library code just where it has always been, in the root of the llvm-mca directory.
The include directives for the non-library source file have been updated to refer to the llvm-mca library headers.
I updated the llvm-mca/CMakeLists.txt file to include the library headers, but I made the non-library code
explicitly reference the library's 'include' directory. Once we eventually (hopefully) migrate the MCA library
components into llvm the include directives used by the non-library source files will be updated to point to the
proper location in llvm.
Reviewers: andreadb, courbet, RKSimon
Reviewed By: andreadb
Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D50929
llvm-svn: 340755
Before this patch, the SchedulerStatistics only printed the maximum number of
buffer entries consumed in each scheduler's queue at a given point of the
simulation.
This patch restructures the reported table, and adds an extra field named
"Average number of used buffer entries" to it.
This patch also uses different colors to help identifying bottlenecks caused by
high scheduler's buffer pressure.
llvm-svn: 340746
This is a bit awkward in a handful of places where we didn't even have
an instruction and now we have to see if we can build one. But on the
whole, this seems like a win and at worst a reasonable cost for removing
`TerminatorInst`.
All of this is part of the removal of `TerminatorInst` from the
`Instruction` type hierarchy.
llvm-svn: 340701
Choosing to revert the change and do it again, hopefully preserving the history
of the changes by using svn copy instead of simply creating a new file from the
contents within Scheduler.
llvm-svn: 340661
This (partially) fixes a regression introduced by
https://reviews.llvm.org/D43945 / r327399, which parallelized
DwarfLinker. This patch avoids parsing and allocating the memory for
all input DIEs up front and instead only allocates them in the
concurrent loop in the AnalyzeLambda. At the end of the loop the
memory from the LinkContext is cleared again.
This reduces the peak memory needed to link the debug info of a
non-modular build of the Swift compiler by >3GB.
rdar://problem/43444464
Differential Revision: https://reviews.llvm.org/D51078
llvm-svn: 340650
When used in cross-DSO mode, CFI will generate calls to special functions rather than trap instructions. For example, instead of generating
if (!InlinedFastCheck(f))
abort();
call *f
CFI generates
if (!InlinedFastCheck(f))
__cfi_slowpath(CallSiteTypeId, f);
call *f
This patch teaches cfi-verify to recognize calls to __cfi_slowpath and abort and treat them as trap functions.
In addition to normal symbols, we also parse the dynamic relocations to handle cross-DSO calls in libraries.
We also extend cfi-verify to recognize other patterns that occur using cross-DSO. For example, some indirect calls are not guarded by a branch to a trap but instead follow a call to __cfi_slowpath. For example:
if (!InlinedFastCheck(f))
call *f
else {
__cfi_slowpath(CallSiteTypeId, f);
call *f
}
In this case, the second call to f is not marked as protected by the current code. We thus recognize if indirect calls directly follow a call to a function that will trap on CFI violations and treat them as protected.
We also ignore indirect calls in the PLT, since on AArch64 each entry contains an indirect call that should not be protected by CFI, and these are labeled incorrectly when debug information is not present.
Differential Revision: https://reviews.llvm.org/D49383
llvm-svn: 340612
Per LLVM's CommandGuide, llvm-profdata show -text is supposed to produce
textual output that can be passed as input to further llvm-profdata
invocations. This previously didn't work for two reasons:
1) -text was not sufficient to enable the machine-readable text format output;
instead, -text was effectively ignored if -counts was not also specified. (With
this patch, -counts is instead ignored if -text is specified, because the
machine-readable text format always includes counts.)
2) When the input data was an IR-level profile, the :ir marker was missing from
the output, resulting in a text format output that would not be usable as
profiling data due to function hash mismatches.
Differential Revision: https://reviews.llvm.org/D51188
llvm-svn: 340592
Thanks to @waltl for reporting this issue.
I have also added an assert to check for invalid null strategy objects, and I
have reworded a couple of code comments in Scheduler.h
llvm-svn: 340545
With this patch, users can now customize the pipeline selection strategy for
scheduler resources. The resource selection strategy can be defined at processor
resource granularity. This enables the definition of different strategies for
different hardware schedulers.
To override the strategy associated with a processor resource, users can call
method ResourceManager::setCustomStrategy(), and pass a 'ResourceStrategy'
object in input.
Class ResourceStrategy is an abstract class which declares virtual method
`ResourceStrategy::select()`. Method select() is meant to implement the actual
strategy; it is responsible for picking the next best resource from a set of
available pipeline resources. Custom strategy must simply override that method.
By default, processor resources are associated with instances of
'DefaultResourceStrategy'. A 'DefaultResourceStrategy' internally implements a
simple round-robin selector. For more details, please refer to the code comments
in Scheduler.h.
llvm-svn: 340536
Both DWARFDebugLine and DWARFDebugAddr used the same callback mechanism
for handling recoverable errors. They both implemented similar warn() function
to be used as such callbacks.
In this revision we get rid of code duplication and move this warn() function
to DWARFContext as DWARFContext::dumpWarning().
Reviewers: lhames, jhenderson, aprantl, probinson, dblaikie, JDevlieghere
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D51033
llvm-svn: 340528
The format is the same as in ELF: a sequence of ULEB128-encoded
symbol indexes.
Differential Revision: https://reviews.llvm.org/D51047
llvm-svn: 340499
There are several places where we use CMAKE_CONFIGURATION_TYPES to determine if we are using an IDE generator and in turn decide not to generate some of the convenience targets (like all the install-* and check-llvm-* targets). This decision is made because IDEs don't always deal well with the thousands of targets LLVM can generate.
This approach does not work for Visual Studio 15's new CMake integration. Because VS15 uses a Ninja generator, it isn't a multi-configuration build, and generating all these extra targets mucks up the UI and adds little value.
With this change we still don't generate these targets by default for Visual Studio and Xcode generators, and LLVM_ENABLE_IDE becomes a switch that can be enabled on the VS15 CMake builds, to improve the IDE experience.
llvm-svn: 340435
Summary: This is to be consistent with lld behavior since rLLD340364.
Reviewers: tejohnson
Reviewed By: tejohnson
Subscribers: steven_wu, eraman, mehdi_amini, inglorion, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D51060
llvm-svn: 340380
The constructor of Scheduler now accepts a SchedulerStrategy object, which is
used internally by method Scheduler::select() to drive the instruction selection
process.
The goal of this patch is to enable the definition of custom selection
strategies while reusing the same algorithms implemented by class Scheduler.
The motivation is that, on some targets, the default strategy may not well
approximate the selection logic in the hardware schedulers.
This patch also adds the ability to pass a ResourceManager object to the
constructor of Scheduler. This gives a bit more flexibility to the design, and
potentially it allows to expose processor resources to SchedulerStrategy
objects.
Differential Revision: https://reviews.llvm.org/D51051
llvm-svn: 340314
The goal of this patch is to simplify the Scheduler's interface in preparation
for D50929.
Some methods in the Scheduler's interface should not be exposed to external
users, since their presence makes it hard to both understand, and extend the
Scheduler's interface.
This patch removes the following two methods from the public Scheduler's API:
- reclaimSimulatedResources()
- updatePendingQueue()
Their logic has been migrated to a new method named 'cycleEvent()'.
Methods 'updateIssuedSet()' and 'promoteToReadySet()' still exist. However,
they are now private members of class Scheduler.
This simplifies the interaction with the Scheduler from the ExecuteStage.
llvm-svn: 340273
Summary: Before, llvm-strip accepted a second argument but it would just be ignored.
Reviewers: alexshap, jhenderson, paulsemel
Reviewed By: alexshap
Subscribers: jakehehrlich, rupprecht, llvm-commits
Differential Revision: https://reviews.llvm.org/D51004
llvm-svn: 340229
The LSUnit is now a HardwareUnit, and it is owned by the mca::Context.
Derived classes can now implement a different consistency model by overriding
method `LSUnit::isReady()`.
This patch also slightly refactors the Scheduler interface in the attempt to
simplifying the interaction between ExecuteStage and the underlying Scheduler.
llvm-svn: 340176
This patch significantly improves performance of the YAML serializer by
optimizing `YAML::isNumeric` function. This function is called on the
most strings and is highly inefficient for two reasons:
* It uses `Regex`, which is parsed and compiled each time this
function is called
* It uses multiple passes which are not necessary
This patch introduces stateful ad hoc YAML number parser which does not
rely on `Regex`. It also fixes YAML number format inconsistency: current
implementation supports C-stile octal number format (`01234567`) which
was present in YAML 1.0 specialization (http://yaml.org/spec/1.0/),
[Section 2.4. Tags, Example 2.19] but was deprecated and is no longer
present in latest YAML 1.2 specification
(http://yaml.org/spec/1.2/spec.html), see [Section 10.3.2. Tag
Resolution]. Since the rest of the rest of the implementation does not
support other deprecated YAML 1.0 numeric features such as sexagecimal
numbers, commas as delimiters it is treated as inconsistency and not
longer supported. This patch also adds unit tests to ensure the validity
of proposed implementation.
This performance bottleneck was identified while profiling Clangd's
global-symbol-builder tool with my colleague @ilya-biryukov. The
substantial part of the runtime was spent during a single-thread Reduce
phase, which concludes with YAML serialization of collected symbol
collection. Regex matching was accountable for approximately 45% of the
whole runtime (which involves sharded Map phase), now it is reduced to
18% (which is spent in `clang::clangd::CanonicalIncludes` and can be
also optimized because all used regexes are in fact either suffix
matches or exact matches).
`llvm-yaml-numeric-parser-fuzzer` was used to ensure the validity of the
proposed regex replacement. Fuzzing for ~60 hours using 10 threads did
not expose any bugs.
Benchmarking `global-symbol-builder` (using `hyperfine --warmup 2
--min-runs 5 'command 1' 'command 2'`) tool by processing a reasonable
amount of code (26 source files matched by
`clang-tools-extra/clangd/*.cpp` with all transitive includes) confirmed
our understanding of the performance bottleneck nature as it speeds up
the command by the factor of 1.6x:
| Command | Mean [s] | Min…Max [s] |
| this patch (D50839) | 84.7 ± 0.6 | 83.3…84.7 |
| master (rL339849) | 133.1 ± 0.8 | 132.4…134.6 |
Using smaller samples (e.g. by collecting symbols from
`clang-tools-extra/clangd/AST.cpp` only) yields even better performance
improvement, which is expected because Map phase takes less time
compared to Reduce and is 2.05x faster and therefore would significantly
improve the performance of standalone YAML serializations.
| Command | Mean [ms] | Min…Max [ms] |
| this patch (D50839) | 3702.2 ± 48.7 | 3635.1…3752.3 |
| master (rL339849) | 7607.6 ± 109.5 | 7533.3…7796.4 |
Reviewed by: zturner, ilya-biryukov
Differential revision: https://reviews.llvm.org/D50839
llvm-svn: 340154
Added DIFlags in LLVMDIBuilderCreateBasicType to add optional DWARF
attributes, such as DW_AT_endianity.
Patch by Chirag Patel.
Differential Revision: https://reviews.llvm.org/D50832
llvm-svn: 340146
Summary:
Port GNU Objcopy -G/--keep-global-symbol(s).
This is slightly different than the already-implemented --globalize-symbol, which marks a symbol as global when copying. When --keep-global-symbol (alias -G) is used, *only* those symbols marked will stay global, and all other globals are demoted to local. (Also note that it doesn't *promote* a symbol to global). Additionally, there is a pluralized version of the flag --keep-global-symbols, which effectively applies --keep-global-symbol for every non-comment in a file.
Reviewers: jakehehrlich, jhenderson, alexshap
Reviewed By: jhenderson
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D50589
llvm-svn: 340105
VSO was a little close to VDSO (an acronym on Linux for Virtual Dynamic Shared
Object) for comfort. It also risks giving the impression that instances of this
class could be shared between ExecutionSessions, which they can not.
JITDylib seems moderately less confusing, while still hinting at how this
class is intended to be used, i.e. as a JIT-compiled stand-in for a dynamic
library (code that would have been a dynamic library if you had wanted to
compile it ahead of time).
llvm-svn: 340084
Summary:
The -I (--input-target) and -B (--binary-architecture) flags exist but are currently silently ignored. This adds support for -I binary for architectures i386, x86-64 (and alias i386:x86-64), arm, aarch64, sparc, and ppc (powerpc:common64). This is largely based on D41687.
This is done by implementing an additional subclass of Reader, BinaryReader, which works by interpreting the input file as contents for .data field, sets up a synthetic header, and adds additional sections/symbols (e.g. _binary__tmp_data_txt_start).
Reviewers: jakehehrlich, alexshap, jhenderson, javed.absar
Reviewed By: jhenderson
Subscribers: jyknight, nemanjai, kbarton, fedor.sergeev, jrtc27, kristof.beyls, paulsemel, llvm-commits
Differential Revision: https://reviews.llvm.org/D50343
llvm-svn: 340070
class Scheduler should not know anything of hardware event listeners and
hardware stall events (HWStallEvent). HWStallEvent objects should only be
constructed by pipeline stages to notify listeners of hardware events.
No functional change intended.
llvm-svn: 340036
This patch changes how instruction execution is orchestrated by the Pipeline.
In particular, this patch makes it more explicit how instructions transition
through the various pipeline stages during execution.
The main goal is to simplify both the stage API and the Pipeline execution. At
the same time, this patch fixes some design issues which are currently latent,
but that are likely to cause problems in future if people start defining custom
pipelines.
The new design assumes that each pipeline stage knows the "next-in-sequence".
The Stage API has gained three new methods:
- isAvailable(IR)
- checkNextStage(IR)
- moveToTheNextStage(IR).
An instruction IR can be executed by a Stage if method `Stage::isAvailable(IR)`
returns true.
Instructions can move to next stages using method moveToTheNextStage(IR).
An instruction cannot be moved to the next stage if method checkNextStage(IR)
(called on the current stage) returns false.
Stages are now responsible for moving instructions to the next stage in sequence
if necessary.
Instructions are allowed to transition through multiple stages during a single
cycle (as long as stages are available, and as long as all the calls to
`checkNextStage(IR)` returns true).
Methods `Stage::preExecute()` and `Stage::postExecute()` have now become
redundant, and those are removed by this patch.
Method Pipeline::runCycle() is now simpler, and it correctly visits stages
on every begin/end of cycle.
Other changes:
- DispatchStage no longer requires a reference to the Scheduler.
- ExecuteStage no longer needs to directly interact with the
RetireControlUnit. Instead, executed instructions are now directly moved to the
next stage (i.e. the retire stage).
- RetireStage gained an execute method. This allowed us to remove the
dependency with the RCU in ExecuteStage.
- FecthStage now updates the "program counter" during cycleBegin() (i.e.
before we start executing new instructions).
- We no longer need Stage::Status to be returned by method execute(). It has
been dropped in favor of a more lightweight llvm::Error.
Overally, I measured a ~11% performance gain w.r.t. the previous design. I also
think that the Stage interface is probably easier to read now. That being said,
code comments have to be improved, and I plan to do it in a follow-up patch.
Differential revision: https://reviews.llvm.org/D50849
llvm-svn: 339923
The main difference is that now `cycleStart()` and `cycleEnd()` return an
llvm::Error.
This patch implements a few minor style changes, and adds missing 'const' to
some methods.
llvm-svn: 339885
This allows to set custom Info field value for SHT_GROUP sections.
It is useful to allow this because we would be able to replace at least one binary
object committed in LLD and replace it with the yaml2obj based test.
Differential revision: https://reviews.llvm.org/D50776
llvm-svn: 339772
This patch fixes a regression introduced at revision 338702.
A processor resource mask was incorrectly implicitly truncated to an unsigned
quantity. Later on, the truncated mask was used to initialize an element of a
vector of processor resource descriptors.
On targets with more than 32 processor resources, some elements of the vector
are left uninitialized. As a consequence, this bug might have eventually caused
a crash due to null dereference in the Scheduler.
This patch fixes PR38575, and adds a test for it.
llvm-svn: 339768
Currently, it is possible to use yaml2obj for producing SHT_GROUP sections
of type GRP_COMDAT. For LLD test case I need to produce an object with
a broken (different from GRP_COMDAT) type.
The patch teaches tool to do such things.
Differential revision: https://reviews.llvm.org/D50761
llvm-svn: 339764
Summary:
Add an overload to sys::fs::setLastModificationAndAccessTime that allows setting last access and modification times separately. This will allow tools to use this API when they want to preserve both the access and modification times from an input file, which may be different.
Also note that both the POSIX (futimens/futimes) and Windows (SetFileTime) APIs take the two timestamps in the order of (1) access (2) modification time, so this renames the method to "setLastAccessAndModificationTime" to make it clear which timestamp is which.
For existing callers, the 1-arg overload just sets both timestamps to the same thing.
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D50521
llvm-svn: 339628
Summary:
This patch introduces error handling to propagate the errors from llvm-mca library classes (or what will become library classes) up to the driver. This patch also introduces an enum to make clearer the intention of the return value for Stage::execute.
This supports PR38101.
Reviewers: andreadb, courbet, RKSimon
Reviewed By: andreadb
Subscribers: llvm-commits, tschuett, gbedwell
Differential Revision: https://reviews.llvm.org/D50561
llvm-svn: 339594
LLVM triple normalization is handling "unknown" and empty components
differently; for example given "x86_64-unknown-linux-gnu" and
"x86_64-linux-gnu" which should be equivalent, triple normalization
returns "x86_64-unknown-linux-gnu" and "x86_64--linux-gnu". autoconf's
config.sub returns "x86_64-unknown-linux-gnu" for both
"x86_64-linux-gnu" and "x86_64-unknown-linux-gnu". This changes the
triple normalization to behave the same way, replacing empty triple
components with "unknown".
This addresses PR37129.
Differential Revision: https://reviews.llvm.org/D50219
llvm-svn: 339294
We do need to map /Zi to /Z7 explicitly for msbuild as explained in this file,
but since /Zi is passed by default and since things transparently work fine
with it mapped to /Z7, we shouldn't produce effectively inactionable noise for
it.
Also don't warn on /X since clang-cl supports that (since r326357; the risk of
duplicating a bunch of clang-cl driver logic here).
https://reviews.llvm.org/D50398
llvm-svn: 339169
Summary:
Hello!
This commit adds a LLVM-C target that is always built on MSVC. A big fat warning, this is my first cmake code ever so there is a fair bit of I-have-no-idea-what-I'm-doing going on here. Which is also why I placed it outside of llvm-shlib as I was afraid of breaking things of other people. Secondly llvm-shlib builds a LLVM.so which exports all symbols and then does a thin library that points to it, but on Windows we do not build a LLVM.dll so that would have complicated the code more.
The patch includes a python script that calls dumpbin.exe to get all of the symbols from the built libraries. It then grabs all the symbols starting with LLVM and generates the export file from those. The export file is then used to create the library just like the LLVM-C that is built on darwin.
Improvements that I need help with, to follow up this review.
- Get cmake to make sure that dumpbin.exe is on the path and wire the full path to the script.
- Use LLVM-C.dll when building llvm-c-test so we can verify that the symbols are exported.
- Bundle the LLVM-C.dll with the windows installer.
Why do this? I'm building a language frontend which is self-hosting, and on windows because of various tooling issues we have a problem of consuming the LLVM*.lib directly on windows. Me and the users of my projects using LLVM would be greatly helped by having LLVM-C.dll built and shipped by the Windows installer. Not only does LLVM takes forever to build, you have to run a extra python script in order to get the final DLL.
Any comments, thoughts or help is greatly appreciated.
Cheers, Jakob.
Patch by: Wallbraker (Jakob Bornecrantz)
Reviewers: compnerd, beanz, hans, smeenai
Reviewed By: beanz
Subscribers: xbolva00, bhelyer, Memnarch, rnk, fedor.sergeev, chapuni, smeenai, john.brawn, deadalnix, llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D35077
llvm-svn: 339151
Summary:
The accelerator tables use the debug_str section to store their strings.
However, they do not support the indirect method of access that is
available for the debug_info section (DW_FORM_strx et al.).
Currently our code is assuming that all strings can/will be referenced
indirectly, and puts all of them into the debug_str_offsets section.
This is generally true for regular (unsplit) dwarf, but in the DWO case,
most of the strings in the debug_str section will only be used from the
accelerator tables. Therefore the contents of the debug_str_offsets
section will be largely unused and bloating the main executable.
This patch rectifies this by teaching the DwarfStringPool to
differentiate between strings accessed directly and indirectly. When a
user inserts a string into the pool it has to declare whether that
string will be referenced directly or not. If at least one user requsts
indirect access, that string will be assigned an index ID and put into
debug_str_offsets table. Otherwise, the offset table is skipped.
This approach reduces the overall binary size (when compiled with
-gdwarf-5 -gsplit-dwarf) in my tests by about 2% (debug_str_offsets is
shrunk by 99%).
Reviewers: probinson, dblaikie, JDevlieghere
Subscribers: aprantl, mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D49493
llvm-svn: 339122
I was trying to add a test case for LLD and found that it
is impossible to set sh_entsize via yaml.
The patch implements the missing part.
Differential revision: https://reviews.llvm.org/D50235
llvm-svn: 339113
This patch is a follow-up to r338702.
We don't need to use a map to model the wait/ready/issued sets. It is much more
efficient to use a vector instead.
This patch gives us an average 7.5% speedup (on top of the ~12% speedup obtained
after r338702).
llvm-svn: 338883
Summary:
This change removes the ad-hoc implementation used by llvm-xray's
`convert` subcommand to generate JSON encoded catapult (AKA Chrome
Trace Viewer) trace output, to instead use the JSON encoder now in the
Support library.
Reviewers: kpw, zturner, eizan
Reviewed By: kpw
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D50129
llvm-svn: 338834
Summary:
With Mach-O, there is a flag requirement discrepancy between working with
universal binaries and thin binaries. Many flags that don't require the `-macho`
flag (for example `-private-headers` and `-disassemble`) fail to work on
universal binaries unless `-macho` is given. When this happens, the error
message is unhelpful, stating:
The file was not recognized as a valid object file.
Which can lead to confusion.
This change allows generic flags to be used on universal binaries with and
without the `-macho` flag. This means flags that can be used for thin files can
be used consistently with fat files too.
To do this, the universal binary support within `ParseInputMachO()` is extracted
into a new function. This new function is called directly from `DumpInput()`
when the input binary is universal. Additionally the `-arch` flag validation in
`ParseInputMachO()` was extracted to be reused.
Reviewers: compnerd
Reviewed By: compnerd
Subscribers: keith, llvm-commits
Differential Revision: https://reviews.llvm.org/D48702
llvm-svn: 338792
Summary:
This option is no longer needed since r300496 added symbol
versioning by default
Reviewers: sylvestre.ledru, beanz, mgorny
Reviewed By: mgorny
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49835
llvm-svn: 338751
Corrected and simplified the help text.
It was clearly too difficult to maintain before (see e.g. @227296) making it
simpler and more consistent it should help people keep it up to date.
Differential Revision: https://reviews.llvm.org/D48577
llvm-svn: 338703
We don't need to use a map to store ResourceState objects. The number of
processor resources is known statically from the scheduling model. We can
therefore use a vector, and reserve a slot for each processor resource that we
want to simulate.
Every time the ResourceManager queries the ResourceState vector, the index to
the vector of ResourceState objects can be easily computed from the processor
resource mask.
This drastically reduces the time complexity of method ResourceManager::use() and
method ResourceManager::release(). This patch gives an average speedup of 12%.
llvm-svn: 338702
This is useful for understanding how our demangler processes
back references and for investigating issues related to
back references. But it's a feature only useful for debugging
the demangling process itself, so I'm marking it hidden.
llvm-svn: 338609
Summary:
Add support for --rename-section flags from gnu objcopy.
Not all flags appear to have an effect for ELF objects, but allowing them would allow easier drop-in replacement. Other unrecognized flags are rejected.
This was only tested by comparing flags printed by "readelf -e <.o>" against the output of gnu vs llvm objcopy, it hasn't been tested to be valid beyond that.
Reviewers: jakehehrlich, alexshap
Subscribers: llvm-commits, paulsemel, alexshap
Differential Revision: https://reviews.llvm.org/D49870
llvm-svn: 338582
The functions `lookForDIEsToKeep` and `keepDIEAndDependencies` can have
some very deep recursion. This tackles part of this problem by removing
the recursion from `lookForDIEsToKeep` by turning it into a worklist.
The difficulty in doing so is the computation of incompleteness, which
depends on the incompleteness of its children. To compute this, we
insert "continuation markers" into the worklist. This informs the work
loop to (re)compute the incompleteness property of the DIE associated
with it (i.e. the parent of the previously processed DIE).
This patch should generate byte-identical output. Unfortunately it also
has some impact of performance, regressing by about 4% when processing
clang on my machine.
Differential revision: https://reviews.llvm.org/D48899
llvm-svn: 338536