When using weak symbols, the WinCOFFObjectWriter keeps a list (`WeakDefaults`)
that's used to make names unique. This list should be reset when the object
writer is reset, because otherwise reuse of the object writer can result in
freed symbols being accessed. With some added output, this becomes clear when
using `llc` in `--run-twice` mode:
```
$ ./llc --compile-twice -mtriple=x86_64-pc-win32 trivial.ll -filetype=obj
DefineSymbol::WeakDefaults
- .weak.foo.default
- .weak.bar.default
DefineSymbol::WeakDefaults
- .weak.foo.default
- áÑJij⌂ p§┼Ø┐☺
- .debug_macinfo.dw
- .weak.bar.default
```
This does not seem to leak into the output object file though, so I couldn't
come up with a test. I added one that just does `--run-twice` (and verified
that it does access freed memory), which should result in detecting the
invalid memory accesses when running under ASAN.
Observed in a Julia PR where we started using weak symbols:
https://github.com/JuliaLang/julia/pull/45649
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D129840
It's more natural to use uint8_t * (std::byte needs C++17 and llvm has
too much uint8_t *) and most callers use uint8_t * instead of char *.
The functions are recently moved into `llvm::compression::zlib::`, so
downstream projects need to make adaption anyway.
Summary:
Introduce NeverAlign fragment type.
The intended usage of this fragment is to insert it before a pair of
macro-op fusion eligible instructions. NeverAlign fragment ensures that
the next fragment (first instruction in the pair) does not end at a
given alignment boundary by emitting a minimal size nop if necessary.
In effect, it ensures that a pair of macro-fusible instructions is not
split by a given alignment boundary, which is a precondition for
macro-op fusion in modern Intel Cores (64B = cache line size, see Intel
Architecture Optimization Reference Manual, 2.3.2.1 Legacy Decode
Pipeline: Macro-Fusion).
This patch introduces functionality used by BOLT when emitting code with
MacroFusion alignment already in place.
The use case is different from BoundaryAlign and instruction bundling:
- BoundaryAlign can be extended to perform the desired alignment for the
first instruction in the macro-op fusion pair (D101817). However, this
approach has higher overhead due to reliance on relaxation as
BoundaryAlign requires in the general case - see
https://reviews.llvm.org/D97982#2710638.
- Instruction bundling: the intent of NeverAlign fragment is to prevent
the first instruction in a pair ending at a given alignment boundary, by
inserting at most one minimum size nop. It's OK if either instruction
crosses the cache line. Padding both instructions using bundles to not
cross the alignment boundary would result in excessive padding. There's
no straightforward way to request instruction bundling to avoid a given
end alignment for the first instruction in the bundle.
LLVM: https://reviews.llvm.org/D97982
Manual rebase conflict history:
https://phabricator.intern.facebook.com/D30142613
Test Plan: sandcastle
Reviewers: #llvm-bolt
Subscribers: phabricatorlinter
Differential Revision: https://phabricator.intern.facebook.com/D31361547
* Refactor compression namespaces across the project, making way for a possible
introduction of alternatives to zlib compression.
Changes are as follows:
* Relocate the `llvm::zlib` namespace to `llvm::compression::zlib`.
Reviewed By: MaskRay, leonardchan, phosek
Differential Revision: https://reviews.llvm.org/D128953
Finish adding support for the remaining binary named operators in expression context: XOR, SHL, and SHR.
Differential Revision: https://reviews.llvm.org/D129299
Currently we use the `.llvm.offloading` section to store device-side
objects inside the host, creating a fat binary. The contents of these
sections is currently determined by the name of the section while it
should ideally be determined by its type. This patch adds the new
`SHT_LLVM_OFFLOADING` section type to the ELF section types. Which
should make it easier to identify this specific data format.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D129052
ml.exe and ml64.exe support the use of anonymous labels, with @B and @F referring to the previous and next anonymous label respectively.
We add similar support to llvm-ml, with the exception that we are unable to emit an error message for an @F expression not followed by a @@ label.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D128944
`mov x0, 1024u` is permitted in binutils but rejected by the integrated
assembler. Support the case. This is especially important when using the C
pre-processor with the assembler: some shared code between C and assembler may
use lower-cased suffices.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D128871
We are going to change alignment for DWARF sections. This patch is to
fix functionality issue if the alignment is not the same with
DefaultSectionAlign defined in XCOFFObjectWriter.cpp.
Currently no test for this patch as for now alignment for DWARF sections
and other sections are always the same. This patch will be tested when
patch changing DWARF section is merged in.
Reviewed By: Esme
Differential Revision: https://reviews.llvm.org/D116092
This is a resurrection of D106421 with the change that it keeps backward-compatibility. This means decoding the previous version of `LLVM_BB_ADDR_MAP` will work. This is required as the profile mapping tool is not released with LLVM (AutoFDO). As suggested by @jhenderson we rename the original section type value to `SHT_LLVM_BB_ADDR_MAP_V0` and assign a new value to the `SHT_LLVM_BB_ADDR_MAP` section type. The new encoding adds a version byte to each function entry to specify the encoding version for that function. This patch also adds a feature byte to be used with more flexibility in the future. An use-case example for the feature field is encoding multi-section functions more concisely using a different format.
Conceptually, the new encoding emits basic block offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, offsets must be aggregated along with basic block sizes to calculate the final offsets of basic blocks relative to the function address.
This encoding uses smaller values compared to the existing one (offsets relative to function symbol).
Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 17% total reduction in the size of the bb-address-map section (from about 11MB to 9MB for the clang PGO binary).
The extra two bytes (version and feature fields) incur a small 3% size overhead to the `LLVM_BB_ADDR_MAP` section size.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D121346
This is already possible for e.g. `cstring_literals`, but the entry for zerofill was unnamed.
rdar://90336380
Differential Revision: https://reviews.llvm.org/D128654
DXContainer files resemble traditional object files in that they are
comprised of parts which resemble sections. Adding DXContainer as an
object file format in the MC layer will allow emitting DXContainer
objects through the normal object emission pipeline.
Differential Revision: https://reviews.llvm.org/D127165
Previously, omitting unnecessary DWARF unwinds was only done in two
cases:
* For Darwin + aarch64, if no DWARF unwind info is needed for all the
functions in a TU, then the `__eh_frame` section would be omitted
entirely. If any one function needed DWARF unwind, then MC would emit
DWARF unwind entries for all the functions in the TU.
* For watchOS, MC would omit DWARF unwind on a per-function basis, as
long as compact unwind was available for that function.
This diff makes it so that we omit DWARF unwind on a per-function basis
for Darwin + aarch64 as well. In addition, we introduce the flag
`--emit-dwarf-unwind=` which can toggle between `always`,
`no-compact-unwind` (only emit DWARF when CU cannot be emitted for a
given function), and the target platform `default`. `no-compact-unwind`
is particularly useful for newer x86_64 platforms: we don't want to omit
DWARF unwind for x86_64 in general due to possible backwards compat
issues, but we should make it possible for people to opt into this
behavior if they are only targeting newer platforms.
**Motivation:** I'm working on adding support for `__eh_frame` to LLD,
but I'm concerned that we would suffer a perf hit. Processing compact
unwind is already expensive, and that's a simpler format than EH frames.
Given that MC currently produces one EH frame entry for every compact
unwind entry, I don't think processing them will be cheap. I tried to do
something clever on LLD's end to drop the unnecessary EH frames at parse
time, but this made the code significantly more complex. So I'm looking
at fixing this at the MC level instead.
**Addendum:** It turns out that there was a latent bug in the X86
backend when `OmitDwarfIfHaveCompactUnwind` is naively enabled, which is
not too surprising given that this combination has not been heretofore
used.
For functions that have unwind info that cannot be encoded with CU, MC
would end up dropping both the compact unwind entry (OK; existing
behavior) as well as the DWARF entries (not OK). This diff fixes things
so that we emit the DWARF entry, as well as a CU entry with encoding
`UNWIND_X86_MODE_DWARF` -- this basically tells the unwinder to look for
the DWARF entry. I'm not 100% sure the `UNWIND_X86_MODE_DWARF` CU entry
is necessary, this was the simplest fix. ld64 seems to be able to handle
both the absence and presence of this CU entry. Ultimately ld64 (and
LLD) will synthesize `UNWIND_X86_MODE_DWARF` if it is absent, so there
is no impact to the final binary size.
Reviewed By: davide, lhames
Differential Revision: https://reviews.llvm.org/D122258
The callsite identifier used in pseudo probe encoding and decoding is consisted of a function name and the callsite probe id. For encoding, i.e., `MCPseudoProbeInlineTree`, the function name is callee function name. However for decoding, i.e., `MCDecodedPseudoProbeInlineTree`, the caller function name is used actually. This results in multiple callees that are inlined at the same callsite, likely via indirect call promotion, sharing the same decoded inline frame. While it is not a problem for profile generation, it confuses probe re-encoding in Bolt.
In Bolt, we decode pseudo probes first and build `MCDecodedPseudoProbeInlineTree`. The decoded tree is used for final re-encoding. Here comes the problem. Two inlinees from the same callsite share the same decoded inline frame. During re-encoding, the frame name (whatever inlinee comes first) will be used and encoded in the bolted binary. This will cause wrong inline contexts in the profile generated on the bolted binary.
The fix is a no-op to pre-bolt profile generation. Some of the bolt tests are not yet upstreamed, thus I'm not adding a bolt test here.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D126434
Upon invalid alignment value, still set a default valid alignment value to avoid
hitting later asserts.
Fix#55273
Differential Revision: https://reviews.llvm.org/D125688
It's a fairly common issue that the generating code incorrectly marks
instructions as narrow or wide; check that the instruction lengths
add up to the expected value, and error out if it doesn't. This allows
catching code generation bugs.
Also check that prologs and epilogs are properly terminated, to
catch other code generation issues.
Differential Revision: https://reviews.llvm.org/D125647
This includes .seh_* directives for generating it from assembly.
It is designed fairly similarly to the ARM64 handling.
For .seh_handler directives, such as
".seh_handler __C_specific_handler, @except" (which is supported
on x86_64 and aarch64 so far), the "@except" bit doesn't work in
ARM assembly, as '@' is used as a comment character (on all current
platforms).
Allow using '%' instead of '@' for this purpose. This convention
is used by GAS in similar contexts already,
e.g. [1]:
Note on targets where the @ character is the start of a comment
(eg ARM) then another character is used instead. For example the
ARM port uses the % character.
In practice, this unfortunately means that all such .seh_handler
directives will need ifdefs for ARM.
Contrary to ARM64, on ARM, it's quite common that we can't evaluate
e.g. the function length at this point, due to instructions whose
length is finalized later. (Also, inline jump tables end with
a ".p2align 1".)
If unable to to evaluate the function length immediately, emit
it as an MCExpr instead. If we'd implement splitting the unwind
info for a function (which isn't implemented for ARM64 yet either),
we wouldn't know whether we need to split it though.
Avoid calling getFrameIndexOffset() on an unset
FuncInfo.UnwindHelpFrameIdx, to avoid triggering asserts in the
preexisting testcase CodeGen/ARM/Windows/wineh-basic.ll. (Once
MSVC exception handling is fully implemented, those changes
can be reverted.)
[1] https://sourceware.org/binutils/docs/as/Section.html#Section
Differential Revision: https://reviews.llvm.org/D125645
For ARM SEH, the epilogs will need a little more associated data than
just the plain list of opcodes.
This is a preparatory refactoring for D125645.
Differential Revision: https://reviews.llvm.org/D125879
MCSymbolizer::tryAddingSymbolicOperand() overloaded the Size parameter
to specify either the instruction size or the operand size depending on
the architecture. However, for proper symbolic disassembly on X86, we
need to know both sizes, as an instruction can have two operands, and
the instruction size cannot be reliably calculated based on the operand
offset and its size. Hence, split Size into OpSize and InstSize.
For X86, the new interface allows to fix a couple of issues:
* Correctly adjust the value of PC-relative operands.
* Set operand size to zero when the operand is specified implicitly.
Differential Revision: https://reviews.llvm.org/D126101
There's an inconsistency regarding the epilogs of packed ARM64
unwind info with homed parameters; according to the documentation
(and according to common sense), the epilog wouldn't have a series
of nop instructions matching the stp x0-x7 in the prolog - however
in practice, RtlVirtualUnwind still seems to behave as if the epilog
does have the mirrored nops from the prolog.
In practice, MSVC doesn't seem to produce packed unwind info with
homed parameters, which might be why this inconsistency hasn't
been noticed.
Thus, to play it safe, avoid creating such packed unwind info with
homed parameters. (LLVM's current behaviour matches the current
runtime behaviour of RtlVirtualUnwind, but if it later is bug fixed
to match the documentation, such unwind information would be
incorrect.)
See https://github.com/llvm/llvm-project/issues/54879 for further
discussion on the matter.
Differential Revision: https://reviews.llvm.org/D125876
This allows sharing opcodes between prolog and epilog even when there
is more than one epilog.
I didn't make any handcrafted special MC level testcases for this (yet
at least), but it does seem to have the expected effect on two existing
CodeGen level testcases.
Differential Revision: https://reviews.llvm.org/D125619
The "packed epilog" form only implies that the epilog is located
exactly at the end of the function (so the location of the epilog
is implicit from the epilog opcodes), but it doesn't have to share
opcodes with the prolog - as long as the total number of opcode
bytes and the offset to the epilog fit within the bitfields.
This avoids writing a 4 byte epilog scope in many cases. (I haven't
measured how much this shrinks actual xdata sections in practice
though.)
Differential Revision: https://reviews.llvm.org/D125536
EXTERN PROC isn't really well documented in MSVC, so after poking around
it seems as if it's just a regular extern symbol.
Interestingly enough, under MSVC the following is allowed:
extern foo:proc
mov eax, foo
MSVC will output:
mov eax, 0
while llvm-ml will currently output:
mov eax, dword ptr [foo]
(since foo is an extern)
Arguably, llvm-ml's output makes more sense, even though it's
inconsistent with MSVC ml. However, since moving an extern proc symbol
to a register doesn't really make sense in the first place, we'll treat
it as undefined behavior for now.
Reviewed By: epastor
Differential Revision: https://reviews.llvm.org/D125582
operator== and operator!= were added in
1308bb99e0 / D87369, but this existing
codepath wasn't updated to use them.
Also fix the indentation of the enclosed liens.
Differential Revision: https://reviews.llvm.org/D125368
The __llvm_addrsig section is a section that the linker needs for safe icf.
This was not yet implemented for MachO - this is the implementation.
It has been tested with a safe deduplication implementation inside lld.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D123751
For the AIX linker, under default options, global or weak symbols which
have no visibility bits set to zero (i.e. no visibility, similar to ELF
default) are only exported if specified on an export list provided to
the linker. So AIX has an additional visibility style called
"exported" which indicates to the linker that the symbol should
be explicitly globally exported.
This change maps "dllexport" in the LLVM IR to correspond to XCOFF
exported as we feel this best models the intended semantic (discussion
on the discourse RFC thread: https://discourse.llvm.org/t/rfc-adding-exported-visibility-style-to-the-ir-to-model-xcoff-exported-visibility/61853)
and allows us to enable writing this visibility for the AIX target
in the assembly path.
Reviewed By: DiggerLin
Differential Revision: https://reviews.llvm.org/D123951
This is a follow on to the revert of D84265 to add an error if we'd need
to write a non-zero visibility type in the xcoff object file. We can't
currently do that because we lack the auxilary header to interpret the
bits in XCOFF32. This is important because visibility is being enabled
in the assembly writing path, and without this error the visibility
could be silently ignored.
Differential Revision: https://reviews.llvm.org/D124392
Default behavior for .file directory was changed in D105856, but
ptxas (CUDA 11.5 release) refuses to parse it:
$ llc -march=nvptx64 llvm/test/DebugInfo/NVPTX/debug-file-loc.ll
$ ptxas debug-file-loc.s
ptxas debug-file-loc.s, line 42; fatal : Parsing error near
'"foo.h"': syntax error
Added a new field to MCAsmInfo to control default value of
UseDwarfDirectory. This value is used if -dwarf-directory command line
option is not specified.
Differential Revision: https://reviews.llvm.org/D121299
This API will be used in D121876, to get finalized string data for
.debug_line_str.
Reviewed By: dblaikie, rafauler
Differential Revision: https://reviews.llvm.org/D124052
The patch adds SPIRV-specific MC layer implementation, SPIRV object
file support and SPIRVInstPrinter.
Differential Revision: https://reviews.llvm.org/D116462
Authors: Aleksandr Bezzubikov, Lewis Crawford, Ilia Diachkov,
Michal Paszkowski, Andrey Tretyakov, Konrad Trifunovic
Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Ilia Diachkov <iliya.diyachkov@intel.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
ptxas fails to parse such syntax:
mov.u64 %rd1, ($str);
fatal : Parsing error near '$str': syntax error
A new MCAsmInfo option was added because InParens parameter of
MCExpr::print is not sufficient to disable parens
completely. MCExpr::print resets it to false for a recursive call in
case of unary or binary expressions.
Targets that require parens around identifiers that start with '$'
should always pass MCAsmInfo to MCExpr::print.
Therefore 'operator<<(raw_ostream &, MCExpr&)' should be avoided
because it calls MCExpr::print with nullptr MAI.
Differential Revision: https://reviews.llvm.org/D123702
ptxas fails to parse such syntax:
mov.u64 %rd1, ($str);
fatal : Parsing error near '$str': syntax error
A new MCAsmInfo option was added because InParens parameter of
MCExpr::print is not sufficient to disable parens
completely. MCExpr::print resets it to false for a recursive call in
case of unary or binary expressions.
Differential Revision: https://reviews.llvm.org/D123702
`.symver foo, foo@ver` creates the MCSymbolELF `foo@ver` whose almost all
attributes (including st_size) should inherit from `foo` (GNU as behavior).
a041ef1bd8 added st_size propagation which works
for many cases but fails for the following one:
```
.set __GLIBC_2_12_sys_errlist, _sys_errlist_internal
.type __GLIBC_2_12_sys_errlist,@object
.size __GLIBC_2_12_sys_errlist, 1080
.symver __GLIBC_2_12_sys_errlist, sys_errlist@GLIBC_2.12
...
_sys_errlist_internal:
.size _sys_errlist_internal, 1072
```
`sys_errlist@GLIBC_2.12`'s st_size is 1072 (incorrect), which does not match
`__GLIBC_2_12_sys_errlist`'s st_size: 1080.
The problem is that `Base` is (the final) `_sys_errlist_internal` while we want
to respect (the intermediate) `__GLIBC_2_12_sys_errlist`'s st_size.
Fix this by following the MCSymbolRefExpr assignment chain and finding
the closest non-null `getSize()`, which covers most needs. Notably MCBinaryExpr
is not handled, but it is rare enough to matter.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D123283
Returning `std::array<uint8_t, N>` is better ergonomics for the hashing functions usage, instead of a `StringRef`:
* When returning `StringRef`, client code is "jumping through hoops" to do string manipulations instead of dealing with fixed array of bytes directly, which is more natural
* Returning `std::array<uint8_t, N>` avoids the need for the hasher classes to keep a field just for the purpose of wrapping it and returning it as a `StringRef`
As part of this patch also:
* Introduce `TruncatedBLAKE3` which is useful for using BLAKE3 as the hasher type for `HashBuilder` with non-default hash sizes.
* Make `MD5Result` inherit from `std::array<uint8_t, 16>` which improves & simplifies its API.
Differential Revision: https://reviews.llvm.org/D123100
On targets which don't allow "@" in unquoted identifiers, make sure we
don't emit them; otherwise, we can't parse our own output.
Differential Revision: https://reviews.llvm.org/D122516
DXIL is wrapped in a container format defined by the DirectX 11
specification. Codebases differ in calling this format either DXBC or
DXILContainer.
Since eventually we want to add support for DXBC as a target
architecture and the format is used by DXBC and DXIL, I've termed it
DXContainer here.
Most of the changes in this patch are just adding cases to switch
statements to address warnings.
Reviewed By: pete
Differential Revision: https://reviews.llvm.org/D122062
STB_GNU_UNIQUE should be treated in a way similar to STB_GLOBAL.
This fixes an "Invalid Binding" failure in an LLVM_ENABLE_ASSERTIONS=on build
for source files like glibc elf/tst-unique1mod1.c .
This bug has been benign so far because (a) Clang does not produce
%gnu_unique_object by itself (b) a non-assertion build likely picks the
STB_GLOBAL code path anyway.
Complete pseudo probes decoding can result in large memory usage. In practice only a small porting of the decoded probes are used in profile generation. I'm changing the full decoding mode to be decoding for profiled functions only, though we still do a full scan of the .pseudoprobe section due to a missing table-of-content but we don't have to build the in-memory data structure for functions not sampled.
To build the in-memory data structure for profiled functions only, I'm rewriting the previous non-recursive probe decoding logic to be recursive. This is easy to read and maintain.
I also have to change the previous representation of unsymbolized context from probe-based stack to address-based stack since the profiled functions are unknown yet by the time of virtual unwinding. The address-based stack will be converted to probe-based stack after virtual unwinding and on-demand probe decoding.
I'm seeing 20GB memory is saved for one of our internal large service.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D121643
This is the first patch to enable the XCOFF64 object writer.
Currently only fileHeader and sectionHeaders are supported.
Reviewed By: jhenderson, DiggerLin
Differential Revision: https://reviews.llvm.org/D120861
With a sufficiently large output buffer, the only failure is Z_MEM_ERROR.
Check it and call the noreturn report_bad_alloc_error if applicable.
resize_for_overwrite may call report_bad_alloc_error as well.
Now that there is no other error type, we can replace the return type with void
and simplify call sites.
Reviewed By: ikudrin
Differential Revision: https://reviews.llvm.org/D121512
This change continues to lay the ground work for supporting extended
const expressions in the linker.
The included test covers object file reading and writing and the YAML
representation.
Differential Revision: https://reviews.llvm.org/D121349
The opt-out from rL236267 (2015) is untested and seems no longer needed (or not
needed when rL236267 was committed): there is nothing special with uncompressed
alignment. This brings us in line with GCC which compresses .debug_frame .
Checked that -g -fno-asynchronous-unwind-tables + objcopy
--decompress-debug-sections output is identical to -g
-fno-asynchronous-unwind-tables -gz + objcopy --decompress-debug-sections
output.
As requested in D107955 <https://reviews.llvm.org/D107955>, this patch
splits off the `MC` and `CodeGen` parts and adds a testcase.
Tested on `sparcv9-sun-solaris2.11`, `amd64-pc-solaris2.11`, and
`x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D120318
This patch is the first in a series of patches to upstream the support for Apple's DriverKit. Once complete, it will allow targeting DriverKit platform with Clang similarly to AppleClang.
This code was originally authored by JF Bastien.
Differential Revision: https://reviews.llvm.org/D118046
Large COFF section names are moved into the string table and the
section header field is the offset into the string table encoded in
ASCII for offset smaller than 7 digits and in base64 for larger
offsets.
The operation of taking the string table offsets is done in a few
places in the codebase, so it is helpful to move this operation into
`BinaryFormat` so that it can be shared everywhere it's done.
So this patch takes the implementation of this operation from
`llvm/lib/MC/WinCOFFObjectWriter.cpp` and moves it into `BinaryFormat`.
Reviewed By: jhenderson, rnk
Differential Revision: https://reviews.llvm.org/D118793
Separate MCRegisterInfo::regsOverlap out from
TargetRegisterInfo::regsOverlap. This is useful in the AMDGPU AsmParser
where we only have access to MCRegisterInfo.
Differential Revision: https://reviews.llvm.org/D119533
As usual with that header cleanup series, some implicit dependencies now need to
be explicit:
llvm/MC/MCParser/MCAsmParser.h no longer includes llvm/MC/MCParser/MCAsmLexer.h
Preprocessed lines to build llvm on my setup:
after: 1068185081
before: 1068324320
So no compile time benefit to expect, but we still get the looser coupling
between files which is great.
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119359
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:
llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h
Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after: 1049293745
Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
For PGO on AIX, when we switch to the linux-style PGO variable access
(via _start and _stop labels), we need the compiler to generate a .ref
assembly for each of the three csects:
- __llvm_prf_data[RW]
- __llvm_prf_names[RO]
- __llvm_prf_vnds[RW]
We insert the .ref inside the __llvm_prf_cnts[RW] csect so that if it's
live then the 3 csects are live.
For example, for a testcase with at least one function definition, when
compiled with -fprofile-generate we should generate:
.csect __llvm_prf_cnts[RW],3
.ref __llvm_prf_data[RW] <<============ needs to be inserted
.ref __llvm_prf_names[RO] <<===========
the __llvm_prf_vnds is not always present, so we reference it only when
it's present.
Reviewed By: sfertile, daltenty
Differential Revision: https://reviews.llvm.org/D116607
The namespace llvm::swift is causing errors to pop up in the apple/llvm-project build when cherry-picking 4ce1f3d47c into apple/llvm-project
Differential Review: https://reviews.llvm.org/D118716
Add support for Swift reflection metadata to dsymutil.
This patch adds support for copying Swift reflection metadata (__swift5_.* sections) from .o files to into the symbol-rich binary in the output .dSYM. The functionality is automatically enabled only if a .o file has reflection metadata sections and the binary doesn't. When copying dsymutil moves the section from the __TEXT segment to the __DWARF segment.
rdar://76973336
Differential Revision: https://reviews.llvm.org/D115007
This extra stray space after tab can be traced back to when printing
of this directive was added originally in
4f01b783a3. The same commit added
inconsistent printing of space after the ELF .type directive too,
which was fixed later in
77fe07a93a.
(This is kind of NFC, but it does alter the output, so it's not
strictly non-functional in that sense.)
Differential Revision: https://reviews.llvm.org/D118401