The Headers.CountersDelta field is an uint64_t, not a pointer,
so just cast to uint32_t to truncate it.
Differential Revision: https://reviews.llvm.org/D107619
Change `CountersPtr` in `__profd_` to a label difference, which is a link-time
constant. On ELF, when linking a shared object, this requires that `__profc_` is
either private or linkonce/linkonce_odr hidden. On COFF, we need D104564 so that
`.quad a-b` (64-bit label difference) can lower to a 32-bit PC-relative relocation.
```
# ELF: R_X86_64_PC64 (PC-relative)
.quad .L__profc_foo-.L__profd_foo
# Mach-O: a pair of 8-byte X86_64_RELOC_UNSIGNED and X86_64_RELOC_SUBTRACTOR
.quad l___profc_foo-l___profd_foo
# COFF: we actually use IMAGE_REL_AMD64_REL32/IMAGE_REL_ARM64_REL32 so
# the high 32-bit value is zero even if .L__profc_foo < .L__profd_foo
# As compensation, we truncate CountersDelta in the header so that
# __llvm_profile_merge_from_buffer and llvm-profdata reader keep working.
.quad .L__profc_foo-.L__profd_foo
```
(Note: link.exe sorts `.lprfc` before `.lprfd` even if the object writer
has `.lprfd` before `.lprfc`, so we cannot work around by reordering
`.lprfc` and `.lprfd`.)
With this change, a stage 2 (`-DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_BUILD_INSTRUMENTED=IR`)
`ld -pie` linked clang is 1.74% smaller due to fewer R_X86_64_RELATIVE relocations.
```
% readelf -r pie | awk '$3~/R.*/{s[$3]++} END {for (k in s) print k, s[k]}'
R_X86_64_JUMP_SLO 331
R_X86_64_TPOFF64 2
R_X86_64_RELATIVE 476059 # was: 607712
R_X86_64_64 2616
R_X86_64_GLOB_DAT 31
```
The absolute function address (used by llvm-profdata to collect indirect call
targets) can be converted to relative as well, but is not done in this patch.
Differential Revision: https://reviews.llvm.org/D104556
When writing out a profile, avoid allocating a page on the stack for the
purpose of writing out zeroes, as some embedded environments do not have
enough stack space to accomodate this.
Instead, use a small, fixed-size zero buffer that can be written
repeatedly.
For a synthetic file with >100,000 functions, I did not measure a
significant difference in profile write times. We are removing a
page-length zero-fill `memset()` in favor of several smaller buffered
`fwrite()` calls: in practice, I am not sure there is much of a
difference. The performance impact is only expected to affect the
continuous sync mode (%c) -- zero padding is less than 8 bytes in all
other cases.
rdar://57810014
Differential Revision: https://reviews.llvm.org/D71323
This header fragment is useful on its own for any consumer that wants
to use custom instruction profile runtime with the LLVM instrumentation.
The concrete use case is in Fuchsia's kernel where we want to use
instruction profile instrumentation, but we cannot use the compiler-rt
runtime because it's not designed for use in the kernel environment.
This change allows installing this header as part of compiler-rt.
Differential Revision: https://reviews.llvm.org/D64532
This header fragment is useful on its own for any consumer that wants
to use custom instruction profile runtime with the LLVM instrumentation.
The concrete use case is in Fuchsia's kernel where we want to use
instruction profile instrumentation, but we cannot use the compiler-rt
runtime because it's not designed for use in the kernel environment.
This change allows installing this header as part of compiler-rt.
Differential Revision: https://reviews.llvm.org/D64532
VLAs in C appear to not work on Windows, so use COMPILER_RT_ALLOCA:
C:\b\slave\sanitizer-windows\llvm-project\compiler-rt\lib\profile\InstrProfilingWriter.c(264): error C2057: expected constant expression
C:\b\slave\sanitizer-windows\llvm-project\compiler-rt\lib\profile\InstrProfilingWriter.c(264): error C2466: cannot allocate an array of constant size 0
C:\b\slave\sanitizer-windows\llvm-project\compiler-rt\lib\profile\InstrProfilingWriter.c(264): error C2133: 'Zeroes': unknown size
Add support for continuously syncing profile counter updates to a file.
The motivation for this is that programs do not always exit cleanly. On
iOS, for example, programs are usually killed via a signal from the OS.
Running atexit() handlers after catching a signal is unreliable, so some
method for progressively writing out profile data is necessary.
The approach taken here is to mmap() the `__llvm_prf_cnts` section onto
a raw profile. To do this, the linker must page-align the counter and
data sections, and the runtime must ensure that counters are mapped to a
page-aligned offset within a raw profile.
Continuous mode is (for the moment) incompatible with the online merging
mode. This limitation is lifted in https://reviews.llvm.org/D69586.
Continuous mode is also (for the moment) incompatible with value
profiling, as I'm not sure whether there is interest in this and the
implementation may be tricky.
As I have not been able to test extensively on non-Darwin platforms,
only Darwin support is included for the moment. However, continuous mode
may "just work" without modification on Linux and some UNIX-likes. AIUI
the default value for the GNU linker's `--section-alignment` flag is set
to the page size on many systems. This appears to be true for LLD as
well, as its `no_nmagic` option is on by default. Continuous mode will
not "just work" on Fuchsia or Windows, as it's not possible to mmap() a
section on these platforms. There is a proposal to add a layer of
indirection to the profile instrumentation to support these platforms.
rdar://54210980
Differential Revision: https://reviews.llvm.org/D68351
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
This includes a few nice bits of refactoring (e.g splitting out the
exclusive locking code into a common utility).
Hopefully the Windows support is fixed now.
Patch by Rainer Orth!
Differential Revision: https://reviews.llvm.org/D40944
llvm-svn: 320731
This includes a few nice bits of refactoring (e.g splitting out the
exclusive locking code into a common utility).
Patch by Rainer Orth!
Differential Revision: https://reviews.llvm.org/D40944
llvm-svn: 320726
Introduces a 'owner' struct to include the overridable write
method and the write context in C.
This allows easy introdution of new member API to help reduce
profile merge time in the follow up patch.
llvm-svn: 306432
This is part-3 of the effort to eliminate dependency on
libc allocator in instr profiler runtime. With this change,
the profile dumper is completely free of malloc/calloc.
Value profile instr API implementation is the only remaining
piece with calloc dependency.
llvm-svn: 269576
With this change, dynamic memory allocation is only used
for testing purpose. This change is one of the many steps to
make instrument profiler dynamic allocation free.
llvm-svn: 269453
This reverts commit r268840, as it breaks Thumb2 self-hosting. There is something
unstable in the profiling for Thumb2 that needs to be sorted out before we continue
implementing these changes to the profiler. See PR27667.
llvm-svn: 268864
Compiler-rt miscalculates the number of entries in the __llvm_prf_data section
on i386 Darwin. This results in a number of test failures (which we started
catching after r261344).
The fix we attempted earlier is insufficient (r261683). It caused some tests to
start passing again, but that hid the fact that we drop some data entries.
This patch should fix the real problem. It fixes the way we compute DataSize by
taking into account the way the Darwin linker lays out __llvm_prf_data.
Differential Revision: http://reviews.llvm.org/D17623
llvm-svn: 261957
Extract the buffered filer writer code used by value profile
writer and turn it into common/sharable buffered fileIO
interfaces. Added a test case for the buffered file writer and
rewrite the VP dumping using the new APIs.
llvm-svn: 256604
The profile reader no longer depends on this field to be updated and point
to owning func's vp data. The VP data also no longer needs to be allocated
in a contiguous memory space.
Differential Revision: http://reviews.llvm.org/D15258
llvm-svn: 256543
(patch suggested by silvas)
With this patch, the IO information is wrapped in struct
ProfDataIOVec, and interface of writerCallback takes a vector
of IOVec and a pointer to writer context pointer.
Differential Revision: http://reviews.llvm.org/D14859
llvm-svn: 253764
Value profile enumerator change to match LLVM code
ProfData new member field name change to match LLVM code
ProfData member type change to match LLVM code
Do not use lower case for types that are internal to implementation (not exposed to APIs)
There is no functional change. This is a preparation patch to enable more code sharing
in follow up patches
Differential Revision: http://reviews.llvm.org/D14841
llvm-svn: 253700