Currently, DWARFLinker receives kind of accel tables as predefined sets:
```
Apple, ///< .apple_names, .apple_namespaces, .apple_types, .apple_objc.
Dwarf, ///< DWARF v5 .debug_names.
Default, ///< Dwarf for DWARF5 or later, Apple otherwise.
Pub, ///< .debug_pubnames, .debug_pubtypes
```
This patch removes implicit sets of tables(Default, Dwarf) and allows to ask for several sets:
```
Apple, ///< .apple_names, .apple_namespaces, .apple_types, .apple_objc.
Pub, ///< .debug_pubnames, .debug_pubtypes
DebugNames ///< .debug_names.
```
It allows seamlessness adding more accel tables in the future: .gdb_index, .debug_cu_index...
Doing things that way, DWARFLinker will be independent of consumers' requirements.
f.e. dsymutil and llvm-dwarfutil may have different variants for Default set
(so, instead of implementing these differencies inside DWARFLinker it could be
implemented in the corresponding module).
Differential Revision: https://reviews.llvm.org/D132371
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
There's a data race on the UninterestingChunks set. The code seems to
be operating on the assumption that all the tasks completed, so ensure
the unused results do complete. This started showing up about 50% of
the time when running operands-skip-parallel.ll after the recent
switch to use DenseSet; previously it failed much less frequently with
std::set.
We should introduce a mechanism to early terminate unused
results. Alternatively, I've been thinking about ways to to make the
reduction order smarter. I frequently have tests that take multiple
minutes to compile and hit the failure. It may be helpful to see which
chunks took the least time and prefer those over just taking the first
result.
As stated in
https://discourse.llvm.org/t/rfc-llc-add-expandlargeintfpconvert-pass-for-fp-int-conversion-of-large-bitint/65528,
this implementation is very similar to ExpandLargeDivRem, which expands
‘fptoui .. to’, ‘fptosi .. to’, ‘uitofp .. to’, ‘sitofp .. to’ instructions
with a bitwidth above a threshold into auto-generated functions. This is
useful for targets like x86_64 that cannot lower fp convertions with more
than 128 bits. The expanded nodes are referring from the IR generated by
`compiler-rt/lib/builtins/floattidf.c`, `compiler-rt/lib/builtins/fixdfti.c`,
and etc.
Corner cases:
1. For fp16: as there is no related builtins added in compliler-rt. So I
mainly utilized the fp32 <-> fp16 lib calls to implement.
2. For fp80: as this pass is soft fp emulation and no fp80 instructions can
help in this problem. I recommend users to deprecate this usage. For now, the
implementation uses fp128 as the temporary conversion type and inserts
fptrunc/ext at top/end of the function.
3. For bf16: as clang FE currently doesn't support bf16 algorithm operations
(convert to int, float, +, -, *, ...), this patch doesn't consider bf16 for
now.
4. For unsigned FPToI: since both default hardware behaviors and libgcc are
ignoring "returns 0 for negative input" spec. This pass follows this old way
to ignore unsigned FPToI. See this example:
https://gcc.godbolt.org/z/bnv3jqW1M
The end-to-end tests are uploaded at https://reviews.llvm.org/D138261
Reviewed By: LuoYuanke, mgehre-amd
Differential Revision: https://reviews.llvm.org/D137241
We need to flatten the SampleFDO profile in profile supplementation
because the InstrFDO profile does not have inlined callsite counters.
Without flattening profile, FDO optimizations are not stable:
we will not supplement the second generation profile when the modified
functions are all inlined.
This patch fixes this issue: we will flatten the profile for functions
that appears in FDO profile.
Note that we only need to find the hot/warm functions in SampleFDO
profile, so we will not perform a full flatten. We will use
a DFS traversal to compute the accumulated entry count and max bodycount.
This is much cheaper than full flattening.
Differential Revision: https://reviews.llvm.org/D138893
The exports trie used to be pointed by the information in LC_DYLD_INFO,
but when chained fixups are present, the exports trie is pointed by
LC_DYLD_EXPORTS_TRIE instead.
Modify the Object library to give access to the information pointed by
each of the load commands, and to fallback from one into the other when
the exports are requested.
Modify ObjectYAML to support dumping the export trie when pointed by
LC_DYLD_EXPORTS_TRIE and to parse the existence of a export trie also
when the load command is present.
This is a split of D134250 with improvements on top.
Reviewed By: alexander-shaposhnikov
Differential Revision: https://reviews.llvm.org/D134571