We were only including MachineInstr.h for DebugLoc.h. This exposes an implicit include dependency in BTFDebug.h where I've had to add the MachineInstr.h include.
Remove a number of includes that aren't necessary (nor are we relying on the remaining includes to provide the declarations), we just needed a llvm::Instruction forward declaration.
This exposed a couple of source files that were implicitly replying on the includes for their use of llvm::SmallSet or std::set, requiring local includes to be added there instead.
Currently, bpf does not specify 128bit alignment in its
layout spec. So for a structure like
struct ipv6_key_t {
unsigned pid;
unsigned __int128 saddr;
unsigned short lport;
};
clang will generate IR type
%struct.ipv6_key_t = type { i32, [12 x i8], i128, i16, [14 x i8] }
Additional padding is to ensure later IR->MIR can generate correct
stack layout with target layout spec.
But it is common practice for a tracing program to be
first compiled with target flag (e.g., x86_64 or aarch64) through
clang to generate IR and then go through llc to generate bpf
byte code. Tracing program often refers to kernel internal
data structures which needs to be compiled with non-bpf target.
But such a compilation model may cause a problem on aarch64.
The bcc issue https://github.com/iovisor/bcc/issues/2827
reported such a problem.
For the above structure, since aarch64 has "i128:128" in its
layout string, the generated IR will have
%struct.ipv6_key_t = type { i32, i128, i16 }
Since bpf does not have "i128:128" in its spec string,
the selectionDAG assumes alignment 8 for i128 and
computes the stack storage size for the above is 32 bytes,
which leads incorrect code later.
The x86_64 does not have this issue as it does not have
"i128:128" in its layout spec as it does permits i128 to
be alignmented at 8 bytes at stack. Its IR type looks like
%struct.ipv6_key_t = type { i32, [12 x i8], i128, i16, [14 x i8] }
The fix here is add i128 support in layout spec, the same as
aarch64. The only downside is we may have less optimal stack
allocation in certain cases since we require 16byte alignment
for i128 instead of 8. But this is probably fine as i128 is
not used widely and in most cases users should already
have proper alignment.
Differential Revision: https://reviews.llvm.org/D76587
Currently, BPF does not support dynamic static allocation.
For a program like below:
extern void bar(int *);
void foo(int n) {
int a[n];
bar(a);
}
The current error message looks like:
unimplemented operand
UNREACHABLE executed at /.../llvm/lib/Target/BPF/BPFISelLowering.cpp:199!
Let us make error message explicit so it will be clear to the user
what is the problem. With this patch, the error message looks like:
fatal error: error in backend: Unsupported dynamic stack allocation
...
Differential Revision: https://reviews.llvm.org/D74521
Currently, isTruncateFree() and isZExtFree() callbacks return false
as they are not implemented in BPF backend. This may cause suboptimal
code generation. For example, if the load in the context of zero extension
has more than one use, the pattern zextload{i8,i16,i32} will
not be generated. Rather, the load will be matched first and
then the result is zero extended.
For example, in the test together with this commit, we have
I1: %0 = load i32, i32* %data_end1, align 4, !tbaa !2
I2: %conv = zext i32 %0 to i64
...
I3: %2 = load i32, i32* %data, align 4, !tbaa !7
I4: %conv2 = zext i32 %2 to i64
...
I5: %4 = trunc i64 %sub.ptr.lhs.cast to i32
I6: %conv13 = sub i32 %4, %2
...
The I1 and I2 will match to one zextloadi32 DAG node, where SUBREG_TO_REG is
used to convert a 32bit register to 64bit one. During code generation,
SUBREG_TO_REG is a noop.
The %2 in I3 is used in both I4 and I6. If isTruncateFree() is false,
the current implementation will generate a SLL_ri and SRL_ri
for the zext part during lowering.
This patch implement isTruncateFree() in the BPF backend, so for the
above example, I3 and I4 will generate a zextloadi32 DAG node with
SUBREG_TO_REG is generated during lowering to Machine IR.
isZExtFree() is also implemented as it should help code gen as well.
This patch also enables the change in https://reviews.llvm.org/D73985
since it won't kick in generates MOV_32_64 machine instruction.
Differential Revision: https://reviews.llvm.org/D74101
Summary:
Add a new method (tryParseRegister) that attempts to parse a register specification.
MASM allows the use of IFDEF <register>, as well as IFDEF <symbol>. To accommodate this, we make it possible to check whether a register specification can be parsed at the current location, without failing the entire parse if it can't.
Reviewers: thakis
Reviewed By: thakis
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73486
The compiler may transform the following code
ctx = ctx + reloc_offset
... (*(u32 *)ctx) & 0x8000 ...
to
ctx = ctx + reloc_offset
... (*(u8 *)(ctx + 1)) & 0x80 ...
where reloc_offset will be replaced with a constant during
AsmPrinter phase.
The above transformed code will be rejected the kernel verifier
as it does not allow
*(type *)((ctx + non_zero_offset1) + non_zero_offset2)
style access pattern.
It is hard at SelectionDag phase to identify whether a load
is related to context or not. Sometime, interprocedure analysis
may be needed. So let us simply prevent such optimization
from happening.
Differential Revision: https://reviews.llvm.org/D73997
Linux commit
1cf5b23988 (diff-289313b9fec99c6f0acfea19d9cfd949)
uses "#pragma clang attribute push (__attribute__((preserve_access_index)),
apply_to = record)"
to apply CO-RE relocations to all records including the following pattern:
#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)
typedef struct {
int a;
} __t;
#pragma clang attribute pop
int test(__t *arg) { return arg->a; }
The current approach to use struct/union type in the relocation record will
result in an anonymous struct, which make later type matching difficult
in bpf loader. In fact, current BPF backend will fail the above program
with assertion:
clang: ../lib/Target/BPF/BPFAbstractMemberAccess.cpp:796: ...
Assertion `TypeName.size()' failed.
clang will change to use the type of the base of the member access
which will preserve the typedef modifier for the
preserve_{struct,union}_access_index intrinsics in the above example.
Here we adjust BPF backend to accept that the debuginfo
type metadata may be 'typedef' and handle them properly.
Differential Revision: https://reviews.llvm.org/D73902
Summary: This is a first step before changing the types to llvm::Align and introduce functions to ease client code.
Reviewers: courbet
Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73785
The recommended optimization level for BPF programs
is O2 since (1). BPF is running inside the kernel and
linux kernel won't work at -O0 level, and (2). Verifier
is not able to handle O0 code properly, e.g., potential
large stack size and a lot of spills.
But we should keep -O0 at least compiling.
This patch fixed a bug in BPFMISimplifyPatchable phase
where with -O0, a segmentation fault will happen for a
simple program like:
int test(int a, int b) { return a + b; }
A test case is added to capture such a case.
Differential Revision: https://reviews.llvm.org/D73681
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
Summary:
For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF
this change makes all symbols in the target specific libraries hidden
by default.
A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these
libraries public, which is mainly needed for the definitions of the
LLVMInitialize* functions.
This patch reduces the number of public symbols in libLLVM.so by about
25%. This should improve load times for the dynamic library and also
make abi checker tools, like abidiff require less memory when analyzing
libLLVM.so
One side-effect of this change is that for builds with
LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that
access symbols that are no longer public will need to be statically linked.
Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1):
nm before/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
36221
nm after/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
26278
Reviewers: chandlerc, beanz, mgorny, rnk, hans
Reviewed By: rnk, hans
Subscribers: merge_guards_bot, luismarques, smeenai, ldionne, lenary, s.egerton, pzheng, sameer.abuasal, MaskRay, wuzish, echristo, Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D54439
The argument is llvm::null() everywhere except llvm::errs() in
llvm-objdump in -DLLVM_ENABLE_ASSERTIONS=On builds. It is used by no
target but X86 in -DLLVM_ENABLE_ASSERTIONS=On builds.
If we ever have the needs to add verbose log to disassemblers, we can
record log with a member function, instead of passing it around as an
argument.
Previously extern function is added as BTF_KIND_VAR. This does not work
well with existing BTF infrastructure as function expected to use
BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO.
This patch added extern function to BTF_KIND_FUNC. The two bits 0:1
of btf_type.info are used to indicate what kind of function it is:
0: static
1: global
2: extern
Differential Revision: https://reviews.llvm.org/D71638
printInst prints a branch/call instruction as `b offset` (there are many
variants on various targets) instead of `b address`.
It is a convention to use address instead of offset in most external
symbolizers/disassemblers. This difference makes `llvm-objdump -d`
output unsatisfactory.
Add `uint64_t Address` to printInst(), so that it can pass the argument to
printInstruction(). `raw_ostream &OS` is moved to the last to be
consistent with other print* methods.
The next step is to pass `Address` to printInstruction() (generated by
tablegen from the instruction set description). We can gradually migrate
targets to print addresses instead of offsets.
In any case, downstream projects which don't know `Address` can pass 0 as
the argument.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D72172
Previous btf field relocation is always at assignment like
r1 = 4
which is converted from an ld_imm64 instruction.
This patch did an optimization such that relocation
instruction might be load/store/shift. Specically, the
following insns may also have relocation, except BPF_MOV:
LDB, LDH, LDW, LDD, STB, STH, STW, STD,
LDB32, LDH32, LDW32, STB32, STH32, STW32,
SLL, SRL, SRA
To accomplish this, a few BPF target specific
codegen only instructions are invented. They
are generated at backend BPF SimplifyPatchable phase,
which is at early llc phase when SSA form is available.
The new codegen only instructions will be converted to
real proper instructions at the codegen and BTF emission stage.
Note that, as revealed by a few tests, this optimization might
be actual generating more relocations:
Scenario 1:
if (...) {
... __builtin_preserve_field_info(arg->b2, 0) ...
} else {
... __builtin_preserve_field_info(arg->b2, 0) ...
}
Compiler could do CSE to only have one relocation. But if both
of the above is translated into codegen internal instructions,
the compiler will not be able to do that.
Scenario 2:
offset = ... __builtin_preserve_field_info(arg->b2, 0) ...
...
... offset ...
... offset ...
... offset ...
For whatever reason, the compiler might be temporarily do copy
propagation of the righthand of "offset" assignment like
... __builtin_preserve_field_info(arg->b2, 0) ...
... __builtin_preserve_field_info(arg->b2, 0) ...
and CSE will be able to deduplicate later.
But if these intrinsics are converted to BPF pseudo instructions,
they will not be able to get deduplicated.
I do not expect we have big instruction count difference.
It may actually reduce instruction count since now relocation
is in deeper insn dependency chain.
For example, for test offset-reloc-fieldinfo-2.ll, this patch
generates 7 instead of 6 relocations for non-alu32 mode, but it
actually reduced instruction count from 29 to 26.
Differential Revision: https://reviews.llvm.org/D71790
This has two main effects:
- Optimizes debug info size by saving 221.86 MB of obj file size in a
Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of
object file size.
- Incremental step towards decoupling target intrinsics.
The enums are still compact, so adding and removing a single
target-specific intrinsic will trigger a rebuild of all of LLVM.
Assigning distinct target id spaces is potential future work.
Part of PR34259
Reviewers: efriedma, echristo, MaskRay
Reviewed By: echristo, MaskRay
Differential Revision: https://reviews.llvm.org/D71320
Currently for extern variables with section attribute, those
BTF_KIND_VARs will not be placed in any DataSec. This is
inconvenient as any other generated BTF_KIND_VAR belongs to
one DataSec. This patch put these extern variables into
".extern" section so bpf loader can have a consistent
processing mechanism for all data sections and variables.
extern variable usage in BPF is different from traditional
pure user space application. Recent discussion in linux bpf
mailing list has two use cases where debug info types are
required to use extern variables:
- extern types are required to have a suitable interface
in libbpf (bpf loader) to provide kernel config parameters
to bpf programs.
https://lore.kernel.org/bpf/CAEf4BzYCNo5GeVGMhp3fhysQ=_axAf=23PtwaZs-yAyafmXC9g@mail.gmail.com/T/#t
- extern types are required so kernel bpf verifier can
verify program which uses external functions more precisely.
This will make later link with actual external function no
need to reverify.
https://lore.kernel.org/bpf/87eez4odqp.fsf@toke.dk/T/#m8d5c3e87ffe7f2764e02d722cb0d8cbc136880ed
This patch added bpf support to consume such info into BTF,
which can then be used by bpf loader. Function processFuncPrototypes()
only adds extern function definitions into BTF. The functions
with actual definition have been added to BTF in some other places.
Differential Revision: https://reviews.llvm.org/D70697
Generate types for global variables with "weak" attribute.
Keep allocation scope the same for both weak and non-weak
globals as ELF symbol table can determine whether a global
symbol is weak or not.
Differential Revision: https://reviews.llvm.org/D71162
Currently, BPF backend creates some global variables with name like
<type_name>:<reloc_type>:<patch_imm>$<access_str>
to carry certain information to BPF backend.
With direct clang compilation, the following code in
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
is triggered and the above globals are emitted to the ELF file.
(clang enabled this as opt flag -faddrsig is on by default.)
if (TM.Options.EmitAddrsig) {
// Emit address-significance attributes for all globals.
OutStreamer->EmitAddrsig();
for (const GlobalValue &GV : M.global_values())
if (!GV.use_empty() && !GV.isThreadLocal() &&
!GV.hasDLLImportStorageClass() && !GV.getName().startswith("llvm.") &&
!GV.hasAtLeastLocalUnnamedAddr())
OutStreamer->EmitAddrsigSym(getSymbol(&GV));
}
...
10162: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND tcp_sock:0:2048$0:117
10163: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND tcp_sock:0:2112$0:126:0
10164: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND tcp_sock:1:8$0:31:6
...
While in llc, those globals are not emited since EmitAddrsig
default option is false for llc. The llc flag "-addrsig" can be used to
enable the above code.
This patch added "llvm." prefix to these internal globals so that
they can be ignored in the above codes and possible other
places.
Differential Revision: https://reviews.llvm.org/D70703
Commit a0841dfe85 ("[BPF] Fix a bug in peephole optimization")
fixed a bug in peephole optimization. Recursion is introduced
to handle COPY and PHI instructions.
Unfortunately, multiple PHI instructions may form a cycle
and this will cause infinite recursion, eventual segfault.
For Commit a0841dfe85, I indeed tried a few loops to ensure
that I won't see the recursion, but I did not try with
complex control flows, which, as demonstrated with the test case
in this patch, may introduce PHI cycles.
This patch fixed the issue by introducing a set to remember
visited PHI instructions. This way, cycles can be properly
detected and handled.
Differential Revision: https://reviews.llvm.org/D70586
Summary:
Most libraries are defined in the lib/ directory but there are also a
few libraries defined in tools/ e.g. libLLVM, libLTO. I'm defining
"Component Libraries" as libraries defined in lib/ that may be included in
libLLVM.so. Explicitly marking the libraries in lib/ as component
libraries allows us to remove some fragile checks that attempt to
differentiate between lib/ libraries and tools/ libraires:
1. In tools/llvm-shlib, because
llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of
all libraries defined in the whole project, there was custom code
needed to filter out libraries defined in tools/, none of which should
be included in libLLVM.so. This code assumed that any library
defined as static was from lib/ and everything else should be
excluded.
With this change, llvm_map_components_to_libnames(LIB_NAMES, "all")
only returns libraries that have been added to the LLVM_COMPONENT_LIBS
global cmake property, so this custom filtering logic can be removed.
Doing this also fixes the build with BUILD_SHARED_LIBS=ON
and LLVM_BUILD_LLVM_DYLIB=ON.
2. There was some code in llvm_add_library that assumed that
libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or
ARG_LINK_COMPONENTS set. This is only true because libraries
defined lib lib/ use LLVMBuild.txt and don't set these values.
This code has been fixed now to check if the library has been
explicitly marked as a component library, which should now make it
easier to remove LLVMBuild at some point in the future.
I have tested this patch on Windows, MacOS and Linux with release builds
and the following combinations of CMake options:
- "" (No options)
- -DLLVM_BUILD_LLVM_DYLIB=ON
- -DLLVM_LINK_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON
Reviewers: beanz, smeenai, compnerd, phosek
Reviewed By: beanz
Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70179
One of current peephole optimiations is to remove SLL/SRL if
the sub register has been zero extended. This phase has two bugs
and one limitations.
First, for the physical subregister used in pseudo insn COPY
like below, it permits incorrect optimization.
%0:gpr32 = COPY $w0
...
%4:gpr = MOV_32_64 %0:gpr32
%5:gpr = SLL_ri %4:gpr(tied-def 0), 32
%6:gpr = SRA_ri %5:gpr(tied-def 0), 32
The $w0 could be from the return value of a previous function call
and its upper 32-bit value might contain some non-zero values.
The same applies to function arguments.
Second, the current code may permits removing SLL/SRA like below:
%0:gpr32 = COPY $w0
%1:gpr32 = COPY %0:gpr32
...
%4:gpr = MOV_32_64 %1:gpr32
%5:gpr = SLL_ri %4:gpr(tied-def 0), 32
%6:gpr = SRA_ri %5:gpr(tied-def 0), 32
The reason is that it did not follow def-use chain to skip all
intermediate 32bit-to-32bit COPY instructions.
The current implementation is also very conservative for PHI
instructions. If any PHI insn component is another PHI or COPY insn,
it will just permit SLL/SRA.
This patch fixed the issue as follows:
- During def/use chain traversal, if any physical register is read,
SLL/SRA will be preserved as these physical registers are mostly
from function return values or current function arguments.
- Recursively visit all COPY and PHI instructions.
Enable to generate BTF_KIND_VARs for non-static
default-section globals which is not allowed previously.
Modified the existing test case to accommodate the new change.
Also removed unused linkage enum members VAR_GLOBAL_TENTATIVE and
VAR_GLOBAL_EXTERNAL.
Differential Revision: https://reviews.llvm.org/D70145
Commit fff2721286 ("[BPF] Fix CO-RE bugs with bitfields")
fixed CO-RE handling bitfield issues. But the implementation
introduced a use after free bug. The "Base" of the intrinsic
might be freed so later on accessing the Type of "Base"
might access the freed memory. The failed test case,
CodeGen/BPF/CORE/offset-reloc-middle-chain.ll
is exactly used to test such a case.
Similarly to previous attempt to remember Metadata etc,
remember "Base" pointee Alignment in advance to avoid
such use after free bug.
bitfield handling is not robust with current implementation.
I have seen two issues as described below.
Issue 1:
struct s {
long long f1;
char f2;
char b1:1;
} *p;
The current approach will generate an access bit size
56 (from b1 to the end of structure) which will be
rejected as it is not power of 2.
Issue 2:
struct s {
char f1;
char b1:3;
char b2:5;
char b3:6:
char b4:2;
char f2;
};
The LLVM will group 4 bitfields together with 2 bytes. But
loading 2 bytes is not correct as it violates alignment
requirement. Note that sometimes, LLVM breaks a large
bitfield groups into multiple groups, but not in this case.
To resolve the above two issues, this patch takes a
different approach. The alignment for the structure is used
to construct the offset of the bitfield access. The bitfield
incurred memory access is an aligned memory access with alignment/size
equal to the alignment of the structure.
This also simplified the code.
This may not be the optimal memory access in terms of memory access
width. But this should be okay since extracting the bitfield value
will have the same amount of work regardless of what kind of
memory access width.
Differential Revision: https://reviews.llvm.org/D69837
During deriving proper bitfield access FIELD_BYTE_SIZE,
function Member->getStorageOffsetInBits() is used to
get llvm IR type storage offset in bits so that
the byte size can permit aligned loads/stores with previously
derived FIELD_BYTE_OFFSET.
The function should only be used with bitfield members and it will
assert if ASSERT is turned on during cmake build.
Constant *getStorageOffsetInBits() const {
assert(getTag() == dwarf::DW_TAG_member && isBitField());
if (auto *C = cast_or_null<ConstantAsMetadata>(getExtraData()))
return C->getValue();
return nullptr;
}
This patch fixed the issue by using Member->isBitField()
directly and a test case is added to cover this missing case.
This issue is discovered when running Andrii's linux kernel CO-RE
tests.
Differential Revision: https://reviews.llvm.org/D69761
Ilya Leoshkevich (<iii@linux.ibm.com>) reported an issue that
with -mattr=+alu32 CO-RE has a segfault in BPF MISimplifyPatchable
pass.
The pattern will be transformed by MISimplifyPatchable
pass looks like below:
r5 = ld_imm64 @"b:0:0$0:0"
r2 = ldw r5, 0
... r2 ... // use r2
The pass will remove the intermediate 'ldw' instruction
and replacing all r2 with r5 likes below:
r5 = ld_imm64 @"b:0:0$0:0"
... r5 ... // use r5
Later, the ld_imm64 insn will be replaced with
r5 = <patched immediate>
for field relocation purpose.
With -mattr=+alu32, the input code may become
r5 = ld_imm64 @"b:0:0$0:0"
w2 = ldw32 r5, 0
... w2 ... // use w2
Replacing "w2" with "r5" is incorrect and will
trigger compiler internal errors.
To fix the problem, if the register class of ldw* dest
register is sub_32, we just replace the original ldw*
register with:
w2 = w5
Directly replacing all uses of w2 with in-place
constructed w5 for the use operand seems not working in all cases.
The latest kernel will have -mattr=+alu32 on by default,
so added this flag to all CORE tests.
Tested with latest kernel bpf-next branch as well with this patch.
Differential Revision: https://reviews.llvm.org/D69438
MipsMCAsmInfo was using '$' prefix for Mips32 and '.L' for Mips64
regardless of -target-abi option. By passing MCTargetOptions to MCAsmInfo
we can find out Mips ABI and pick appropriate prefix.
Tags: #llvm, #clang, #lldb
Differential Revision: https://reviews.llvm.org/D66795
Currently, for indirect call, the assembly code printed out as
callx <imm>
This is not right, it should be
callx <reg>
Fixed the issue with proper format.
Differential Revision: https://reviews.llvm.org/D69229
llvm-svn: 375386
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68993
llvm-svn: 375084
Currently, BPF backend is doing truncation elimination. If one truncation
is performed on a value defined by narrow loads, then it could be redundant
given BPF loads zero extend the destination register implicitly.
When the definition of the truncated value is a merging value (PHI node)
that could come from different code paths, then checks need to be done on
all possible code paths.
Above described optimization was introduced as r306685, however it doesn't
work when there is back-edge, for example when loop is used inside BPF
code.
For example for the following code, a zero-extended value should be stored
into b[i], but the "and reg, 0xffff" is wrongly eliminated which then
generates corrupted data.
void cal1(unsigned short *a, unsigned long *b, unsigned int k)
{
unsigned short e;
e = *a;
for (unsigned int i = 0; i < k; i++) {
b[i] = e;
e = ~e;
}
}
The reason is r306685 was trying to do the PHI node checks inside isel
DAG2DAG phase, and the checks are done on MachineInstr. This is actually
wrong, because MachineInstr is being built during isel phase and the
associated information is not completed yet. A quick search shows none
target other than BPF is access MachineInstr info during isel phase.
For an PHI node, when you reached it during isel phase, it may have all
predecessors linked, but not successors. It seems successors are linked to
PHI node only when doing SelectionDAGISel::FinishBasicBlock and this
happens later than PreprocessISelDAG hook.
Previously, BPF program doesn't allow loop, there is probably the reason
why this bug was not exposed.
This patch therefore fixes the bug by the following approach:
- The existing truncation elimination code and the associated
"load_to_vreg_" records are removed.
- Instead, implement truncation elimination using MachineSSA pass, this
is where all information are built, and keep the pass together with other
similar peephole optimizations inside BPFMIPeephole.cpp. Redundant move
elimination logic is updated accordingly.
- Unit testcase included + no compilation errors for kernel BPF selftest.
Patch Review
===
Patch was sent to and reviewed by BPF community at:
https://lore.kernel.org/bpf
Reported-by: David Beckett <david.beckett@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
llvm-svn: 375007