LLVM contains a helpful function for getting the size of a C-style
array: `llvm::array_lengthof`. This is useful prior to C++17, but not as
helpful for C++17 or later: `std::size` already has support for C-style
arrays.
Change call sites to use `std::size` instead.
Differential Revision: https://reviews.llvm.org/D133429
We will insert a new operand which is identical to the Dest for complex
FMUL with a mask. https://godbolt.org/z/eTEdnYv3q
Complex FMA and FMUL with maskz don't have this problem.
Reviewed By: LuoYuanke, skan
Differential Revision: https://reviews.llvm.org/D130638
Summary:
Introduce NeverAlign fragment type.
The intended usage of this fragment is to insert it before a pair of
macro-op fusion eligible instructions. NeverAlign fragment ensures that
the next fragment (first instruction in the pair) does not end at a
given alignment boundary by emitting a minimal size nop if necessary.
In effect, it ensures that a pair of macro-fusible instructions is not
split by a given alignment boundary, which is a precondition for
macro-op fusion in modern Intel Cores (64B = cache line size, see Intel
Architecture Optimization Reference Manual, 2.3.2.1 Legacy Decode
Pipeline: Macro-Fusion).
This patch introduces functionality used by BOLT when emitting code with
MacroFusion alignment already in place.
The use case is different from BoundaryAlign and instruction bundling:
- BoundaryAlign can be extended to perform the desired alignment for the
first instruction in the macro-op fusion pair (D101817). However, this
approach has higher overhead due to reliance on relaxation as
BoundaryAlign requires in the general case - see
https://reviews.llvm.org/D97982#2710638.
- Instruction bundling: the intent of NeverAlign fragment is to prevent
the first instruction in a pair ending at a given alignment boundary, by
inserting at most one minimum size nop. It's OK if either instruction
crosses the cache line. Padding both instructions using bundles to not
cross the alignment boundary would result in excessive padding. There's
no straightforward way to request instruction bundling to avoid a given
end alignment for the first instruction in the bundle.
LLVM: https://reviews.llvm.org/D97982
Manual rebase conflict history:
https://phabricator.intern.facebook.com/D30142613
Test Plan: sandcastle
Reviewers: #llvm-bolt
Subscribers: phabricatorlinter
Differential Revision: https://phabricator.intern.facebook.com/D31361547
In MASM, if a QWORD symbol is passed to a jmp or call instruction in
64-bit mode or a DWORD or WORD symbol is passed in 32-bit mode, then
MSVC's assembler recognizes that as an indirect call. Additionally, if
the operand is qualified as a ptr, then that should also be an indirect
call.
Furthermore, in 64-bit mode, such operands are implicitly rip-relative
(in fact, MSVC's assembler ml64.exe does not allow explicitly specifying
rip as a base register.)
To keep this patch managable, this patch does not include:
* error messages for wrong operand types (e.g. passing a QWORD in 32-bit
mode)
* resolving indirect calls if the symbol is declared after it's first
use (llvm-ml currently only runs a single pass).
* imlementing the extern keyword (required to resolve
https://crbug.com/762167.)
This patch is likely missing a bunch of edge cases, so please do point
them out in the review.
Reviewed By: epastor, hans, MaskRay
Committed By: epastor (on behalf of ayzhao)
Differential Revision: https://reviews.llvm.org/D124413
-Rename Mode*Bit to Is*Bit to match X86Subtarget.
-Rename FeatureLAHFSAHF to FeatureLAFHSAFH64 to match X86Subtarget.
-Use consistent capitalization
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D121975
This reverts commit ef82063207.
- It conflicts with the existing llvm::size in STLExtras, which will now
never be called.
- Calling it without llvm:: breaks C++17 compat
This reverts commit a954558e87.
Thanks Yuanfang's help. I think I found the root cause of the buildbot
fail.
The failed test has both Memory and Immediate X86Operand. All data of
different operand kinds share the same memory space by a union
definition. So it has chance we get the wrong result if we don't check
the operand kind.
It's probably it happen to be the correct value in my local environment
so that I can't reproduce the fail.
Differential Revision: https://reviews.llvm.org/D116090
D115225 tried to roll back the effects on symbols of MS inline asm
introduced by D113096. But the combination of the conditions cannot
match all the changes. As a result, there are still fails after the
patch.
This patch fixes the problem by checking the exact conditions for MS
global variables, i.e., variable (by FrontendSize != 0) + non rip/eip
(by DefaultBaseReg == 0), so that we can fully roll back for D113096.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D116090
D113096 solved the "undefined reference to xxx" issue by adding
constraint *m for the global var. But it has strong side effect due to
the symbol in the assembly being replaced with constraint variable.
This leads to some lowering fails. https://godbolt.org/z/h3nWoerPe
This patch fix the problem by use the constraint *m as place holder
rather than real constraint. It has negligible effect for the existing
code generation.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D115225
Be more consistent in the naming convention for the various RET instructions to specify in terms of bitwidth.
Helps prevent future scheduler model mismatches like those that were only addressed in D44687.
Differential Revision: https://reviews.llvm.org/D113302
Constraint `*m` should be used when the address of a variable is passed
as a value. And the constraint is missing for MS inline assembly when sth
is written to the address of the variable.
The missing would cause FE delete the definition of the static varible,
and then result in "undefined reference to xxx" issue.
Reviewed By: xiangzhangllvm
Differential Revision: https://reviews.llvm.org/D113096
This patch is to order the AVX instructions ahead of AVX512 instructions
in the matching table so that the AVX instructions can be matched first.
Thanks Craig and Shengchen for the idea.
Differential Revision: https://reviews.llvm.org/D111538
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
On some architectures such as Arm and X86 the encoding for a nop may
change depending on the subtarget in operation at the time of
encoding. This change replaces the per module MCSubtargetInfo retained
by the targets AsmBackend in favour of passing through the local
MCSubtargetInfo in operation at the time.
On Arm using the architectural NOP instruction can have a performance
benefit on some implementations.
For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to
limit the chances of this causing problems in the future. I've not
done this for other targets such as X86 as there is more frequent use
of the MCSubtargetInfo and it looks to be for stable properties that
we would not expect to vary per function.
This change required threading STI through MCNopsFragment and
MCBoundaryAlignFragment.
I've attempted to take into account the in tree experimental backends.
Differential Revision: https://reviews.llvm.org/D45962
In preparation for passing the MCSubtargetInfo (STI) through to writeNops
so that it can use the STI in operation at the time, we need to record the
STI in operation when a MCAlignFragment may write nops as padding. The
STI is currently unused, a further patch will pass it through to
writeNops.
There are many places that can create an MCAlignFragment, in most cases
we can find out the STI in operation at the time. In a few places this
isn't possible as we are in initialisation or finalisation, or are
emitting constant pools. When possible I've tried to find the most
appropriate existing fragment to obtain the STI from, when none is
available use the per module STI.
For constant pools we don't actually need to use EmitCodeAlign as the
constant pools are data anyway so falling through into it via an
executable NOP is no better than falling through into data padding.
This is a prerequisite for D45962 which uses the STI to emit the
appropriate NOP for the STI. Which can differ per fragment.
Note that involves an interface change to InitSections. It is now
called initSections and requires a SubtargetInfo as a parameter.
Differential Revision: https://reviews.llvm.org/D45961
ML64.EXE applies implicit RIP-relative addressing only to memory references that include a named-variable reference.
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D105372
This is a mechanical change. This actually also renames the
similarly named methods in the SmallString class, however these
methods don't seem to be used outside of the llvm subproject, so
this doesn't break building of the rest of the monorepo.
Handle "short" in a case-insensitive fashion in MASM.
Required to correctly parse z_Windows_NT-586_asm.asm from the OpenMP runtime.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D104195
Did not correctly handle "jecxz short <address>".
Discovered while working on LLVM-ML; shows up in z_Windows_NT-586_asm.asm from the OpenMP runtime
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D104194
`X86AsmParser::ParseIntelExpression` has a while loop. In the body,
calls to MCAsmLexer::UnLex can force a reallocation in the MCAsmLexer's
`CurToken` SmallVector, invalidating saved references to
`MCAsmLexer::getTok()`.
`const MCAsmToken &Tok` is such a saved reference, and this moves it
from outside the while loop to inside the body, fixing a
use-after-realloc.
`Tok` will still be reused across calls to `Lex()`, each of which
effectively destroys and constructs the pointed-to token. I'm a bit
skeptical of this usage pattern, but it seems broadly used in the
X86AsmParser (and others) so I'm leaving it alone (for now).
Somehow this bug was exposed by https://reviews.llvm.org/D94739,
resulting in test failures in dot-operator related tests in
llvm/test/tools/llvm-ml. I suspect the exposure path is related to
optimizer changes from splitting up the grow operation, but I haven't
dug all the way in. Regardless, there are already tests in tree that
cover this; they might fail consistently if we added ASan
instrumentation to SmallVector.
Differential Revision: https://reviews.llvm.org/D95112
X86 allows for the "addr32" and "addr16" address size override prefixes.
Also, these and the segment override prefixes should be recognized as
valid prefixes.
Differential Revision: https://reviews.llvm.org/D94726
For MASM syntax, the prefixes are not enclosed in braces.
The assembly code should like:
"evex vcvtps2pd xmm0, xmm1"
Differential Revision: https://reviews.llvm.org/D90441