When CurrentLoadedOffset is less than TotalSize, current code will
trigger unsigned overflow and will not return an "allocation failed"
indicator.
Google ref: b/248613299
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D135192
GCC behavior regarding defining __SOFTFP__ when (implicitly) specifying
-mfloat-abi=softfp:
- compile without (implicit) FP: define __SOFTFP__
- compile with (implicit) FP: don't define __SOFTFP__
Currently Clang doesn't define __SOFTFP__ when softfp is specified, either with
or without FP. This patch brings Clang in line with GCC behavior.
This was raised by itaig1 over on Github:
https://github.com/llvm/llvm-project/issues/55755
Reviewed By: pratlucas
Differential Revision: https://reviews.llvm.org/D135680
This follows the path that AArch64 SVE has taken. Doing this via a function attribute set in the frontend is basically a workaround for the fact that several analyzes which need the information (i.e. known bits, lvi, scev) can't easily use TTI without significant amounts of plumbing changes.
This patch hard codes "v" numbers, and directly follows the SVE precedent as a result. In a follow up, I hope to drive this from RISCVISAInfo.h/cpp instead, but the MinVLen number being returned from that interface seemed to always be 0 (which is wrong), and I haven't figured out what's going wrong there.
Differential Revision: https://reviews.llvm.org/D135894
VisbleModuleSet::setVisible takes a callback, to inform of modules
being made (transitively) visible. However, we were calling it as
'Vis(M)' from a recursive lambda, where 'M' is a capture of
setVisible's M, module parameter. Thus we can invoke the callback
multiple times, passing the same value to it each time.
Everywhere else in the lambda, we refer to V.M of the lambda's
Visiting parameter. We should be doing so for the callback. Thus
we'll pass the outermost module on the outermost recursive call, and
as we descend the imports, we'll pass each import to the callback.
Reviewed By: iains
Differential Revision: https://reviews.llvm.org/D135958
Reference: https://gcc.gnu.org/onlinedocs/gccint/Machine-Constraints.html
k: A memory operand whose address is formed by a base register and
(optionally scaled) index register.
m: A memory operand whose address is formed by a base register and
offset that is suitable for use in instructions with the same
addressing mode as st.w and ld.w.
ZB: An address that is held in a general-purpose register. The offset
is zero.
ZC: A memory operand whose address is formed by a base register and
offset that is suitable for use in instructions with the same
addressing mode as ll.w and sc.w.
Note:
The INLINEASM SDNode flags in below tests are updated because the new
introduced enum `Constraint_k` is added before `Constraint_m`.
llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-inline-asm.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-inline-asm.ll
llvm/test/CodeGen/X86/callbr-asm-kill.mir
This patch passes `ninja check-all` on a X86 machine with all official
targets and the LoongArch target enabled.
Differential Revision: https://reviews.llvm.org/D134638
Similar to D131064, this alters most of the intrinsics in arm_neon.h to
be target based, not preprocessor based. The intrinsics that are changed
are the ones with obvious target features (fp16, fp16fml, cryptos, i8mm
and bf16). The ones that are not yet altered are the ones without target
features like rdma (8.1) and complex (8.3). Those will be switched in a
followup patch that allows targeting architecture versions.
The existing ArchGuard in arm_neon.td is split into ArchGuard that still
adds ifdef defines (for example for intrinsics that require __aarch64__),
and TargetGuards for intrinsics dependant on target features. From there
the TargetGuards are used in two ways:
- For intrinsics emitted as functions, __attribute__((target(TargetGuard)))
is added to the definition of the function. Along with the existing
always_inline intrinsic, this will give a compile time error if the
function is used in a context where the target feature is not available.
- For intrinsics emitted as macros, the __builtins are emitted into
arm_neon.inc using TARGET_BUILTIN as opposed to BUILTIN, which includes
the target feature and gives an error if the builtin is found in a
function without the required features, similar to arm_sve.h.
The second method requires that the intrinsics be separable from the
existing _v intrinsics used in other types. For example
__builtin_neon_splat_lane_bf16 is used as opposed to
__builtin_neon_splat_lane_v. There are some adjustments to the CGBuiltin
to account for intrinsics that can be treated similarly, except for
their target features.
Differential Revision: https://reviews.llvm.org/D132034
Prune the search space -- If we know offset(LastFileIDLookup) < SearchOffset, we
can prune the initial binary-search range from [0, end) to [LastFileIDlookup, end).
It reduces the binary search scan by ~30%.
SemaExpr.cpp: 1393437 -> 1035426
FindTarget.cpp: 1275930 -> 920087
Linux kernel:
getFileIDLocal: 2.45% -> 2.15%
Differential Revision: https://reviews.llvm.org/D135132
Similar to getFileIDLocal patch, but for the version for load module.
Test with clangd (building AST with preamble), FileID scans in binary
search is reduced:
SemaExpr.cpp: 142K -> 137K (-3%)
FindTarget.cpp: 368K -> 343K (-6%)
Differential Revision: https://reviews.llvm.org/D135258
In HIP a library is usually compiled with default target ID e.g. gfx906 so that
it can be used in all GPU configurations. The bitcode is saved in bundled
bitcode with gfx906 in entry ID.
In runtime compilation, a HIP program is compiled with a target ID matching
the GPU configuration, e.g. gfx906:xnack-. This program needs to link with
a library bundled bitcode with target ID gfx906.
For example:
clang --offload-arch=gfx906 -o lib.o lib.hip
clang --offload-arch=gfx906:xnack- program.hip lib.o
This common use case requires that clang-offlod-bundler to be able to extract
entry with compatible target ID, e.g. extracting an gfx906 entry when requesting
gfx906:xnack-.
Currently clang-offload-bundler only allow extracting entry with exact match
of target ID. This patch relaxes that so that it can extract entries with compatible
target ID.
Reviewed by: Artem Belevich, Saiyedul Islam
Differential Revision: https://reviews.llvm.org/D134546
(& remove PSPTargetInfo because it's unused - it had the wrong ctor in
it anyway, so wouldn't've been able to be instantiated - must've
happened due to bitrot over the years)
Update SourceManager::ContentCache::OrigEntry to keep the original
FileEntryRef, and use that to enable ModuleMap::getModuleMapFile* to
return the original FileEntryRef. This change should be NFC for
most users of SourceManager::ContentCache, but it could affect behaviour
for users of getNameAsRequested such as in compileModuleImpl. I have not
found a way to detect that difference without additional functional
changes, other than incidental cases like changes from / to \ on
Windows so there is no new test.
Differential Revision: https://reviews.llvm.org/D135220
isBeforeInTranslationUnit compares SourceLocations across FileIDs by
mapping them onto a common ancestor file, following include/expansion edges.
It is possible to get a tie in the common ancestor, because multiple
"chunks" of a macro arg will expand to the same macro param token in the body:
#define ID(X) X
#define TWO 2
ID(1 TWO)
Here two FileIDs both expand into `X` in ID's expansion:
- one containing `1` and spelled on line 3
- one containing `2` and spelled by the macro expansion of TWO
isBeforeInTranslationUnit breaks this tie by comparing the two FileIDs:
the one "on the left" is always created first and is numerically smaller.
This seems correct so far.
Prior to this patch it also takes a shortcut (unclear if intentionally).
Instead of comparing the two FileIDs that directly expand to the same location,
it compares the original FileIDs being compared. These may not be the
same if there are multiple macro expansions in between.
This *almost* always yields the right answer, because macro expansion
yields "trees" of FileIDs allocated in a contiguous range: when comparing tree A
to tree B, it doesn't matter what representative you pick.
However, the splitting of >> tokens is modeled as macro expansion (as if
the first '>' was a macro that expands to a '>' spelled a scratch buffer).
This splitting occurs retroactively when parsing, so the FileID allocated is
larger than expected if it were a real macro expansion performed during lexing.
As a result, macro tree A can be on the left of tree B, and yet contain
a token-split FileID whose numeric value is *greator* than those in B.
In this case the tiebreak gives the wrong answer.
Concretely:
#define ID(X) X
template <typename> class S{};
ID(
ID(S<S<int>> x);
int y;
)
Given Greater = (typeloc of S<int>).getEndLoc();
Y = (decl of y).getLocation();
isBeforeInTranslationUnit(Greater, Y) should return true, but returns false.
Here the common FileID of (Greater, Y) is the body of the outer ID
expansion, and they both expand to X within it.
With the current tiebreak rules, we compare the FileID of Greater (a split)
to the FileID of Y (a macro arg expansion into X of the outer ID).
The former is larger because the token split occurred relatively late.
This patch fixes the issue by removing the shortcut. It tracks the immediate
FileIDs used to reach the common file, and uses these IDs to break ties.
In the example, we now compare the macro arg expansion of the inner ID()
to the macro arg expansion of Y, and find that it is smaller.
This requires some changes to the InBeforeInTUCacheEntry (sic).
We store a little more data so it's probably slightly slower.
It was difficult to resist more invasive changes:
- performance: the sizing is very suspicious, and once the cache "fills up"
we're thrashing a single entry
- API: the class seems to be needlessly complicated
However I tried to avoid mixing these with subtle behavior changes, and
will send a followup instead.
Differential Revision: https://reviews.llvm.org/D134685
This adds support under AArch64 for the target("..") attributes. The
current parsing is very X86-shaped, this patch attempts to bring it line
with the GCC implementation from
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes.
The supported formats are:
- "arch=<arch>" strings, that specify the architecture features for a
function as per the -march=arch+feature option.
- "cpu=<cpu>" strings, that specify the target-cpu and any implied
atributes as per the -mcpu=cpu+feature option.
- "tune=<cpu>" strings, that specify the tune-cpu cpu for a function as
per -mtune.
- "+<feature>", "+no<feature>" enables/disables the specific feature, for
compatibility with GCC target attributes.
- "<feature>", "no-<feature>" enabled/disables the specific feature, for
backward compatibility with previous releases.
To do this, the parsing of target attributes has been moved into
TargetInfo to give the target the opportunity to override the existing
parsing. The only non-aarch64 change should be a minor alteration to the
error message, specifying using "CPU" to describe the cpu, not
"architecture", and the DuplicateArch/Tune from ParsedTargetAttr have
been combined into a single option.
Differential Revision: https://reviews.llvm.org/D133848
This cc1 option -fallow-half-arguments-and-returns allows __fp16 to be
passed by argument and returned, without giving an error. It is
currently always enabled for Arm and AArch64, by forcing the option in
the driver. This means any cc1 tests (especially those needing
arm_neon.h) need to specify the option too, to prevent the error from
being emitted.
This changes it to a target option instead, set to true for Arm and
AArch64. This allows the option to be removed. Previously it was implied
by -fnative_half_arguments_and_returns, which is set for certain
languages like open_cl, renderscript and hlsl, so that option now too
controls the errors. There were are few other non-arm uses of
-fallow-half-arguments-and-returns but I believe they were unnecessary.
The strictfp_builtins.c tests were converted from __fp16 to _Float16 to
avoid the issues.
Differential Revision: https://reviews.llvm.org/D133885
k: A memory operand whose address is formed by a base register and
(optionally scaled) index register.
m: A memory operand whose address is formed by a base register and
offset that is suitable for use in instructions with the same
addressing mode as st.w and ld.w.
ZB: An address that is held in a general-purpose register. The offset
is zero.
ZC: A memory operand whose address is formed by a base register and
offset that is suitable for use in instructions with the same
addressing mode as ll.w and sc.w.
Differential Revision: https://reviews.llvm.org/D134638
A given function is compatible with all previous arch versions.
To avoid compering values of the attribute this logic adds all predecessor
architecture values.
Reviewed By: dmgreen, DavidSpickett
Differential Revision: https://reviews.llvm.org/D134353
This patch adds support for constraints `f`, `l`, `I`, `K` according
to [1]. The remain constraints (`k`, `m`, `ZB`, `ZC`) will be added
later as they are a little more complex than the others.
f: A floating-point register (if available).
l: A signed 16-bit constant.
I: A signed 12-bit constant (for arithmetic instructions).
K: An unsigned 12-bit constant (for logic instructions).
For now, no need to support register alias (e.g. `$a0`) in llvm as
clang will correctly decode the usage of register name aliases into
their official names. And AFAIK, the not yet upstreamed `rustc` for
LoongArch will always use official register names (e.g. `$r4`).
[1] https://gcc.gnu.org/onlinedocs/gccint/Machine-Constraints.html
Differential Revision: https://reviews.llvm.org/D134157
Clang with debug builds will crash when run with empty target feature input.
And the warning message is a little bit confusing. This patch adds an empty
check and a new diagnostic to illustrate where goes wrong.
Reviewed By: MaskRay, aaron.ballman
Differential Revision: https://reviews.llvm.org/D133563
`__builtin_arm_crc*` requires the target feature crc which is available on armv8
and above. Calling the fuctions for armv7 leads to a SelectionDAG crash.
```
% clang -c --target=armv7-unknown-linux-gnueabi -c a.c
fatal error: error in backend: Cannot select: intrinsic %llvm.arm.crc32b
PLEASE submit a bug report to ...
```
Add `TARGET_BUILTIN` and define required features for these builtins to
report an error in `CodeGenFunction::checkTargetFeatures`. The problem is quite widespread.
I will add `TARGET_BUILTIN` for more builtins later.
Fix https://github.com/llvm/llvm-project/issues/57802
Differential Revision: https://reviews.llvm.org/D134127
This is first part for support cbuffer/tbuffer.
The format for cbuffer/tbuffer is
BufferType [Name] [: register(b#)] { VariableDeclaration [: packoffset(c#.xyzw)]; ... };
More details at https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-constants
New keyword 'cbuffer' and 'tbuffer' are added.
New AST node HLSLBufferDecl is added.
Build AST for simple cbuffer/tbuffer without attribute support.
The special thing is variables declared inside cbuffer is exposed into global scope.
So isTransparentContext should return true for HLSLBuffer.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D129883
Fix __builtin_assume_aligned incorrect type descriptor
example from @rsmith
struct A { int n; };
struct B { int n; };
struct C : A, B {};
void *f(C *c) {
// Incorrectly returns `c` rather than the address of the B base class.
return __builtin_assume_aligned((B*)c, 8);
}
Differential Revision: https://reviews.llvm.org/D133583
Reuse most of RISCV's implementation with several exceptions:
1. Assign signext/zeroext attribute to args passed in stack.
On RISCV, integer scalars passed in registers have signext/zeroext
when promoted, but are anyext if passed on the stack. This is defined
in early RISCV ABI specification. But after this change [1], integers
should also be signext/zeroext if passed on the stack. So I think
RISCV's ABI lowering should be updated [2].
While in LoongArch ABI spec, we can see that integer scalars narrower
than GRLEN bits are zero/sign-extended no matter passed in registers
or on the stack.
2. Zero-width bit fields are ignored.
This matches GCC's behavior but it hasn't been documented in ABI sepc.
See https://gcc.gnu.org/r12-8294.
3. `char` is signed by default.
There is another difference worth mentioning is that `char` is signed
by default on LoongArch while it is unsigned on RISCV.
This patch also adds `_BitInt` type support to LoongArch and handle it
in LoongArchABIInfo::classifyArgumentType.
[1] cec39a064e
[2] https://github.com/llvm/llvm-project/issues/57261
Differential Revision: https://reviews.llvm.org/D132285
Some HLSL functionality is gated on the target shader model version.
Enabling the use of availability markup allows us to diagnose
availability issues easily in the frontend.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D134067
LLVM contains a helpful function for getting the size of a C-style
array: `llvm::array_lengthof`. This is useful prior to C++17, but not as
helpful for C++17 or later: `std::size` already has support for C-style
arrays.
Change call sites to use `std::size` instead. Leave the few call sites that
use a locally defined `array_lengthof` that are meant to test previous bugs
with NTTPs in clang analyzer and SemaTemplate.
Differential Revision: https://reviews.llvm.org/D133520
Make the DataLayout string always hold a vector alignment of 8 bytes,
regardless of the vector ABI. This makes the datalayout depend only on the
target triple which is the general expectation (in assertions).
On older architectures where vectors use the natural alignment (16 bytes),
the front end will maintain the same behavior and produce an overalignment
compared to the datalayout.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D131158
The backend now has a 32bit feature as part of the recent mtune
patch. We can now use that make our rv32-only builtin error checking
work the same way as rv64-only errors.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D132192
`report_fatal_error` is overloaded on `StringRef` and `Twine &`, therefore passing a `std::string` argument leads to ambiguity as it is convertible to either type.
Reviewed By: gribozavr2, sgatev
Differential Revision: https://reviews.llvm.org/D132745
This addresses an accidental change in behavior from
41667a8b9b to get the bots back to green.
However, I think there's an issue with LLDB assuming it's valid to
enable support for keywords in language modes that don't support the
keyword (as other parts of Clang are not expecting to be able to do
that).
This should fix (and others):
https://lab.llvm.org/buildbot/#/builders/68/builds/38374
The patch diagnoses an identifier as a future keyword if it exists in a
future language mode, such as:
int restrict;
in C modes earlier than C99. We now give a warning to the user that
such an identifier is a future keyword. Handles keywords from C as well
as C++.
Differential Revision: https://reviews.llvm.org/D131683