Because ventus riscv is designed specially for OpenCL language, we originally add or remove some language features mainly for serving OpenCL, but we now need to add customized `printf` function which is expected to be written in C, so we need also to add support for C language features in current ventus
Signed-off-by: zhoujingya <jing.zhou@terapines.com>
D130887 uses a dummy empty section `sanmd_covered` (with the SHF_GNU_RETAIN flag on
ELF) to prevent `undefined symbol: __start_sanmd_covered` if all `sanmd_covered`
are discarded by `ld --gc-sections` (in `-z start-stop-gc` mode).
The dummy `sanmd_covered` does not have the SHF_LINK_ORDER flag, so mixing it
with SHF_LINK_ORDER `sanmd_covered` causes an issue to GNU ld<2.36
(https://sourceware.org/bugzilla/show_bug.cgi?id=26256).
Similar to D98903 for SanitizerCoverage, let's make encapsulation symbols
undefined weak[1]. This additionally avoids size cost due to the dummy section and
symbol.
[1]: https://maskray.me/blog/2021-01-31-metadata-sections-comdat-and-shf-link-order
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D139276
We support AMX-FP16 isa in https://reviews.llvm.org/D135941 now.
The old intrinsic interface need to manually write tile registers.
So we support its new intrinsic interface to let it be able to do register allocation.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D138987
After D137316 implements the intrinsics of the first crc check instruction
and related diagnosis, this patch implements the intrinsics of all remaining
crc check instructions.
Differential Revision: https://reviews.llvm.org/D138418
This aligns the behaviour with that of disabling optimisations for the
translation unit entirely. Not merging the traps allows us to keep
separate debug information for each, improving the debugging experience
when finding the cause for a ubsan trap.
Differential Revision: https://reviews.llvm.org/D137714
This was triggered by some code in picolibc. The minimal version looks
like this:
double infinity(void) {
return 5;
}
extern long double infinityl() __attribute__((__alias__("infinity")));
These two declarations have a different type (not because of the 'long
double', which is also 'double' in IR, but because infinityl has
variadic parameters). This led to a crash in the bitcast which assumed
address space 0.
Differential Revision: https://reviews.llvm.org/D138681
This used to be required, but the difference between asserts/!asserts
builds no longer exists for %clang_cc1 (only for %clang), so they pass
just fine without this flag.
`-fapprox-func` should be disabled by `-fp-model={strict|precise}`,
as well as other fast-math flags. See the last changes in
`clang/test/Driver/fp-model.c`.
Probably this route (`case options::OPT_ffp_model_EQ`) was forgot
to update in D106191 and D114564. There is no appropriate reason not
to disable the flag.
This commit also updates other regression tests, which are not directly
related to this bug, for consistency with other fast-math flags.
Differential Revision: https://reviews.llvm.org/D138109
Summary:
AIX library functions frexpl(), ldexpl(), and modfl() are for 128-bit IBM long double, i.e. __ibm128. Other *l() functions, e.g., acosl(), are for 64-bit long double. The AIX Clang compiler currently maps builtin functions __builtin_frexpl(), __builtin_ldexpl(), and __builtin_modfl() to frexpl(), ldexpl(), and modfl() in 64-bit long double mode which results in seg-faults or incorrect return values. This patch changes to map __builtin_frexpl(), __builtin_ldexpl(), and __builtin_modfl() to double version lib functions frexp(), ldexp() and modf() in 64-bit long double mode.
Reviewed by: hubert.reinterpretcast, daltenty
Differential Revision: https://reviews.llvm.org/D137986
Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).
This is a recommit of ef9e62469.
Test plan: ninja check-all
Differential revision: https://reviews.llvm.org/D137768
-ffp-model=strict -ffp-model=fast will still enable strict exception
handling behavior, therefore clang still emits constrained FP operations
in IR.
-ffp-model=fast -ffp-model=strict emits two warnings: one for strict
overriding fast, the other for strict overriding strict, which is
confusing.
Reviewed By: zahiraam
Differential Revision: https://reviews.llvm.org/D137618
This avoids depending on int->float or double->float conversion.
Improving codegen with #pragma STDC FENV_ACCESS ON.
Really we should improve constant folding somewhere, but this was
a cheap and easy improvement.
Fixes PR59052.
A scalar which exceeds 4 bytes should be returned via a stack slot,
on an AVRTiny device.
Reviewed By: aykevl
Differential Revision: https://reviews.llvm.org/D138125
The conditions for which Clang emits the `unsafe-fp-math` function
attribute has been modified as part of
`84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7`.
In the backend code generators `"unsafe-fp-math"="true"` enable floating
point contraction for the whole function.
The intent of the change in `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7`
was to prevent backend code generators performing contractions when that
is not expected.
However the change is inaccurate and incomplete because it allows
`unsafe-fp-math` to be set also when only in-statement contraction is
allowed.
Consider the following example
```
float foo(float a, float b, float c) {
float tmp = a * b;
return tmp + c;
}
```
and compile it with the command line
```
clang -fno-math-errno -funsafe-math-optimizations -ffp-contract=on \
-O2 -mavx512f -S -o -
```
The resulting assembly has a `vfmadd213ss` instruction which corresponds
to a fused multiply-add. From the user perspective there shouldn't be
any contraction because the multiplication and the addition are not in
the same statement.
The optimized IR is:
```
define float @test(float noundef %a, float noundef %b, float noundef %c) #0 {
%mul = fmul reassoc nsz arcp afn float %b, %a
%add = fadd reassoc nsz arcp afn float %mul, %c
ret float %add
}
attributes #0 = {
[...]
"no-signed-zeros-fp-math"="true"
"no-trapping-math"="true"
[...]
"unsafe-fp-math"="true"
}
```
The `"unsafe-fp-math"="true"` function attribute allows the backend code
generator to perform `(fadd (fmul a, b), c) -> (fmadd a, b, c)`.
In the current IR representation there is no way to determine the
statement boundaries from the original source code.
Because of this for in-statement only contraction the generated IR
doesn't have instructions with the `contract` fast-math flag and
`llvm.fmuladd` is being used to represent contractions opportunities
that occur within a single statement.
Therefore `"unsafe-fp-math"="true"` can only be emitted when contraction
across statements is allowed.
Moreover the change in `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7` doesn't
take into account that the floating point math function attributes can
be refined during IR code generation of a function to handle the cases
where the floating point math options are modified within a compound
statement via pragmas (see `CGFPOptionsRAII`).
For consistency `unsafe-fp-math` needs to be disabled if the contraction
mode for any scope/operation is not `fast`.
Similarly for consistency reason the initialization of `UnsafeFPMath` of
in `TargetOptions` for the backend code generation should take into
account the contraction mode as well.
Reviewed By: zahiraam
Differential Revision: https://reviews.llvm.org/D136786
This reverts commit bf8381a8bc.
There is a layering violation: LLVMAnalysis depends on LLVMCore, so
LLVMCore should not include LLVMAnalysis header
llvm/Analysis/ModuleSummaryAnalysis.h
Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).
This is a recommit of ef9e62469.
Test plan: ninja check-all
Differential revision: https://reviews.llvm.org/D137768
Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).
Test plan: ninja check-all
Differential revision: https://reviews.llvm.org/D137768
Add codegen for llvm cos and sin elementwise builtins
The sin and cos elementwise builtins are necessary for HLSL codegen.
Tests were added to make sure that the expected errors are encountered
when these functions are given inputs of incompatible types.
The new builtins are restricted to floating point types only.
Reviewed By: craig.topper, fhahn
Differential Revision: https://reviews.llvm.org/D135011
On targets where ptrdiff_t is smaller than long, clang crashes when emitting
synthesized getters/setters that call objc_[gs]etProperty. Explicitly emit a
zext/trunc of the ivar offset value (which is defined to long) to ptrdiff_t,
which objc_[gs]etProperty takes.
Add a test using the AVR target, where ptrdiff_t is smaller than long. Test
failed previously and passes now.
Differential Revision: https://reviews.llvm.org/D112049
Reverted in 98fa95492f.
The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir
This patch plumbs the AssignmentTrackingPass (AKA declare-to-assign), added in
the previous patch in this set, into the optimisation pipeline from
clang. clang/test/CodeGen/assignment-tracking/assignment-tracking.cpp is the
main test for this patch.
Note: while clang (with the help of the declare-to-assign pass) can now emit
Assignment Tracking metadata, the llvm middle and back ends don't yet
understand it.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D132226
D133848 added support for the GCC format of target("..") attributes. The
supported formats to match gcc are:
// "arch=<arch>" - parsed to features as per -march=..
// "cpu=<cpu>" - parsed to features as per -mcpu=.., with CPU set to <cpu>
// "tune=<cpu>" - TuneCPU set to <cpu>
// "+feature", "+nofeature" - Add (or remove) feature.
We also support the existing formats, previously accepted by clang, for
compatibility with the existing code and intrinsics code:
// "feature", "no-feature" - Add (or remove) feature.
The clang formats would accept and use internal feature names
("fullfp16"/"neon"/"sve") as opposed to the user facing names
("fp16"/"simd"/"sve"). Usually they use the same names, but can be
different for cases like fp, fullfp16 and mte (among others).
This patch makes the clang format also except the user facing names, by
parsing the features through getArchExtFeature. There is a fallback if
the name is not recognized (like "fullfp16"), where we add the existing
string which should then be checked later for consistency. This allows
the internal names to be used as before, so long as they are recognized
as internal names. (Note that we currently don't have an implementation
of isValidFeatureName. The backend will currently give an error like
"'-sid' is not a recognized feature for this target (ignoring feature)."
This should be improved in a later patch once an implementation of
isValidFeatureName in clang is present).
Differential Revision: https://reviews.llvm.org/D137617
The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir
This patch plumbs the AssignmentTrackingPass (AKA declare-to-assign), added in
the previous patch in this set, into the optimisation pipeline from
clang. clang/test/CodeGen/assignment-tracking/assignment-tracking.cpp is the
main test for this patch.
Note: while clang (with the help of the declare-to-assign pass) can now emit
Assignment Tracking metadata, the llvm middle and back ends don't yet
understand it.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D132226
Updated the RUN line in several test cases to use the new PM syntax
opt -passes=<pipeline>
instead of the deprecated syntax
opt -pass1 -pass2
This was not a complete cleanup in clang/test. But just a swipe using
some simple search-and-replace. Mainly for RUN lines involving
-mem2reg, -instnamer and -early-cse.