The second argument must be a constant, otherwise instruction selection
will fail. always_inline is not enough for isel to always fold
everything away at -O0.
Sadly the overloading turned this into a big macro mess. Fixes PR33212.
llvm-svn: 304205
The maximum alignment for ARM NEON data types should be 64-bits as specified
in ARM procedure call standard document Sec. A.2 Notes.
This patch fixes it from its current larger natural default values, except
for Android (so as not to break existing ABI).
Reviewed by: Stephen Hines, Renato Golin.
Differential Revision: https://reviews.llvm.org/D33205
llvm-svn: 304201
Amongst other, this will help LTO to correctly handle/honor files
compiled with O0, helping debugging failures.
It also seems in line with how we handle other options, like how
-fnoinline adds the appropriate attribute as well.
Differential Revision: https://reviews.llvm.org/D28404
llvm-svn: 304127
AVX512_VPOPCNTDQ is a new feature set that was published by Intel.
The patch represents the Clang side of the addition of six intrinsics for two new machine instructions (vpopcntd and vpopcntq).
It also includes the addition of the new feature set.
Differential Revision: https://reviews.llvm.org/D33170
llvm-svn: 303857
It is clean when I build boostrap and run make checkall on my machine, I guess
it could be I only build bootstrap with assert, while the buildbots may build
without asserts, which could cause the difference.
llvm-svn: 303786
Summary:
This change allows us to add arg1 logging support to functions through
the special case list provided through -fxray-always-instrument=. This
is useful for adding arg1 logging to functions that are either in
headers that users don't have control over (i.e. cannot change the
source) or would rather not do.
It only takes effect when the pattern is matched through the "fun:"
special case, as a category. As in:
fun:*pattern=arg1
Reviewers: pelikan, rnk
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D33392
llvm-svn: 303719
This test was failing on our fork of clang because it was not capturing
[[ARG]] in the N32 case. Therefore it used the value from the last function
which does not always have to be the same. All other cases were already
capturing ARG so this appears to be an oversight.
The test now uses -enable-var-scope to prevent such errors in the future.
Reviewers: sdardis, atanasyan
Patch by: Alexander Richardson
Differential Revision: https://reviews.llvm.org/D32425
llvm-svn: 303619
This patch adds support for the `micromips` and `nomicromips` attributes
for MIPS targets.
Differential revision: https://reviews.llvm.org/D33363
llvm-svn: 303546
Re-commit r303463 now that LLVM is fixed and adjust some lit tests.
llvm::TargetLibraryInfo needs to know the size of wchar_t to work on
functions like `wcslen`. This patch changes clang to always emit the
wchar_size module flag (it would only do so for ARM previously).
This also adds an `assert()` to ensure the LLVM defaults based on the
target triple are in sync with clang.
Differential Revision: https://reviews.llvm.org/D32982
llvm-svn: 303478
Let's revert this for now (and with it the assert()) to get the bots
back to green until I have LLVM synced up properly.
This reverts commit r303463.
llvm-svn: 303474
llvm::TargetLibraryInfo needs to know the size of wchar_t to work on
functions like `wcslen`. This patch changes clang to always emit the
wchar_size module flag (it would only do so for ARM previously).
This also adds an `assert()` to ensure the LLVM defaults based on the
target triple are in sync with clang.
Differential Revision: https://reviews.llvm.org/D32982
llvm-svn: 303463
Alloca always returns a pointer in alloca address space, which may
be different from the type defined by the language. For example,
in C++ the auto variables are in the default address space. Therefore
cast alloca to the expected address space when necessary.
Differential Revision: https://reviews.llvm.org/D32248
llvm-svn: 303370
Summary:
Clang changes to remove this option and replace with a parameter
always set in the context of a ThinLTO distributed backend.
Depends on D33133.
Reviewers: pcc
Subscribers: mehdi_amini, eraman, cfe-commits
Differential Revision: https://reviews.llvm.org/D33134
llvm-svn: 302940
Avoid using report_fatal_error, because it will ask the user to file a
bug. If the user attempts to disable SSE on x86_64 and them use floating
point, that's a bug in their code, not a bug in the compiler.
This is just a start. There are other ways to crash the backend in this
configuration, but they should be updated to follow this pattern.
Differential Revision: https://reviews.llvm.org/D27522
llvm-svn: 302835
Modified MipsABIInfo::classifyArgumentType so that it now coerces
aggregate structures only if the size of said aggregate is less than
16/64 bytes, depending on the ABI.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D32900
with minor changes (use regexp instead of the hardcoded values) to the test.
llvm-svn: 302670
Sanitizer instrumentation generally needs to be marked with !nosanitize,
but we're not doing this properly for ubsan's overflow checks.
r213291 has more information about why this is needed.
llvm-svn: 302598
This feature is subtly broken when the linker is gold 2.26 or
earlier. See the following bug for details:
https://sourceware.org/bugzilla/show_bug.cgi?id=19002
Since the decision needs to be made at compilation time, we can not
test the linker version. The flag is off by default on ELF targets,
and on otherwise.
llvm-svn: 302591
Reverting
Modified MipsABIInfo::classifyArgumentType so that it now coerces
aggregate structures only if the size of said aggregate is less than 16/64
bytes, depending on the ABI.
as it broke clang-with-lto-ubuntu builder.
llvm-svn: 302555
Modified MipsABIInfo::classifyArgumentType so that it now coerces aggregate
structures only if the size of said aggregate is less than 16/64 bytes,
depending on the ABI.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D32900
llvm-svn: 302547
Summary:
We define the `__xray_customeevent` builtin that gets translated to
IR calls to the correct intrinsic. The default implementation of this is
a no-op function. The codegen side of this follows the following logic:
- When `-fxray-instrument` is not provided in the driver, we elide all
calls to `__xray_customevent`.
- When `-fxray-instrument` is enabled and a function is marked as "never
instrumented", we elide all calls to `__xray_customevent` in that
function; if either marked as "always instrumented" or subject to
threshold-based instrumentation, we emit a call to the
`llvm.xray.customevent` intrinsic from LLVM for each
`__xray_customevent` occurrence in the function.
This change depends on D27503 (to land in LLVM first).
Reviewers: echristo, rsmith
Subscribers: mehdi_amini, pelikan, lrl, cfe-commits
Differential Revision: https://reviews.llvm.org/D30018
llvm-svn: 302492
This patch adds support for the the LightWeight Profiling (LWP) instructions which are available on all AMD Bulldozer class CPUs (bdver1 to bdver4).
Differential Revision: https://reviews.llvm.org/D32770
llvm-svn: 302418
It turns out there are some sort-of-but-not-quite empty structs that break all
the rules. For example:
struct SuperEmpty { int arr[0]; };
struct SortOfEmpty { struct SuperEmpty e; };
Both of these have sizeof == 0, even in C++ mode, for GCC compatibility. The
first one also doesn't occupy a register when passed by value in GNU C++ mode,
unlike everything else.
On Darwin, we want to ignore the lot (and especially don't want to try to use
an i0 as we were).
llvm-svn: 302313
This avoids problems on code like this:
char buf[16];
__asm {
movups xmm0, [buf]
mov [buf], eax
}
The frontend size in this case (1) is wrong, and the register makes the
instruction matching unambiguous. There are also enough bytes available
that we shouldn't complain to the user that they are potentially using
an incorrectly sized instruction to access the variable.
Supersedes D32636 and D26586 and fixes PR28266
llvm-svn: 302179
Implemented the remaining integer data processing intrinsics from
the ARM ACLE v2.1 spec, such as parallel arithemtic and DSP style
multiplications.
Differential Revision: https://reviews.llvm.org/D32282
llvm-svn: 302131
Currently, ubsan emits overflow checks for arithmetic that is known to
be safe at compile-time, e.g:
1 + 1 => CheckedAdd(1, 1)
This leads to breakage when using the __builtin_prefetch intrinsic. LLVM
expects the arguments to @llvm.prefetch to be constant integers, and
when ubsan inserts unnecessary checks on the operands to the intrinsic,
this contract is broken, leading to verifier failures (see PR32874).
Instead of special-casing __builtin_prefetch for ubsan, this patch fixes
the underlying problem, i.e that clang currently emits unnecessary
overflow checks.
Testing: I ran the check-clang and check-ubsan targets with a stage2,
ubsan-enabled build of clang. I added a regression test for PR32874, and
some extra checking to make sure we don't regress runtime checking for
unsafe arithmetic. The existing ubsan-promoted-arithmetic.cpp test also
provides coverage for this change.
llvm-svn: 301988
The previous algorithm processed one character at a time, which is very
painful on a modern CPU. Replace it with xxHash64, which both already
exists in the codebase and is fairly fast.
Patch from Scott Smith!
Differential Revision: https://reviews.llvm.org/D32509
llvm-svn: 301487
It's possible to determine the alignment of an alloca at compile-time.
Use this information to skip emitting some runtime alignment checks.
Testing: check-clang, check-ubsan.
This significantly reduces the amount of alignment checks we emit when
compiling X86ISelLowering.cpp. Here are the numbers from patched/unpatched
clangs based on r301361.
------------------------------------------
| Setup | # of alignment checks |
------------------------------------------
| unpatched, -O0 | 47195 |
| patched, -O0 | 30876 | (-34.6%)
------------------------------------------
llvm-svn: 301377
This change restores pre-r301225 behavior, where linker GC compatible global
instrumentation was used on COFF targets disregarding -f(no-)data-sections and/or
/Gw flags.
This instrumentation puts each global in a COMDAT with an ASan descriptor for that global.
It effectively enables -fdata-sections, but limits it to ASan-instrumented globals.
llvm-svn: 301374
Since Split DWARF needs to name the actual .dwo file that is generated,
it can't be known at the time the llvm::Module is produced as it may be
merged with other Modules before the object is generated and that object
may be generated with any name.
By passing the Split DWARF file name when LLVM is producing object code
the .dwo file name in the object file can match correctly.
The support for Split DWARF for implicit modules remains the same -
using metadata to store the dwo name and dwo id so that potentially
multiple skeleton CUs referring to different dwo files can be generated
from one llvm::Module.
llvm-svn: 301063
This restores the behavior prior to D31167 where the code-gen default was
FPC_On which mapped to FPOpFusion::Standard. After merging the FE
state (on/off) and the code-gen state (on/fast/off), the default became off to
match the front-end.
In other words, the front-end controls when to fuse along the language
standards and the backend shouldn't override this by splitting fused
intrinsics as FPOpFusion::Strict would imply.
Differential Revision: https://reviews.llvm.org/D32301
llvm-svn: 300858
LLVM has changed the semantics of dbg.declare for describing function
arguments. After this patch a dbg.declare always takes the *address*
of a variable as the first argument, even if the argument is not an
alloca.
https://bugs.llvm.org/show_bug.cgi?id=32382
rdar://problem/31205000
llvm-svn: 300523
MOVNTDQA non-temporal aligned vector loads can be correctly represented using generic builtin loads, allowing us to remove the existing x86 intrinsics.
LLVM companion patch: D31767.
Differential Revision: https://reviews.llvm.org/D31766
llvm-svn: 300326
For OpenCL, the private address space qualifier is 0 in AST. Before this change, 0 address space qualifier
is always mapped to target address space 0. As now target private address space is specified by
alloca address space in data layout, address space qualifier 0 needs to be mapped to alloca addr space specified by the data layout.
This change has no impact on targets whose alloca addr space is 0.
With contributions from Matt Arsenault, Tony Tye and Wen-Heng (Jack) Chung
Differential Revision: https://reviews.llvm.org/D31404
llvm-svn: 299965
This re-lands r299875.
I introduced a bug in Clang code responsible for replacing K&R, no
prototype declarations with a real function definition with a prototype.
The bug was here:
// Collect any return attributes from the call.
- if (oldAttrs.hasAttributes(llvm::AttributeList::ReturnIndex))
- newAttrs.push_back(llvm::AttributeList::get(newFn->getContext(),
- oldAttrs.getRetAttributes()));
+ newAttrs.push_back(oldAttrs.getRetAttributes());
Previously getRetAttributes() carried AttributeList::ReturnIndex in its
AttributeList. Now that we return the AttributeSetNode* directly, it no
longer carries that index, and we call this overload with a single node:
AttributeList::get(LLVMContext&, ArrayRef<AttributeSetNode*>)
That aborted with an assertion on x86_32 targets. I added an explicit
triple to the test and added CHECKs to help find issues like this in the
future sooner.
llvm-svn: 299899
Previously __cfi_check was created in LTO optimization pipeline, which
means LLD has no way of knowing about the existence of this symbol
without rescanning the LTO output object. As a result, LLD fails to
export __cfi_check, even when given --export-dynamic-symbol flag.
llvm-svn: 299806
It's used by MS headers in VS 2017 without including intrin.h, so we
can't implement it in the header anymore.
Differential Revision: https://reviews.llvm.org/D31736
llvm-svn: 299782
This adds the new pragma and the first variant, contract(on/off/fast).
The pragma has the same block scope rules as STDC FP_CONTRACT, i.e. it can be
placed at the beginning of a compound statement or at file scope.
Similarly to STDC FP_CONTRACT there is no need to use attributes. First an
annotate token is inserted with the parsed details of the pragma. Then the
annotate token is parsed in the proper contexts and the Sema is updated with
the corresponding FPOptions using the shared ActOn function with STDC
FP_CONTRACT.
After this the FPOptions from the Sema is propagated into the AST expression
nodes. There is no change here.
I was going to add a 'default' option besides 'on/off/fast' similar to STDC
FP_CONTRACT but then decided against it. I think that we'd have to make option
uppercase then to avoid using 'default' the keyword. Also because of the
scoped activation of pragma I am not sure there is really a need a for this.
Differential Revision: https://reviews.llvm.org/D31276
llvm-svn: 299470
With this, FMF(contract) becomes an alternative way to express the request to
contract.
These are currently only propagated for FMul, FAdd and FSub. The rest will be
added as more FMFs are hooked up for this.
This is toward fixing PR25721.
Differential Revision: https://reviews.llvm.org/D31168
llvm-svn: 299469
MS assembly syntax provide us with the 'EVEN' directive as a synonymous to at&t '.even'.
This patch include the (small, simple) changes need to allow it.
llvm-side:
https://reviews.llvm.org/D27417
Differential Revision: https://reviews.llvm.org/D27418
llvm-svn: 299454
This patch is a part two of two reviews, one for the clang and the other for LLVM.
In this patch, I covered the clang side, by introducing the intrinsic to the front end.
This is done by creating a generic replacement.
Differential Revision: https://reviews.llvm.org/D31394a
llvm-svn: 299431
Summary:
Use PreCodeGenModuleHook to invoke the correct writer when emitting LLVM
IR, returning false to skip codegen from within thinBackend.
Reviewers: pcc, mehdi_amini
Subscribers: Prazek, cfe-commits
Differential Revision: https://reviews.llvm.org/D31534
llvm-svn: 299274
Our _MM_HINT_T0/T1 constant values are 3/2 which matches gcc, but not icc or Intel documentation. Interestingly gcc had this same bug on their implementation of the gather/scatter builtins at one point too.
Fixes PR32411.
llvm-svn: 299233
Reasoning behind this change was allowing the function to accept all values
from range [-128, 255] since all of them can be encoded in an 8bit wide
value.
This differs from the prior state where only range [-128, 127] was accepted,
where values were assumed to be signed, whereas now the actual
interpretation of the immediate is deferred to the consumer as required.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D31082
llvm-svn: 299229
Sigh, another follow-on fix needed for test in r299152 causing bot
failures. We also need the target-cpu for the ThinLTO BE clang
invocation.
llvm-svn: 299178
Third and hopefully final fix to test for r299152 that is causing bot
failures: make sure the target triple specified for the ThinLTO backend
clang invocations as well.
llvm-svn: 299176
Summary:
This involved refactoring out pieces of
EmitAssemblyHelper::CreateTargetMachine for use in runThinLTOBackend.
Subsumes D31114.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, cfe-commits
Differential Revision: https://reviews.llvm.org/D31508
llvm-svn: 299152
Summary:
The refactoring introduced a regression in the flag processing for
-fxray-instruction-threshold which causes it to not get passed properly.
This change should restore the previous behaviour.
Reviewers: rnk, pelikan
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D31491
llvm-svn: 299126
GCC has the alloc_align attribute, which is similar to assume_aligned, except the attribute's parameter is the index of the integer parameter that needs aligning to.
Differential Revision: https://reviews.llvm.org/D29599
llvm-svn: 299117
Summary:
The -fxray-always-instrument= and -fxray-never-instrument= flags take
filenames that are used to imbue the XRay instrumentation attributes
using a whitelist mechanism (similar to the sanitizer special cases
list). We use the same syntax and semantics as the sanitizer blacklists
files in the implementation.
As implemented, we respect the attributes that are already defined in
the source file (i.e. those that have the
[[clang::xray_{always,never}_instrument]] attributes) before applying
the always/never instrument lists.
Reviewers: rsmith, chandlerc
Subscribers: jfb, mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D30388
llvm-svn: 299041
attributes.
These patches don't work because we can't currently access the parameter
information in a reliable way when building attributes. I thought this
would be relatively straightforward to fix, but it seems not to be the
case. Fixing this will requrie a substantial re-plumbing of machinery to
allow attributes to be handled in this location, and several other fixes
to the attribute machinery should probably be made at the same time. All
of this will make the patch .... substantially more complicated.
Reverting for now as there are active miscompiles caused by the current
version.
llvm-svn: 298695
Summary:
Clang companion patch to LLVM patch D31027, which adds support
for emitting minimized bitcode file for use in the thin link step.
Add a cc1 option -fthin-link-bitcode=<file> to trigger this behavior.
Depends on D31027.
Reviewers: mehdi_amini, pcc
Subscribers: cfe-commits, Prazek
Differential Revision: https://reviews.llvm.org/D31050
llvm-svn: 298639
It seems MS headers have started using __readgsqword, and since it's
used in a header that doesn't include intrin.h, we can't implement it as
an inline function anymore.
That was already the case for __readfsdword, which Saleem added support
for in r220859. This patch reuses that codegen to implement all of
__read[fg]s{byte,word,dword,qword}.
Differential Revision: https://reviews.llvm.org/D31248
llvm-svn: 298538
explaining why we have to ignore errors here even though in other parts
of codegen we can be more strict with builtins.
Also add a test case based on the code in a TSan test that found this
issue.
llvm-svn: 298494
declarations and calls instead of just definitions, and then teach it to
*not* attach such attributes even if the source code contains them.
This follows the design direction discussed on cfe-dev here:
http://lists.llvm.org/pipermail/cfe-dev/2017-January/052066.html
The idea is that for C standard library builtins, even if the library
vendor chooses to annotate their routines with __attribute__((nonnull)),
we will ignore those attributes which pertain to pointer arguments that
have an associated size. This allows the widespread (and seemingly
reasonable) pattern of calling these routines with a null pointer and
a zero size. I have only done this for the library builtins currently
recognized by Clang, but we can now trivially add to this set. This will
be controllable with -fno-builtin if anyone should care to do so.
Note that this does *not* change the AST. As a consequence, warnings,
static analysis, and source code rewriting are not impacted.
This isn't even a regression on any platform as neither Clang nor LLVM
have ever put 'nonnull' onto these arguments for declarations. All this
patch does is enable it on other declarations while preventing us from
ever accidentally enabling it on these libc functions due to a library
vendor.
It will also allow any other libraries using this annotation to gain
optimizations based on the annotation even when only a declaration is
visible.
llvm-svn: 298491
The alias was only ever used on darwin and had some issues there,
and isn't used in practice much. Also fixes a problem with -mno-altivec
not turning off -maltivec.
Also add a diagnostic for faltivec/fno-altivec that directs users to use
maltivec options and include the altivec.h file explicitly.
llvm-svn: 298449
Summary:
Because SamplePGO passes will be invoked twice in ThinLTO build: once at compile phase, the other at backend. We want to make sure the IR at the 2nd phase matches the hot part in pro
file, thus we do not want to inline hot callsites in the first phase.
Reviewers: tejohnson, eraman
Reviewed By: tejohnson
Subscribers: mehdi_amini, cfe-commits, Prazek
Differential Revision: https://reviews.llvm.org/D31202
llvm-svn: 298429
This patch introduces X86AsmParser with the ability to handle the aforementioned ops within compound "MS" arithmetical expressions.
Currently - only supported as a stand alone Operand, e.g.:
"TYPE X"
now allowed :
"4 + TYPE X * 128"
LLVM side: https://reviews.llvm.org/D31173
Differential Revision: https://reviews.llvm.org/D31174
llvm-svn: 298426
x86 has undef SSE/AVX intrinsics that should represent a bogus register operand.
This is not the same as LLVM's undef value which can take on multiple bit patterns.
There are better solutions / follow-ups to this discussed here:
https://bugs.llvm.org/show_bug.cgi?id=32176
...but this should prevent miscompiles with a one-line code change.
Differential Revision: https://reviews.llvm.org/D30834
llvm-svn: 297588
Removes immediate range checks for these instructions, since they have GPR
rt as their input operand.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D30693
llvm-svn: 297485
This patch honors the unaligned type qualifier (currently available through he
keyword __unaligned and -fms-extensions) in CodeGen. In the current form the
patch affects declarations and expressions. It does not affect fields of
classes.
Differential Revision: https://reviews.llvm.org/D30166
llvm-svn: 297276
This test broke with an LLVM instcombine patch (r297166).
I changed the RUN line to only run -mem2reg (to save time checking this large chunk of tests)
and updated the checks using the script attached to D17999:
https://reviews.llvm.org/D17999
The goal is to make this test immune to optimizer changes. If there's something in these
tests that was checking for an IR optimization, that should be tested in LLVM, not Clang.
llvm-svn: 297189