Use this feature to fix a bug on ARM where 4 byte alignment is
incorrectly assumed.
Differential Revision: https://reviews.llvm.org/D57335
llvm-svn: 355685
Use this feature to fix a bug on ARM where 4 byte alignment is
incorrectly assumed.
Differential Revision: https://reviews.llvm.org/D57335
llvm-svn: 355585
Use this feature to fix a bug on ARM where 4 byte alignment is
incorrectly assumed.
Differential Revision: https://reviews.llvm.org/D57335
llvm-svn: 355522
This patch enables the following
1) AMD family 17h "znver2" tune flag (-march, -mcpu).
2) ISAs that are enabled for "znver2" architecture.
3) For the time being, it uses the znver1 scheduler model.
4) Tests are updated.
5) This patch is the clang counterpart to D58343
Reviewers: craig.topper
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58344
llvm-svn: 354899
This adds ACLE-defined macros to test for code being compiled in the ROPI and
RWPI position-independence modes.
Differential revision: https://reviews.llvm.org/D23610
llvm-svn: 354265
Summary:
The predefined macro `_ARCH_PWR6X` is associated with GCC's
`-mcpu=power6x` option, which enables generation of P6 "raw mode"
instructions such as `mftgpr`.
Later POWER processors build upon the "architected mode", not the raw
one. `_ARCH_PWR6X` should not be defined for these later processors.
Fixes PR#40236.
Reviewers: echristo, hfinkel, kbarton, nemanjai, wschmidt
Reviewed By: hfinkel
Subscribers: jsji, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58128
llvm-svn: 353975
The rationale of this change is to fix _Unwind_Word / _Unwind_SWord
definitions for MIPS N32 ABI. This ABI uses 32-bit pointers,
but _Unwind_Word and _Unwind_SWord types are eight bytes long.
# The __attribute__((__mode__(__unwind_word__))) is added to the type
definitions. It makes them equal to the corresponding definitions used
by GCC and allows to override types using `getUnwindWordWidth` function.
# The `getUnwindWordWidth` virtual function override in the `MipsTargetInfo`
class and provides correct type size values.
Differential revision: https://reviews.llvm.org/D58165
llvm-svn: 353965
This patch simply teach BPF driver about the new CPU "v3" introduced in
LLVM backend.
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
llvm-svn: 353479
Summary:
Added ability to generate correct debug info data about the variable
address class. Currently, for all the locals and globals the default
values are used, ADDR_local_space(6) for locals and ADDR_global_space(5)
for globals. The values are taken from the table in
https://docs.nvidia.com/cuda/archive/10.0/ptx-writers-guide-to-interoperability/index.html#cuda-specific-dwarf.
We need to emit correct data for address classes of, at least, shared
and constant globals. Currently, all these variables are treated by
the cuda-gdb debugger as the variables in the global address space
and, thus, it require manual data type casting.
Reviewers: echristo, probinson
Subscribers: jholewinski, aprantl, cfe-commits
Differential Revision: https://reviews.llvm.org/D57162
llvm-svn: 353204
rC352620 caused regressions because it copied floating point format from
aux target.
floating point format decides whether extended long double is supported.
It is x86_fp80 on x86 but IEEE double on amdgcn.
Document usage of long doubel type in HIP programming guide
https://github.com/ROCm-Developer-Tools/HIP/pull/890
Differential Revision: https://reviews.llvm.org/D57527
llvm-svn: 352801
In 64 bit MSVC environment size_t is defined as unsigned long long.
In single source language like HIP, data layout should be consistent
in device and host compilation, therefore copy data layout controlling
fields from Aux target for AMDGPU target.
Differential Revision: https://reviews.llvm.org/D56318
llvm-svn: 352620
As Discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2019-January/129543.html
There are problems exposing the _Float16 type on architectures that
haven't defined the ABI/ISel for the type yet, so we're temporarily
disabling the type and making it opt-in.
Differential Revision: https://reviews.llvm.org/D57188
Change-Id: I5db7366dedf1deb9485adb8948b1deb7e612a736
llvm-svn: 352221
This adds a `__wasi__` macro for the wasi OS, similar to `__linux__` etc. for
other OS's.
Differential Revision: https://reviews.llvm.org/D57155
llvm-svn: 352105
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
a stray single '\r' from one file. These are the last line ending issues
I can find in the files containing parts of LLVM's file headers.
llvm-svn: 351634
Replace multiple comparisons of getOS() value with FreeBSD, NetBSD,
OpenBSD and DragonFly with matching isOS*BSD() methods. This should
improve the consistency of coding style without changing the behavior.
Direct getOS() comparisons were left whenever used in switch or switch-
like context.
Differential Revision: https://reviews.llvm.org/D55916
llvm-svn: 349752
The Darwin targets use `int64_t` and `uint64_t` to define the `int_least64_t`
and `int_fast64_t` types. The underlying type is actually a `long long`. Match
the types to allow the printf specifiers to work properly and have the compiler
vended macros match the implementation on the target.
llvm-svn: 348939
Summary:
The patch is to add the VSX register support for inline assembly. After this
patch, we can use VSX register in inline assembly clobber list without error.
Reviewed By: jsji, nemanjai
Differential Revision: https://reviews.llvm.org/D55192
llvm-svn: 348572
Support the Swift calling convention on Windows ARM and AArch64. Both
of these conform to the AAPCS, AAPCS64 calling convention, and LLVM has
been adjusted to account for the register usage. Ensure that the
frontend passes this into the backend. This allows the swift runtime to
be built for Windows.
llvm-svn: 348454
This patch addresses a compilation error with clang when
running in Haiku being unable to compile code using
float128 (throws compilation error such as 'float128 is
not supported on this target').
Patch by kallisti5 (Alexander von Gluck IV)
Differential Revision: https://reviews.llvm.org/D54901
llvm-svn: 348368
As of rev. 268898, clang supports __float128 on SystemZ. This seems to
have been in error. GCC has never supported __float128 on SystemZ,
since the "long double" type on the platform is already IEEE-128. (GCC
only supports __float128 on platforms where "long double" is some other
data type.)
For compatibility reasons this patch removes __float128 on SystemZ
again. The test case is updated accordingly.
llvm-svn: 348247
This adds Hurd toolchain support to Clang's driver in addition
to handling translating the triple from Hurd-compatible form to
the actual triple registered in LLVM.
(Phabricator was stripping the empty files from the patch so I
manually created them)
Patch by sthibaul (Samuel Thibault)
Differential Revision: https://reviews.llvm.org/D54379
llvm-svn: 347833
This is skylake-avx512 with the addition of avx512vnni ISA.
Patch by Jianping Chen
Differential Revision: https://reviews.llvm.org/D54792
llvm-svn: 347682
This patch should not introduce any behavior changes. It consists of
mostly one of two changes:
1. Replacing fall through comments with the LLVM_FALLTHROUGH macro
2. Inserting 'break' before falling through into a case block consisting
of only 'break'.
We were already using this warning with GCC, but its warning behaves
slightly differently. In this patch, the following differences are
relevant:
1. GCC recognizes comments that say "fall through" as annotations, clang
doesn't
2. GCC doesn't warn on "case N: foo(); default: break;", clang does
3. GCC doesn't warn when the case contains a switch, but falls through
the outer case.
I will enable the warning separately in a follow-up patch so that it can
be cleanly reverted if necessary.
Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu
Differential Revision: https://reviews.llvm.org/D53950
llvm-svn: 345882
This silences a -Wimplicit-fallthrough warning from clang. GCC does not
appear to warn when the case body ends in a switch.
This is a somewhat surprising but intended fallthrough that I pulled out
from my mechanical patch. The code intends to handle 'Yi' and related
constraints as the 'x' constraint.
llvm-svn: 345873
We haven't supported compiling ObjC1 for a long time (and never will again), so
there isn't any reason to keep these separate. This patch replaces
LangOpts::ObjC1 and LangOpts::ObjC2 with LangOpts::ObjC.
Differential revision: https://reviews.llvm.org/D53547
llvm-svn: 345637
Generate the FP16FML intrinsics into arm_neon.h (AArch64 only for now).
Add two new type modifiers to NeonEmitter to handle the new prototypes.
Define __ARM_FEATURE_FP16FML when +fp16fml is enabled and guard the
intrinsics with the macro in arm_neon.h.
Based on a patch by Gao Yiling.
Differential Revision: https://reviews.llvm.org/D53633
llvm-svn: 345344
Similar to how ICC handles CPU-Dispatch on Windows, this patch uses the
resolver function directly to forward the call to the proper function.
This is not nearly as efficient as IFuncs of course, but is still quite
useful for large functions specifically developed for certain
processors.
This is unfortunately still limited to x86, since it depends on
__builtin_cpu_supports and __builtin_cpu_is, which are x86 builtins.
The naming for the resolver/forwarding function for cpu-dispatch was
taken from ICC's implementation, which uses the unmodified name for this
(no mangling additions). This is possible, since cpu-dispatch uses '.A'
for the 'default' version.
In 'target' multiversioning, this function keeps the '.resolver'
extension in order to keep the default function keeping the default
mangling.
Change-Id: I4731555a39be26c7ad59a2d8fda6fa1a50f73284
Differential Revision: https://reviews.llvm.org/D53586
llvm-svn: 345298
I'm unsure if KNL has this feature, but the backend never thought it did, only clang did. The predefined-arch-macros test lost the check for __RTM__ on KNL when it was removed Skylake CPUs in r344117.
I think we want to drop it from KNL for consistency with Skylake anyway regardless of how we got here.
llvm-svn: 344978
The `GNUABIN32` environment in a target triple implies using the N32
ABI. This patch adds support for this environment and switches on N32
ABI if necessary.
Patch by Patch by YunQiang Su.
Differential revision: https://reviews.llvm.org/D51464
llvm-svn: 344570
Summary:
gcc defines macros such as __code_model_small_ based on the user passed command line flag -mcmodel. clang accepts a flag with the same name and similar effects, but does not generate any macro that the user can use. This cl narrows the gap between gcc and clang behaviour.
However, achieving full compatibility with gcc is not trivial: The set of valid values for mcmodel in gcc and clang are not equal. Also, gcc defines different macros for different architectures. In this cl, we only tackle an easy part of the problem and define the macro only for x64 architecture. When the user does not specify a mcmodel, the macro for small code model is produced, as is the case with gcc.
Reviewers: compnerd, MaskRay
Reviewed By: MaskRay
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D52920
llvm-svn: 344000
These intrinsics exist in icc. They can be found on the Intel Intrinsics Guide website.
All the backend support is in place to pattern match a load+bswap or a bswap+store pattern to the MOVBE instructions. So we just need to get the frontend to emit the correct IR. The pointer arguments in icc are declared as void so I had to jump through a packed struct to forcing a specific alignment on the load/store. Same trick we use in the unaligned vector load/store intrinsics
Differential Revision: https://reviews.llvm.org/D52586
llvm-svn: 343343
My previous change (rL340911) set the two features for architectures
>= 6, which wrongly includes v6m. Now set to >= 6 and not Cortex-M.
Differential Revision: https://reviews.llvm.org/D52644
llvm-svn: 343309
This patch allows targetting Armv8.5-A from Clang. Most of the
implementation is in TargetParser, so this is mostly just adding tests.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52491
llvm-svn: 343111
Windows uses `unsigned short` for `wint_t`. Correct the type definition as
vended by the compiler. This type is defined in corecrt.h and is
unconditionally typedef'ed. cl does not have an equivalent to `__WINT_TYPE__`
which is why this was never detected.
llvm-svn: 342557
The instruction set first appeared with Westmere, but not all processors
in that and the next few generations have the instructions. According to
Wikipedia[1], the first generation in which all SKUs have AES
instructions are Skylake and Goldmont. I can't find any Skylake,
Kabylake, Kabylake-R or Cannon Lake currently listed at
https://ark.intel.com that says "Intel® AES New Instructions" "No".
This matches GCC commit
https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01940.html
[1] https://en.wikipedia.org/wiki/AES_instruction_set
Patch By: thiagomacieira
Differential Revision: https://reviews.llvm.org/D51510
llvm-svn: 341862
ARM_FEATURE_DSP is already set for targets with the +dsp feature. In
the backend, this target feature is also used to represent the
availability of the of the instructions that the ACLE guard through
the __ARM_FEATURE_SIMD32 macro. We don't have any cores that
implement one and not the other, so set this macro for cores later
than V6 or for Cortex-M cores that the target parser, or user, reports
that the 'dsp' instructions are supported.
Differential Revision: https://reviews.llvm.org/D51093
llvm-svn: 340911
As reported on http://lists.llvm.org/pipermail/cfe-dev/2018-August/058760.html,
this broke i386-freebsd11 due to its lack of atomic 64 bit primitives.
While that's not really this commit's fault, let's revert back to the old
behaviour until this can be fixed. This means generating cmpxchg8b etc for i386
and i486 which don't technically support those, but that's been the behaviour
for a long time, so a little longer probably doesn't hurt that much.
> Adjust MaxAtomicInlineWidth for i386/i486 targets.
>
> This is to fix the bug reported in https://bugs.llvm.org/show_bug.cgi?id=34347#c6.
> Currently, all MaxAtomicInlineWidth of x86-32 targets are set to 64. However,
> i386 doesn't support any cmpxchg related instructions. i486 only supports cmpxchg.
> So in this patch MaxAtomicInlineWidth is reset as follows:
> For i386, the MaxAtomicInlineWidth should be 0 because no cmpxchg is supported.
> For i486, the MaxAtomicInlineWidth should be 32 because it supports cmpxchg.
> For others 32 bits x86 cpu, the MaxAtomicInlineWidth should be 64 because of cmpxchg8b.
>
> Differential Revision: https://reviews.llvm.org/D42154
llvm-svn: 340666
subtarget features for indirect calls and indirect branches.
This is in preparation for enabling *only* the call retpolines when
using speculative load hardening.
I've continued to use subtarget features for now as they continue to
seem the best fit given the lack of other retpoline like constructs so
far.
The LLVM side is pretty simple. I'd like to eventually get rid of the
old feature, but not sure what backwards compatibility issues that will
cause.
This does remove the "implies" from requesting an external thunk. This
always seemed somewhat questionable and is now clearly not desirable --
you specify a thunk the same way no matter which set of things are
getting retpolines.
I really want to keep this nicely isolated from end users and just an
LLVM implementation detail, so I've moved the `-mretpoline` flag in
Clang to no longer rely on a specific subtarget feature by that name and
instead to be directly handled. In some ways this is simpler, but in
order to preserve existing behavior I've had to add some fallback code
so that users who relied on merely passing -mretpoline-external-thunk
continue to get the same behavior. We should eventually remove this
I suspect (we have never tested that it works!) but I've not done that
in this patch.
Differential Revision: https://reviews.llvm.org/D51150
llvm-svn: 340515
Set __mips_fpr to 0 if o32 ABI is used with either -mfpxx
or none of -mfp32, -mfpxx, -mfp64 being specified.
Introduce additional checks:
-mfpxx is only to be used in conjunction with the o32 ABI.
report an error when incompatible options are provided.
Formerly no errors were raised when combining n32/n64 ABIs
with -mfp32 and -mfpxx.
There are other cases when __mips_fpr should be set to 0
that are not covered, ex. using o32 on a mips64 cpu
which is valid but not supported in the backend as of yet.
Differential Revision: https://reviews.llvm.org/D50557
llvm-svn: 340391
Fast FMAF is not a sufficient condition to enable denormals.
Before VI, enabling denormals caused F32 instructions to
run at F64 speeds.
llvm-svn: 339278
The way address space declarations for builtins currently work
is nearly useless. The code assumes the address spaces used for
builtins is a confusingly named "target address space" from user
code using __attribute__((address_space(N))) that matches
the builtin declaration. There's no way to use this to declare
a builtin that returns a language specific address space.
The terminology used is highly cofusing since it has nothing
to do with the the address space selected by the target to use
for a language address space.
This feature is essentially unused as-is. AMDGPU and NVPTX
are the only in-tree targets attempting to use this. The AMDGPU
builtins certainly do not behave as intended (i.e. all of the
builtins returning pointers can never compile because the numbered
address space never matches the expected named address space).
The NVPTX builtins are missing tests for some, and the others
seem to rely on an implicit addrspacecast.
Change the used address space for builtins based on a target
hook to allow using a language address space for a builtin.
This allows the same builtin declaration to be used for multiple
languages with similarly purposed address spaces (e.g. the same
AMDGPU builtin can be used in OpenCL and CUDA even though the
constant address spaces are arbitarily different).
This breaks the possibility of using arbitrary numbered
address spaces alongside the named address spaces for builtins.
If this is an issue we probably need to introduce another builtin
declaration character to distinguish language address spaces from
so-called "target address spaces".
llvm-svn: 338707
This adds tests for Armv8.4-A, and also some v8.2 and v8.3 tests that were
missing.
Differential Revision: https://reviews.llvm.org/D50068
llvm-svn: 338525
Summary: Microsoft's C++ object model for ARM64 is the same as that for X86_64.
For example, small structs with non-trivial copy constructors or virtual
function tables are passed indirectly. Currently, they are passed in registers
when compiled with clang.
Reviewers: rnk, mstorsjo, TomTan, haripul, javed.absar
Reviewed By: rnk, mstorsjo
Subscribers: kristof.beyls, chrib, llvm-commits, cfe-commits
Differential Revision: https://reviews.llvm.org/D49770
llvm-svn: 338076
Changing it to unsigned long (which is 32-bit on wasm32) makes it the same
type as wasm64 (where unsigned long is 64-bit), which would eliminate the most
common cause for mangled names being different between wasm32 and wasm64. For
example, export lists containing symbol names could now often be the same
between wasm32 and wasm64.
Differential Revision: https://reviews.llvm.org/D40526
llvm-svn: 337783
As documented here: https://software.intel.com/en-us/node/682969 and
https://software.intel.com/en-us/node/523346. cpu_dispatch multiversioning
is an ICC feature that provides for function multiversioning.
This feature is implemented with two attributes: First, cpu_specific,
which specifies the individual function versions. Second, cpu_dispatch,
which specifies the location of the resolver function and the list of
resolvable functions.
This is valuable since it provides a mechanism where the resolver's TU
can be specified in one location, and the individual implementions
each in their own translation units.
The goal of this patch is to be source-compatible with ICC, so this
implementation diverges from the ICC implementation in a few ways:
1- Linux x86/64 only: This implementation uses ifuncs in order to
properly dispatch functions. This is is a valuable performance benefit
over the ICC implementation. A future patch will be provided to enable
this feature on Windows, but it will obviously more closely fit ICC's
implementation.
2- CPU Identification functions: ICC uses a set of custom functions to identify
the feature list of the host processor. This patch uses the cpu_supports
functionality in order to better align with 'target' multiversioning.
1- cpu_dispatch function def/decl: ICC's cpu_dispatch requires that the function
marked cpu_dispatch be an empty definition. This patch supports that as well,
however declarations are also permitted, since the linker will solve the
issue of multiple emissions.
Differential Revision: https://reviews.llvm.org/D47474
llvm-svn: 337552
The SPIR target currently allows for half precision floating point types to be
emitted using the LLVM intrinsic functions which convert half types to floats
and doubles. However, this is illegal in SPIR as the only intrinsic allowed by
SPIR is memcpy, as per section 3 of the SPIR specification. Currently this is
leading to an assert being hit in the Clang CodeGen when attempting to emit a
constant or literal _Float16 type in a comparison operation on a SPIR or SPIR64
target. This assert stems from the CodeGen attempting to emit a constant half
value as an integer because the backend has specified that it is using these
half conversion intrinsics (which represents half as i16). This patch prevents
SPIR targets from using these intrinsics by overloading the responsible target
info method, marks SPIR targets as having a legal half type and provides
additional regression testing for the _Float16 type on SPIR targets.
Patch by: Stephen McGroarty
Differential Revision: https://reviews.llvm.org/D48188
llvm-svn: 335111
Diasble the use of the type __float128 for PPC machines older
than Power9.
The use of -mfloat128 for PPC machine older than Power9 will result
in an error.
Differential Revision: https://reviews.llvm.org/D48088
llvm-svn: 334613
Adding __attribute__((aligned(32))) to __m256 breaks the implementation
of _mm256_loadu_ps on Windows. On Windows, alignment attributes have
higher precedence than packing attributes.
We also might want to carefully consider the consequences of changing
our vector typedefs, since many users copy them and invent their own
new, non-Intel specific vector type names.
llvm-svn: 333958
This fixes two major problems:
- We were not capping vector alignment as desired on 32-bit ARM.
- We were using different alignments based on the AVX settings on
Intel, so we did not have a consistent ABI.
This is an ABI break, but we think we can get away with it because
vectors tend to be used mostly in inline code (which is why not having
a consistent ABI has not proven disastrous on Intel).
Intel's AVX types are specified as having 32-byte / 64-byte alignment,
so align them explicitly instead of relying on the base ABI rule.
Note that this sort of attribute is stripped from template arguments
in template substitution, so there's a possibility that code templated
over vectors will produce inadequately-aligned objects. The right
long-term solution for this is for alignment attributes to be
interpreted as true qualifiers and thus preserved in the canonical type.
llvm-svn: 333791
An intrinsic for an old instruction, as described in the Intel SDM.
Reviewers: craig.topper, rnk
Reviewed By: craig.topper, rnk
Differential Revision: https://reviews.llvm.org/D47142
llvm-svn: 333256
in gcc by https://gcc.gnu.org/ml/gcc-cvs/2018-04/msg00534.html.
The -mibt feature flag is being removed, and the -fcf-protection
option now also defines a CET macro and causes errors when used
on non-X86 targets, while X86 targets no longer check for -mibt
and -mshstk to determine if -fcf-protection is supported. -mshstk
is now used only to determine availability of shadow stack intrinsics.
Comes with an LLVM patch (D46882).
Patch by mike.dvoretsky
Differential Revision: https://reviews.llvm.org/D46881
llvm-svn: 332704
When looking at lib/Basic/Targets/OSTargets.h, I noticed that _REENTRANT is defined
unconditionally on Solaris, unlike all other targets and what either Studio cc (only define
it with -mt) or gcc (only define it with -pthread) do.
This patch follows that lead.
Differential Revision: https://reviews.llvm.org/D41241
llvm-svn: 332343
The option enables use of 32-bit pointers for accessing
const/local/shared memory. The feature is disabled by default.
Differential Revision: https://reviews.llvm.org/D46148
llvm-svn: 331938