Commit Graph

6216 Commits

Author SHA1 Message Date
Sander de Smalen e51c1d06a9 [SveEmitter] Add builtins for svtbl2
Reviewers: david-arm, efriedma, c-rhodes

Reviewed By: c-rhodes

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81462
2020-06-17 09:41:38 +01:00
Jun Ma 4a1776979f [CodeGen][TLS] Set TLS Model for __tls_guard as well.
Differential Revision: https://reviews.llvm.org/D81543
2020-06-17 08:31:13 +08:00
Luke Geeson 10b6567f49 [AArch64]: BFloat MatMul Intrinsics&CodeGen
This patch upstreams support for BFloat Matrix Multiplication Intrinsics
and Code Generation from __bf16 to AArch64. This includes IR intrinsics. Unittests are
provided as needed. AArch32 Intrinsics + CodeGen will come after this
patch.

This patch is part of a series implementing the Bfloat16 extension of
the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm
Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:

Luke Geeson
 - Momchil Velikov
 - Mikhail Maltsev
 - Luke Cheeseman

Reviewers: SjoerdMeijer, t.p.northover, sdesmalen, labrinea, miyuki,
stuij

Reviewed By: miyuki, stuij

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits,
llvm-commits, miyuki, chill, pbarrio, stuij

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80752

Change-Id: I174f0fd0f600d04e3799b06a7da88973c6c0703f
2020-06-16 15:23:30 +01:00
Luke Geeson 508a4764c0 [AArch64]: BFloat Load/Store Intrinsics&CodeGen
This patch upstreams support for ld / st variants of BFloat intrinsics
in from __bf16 to AArch64. This includes IR intrinsics. Unittests are
provided as needed.

This patch is part of a series implementing the Bfloat16 extension of
the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm
Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:

 - Luke Geeson
 - Momchil Velikov
 - Luke Cheeseman

Reviewers: fpetrogalli, SjoerdMeijer, sdesmalen, t.p.northover, stuij

Reviewed By: stuij

Subscribers: arsenm, pratlucas, simon_tatham, labrinea, kristof.beyls,
hiraditya, danielkiss, cfe-commits, llvm-commits, pbarrio, stuij

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80716

Change-Id: I22e1dca2a8a9ec25d1e4f4b200cb50ea493d2575
2020-06-16 15:23:30 +01:00
Francesco Petrogalli 017969de76 [llvm][SveEmitter] SVE ACLE for quadword permute intrinsics.
Summary:
The following intrinsics have been added, guarded by the macro
`__ARM_FEATURE_SVE_MATMUL_FP64`:

* svtrn1q[_*]
* svtrn2q[_*]
* svuzp1q[_*]
* svuzp2q[_*]
* svzip1q[_*]
* svzip2q[_*]

Supported types:

* svint[8|16|32|64]_t
* svuint[8|16|32|64]_t
* svfloat[16|32|64]_t

TODO: add support for svbfloat16_t

Reviewers: efriedma, sdesmalen, kmclaughlin, rengolin

Reviewed By: sdesmalen

Subscribers: tschuett, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80851
2020-06-15 16:52:36 +00:00
Jeff Mott 8799ebbc1f [clang] Fix or emit diagnostic for checked arithmetic builtins with
_ExtInt types

- Fix computed size for _ExtInt types passed to checked arithmetic
  builtins.
- Emit diagnostic when signed _ExtInt larger than 128-bits is passed
    to __builtin_mul_overflow.
- Change Sema checks for builtins to accept placeholder types.

Differential Revision: https://reviews.llvm.org/D81420
2020-06-15 06:51:54 -07:00
Sander de Smalen 91a4a592ed [SveEmitter] Add SVE tuple types and builtins for svundef.
This patch adds new SVE types to Clang that describe tuples of SVE
vectors. For example `svint32x2_t` which maps to the twice-as-wide
vector `<vscale x 8 x i32>`. Similarly, `svint32x3_t` will map to
`<vscale x 12 x i32>`.

It also adds builtins to return an `undef` vector for a given
SVE type.

Reviewers: c-rhodes, david-arm, ctetreau, efriedma, rengolin

Reviewed By: c-rhodes

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81459
2020-06-15 07:36:01 +01:00
John McCall 7fac1acc61 Set the LLVM FP optimization flags conservatively.
Functions can have local pragmas that override the global settings.
We set the flags eagerly based on global settings, but if we emit
an expression under the influence of a pragma, we clear the
appropriate flags from the function.

In order to avoid doing a ton of redundant work whenever we emit
an FP expression, configure the IRBuilder to default to global
settings, and only reconfigure it when we see an FP expression
that's not using the global settings.

Patch by Michele Scandale!

https://reviews.llvm.org/D80462
2020-06-11 18:16:41 -04:00
Marco Elver 52cae05e08 [ASan][Test] Fix expected strings for globals test
The expected strings would previously not catch bugs when redzones were
added when they were not actually expected. Fix by adding "global "
before the type.
2020-06-10 20:26:24 +02:00
Marco Elver 0f04f10453 [ASan][Test] Split out global alias test
Some platforms do not support aliases. Split the test, and pass explicit
triple to avoid test failure.

Reviewed By: thakis

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81591
2020-06-10 19:58:50 +02:00
Marco Elver d3f89314ff [KernelAddressSanitizer] Make globals constructors compatible with kernel [v2]
[ v1 was reverted by c6ec352a6b due to
  modpost failing; v2 fixes this. More info:
  https://github.com/ClangBuiltLinux/linux/issues/1045#issuecomment-640381783 ]

This makes -fsanitize=kernel-address emit the correct globals
constructors for the kernel. We had to do the following:

* Disable generation of constructors that rely on linker features such
  as dead-global elimination.

* Only instrument globals *not* in explicit sections. The kernel uses
  sections for special globals, which we should not touch.

* Do not instrument globals that are prefixed with "__" nor that are
  aliased by a symbol that is prefixed with "__". For example, modpost
  relies on specially named aliases to find globals and checks their
  contents. Unfortunately modpost relies on size stored as ELF debug info
  and any padding of globals currently causes the debug info to cause size
  reported to be *with* redzone which throws modpost off.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203493

Tested:
* With 'clang/test/CodeGen/asan-globals.cpp'.

* With test_kasan.ko, we can see:

  	BUG: KASAN: global-out-of-bounds in kasan_global_oob+0xb3/0xba [test_kasan]

* allyesconfig, allmodconfig (x86_64)

Reviewed By: glider

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D81390
2020-06-10 15:08:42 +02:00
Sander de Smalen a8fbbf8fe2 [SveEmitter] NFC: Add missing ACLE tests
These ACLE tests were missing in previous patches:
- D79357: [SveEmitter] Add builtins for svdup and svindex
- D78747: [SveEmitter] Add builtins for compares and ReverseCompare flag.
- D76238: [SveEmitter] Implement builtins for contiguous loads/stores
2020-06-10 08:29:34 +01:00
Craig Topper 641d5ac4d1 [X86] Assign a feature to tremont, goldmont, goldmont-plus, icelake-client, and icelake for target multiversioning priority.
Without this these CPUs all caused the compiler to assert when
used for multiversioning.
2020-06-09 16:39:41 -07:00
Thomas Lively b7d369280b [WebAssembly] Implement prototype SIMD rounding instructions
Summary:
As specified in https://github.com/WebAssembly/simd/pull/232. These
instructions are implemented as LLVM intrinsics for now rather than
normal ISel patterns to make these instructions opt-in. Once the
instructions are merged to the spec proposal, the intrinsics will be
replaced with proper ISel patterns.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D81222
2020-06-09 10:14:14 -07:00
Arthur Eubanks ce7d3e1c55 Reland (again) D80966 [codeview] Put !heapallocsite on calls to operator new
Check that getDebugInfo() is not null, as in the first revision, before
calling getDebugInfo()->addHeapAllocSiteMetadata().
Else would cause a crash with a new expression in a default arg.

---

Clang marks calls to operator new as heap allocation sites, but the
operator declared at global scope returns a void pointer. There is no
explicit cast in the code, so the compiler has to write down the
allocated type itself.

Also generalize a cast to use CallBase, so that we mark heap alloc sites
when exceptions are enabled.

Differential Revision: https://reviews.llvm.org/D80966
2020-06-09 09:27:32 -07:00
Florian Hahn 3323a628ec [Matrix] Add __builtin_matrix_transpose to Clang.
This patch add __builtin_matrix_transpose to Clang, as described in
clang/docs/MatrixTypes.rst.

Reviewers: rjmccall, jfb, rsmith, Bigcheese

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D72778
2020-06-09 10:14:37 +01:00
Arthur Eubanks a92ce3b706 Revert "Reland D80966 [codeview] Put !heapallocsite on calls to operator new"
This reverts commit b6e143aa54.

Causes https://bugs.chromium.org/p/chromium/issues/detail?id=1092370#c5.
Will investigate and reland (again).
2020-06-08 12:49:41 -07:00
Arthur Eubanks c07339c675 Move *San module passes later in the NPM pipeline
Summary:
This fixes pr33372.cpp under the new pass manager.

ASan adds padding to globals. For example, it will change a {i32, i32, i32} to a {{i32, i32, i32}, [52 x i8]}. However, when loading from the {i32, i32, i32}, InstCombine may (after various optimizations) end up loading 16 bytes instead of 12, likely because it thinks the [52 x i8] padding is ok to load from. But ASan checks that padding should not be loaded from.

Ultimately this is an issue of *San passes wanting to be run after all optimizations. This change moves the module passes right next to the corresponding function passes.

Also remove comment that's no longer relevant, this is the last ASan/MSan/TSan failure under the NPM (hopefully...).

As mentioned in https://reviews.llvm.org/rG1285e8bcac2c54ddd924ffb813b2b187467ac2a6, NPM doesn't support LTO + sanitizers, so modified some tests that test for that.

Reviewers: leonardchan, vitalybuka

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81323
2020-06-08 12:08:49 -07:00
Fangrui Song fc935fc35b Reland D80979 [clang] Implement VectorType logic not operator
With a fix to use -triple %itanium_abi_triple

Differential Revision: https://reviews.llvm.org/D80979
2020-06-08 09:32:30 -07:00
Nico Weber abca3b7b2c Revert "[clang] Implement VectorType logic not operator."
This reverts commit a0de3335ed.
Breaks check-clang on Windows, see e.g.
https://reviews.llvm.org/D80979#2078750 (but fails on all
other Windows bots too).
2020-06-08 06:45:21 -04:00
Marco Elver c6ec352a6b Revert "[KernelAddressSanitizer] Make globals constructors compatible with kernel"
This reverts commit 866ee2353f.

Building the kernel results in modpost failures due to modpost relying
on debug info and inspecting kernel modules' globals:
https://github.com/ClangBuiltLinux/linux/issues/1045#issuecomment-640381783
2020-06-08 10:34:03 +02:00
Jun Ma a0de3335ed [clang] Implement VectorType logic not operator.
Differential Revision: https://reviews.llvm.org/D80979
2020-06-08 08:41:01 +08:00
Fangrui Song b6e143aa54 Reland D80966 [codeview] Put !heapallocsite on calls to operator new
With a change to use `CGM.getCodeGenOpts().getDebugInfo() != codegenoptions::NoDebugInfo`
instead of `getDebugInfo()`,
to fix `Profile-<arch> :: instrprof-gcov-multithread_fork.test`

See CodeGenModule::CodeGenModule, `EmitGcovArcs || EmitGcovNotes` can
set `clang::CodeGen::CodeGenModule::DebugInfo`.

---

Clang marks calls to operator new as heap allocation sites, but the
operator declared at global scope returns a void pointer. There is no
explicit cast in the code, so the compiler has to write down the
allocated type itself.

Also generalize a cast to use CallBase, so that we mark heap alloc sites
when exceptions are enabled.

Differential Revision: https://reviews.llvm.org/D80966
2020-06-07 13:35:20 -07:00
Ties Stuij 5945e9799e [clang][BFloat] Add reinterpret cast intrinsics
Summary:
This patch is part of a series implementing the Bfloat16 extension of the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties is specified in the Arm C language
extension specification:

https://developer.arm.com/docs/ihi0055/d/procedure-call-standard-for-the-arm-64-bit-architecture

Subscribers: kristof.beyls, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79869

The following people contributed to this patch:

- Luke Cheeseman
- Alexandros Lamprineas
- Luke Geeson
- Ties Stuij
2020-06-07 14:32:37 +01:00
Florian Hahn 4affc444b4 [Matrix] Implement * binary operator for MatrixType.
This patch implements the * binary operator for values of
MatrixType. It adds support for matrix * matrix, scalar * matrix and
matrix * scalar.

For the matrix, matrix case, the number of columns of the first operand
must match the number of rows of the second. For the scalar,matrix variants,
the element type of the matrix must match the scalar type.

Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D76794
2020-06-07 11:11:27 +01:00
Fangrui Song e664d0543f [gcov] Improve tests and lower the minimum supported version to gcov 3.4
global-ctor.ll no longer checks what it intended to check
(@_GLOBAL__sub_I_global-ctor.ll needs a !dbg to work).
Rewrite it.

gcov 3.4 and gcov 4.2 use the same format, thus we can lower the version
requirement to 3.4
2020-06-06 23:11:32 -07:00
Douglas Yung 059ba74bb6 Revert "[codeview] Put !heapallocsite on calls to operator new"
This reverts commit 672ed53860.

This commit is hitting an assertion failure across multiple bots in the test:
Profile-<arch> :: instrprof-gcov-multithread_fork.test

Failing bots include:
http://lab.llvm.org:8011/builders/llvm-avr-linux/builds/2205
http://lab.llvm.org:8011/builders/clang-cmake-aarch64-lld/builds/8967
http://lab.llvm.org:8011/builders/clang-cmake-armv7-full/builds/10789
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/27750
http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/16751
2020-06-06 23:30:46 +00:00
Fangrui Song cdd683b516 [gcov] Support big-endian .gcno and simplify version handling in .gcda 2020-06-06 11:01:47 -07:00
Jonas Paulsson 515bfc66ea [SystemZ] Implement -fstack-clash-protection
Probing of allocated stack space is now done when this option is passed. The
purpose is to protect against the stack clash attack (see
https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt).

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D78717
2020-06-06 18:38:36 +02:00
Marco Elver 97a670958c [ASan][Test] Fix globals test on 32-bit architectures
Buildbot reports failures on e.g. armv7 and thumbv7. Fix the test by
expecting either i32 or i64 for the size-argument.
2020-06-06 11:23:16 +02:00
Marco Elver 2dd83a9230 [ASan][Test] Fix globals test for Mach-O
Summary: Use a portable section name, as for the test's purpose any name will do.

Reviewers: nickdesaulniers, thakis

Reviewed By: thakis

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81306
2020-06-05 23:08:11 +02:00
Reid Kleckner 672ed53860 [codeview] Put !heapallocsite on calls to operator new
Clang marks calls to operator new as heap allocation sites, but the
operator declared at global scope returns a void pointer. There is no
explicit cast in the code, so the compiler has to write down the
allocated type itself.

Also generalize a cast to use CallBase, so that we mark heap alloc sites
when exceptions are enabled.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D80966
2020-06-05 12:52:38 -07:00
Marco Elver 866ee2353f [KernelAddressSanitizer] Make globals constructors compatible with kernel
Summary:
This makes -fsanitize=kernel-address emit the correct globals
constructors for the kernel. We had to do the following:

- Disable generation of constructors that rely on linker features such
  as dead-global elimination.

- Only emit constructors for globals *not* in explicit sections. The
  kernel uses sections for special globals, which we should not touch.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203493

Tested:
1. With 'clang/test/CodeGen/asan-globals.cpp'.
2. With test_kasan.ko, we can see:

  	BUG: KASAN: global-out-of-bounds in kasan_global_oob+0xb3/0xba [test_kasan]

Reviewers: glider, andreyknvl

Reviewed By: glider

Subscribers: cfe-commits, nickdesaulniers, hiraditya, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D80805
2020-06-05 20:20:46 +02:00
Ties Stuij 8b137a4306 [clang][BFloat] Add create/set/get/dup intrinsics
Summary:
This patch is part of a series that adds support for the Bfloat16 extension of
the Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type and its properties are specified in the Arm Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:
- Luke Cheeseman
- Momchil Velikov
- Luke Geeson
- Ties Stuij
- Mikhail Maltsev

Reviewers: t.p.northover, sdesmalen, fpetrogalli, LukeGeeson, stuij, labrinea

Reviewed By: labrinea

Subscribers: miyuki, dmgreen, labrinea, kristof.beyls, ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79710
2020-06-05 14:35:10 +01:00
Ties Stuij a6fcf5ca03 [clang][BFloat] add NEON emitter for bfloat
Summary:
This patch adds the bfloat16_t struct typedefs (e.g. bfloat16x8x2_t) to
arm_neon.h

This patch is part of a series implementing the Bfloat16 extension of the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:
- Luke Cheeseman
- Simon Tatham
- Ties Stuij

Reviewers: t.p.northover, fpetrogalli, sdesmalen, az, LukeGeeson

Reviewed By: fpetrogalli

Subscribers: SjoerdMeijer, LukeGeeson, pbarrio, mgorny, kristof.beyls, ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79708
2020-06-05 14:11:51 +01:00
Ties Stuij 1e44731833 [ARM] Add poly64_t on AArch32.
Summary:
The poly64 types are guarded with ifdefs for AArch64 only. This is wrong. This
was also incorrectly documented in the ACLE spec, but this has been rectified in
the latest release. See paragraph 13.1.2 "Vector data types":

https://developer.arm.com/docs/101028/latest

This patch was written by Alexandros Lamprineas.

Reviewers: ostannard, sdesmalen, fpetrogalli, labrinea, t.p.northover, LukeGeeson

Reviewed By: ostannard

Subscribers: pbarrio, LukeGeeson, kristof.beyls, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79711
2020-06-05 13:04:21 +01:00
Kadir Cetinkaya c31d213463
[clang][test] Put output into temp directory
To unbreak builds that happen on a read-only directory
2020-06-05 13:02:32 +02:00
Ties Stuij ecd682bbf5 [ARM] Add __bf16 as new Bfloat16 C Type
Summary:
This patch upstreams support for a new storage only bfloat16 C type.
This type is used to implement primitive support for bfloat16 data, in
line with the Bfloat16 extension of the Armv8.6-a architecture, as
detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

In detail this patch:
- introduces an opaque, storage-only C-type __bf16, which introduces a new bfloat IR type.

This is part of a patch series, starting with command-line and Bfloat16
assembly support. The subsequent patches will upstream intrinsics
support for BFloat16, followed by Matrix Multiplication and the
remaining Virtualization features of the armv8.6-a architecture.

The following people contributed to this patch:
- Luke Cheeseman
- Momchil Velikov
- Alexandros Lamprineas
- Luke Geeson
- Simon Tatham
- Ties Stuij

Reviewers: SjoerdMeijer, rjmccall, rsmith, liutianle, RKSimon, craig.topper, jfb, LukeGeeson, fpetrogalli

Reviewed By: SjoerdMeijer

Subscribers: labrinea, majnemer, asmith, dexonsmith, kristof.beyls, arphaman, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76077
2020-06-05 10:32:43 +01:00
Fangrui Song e5158b5273 [Driver] Migrate some -f/-fno options to use OptInFFlag and OptOutFFlag 2020-06-04 19:33:14 -07:00
Francesco Petrogalli 36b8af11d3 [SveEmitter] Add SVE ACLE for svld1ro.
Reviewers: sdesmalen, efriedma

Subscribers: tschuett, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80740
2020-06-03 14:44:07 +00:00
Andrew Wock 15a1780a10 [PowerPC] Replace subtract-from-zero float in version with fneg in PowerPC special fma compiler builtins
This is a re-revert with a corrected test.

This patch adds a test for the PowerPC fma compiler builtins, some variations
of which negate inputs and outputs. The code to generate IR for these
builtins was untested before this patch.

Originally, the code used the outdated method of subtracting floating point
values from -0.0 as floating point negation. This patch remedies that.

Patch by: Drew Wock <drew.wock@sas.com>
Differential Revision: https://reviews.llvm.org/D76949
2020-06-03 09:45:27 -04:00
Lucas Prates 8beaba13b8 [Clang][AArch64] Capturing proper pointer alignment for Neon vld1 intrinsicts
Summary:
During CodeGen for AArch64 Neon intrinsics, Clang was incorrectly
assuming all the pointers from which loads were being generated for vld1
intrinsics were aligned according to the intrinsics result type, causing
alignment faults on the code generated by the backend.

This patch updates vld1 intrinsics' CodeGen to properly capture the
correct load alignment based on the type of the pointer provided as
input for the intrinsic.

Reviewers: t.p.northover, ostannard, pcc, efriedma

Reviewed By: ostannard, efriedma

Subscribers: echristo, plotfi, nickdesaulniers, efriedma, kristof.beyls, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79721
2020-06-03 11:39:27 +01:00
Wei Mi 7a6c89427c [SampleFDO] Add use-sample-profile function attribute.
When sampleFDO is enabled, people may expect they can use
-fno-profile-sample-use to opt-out using sample profile for a certain file.
That could be either for debugging purpose or for performance tuning purpose.
However, when thinlto is enabled, if a function in file A compiled with
-fno-profile-sample-use is imported to another file B compiled with
-fprofile-sample-use, the inlined copy of the function in file B may still
get its profile annotated.

The inconsistency may even introduce profile unused warning because if the
target is not compiled with explicit debug information flag, the function
in file A won't have its debug information enabled (debug information will
be enabled implicitly only when -fprofile-sample-use is used). After it is
imported into file B which is compiled with -fprofile-sample-use, profile
annotation for the outline copy of the function will fail because the
function has no debug information, and that will trigger  profile unused
warning.

We add a new attribute use-sample-profile to control whether a function
will use its sample profile no matter for its outline or inline copies.
That will make the behavior of -fno-profile-sample-use consistent.

Differential Revision: https://reviews.llvm.org/D79959
2020-06-02 17:23:17 -07:00
Sriraman Tallam e0bca46b08 Options for Basic Block Sections, enabled in D68063 and D73674.
This patch adds clang options:
-fbasic-block-sections={all,<filename>,labels,none} and
-funique-basic-block-section-names.
LLVM Support for basic block sections is already enabled.

+ -fbasic-block-sections={all, <file>, labels, none} : Enables/Disables basic
block sections for all or a subset of basic blocks. "labels" only enables
basic block symbols.
+ -funique-basic-block-section-names: Enables unique section names for
basic block sections, disabled by default.

Differential Revision: https://reviews.llvm.org/D68049
2020-06-02 00:23:32 -07:00
John McCall 8a8d703be0 Fix how cc1 command line options are mapped into FP options.
Canonicalize on storing FP options in LangOptions instead of
redundantly in CodeGenOptions.  Incorporate -ffast-math directly
into the values of those LangOptions rather than considering it
separately when building FPOptions.  Build IR attributes from
those options rather than a mix of sources.

We should really simplify the driver/cc1 interaction here and have
the driver pass down options that cc1 directly honors.  That can
happen in a follow-up, though.

Patch by Michele Scandale!
https://reviews.llvm.org/D80315
2020-06-01 22:00:30 -04:00
Florian Hahn 8f3f88d2f5 [Matrix] Implement matrix index expressions ([][]).
This patch implements matrix index expressions
(matrix[RowIdx][ColumnIdx]).

It does so by introducing a new MatrixSubscriptExpr(Base, RowIdx, ColumnIdx).
MatrixSubscriptExprs are built in 2 steps in ActOnMatrixSubscriptExpr. First,
if the base of a subscript is of matrix type, we create a incomplete
MatrixSubscriptExpr(base, idx, nullptr). Second, if the base is an incomplete
MatrixSubscriptExpr, we create a complete
MatrixSubscriptExpr(base->getBase(), base->getRowIdx(), idx)

Similar to vector elements, it is not possible to take the address of
a MatrixSubscriptExpr.
For CodeGen, a new MatrixElt type is added to LValue, which is very
similar to VectorElt. The only difference is that we may need to cast
the type of the base from an array to a vector type when accessing it.

Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D76791
2020-06-01 20:08:49 +01:00
Sanjay Patel c0303e5391 [CodeGen] remove instnamer dependency from test file; NFC
This file was originally added without instnamer at:
rL283716 / fe2b9b4fbf

But that was reverted and the test file reappeared with instnamer at:
rL285688 / 62f516f590

I'm not seeing any difference locally from checking nameless values,
so trying to remove a layering violation and see if that can
survive the build bots.
2020-06-01 10:21:17 -04:00
Djordje Todorovic 40a3fcb05c [DebugInfo][CallSites] Remove decl subprograms from 'retainedTypes:'
After the D70350, the retainedTypes: isn't being used for the purpose
of call site debug info for extern calls, so it is safe to delete it
from IR representation.
We are also adding a test to ensure the subprogram isn't stored within
the retainedTypes: from corresponding DICompileUnit.

Differential Revision: https://reviews.llvm.org/D80369
2020-06-01 09:10:05 +02:00
Florian Hahn 6f6e91d193 [Matrix] Implement + and - operators for MatrixType.
This patch implements the + and - binary operators for values of
MatrixType. It adds support for matrix +/- matrix, scalar +/- matrix and
matrix +/- scalar.

For the matrix, matrix case, the types must initially be structurally
equivalent. For the scalar,matrix variants, the element type of the
matrix must match the scalar type.

Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D76793
2020-05-29 20:42:22 +01:00
Arthur Eubanks 1285e8bcac Run Coverage pass before other *San passes under new pass manager, round 2
Summary:
This was attempted once before in https://reviews.llvm.org/D79698, but
was reverted due to the coverage pass running in the wrong part of the
pipeline. This commit puts it in the same place as the other sanitizers.

This changes PassBuilder.OptimizerLastEPCallbacks to work on a
ModulePassManager instead of a FunctionPassManager. That is because
SanitizerCoverage cannot (easily) be split into a module pass and a
function pass like some of the other sanitizers since in its current
implementation it conditionally inserts module constructors based on
whether or not it successfully modified functions.

This fixes compiler-rt/test/msan/coverage-levels.cpp under the new pass
manager (last check-msan test).

Currently sanitizers + LTO don't work together under the new pass
manager, so I removed tests that checked that this combination works for
sancov.

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80692
2020-05-28 17:04:47 -07:00
Nemanja Ivanovic f9e94eb868 [Clang] Enable _Complex __float128
When I added __float128 a while ago, I neglected to add support for the complex
variant of the type. This patch just adds that.

Differential revision: https://reviews.llvm.org/D80533
2020-05-28 06:55:49 -05:00
Marco Elver 69935d86ae [Clang][Sanitizers] Expect test failure on {arm,thumb}v7
Summary:
Versions of LLVM built on {arm,thumb}v7 appear to have differently
configured pass managers, which causes restrictions on which sanitizers
we may use.

As such, expect failure of the recently added "sanitize-coverage.c" test
on these architectures until we can investigate armv7's restrictions.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46117

Reviewers: vitalybuka, glider

Reviewed By: glider

Subscribers: glider, kristof.beyls, danielkiss, cfe-commits, vvereschaka

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80668
2020-05-28 11:33:32 +02:00
Eric Christopher 97a133f157 Temporarily Revert "[Clang][AArch64] Capturing proper pointer alignment for Neon vld1 intrinsicts"
as it's causing crashes on code generation and https://bugs.llvm.org/show_bug.cgi?id=46084

This reverts commit 98cad555e2.
2020-05-26 18:51:00 -07:00
Marco Elver 14de6e29b1 [Clang][Driver] Add Bounds and Thread to SupportsCoverage list
Summary:
This permits combining -fsanitize-coverage with -fsanitize=bounds or
-fsanitize=thread. Note that, GCC already supports combining these.

Tested:
- Add Clang end-to-end test checking IR is generated for both combinations
of sanitizers.
- Several previously failing TSAN tests now pass.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45831

Reviewers: vitalybuka

Reviewed By: vitalybuka

Subscribers: #sanitizers, dvyukov, nickdesaulniers, cfe-commits

Tags: #clang, #sanitizers

Differential Revision: https://reviews.llvm.org/D79628
2020-05-26 13:36:21 -07:00
Adrian Prantl b59b3640bc Debug Info: Mark os_log helper functions as artificial
The os_log helper functions are linkonce_odr and supposed to be
uniqued across TUs, so attachine a DW_AT_decl_line on it is highly
misleading. By setting the function decl to implicit, CGDebugInfo
properly marks the functions as artificial and uses a default file /
line 0 location for the function.

rdar://problem/63450824

Differential Revision: https://reviews.llvm.org/D80463
2020-05-26 09:08:27 -07:00
Lucas Prates 98cad555e2 [Clang][AArch64] Capturing proper pointer alignment for Neon vld1 intrinsicts
Summary:
During CodeGen for AArch64 Neon intrinsics, Clang was incorrectly
assuming all the pointers from which loads were being generated for vld1
intrinsics were aligned according to the intrinsics result type, causing
alignment faults on the code generated by the backend.

This patch updates vld1 intrinsics' CodeGen to properly capture the
correct load alignment based on the type of the pointer provided as
input for the intrinsic.

Reviewers: t.p.northover, ostannard, pcc

Reviewed By: ostannard

Subscribers: kristof.beyls, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79721
2020-05-26 10:09:35 +01:00
Fangrui Song 9d55e4ee13 Make explicit -fno-semantic-interposition (in -fpic mode) infer dso_local
-fno-semantic-interposition is currently the CC1 default. (The opposite
disables some interprocedural optimizations.) However, it does not infer
dso_local: on most targets accesses to ExternalLinkage functions/variables
defined in the current module still need PLT/GOT.

This patch makes explicit -fno-semantic-interposition infer dso_local,
so that PLT/GOT can be eliminated if targets implement local aliases
for AsmPrinter::getSymbolPreferLocal (currently only x86).

Currently we check whether the module flag "SemanticInterposition" is 0.
If yes, infer dso_local. In the future, we can infer dso_local unless
"SemanticInterposition" is 1: frontends other than clang will also
benefit from the optimization if they don't bother setting the flag.
(There will be risks if they do want ELF interposition: they need to set
"SemanticInterposition" to 1.)
2020-05-25 20:48:18 -07:00
Craig Topper 1b02db52b7 [X86] Update some av512 shift intrinsics to use "unsigned int" parameter instead of int to match Intel documentation
There are 65 that take a scalar shift amount. Intel documentation shows 60 of them taking unsigned int. There are 5 versions of srli_epi16 that use int, the 512-bit maskz and 128/256 mask/maskz.

Fixes PR45931

Differential Revision: https://reviews.llvm.org/D80251
2020-05-22 20:12:57 -07:00
Nemanja Ivanovic aede24ecaa [PowerPC] Treat 'Z' inline asm constraint as a true memory constraint
We currently emit incorrect codegen for this constraint because we set it as a
constraint that allows registers. This will cause the value to be copied to the
stack and that address to be passed as the address. This is not what we want.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=42762

Differential revision: https://reviews.llvm.org/D77542
2020-05-22 07:59:21 -05:00
Craig Topper 4cd696f92f [X86] Allow avx512vp2intersect to be used with __builtin_cpu_supports.
compiler-rt and trunk libgcc support this now.
2020-05-21 21:54:54 -07:00
Zequan Wu e36076ee3a [clang] Add nomerge function attribute to clang
Differential Revision: https://reviews.llvm.org/D79121
2020-05-21 17:07:39 -07:00
Zequan Wu b0a0f01bc1 Revert "Add nomerge function attribute to clang"
This reverts commit 307e853954.
2020-05-21 16:13:18 -07:00
Zequan Wu 307e853954 Add nomerge function attribute to clang 2020-05-21 15:28:27 -07:00
Yaxun (Sam) Liu 361e4f14e3 Fix debug info for NoDebug attr
NoDebug attr does not totally eliminate debug info about a function when
inlining is enabled. This is inconsistent with when inlining is disabled.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D79967
2020-05-21 09:02:56 -04:00
Melanie Blower 827be690dc [clang] FastMathFlags.allowContract should be initialized only from FPFeatures.allowFPContractAcrossStatement
Summary: Fix bug introduced in D72841 adding support for pragma float_control

Reviewers: rjmccall, Anastasia

Differential Revision: https://reviews.llvm.org/D79903
2020-05-20 06:19:10 -07:00
Eli Friedman 62f3ef2b53 [CGCall] Annotate references with "align" attribute.
If we're going to assume references are dereferenceable, we should also
assume they're aligned: otherwise, we can't actually dereference them.

See also D80072.

Differential Revision: https://reviews.llvm.org/D80166
2020-05-19 20:21:30 -07:00
jasonliu 7f5d91d3ff [clang][AIX] Implement ABIInfo and TargetCodeGenInfo for AIX
Summary:
Created AIXABIInfo and AIXTargetCodeGenInfo for AIX ABI.

Reviewed By: Xiangling_L, ZarkoCA

Differential Revision: https://reviews.llvm.org/D79035
2020-05-19 15:00:48 +00:00
Xiang1 Zhang bcc0c894f3 Add cet.h for writing CET-enabled assembly code
Summary:
Add x86 feature with IBT and/or SHSTK bits to ELF program property if they  are enabled. Otherwise, contents in this header file are unused.
This file is mainly design for assembly source code which want to enable CET

Reviewers: hjl.tools, annita.zhang, LuoYuanke, craig.topper, tstellar, pengfei, rsmith

Reviewed By: LuoYuanke

Subscribers: cfe-commits, mgorny

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79617
2020-05-19 14:03:17 +08:00
Xiang1 Zhang 62a9eca859 Test asm-cet.S fail for window clang
This reverts commit e7e84ff24a.
2020-05-19 13:18:05 +08:00
Xiang1 Zhang e7e84ff24a Add cet.h for writing CET-enabled assembly code
Summary:
Add x86 feature with IBT and/or SHSTK bits to ELF program property if they  are enabled. Otherwise, contents in this header file are unused.
This file is mainly design for assembly source code which want to enable CET

Reviewers: hjl.tools, annita.zhang, LuoYuanke, craig.topper, tstellar, pengfei, rsmith

Reviewed By: LuoYuanke

Subscribers: mgorny

Differential Revision: https://reviews.llvm.org/D79617
2020-05-19 10:37:46 +08:00
Francesco Petrogalli b593bfd4d8 [clang][SveEmitter] SVE builtins for `svusdot` and `svsudot` ACLE.
Summary:
Intrinsics, guarded by `__ARM_FEATURE_SVE_MATMUL_INT8`:

* svusdot[_s32]
* svusdot[_n_s32]
* svusdot_lane[_s32]
* svsudot[_s32]
* svsudot[_n_s32]
* svsudot_lane[_s32]

Reviewers: sdesmalen, efriedma, david-arm, rengolin

Subscribers: tschuett, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79877
2020-05-18 23:07:23 +00:00
Fangrui Song 82904401e3 Map -O to -O1 instead of -O2
rL82131 changed -O from -O1 to -O2, because -O1 was not different from
-O2 at that time.

GCC treats -O as -O1 and there is now work to make -O1 meaningful.
We can change -O back to -O1 again.

Reviewed By: echristo, dexonsmith, arphaman

Differential Revision: https://reviews.llvm.org/D79916
2020-05-18 15:53:41 -07:00
Francesco Petrogalli e2cc12e412 [SveEmitter] Builtins for SVE matrix multiply `mmla`.
Summary:
Guarded by __ARM_FEATURE_SVE_MATMUL_INT8:

* svmmla_u32
* svmmla_s32
* svusmmla_s32

Guarded by __ARM_FEATURE_SVE_MATMUL_FP32:

* svmmla_f32

Guarded by __ARM_FEATURE_SVE_MATMUL_FP64:

* svmmla_f64

Reviewers: sdesmalen, kmclaughlin, efriedma, rengolin

Subscribers: tschuett, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79639
2020-05-18 22:02:19 +00:00
Jim Lin 7ee479a760 [RISCV] Fix passing two floating-point values in complex separately by two GPRs on RV64
Summary:
This patch fixed the error of counting the remaining FPRs. Complex floating-point
values should be passed by two FPRs for the hard-float ABI. If no two FPRs are
available, it should be passed via a 64-bit GPR (fp+fp). `ArgFPRsLeft` is only
decreased one while the type is complex floating-point. It causes two floating-point
values in the complex are passed separately by two GPRs.

Reviewers: asb, luismarques, lenary

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, s.egerton, pzheng, sameer.abuasal, apazos, evandro, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79770
2020-05-18 13:13:22 +08:00
Hubert Tong 3f5fc73a9d [test][ARM][CMSE] Use clang_cc1 in arm_cmse.h tests
Summary:
The `arm_cmse.h` header includes standard headers, but some tests that
include this header explicitly specify a target. The standard headers
found via the standard include paths need not be compatible with the
explicitly-specified target from the tests. In order to avoid test
failures caused by such incompatibility, this patch uses `%clang_cc1`,
which doesn't pick up the host system headers.

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D79693
2020-05-15 17:34:00 -04:00
Nikita Popov f89f7da999 [IR] Convert null-pointer-is-valid into an enum attribute
The "null-pointer-is-valid" attribute needs to be checked by many
pointer-related combines. To make the check more efficient, convert
it from a string into an enum attribute.

In the future, this attribute may be replaced with data layout
properties.

Differential Revision: https://reviews.llvm.org/D78862
2020-05-15 19:41:07 +02:00
Yonghong Song 072cde03aa [Clang][BPF] implement __builtin_btf_type_id() builtin function
Such a builtin function is mostly useful to preserve btf type id
for non-global data. For example,
   extern void foo(..., void *data, int size);
   int test(...) {
     struct t { int a; int b; int c; } d;
     d.a = ...; d.b = ...; d.c = ...;
     foo(..., &d, sizeof(d));
   }

The function "foo" in the above only see raw data and does not
know what type of the data is. In certain cases, e.g., logging,
the additional type information will help pretty print.

This patch implemented a BPF specific builtin
  u32 btf_type_id = __builtin_btf_type_id(param, flag)
which will return a btf type id for the "param".
flag == 0 will indicate a BTF local relocation,
which means btf type_id only adjusted when bpf program BTF changes.
flag == 1 will indicate a BTF remote relocation,
which means btf type_id is adjusted against linux kernel or
future other entities.

Differential Revision: https://reviews.llvm.org/D74668
2020-05-15 09:44:54 -07:00
Melanie Blower 7b8e306560 [clang] Fix bug in #pragma float_control(push/pop)
Summary: #pragma float_control(pop) was failing to restore the expected
floating point settings because the settings were not correctly preserved
at #pragma float_control(push).
2020-05-14 05:58:11 -07:00
Huihui Zhang fd842d3626 [CodeGen][NFC] Fix test/CodeGen/pr45476.cpp to specify target triple.
Summary:
Use explicit target triple to match more accurately the output for libcall
or native atomic.

Similar to D74847, without explicit target triple, this test will fail for ARM.

This patch update test pr45476.cpp to check for both native atomic and libcall.

Reviewers: efriedma, ekatz, rjmccall, rsmith, luismarques

Reviewed By: efriedma

Subscribers: kristof.beyls, jfb, cfe-commits, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D79914
2020-05-13 18:04:14 -07:00
Alina Sbirlea bd541b217f [NewPassManager] Add assertions when getting statefull cached analysis.
Summary:
Analyses that are statefull should not be retrieved through a proxy from
an outer IR unit, as these analyses are only invalidated at the end of
the inner IR unit manager.
This patch disallows getting the outer manager and provides an API to
get a cached analysis through the proxy. If the analysis is not
stateless, the call to getCachedResult will assert.

Reviewers: chandlerc

Subscribers: mehdi_amini, eraman, hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72893
2020-05-13 12:38:38 -07:00
Joel E. Denny a1fd188223 [FileCheck] Support comment directives
Sometimes you want to disable a FileCheck directive without removing
it entirely, or you want to write comments that mention a directive by
name.  The `COM:` directive makes it easy to do this.  For example,
you might have:

```
; X32: pinsrd_1:
; X32:    pinsrd $1, 4(%esp), %xmm0

; COM: FIXME: X64 isn't working correctly yet for this part of codegen, but
; COM: X64 will have something similar to X32:
; COM:
; COM:   X64: pinsrd_1:
; COM:   X64:    pinsrd $1, %edi, %xmm0
```

Without this patch, you need to use some combination of rewording and
directive syntax mangling to prevent FileCheck from recognizing the
commented occurrences of `X32:` and `X64:` above as directives.
Moreover, FileCheck diagnostics have been proposed that might complain
about the occurrences of `X64` that don't have the trailing `:`
because they look like directive typos:

  <http://lists.llvm.org/pipermail/llvm-dev/2020-April/140610.html>

I think dodging all these problems can prove tedious for test authors,
and directive syntax mangling already makes the purpose of existing
test code unclear.  `COM:` can avoid all these problems.

This patch also updates the small set of existing tests that define
`COM` as a check prefix:

- clang/test/CodeGen/default-address-space.c
- clang/test/CodeGenOpenCL/addr-space-struct-arg.cl
- clang/test/Driver/hip-device-libs.hip
- llvm/test/Assembler/drop-debug-info-nonzero-alloca.ll

I think lit should support `COM:` as well.  Perhaps `clang -verify`
should too.

Reviewed By: jhenderson, thopre

Differential Revision: https://reviews.llvm.org/D79276
2020-05-13 11:29:48 -04:00
Thomas Lively 3d49d1cfa7 [WebAssembly] Implement pseudo-min/max SIMD instructions
Summary:
As proposed in https://github.com/WebAssembly/simd/pull/122. Since
these instructions are not yet merged to the SIMD spec proposal, this
patch makes them entirely opt-in by surfacing them only through LLVM
intrinsics and clang builtins. If these instructions are made
official, these intrinsics and builtins should be replaced with simple
instruction patterns.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D79742
2020-05-12 09:39:01 -07:00
Fangrui Song 25a95f49b0 [gcov][test] Fix clang test 2020-05-12 09:21:19 -07:00
Melanie Blower 7f2db99350 [PATCH] #pragma float_control should be permitted in namespace scope.
Summary: Erroneous error diagnostic observed in VS2017 <numeric> header
Also correction to propagate usesFPIntrin from template func to instantiation.

Reviewers: rjmccall, erichkeane (no feedback received)

Differential Revision: https://reviews.llvm.org/D79631
2020-05-12 06:10:19 -07:00
Sander de Smalen d6936be2ef [SveEmitter] Add builtins for svdup and svindex
Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D79357
2020-05-12 11:02:32 +01:00
Joel E. Denny d0e7fd6b62 Revert "[FileCheck] Support comment directives"
This reverts commit 9a9a5f9893 to try to
fix a bot:

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/23489
2020-05-11 19:41:22 -04:00
Joel E. Denny 9a9a5f9893 [FileCheck] Support comment directives
Sometimes you want to disable a FileCheck directive without removing
it entirely, or you want to write comments that mention a directive by
name.  The `COM:` directive makes it easy to do this.  For example,
you might have:

```
; X32: pinsrd_1:
; X32:    pinsrd $1, 4(%esp), %xmm0

; COM: FIXME: X64 isn't working correctly yet for this part of codegen, but
; COM: X64 will have something similar to X32:
; COM:
; COM:   X64: pinsrd_1:
; COM:   X64:    pinsrd $1, %edi, %xmm0
```

Without this patch, you need to use some combination of rewording and
directive syntax mangling to prevent FileCheck from recognizing the
commented occurrences of `X32:` and `X64:` above as directives.
Moreover, FileCheck diagnostics have been proposed that might complain
about the occurrences of `X64` that don't have the trailing `:`
because they look like directive typos:

  <http://lists.llvm.org/pipermail/llvm-dev/2020-April/140610.html>

I think dodging all these problems can prove tedious for test authors,
and directive syntax mangling already makes the purpose of existing
test code unclear.  `COM:` can avoid all these problems.

This patch also updates the small set of existing tests that define
`COM` as a check prefix:

- clang/test/CodeGen/default-address-space.c
- clang/test/CodeGenOpenCL/addr-space-struct-arg.cl
- clang/test/Driver/hip-device-libs.hip
- llvm/test/Assembler/drop-debug-info-nonzero-alloca.ll

I think lit should support `COM:` as well.  Perhaps `clang -verify`
should too.

Reviewed By: jhenderson, thopre

Differential Revision: https://reviews.llvm.org/D79276
2020-05-11 14:53:48 -04:00
Florian Hahn 1065869195 [Matrix] Add matrix type to Clang.
This patch adds a matrix type to Clang as described in the draft
specification in clang/docs/MatrixSupport.rst. It introduces a new option
-fenable-matrix, which can be used to enable the matrix support.

The patch adds new MatrixType and DependentSizedMatrixType types along
with the plumbing required. Loads of and stores to pointers to matrix
values are lowered to memory operations on 1-D IR arrays. After loading,
the loaded values are cast to a vector. This ensures matrix values use
the alignment of the element type, instead of LLVM's large vector
alignment.

The operators and builtins described in the draft spec will will be added in
follow-up patches.

Reviewers: martong, rsmith, Bigcheese, anemet, dexonsmith, rjmccall, aaron.ballman

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D72281
2020-05-11 18:55:45 +01:00
Thomas Lively 8e3e56f2a3 [WebAssembly] Add wasm-specific vector shuffle builtin and intrinsic
Summary:

Although using `__builtin_shufflevector` and the `shufflevector`
instruction works fine, they are not opaque to the optimizer. As a
result, DAGCombine can potentially reduce the number of shuffles and
change the shuffle masks. This is unexpected behavior for users of the
WebAssembly SIMD intrinsics who have crafted their shuffles to
optimize the code generated by engines. This patch solves the problem
by adding a new shuffle intrinsic that is opaque to the optimizers in
line with the decision of the WebAssembly SIMD contributors at
https://github.com/WebAssembly/simd/issues/196#issuecomment-622494748. In
the future we may implement custom DAG combines to properly optimize
shuffles and replace this solution.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D66983
2020-05-11 10:01:55 -07:00
Kamlesh Kumar 9aee35bcc9 [Clang] Fix the incorrect return type of atomic_is_lock_free
Fixing the return type of atomic_is_lock_free as per
https://en.cppreference.com/w/c/atomic/atomic_is_lock_free

Differential Revision: https://reviews.llvm.org/D79504
2020-05-11 10:48:35 -04:00
Sander de Smalen 4cad97595f [SveEmitter] Add builtins for svmovlb and svmovlt
These builtins are expanded in CGBuiltin to use intrinsics
for (signed/unsigned) shift left long top/bottom.

Reviewers: efriedma, SjoerdMeijer

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D79579
2020-05-11 09:41:58 +01:00
Fangrui Song 25544ce2df [gcov] Default coverage version to '407*' and delete CC1 option -coverage-cfg-checksum
Defaulting to -Xclang -coverage-version='407*' makes .gcno/.gcda
compatible with gcov [4.7,8)

In addition, delete clang::CodeGenOptionsBase::CoverageExtraChecksum and GCOVOptions::UseCfgChecksum.
We can infer the information from the version.

With this change, .gcda files produced by `clang --coverage a.o` linked executable can be read by gcov 4.7~7.
We don't need other -Xclang -coverage* options.
There may be a mismatching version warning, though.

(Note, GCC r173147 "split checksum into cfg checksum and line checksum"
 made gcov 4.7 incompatible with previous versions.)
2020-05-10 16:14:07 -07:00
Fangrui Song 13a633b438 [gcov] Delete CC1 option -coverage-no-function-names-in-data
rL144865 incorrectly wrote function names for GCOV_TAG_FUNCTION
(this might be part of the reasons the header says
"We emit files in a corrupt version of GCOV's "gcda" file format").

rL176173 and rL177475 realized the problem and introduced -coverage-no-function-names-in-data
to work around the issue. (However, the description is wrong.
libgcov never writes function names, even before GCC 4.2).

In reality, the linker command line has to look like:

clang --coverage -Xclang -coverage-version='407*' -Xclang -coverage-cfg-checksum -Xclang -coverage-no-function-names-in-data

Failing to pass -coverage-no-function-names-in-data can make gcov 4.7~7
either produce wrong results (for one gcov-4.9 program, I see "No executable lines")
or segfault (gcov-7).
(gcov-8 uses an incompatible format.)

This patch deletes -coverage-no-function-names-in-data and the related
function names support from libclang_rt.profile
2020-05-10 12:37:44 -07:00
Sanjay Patel d02b3aba37 [CodeGen] fix test to be (mostly) independent of LLVM optimizer; NFC
This test would break with the proposed change to IR canonicalization
in D79171.

The test tried to do the right thing by only using -mem2reg with opt,
but it was using -O3 before that step, so the opt part was meaningless.
2020-05-10 11:25:37 -04:00
Sanjay Patel bcc5ed7b24 [CodeGen] fix test to be (mostly) independent of LLVM optimizer; NFC
This test would break with the proposed change to IR canonicalization
in D79171. The raw unoptimized IR from clang is massive, so I've
replaced -instcombine with -mem2reg to make it more manageable,
but still be unlikely to break with unrelated changed to optimization.
2020-05-10 11:19:43 -04:00
Thomas Lively ebb69b8baf [clang][WebAssembly] Only expose wait and notify builtins with atomics
Summary:
Since the underlying wait and notify instructions are only available
when the atomics feature is enabled, it only makes sense to expose
their builtin functions when atomics are enabled.

Reviewers: aheejin, sunfish

Subscribers: dschuff, sbc100, jgravelle-google, jfb, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79534
2020-05-08 13:54:29 -07:00
Sriraman Tallam e8147ad822 Uniuqe Names for Internal Linkage Symbols.
This is a standalone patch and this would help Propeller do a better job of code
layout as it can accurately attribute the profiles to the right internal linkage
function.

This also helps SampledFDO/AutoFDO correctly associate sampled profiles to the
right internal function. Currently, if there is more than one internal symbol
foo, their profiles are aggregated by SampledFDO.

This patch adds a new clang option, -funique-internal-funcnames, to generate
unique names for functions with internal linkage. This patch appends the md5
hash of the module name to the function symbol as a best effort to generate a
unique name for symbols with internal linkage.

Differential Revision: https://reviews.llvm.org/D73307
2020-05-07 18:18:37 -07:00
Sander de Smalen 96a581d0f0 [SveEmitter] Add builtins for SVE2 svtbx (extended table lookup)
This patch adds builtins for:
- svtbx
2020-05-07 16:15:57 +01:00
Sander de Smalen e46043bba7 [SveEmitter] Add builtins for SVE2 Optional extensions (AES, SHA3, SM4, BITPERM)
This patch adds various builtins under their corresponding feature macros:

Defined under __ARM_FEATURE_SVE2_AES:
- svaesd
- svaese
- svaesimc
- svaesmc
- svpmullb_pair
- svpmullt_pair

Defined under __ARM_FEATURE_SVE2_SHA3:
- svrax1

Defined under __ARM_FEATURE_SVE2_SM4:
- svsm4e
- svsm4ekey

Defined under __ARM_FEATURE_SVE2_BITPERM:
- svbdep
- svbext
- svbgrp
2020-05-07 16:15:57 +01:00
Sander de Smalen f22cdc3cc3 [SveEmitter] Add builtins for SVE2 Character match instructions
This patch adds builtins for:
- svmatch
- svnmatch
2020-05-07 16:15:57 +01:00
Sander de Smalen ae652241bd [SveEmitter] Add builtins for SVE2 Vector histogram count instructions
This patch adds builtins for:
- svhistcnt
- svhistseg
2020-05-07 16:15:57 +01:00
Sander de Smalen fa0371f4fd [SveEmitter] Add builtins for SVE2 Floating-point integer binary logarithm instructions
This patch adds builtins for:
- svlogb
2020-05-07 16:15:57 +01:00
Sander de Smalen 086722c18e [SveEmitter] Add builtins for SVE2 Floating-point widening multiply-accumulate
This patch adds builtins for:
- svmlalb, svmlalb_lane
- svmlalt, svmlalt_lane
- svmlslb, svmlslb_lane
- svmlslt, svmlslt_lane
2020-05-07 16:15:57 +01:00
Sander de Smalen e76256e7c1 [SveEmitter] Add builtins for SVE2 Complex integer dot product
This patch adds builtins for:
- svcdot, svcdot_lane
2020-05-07 16:09:31 +01:00
Sander de Smalen 867bfae93f [SveEmitter] Add builtins for SVE2 Widening complex integer arithmetic
This patch adds builtins for:
- svaddlbt
- svqdmlalbt
- svqdmlslbt
- svsublbt
- svsubltb
2020-05-07 16:09:31 +01:00
Sander de Smalen f525820755 [SveEmitter] Add builtins for SVE2 Narrowing DSP operations
This patch adds builtins for:
- svaddhnb
- svaddhnt
- svqrshrnb
- svqrshrnt
- svqrshrunb
- svqrshrunt
- svqshrnb
- svqshrnt
- svqshrunb
- svqshrunt
- svqxtnb
- svqxtnt
- svqxtunb
- svqxtunt
- svraddhnb
- svraddhnt
- svrshrnb
- svrshrnt
- svrsubhnb
- svrsubhnt
- svshrnb
- svshrnt
- svsubhnb
- svsubhnt
2020-05-07 16:09:31 +01:00
Sander de Smalen b0b658e7fc [SveEmitter] Add builtins for SVE2 Widening DSP operations
This patch adds builtins for:
- svabalb
- svabalt
- svabdlb
- svabdlt
- svaddlb
- svaddlt
- svaddwb
- svaddwt
- svmlalb, svmlalb_lane
- svmlalt, svmlalt_lane
- svmlslb, svmlslb_lane
- svmlslt, svmlslt_lane
- svmullb, svmullb_lane
- svmullt, svmullt_lane
- svqdmlalb, svqdmlalb_lane
- svqdmlalt, svqdmlalt_lane
- svqdmlslb, svqdmlslb_lane
- svqdmlslt, svqdmlslt_lane
- svqdmullb, svqdmullb_lane
- svqdmullt, svqdmullt_lane
- svshllb
- svshllt
- svsublb
- svsublt
- svsubwb
- svsubwt
2020-05-07 16:09:31 +01:00
Sander de Smalen ce7f50c2ce [SveEmitter] Add builtins for SVE2 Uniform complex integer arithmetic
This patch adds builtins for:
- svcadd
- svqcadd
- svcmla
- svcmla_lane
- svqrdcmlah
- svqrdcmlah_lane
2020-05-07 16:09:31 +01:00
Sander de Smalen 5e9bc21eea [SveEmitter] Add builtins for SVE2 Multiplication by indexed elements
This patch adds builtins for:
- svmla_lane
- svmls_lane
- svmul_lane
2020-05-07 15:21:37 +01:00
Sander de Smalen 60615cfb43 [SveEmitter] Add builtins for SVE2 Large integer arithmetic
This patch adds builtins for:
- svadclb
- svadclt
- svsbclb
- svsbclt
2020-05-07 15:21:37 +01:00
Sander de Smalen 36aab0c055 [SveEmitter] Add builtins for SVE2 Bitwise ternary logical instructions
This patch adds builtins for:
- svbcax
- svbsl
- svbsl1n
- svbsl2n
- sveor3
- svnbsl
- svxar
2020-05-07 15:21:37 +01:00
Sander de Smalen b0348af108 [SveEmitter] Add builtins for SVE2 widening pairwise arithmetic
This patch adds builtins for:
- svadalp
2020-05-07 15:21:37 +01:00
Sander de Smalen 7ff05002d0 [SveEmitter] Add builtins for SVE2 Non-widening pairwise arithmetic
This patch adds builtins for:
- svaddp
- svmaxnmp
- svmaxp
- svminnmp
- svminp
2020-05-07 15:21:37 +01:00
Sander de Smalen 0d22076531 [SveEmitter] Add builtins for SVE2 uniform DSP operations
This patch adds builtins for:
- svqdmulh, svqdmulh_lane
- svqrdmlah, svqrdmlah_lane
- svqrdmlsh, svqrdmlsh_lane
- svqrdmulh, svqrdmulh_lane
2020-05-07 13:31:46 +01:00
Sander de Smalen 5fa0eeec6e [SveEmitter] Add more SVE2 builtins for shift operations
This patch adds builtins for:
- svqshlu
- svrshr
- svrsra
- svsli
- svsra
- svsri
2020-05-07 13:31:46 +01:00
Sander de Smalen dc2986f9dc [SveEmitter] Add builtins for SVE2 saturating shift left and addition
This patch adds builtins for:
- svqrshl
- svqshl
- svsqadd
- svuqadd
2020-05-07 13:31:46 +01:00
Sander de Smalen b32d14c30e [SveEmitter] Add builtins for SVE2 uniform DSP operations
This patch adds builtins for:
- svqadd, svhadd, svrhadd
- svqsub, svhsub, svqusbr, svhsubr
- svqabs
- svqneg
- svrecpe
- svrsqrte
2020-05-07 13:31:46 +01:00
Sander de Smalen 35de496550 [SveEmitter] Add builtins for svqdecp and svqincp
This patch adds builtins for saturating increment/decrement by svcntp,
in scalar and vector forms.
2020-05-07 13:31:46 +01:00
Sander de Smalen cac06263a4 [SveEmitter] Add builtins for svinsr 2020-05-07 13:31:46 +01:00
Sander de Smalen 4f94e1a9f7 [SveEmitter] Add builtins for svasrd (zeroing/undef predication)
This patch adds builtins for arithmetic shift right (round towards zero)
instructions for zeroing (_z) and undef (_x) predication.
2020-05-07 12:28:18 +01:00
Sander de Smalen dbc6a07bcc [SveEmitter] Add builtins for address calculations.
This patch adds builtins for:
- svadrb, svadrh, svadrw, svadrd
2020-05-07 12:28:18 +01:00
Sander de Smalen 827c8b06d3 [SveEmitter] Add builtins for svcntp 2020-05-07 12:28:18 +01:00
Sander de Smalen ac894a5181 [SveEmitter] Add builtins for FFR manipulation
This patch adds builtins for:
- svrdffr, svrdffr_z
- svsetffr
- svwrffr
2020-05-07 12:28:18 +01:00
Sander de Smalen 91cb13f90d [SveEmitter] Add builtins for svqadd, svqsub and svdot
This patch adds builtins for saturating add/sub instructions:
- svqadd, svqadd_n
- svqsub, svqsub_n

and builtins for dot product instructions:
- svdot, svdot_lane
2020-05-07 12:28:18 +01:00
Sander de Smalen 3cb8b4c193 [SveEmitter] Add builtins for SVE2 Polynomial arithmetic
This patch adds builtins for:
- sveorbt
- sveortb
- svpmul
- svpmullb, svpmullb_pair
- svpmullt, svpmullt_pair

The svpmullb and svpmullt builtins are expressed using the svpmullb_pair
and svpmullt_pair LLVM IR intrinsics, respectively.

Reviewers: SjoerdMeijer, efriedma, rengolin

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D79480
2020-05-07 11:53:04 +01:00
Craig Topper 16c800b8b7 [X86] Remove support for Y0 constraint as an alias for Yz in inline assembly.
Neither gcc or icc support this. Split out from D79472. I want
to remove more, but it looks like icc does support some things
gcc doesn't and I need to double check our internal test suites.
2020-05-06 14:58:53 -07:00
Artem Belevich 314f99e7d4 [CUDA] Enable existing builtins for PTX7.0 as well.
Differential Revision: https://reviews.llvm.org/D79515
2020-05-06 14:24:21 -07:00
Melanie Blower e5578013b1 When pragma FENV_ACCESS is ignored do not modify Sema.CurFPFeatures
Bug reported by @uabelho against reviews.llvm.org/D72841

Reviewers: rjmccall

Differential Revision: https://reviews.llvm.org/D79510
2020-05-06 13:18:59 -07:00
Melanie Blower c355bec749 Add support for #pragma clang fp reassociate(on|off)
Reviewers: rjmccall, erichkeane, sepavloff

Differential Revision: https://reviews.llvm.org/D78827
2020-05-06 08:05:44 -07:00
Erich Keane 8a1c999c9b Implement _ExtInt ABI for all ABIs in Clang, enable type for ABIs
This is the result of an audit of all of the ABIs in clang to implement
and enable the type for those targets.

Additionally, this finds an issue with integer-promotion passing for a
few platforms when using _ExtInt of < int, so this also corrects that
resulting in signext/zeroext being on a params of those types in some
platforms.

Differential Revisions: https://reviews.llvm.org/D79118
2020-05-06 06:52:18 -07:00
Craig Topper 0fac1c1912 [X86] Allow Yz inline assembly constraint to choose ymm0 or zmm0 when avx/avx512 are enabled and type is 256 or 512 bits
gcc supports selecting ymm0/zmm0 for the Yz constraint when used with 256 or 512 bit vector types.

Fixes PR45806

Differential Revision: https://reviews.llvm.org/D79448
2020-05-05 21:12:30 -07:00
Artem Belevich 844096b996 [CUDA] Make NVVM builtins available with CUDA-11/PTX6.5
Differential Revision: https://reviews.llvm.org/D79449
2020-05-05 15:43:32 -07:00
Sander de Smalen 5ba329059f [SveEmitter] Add builtins for svreinterpret
The reinterpret builtins are generated separately because they
need the cross product of all types, 121 functions in total,
which is inconvenient to specify in the arm_sve.td file.

Reviewers: SjoerdMeijer, efriedma, ctetreau, rengolin

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78756
2020-05-05 13:04:44 +01:00
Sander de Smalen aed6bd6f42 Reland D78750: [SveEmitter] Add builtins for svdupq and svdupq_lane
Edit: Changed a few CHECK lines into CHECK-DAG lines.

This reverts commit 90f3f62cb0.
2020-05-05 10:42:11 +01:00
Sander de Smalen 90f3f62cb0 Revert "[SveEmitter] Add builtins for svdupq and svdupq_lane"
It seems this patch broke some buildbots, so reverting until I
have had a chance to investigate.

This reverts commit 6b90a6887d.
2020-05-04 21:31:55 +01:00
Sander de Smalen 6b90a6887d [SveEmitter] Add builtins for svdupq and svdupq_lane
* svdupq builtins that duplicate scalars to every quadword of a vector
  are defined using builtins for svld1rq (load and replicate quadword).
* svdupq builtins that duplicate boolean values to fill a predicate vector
  are defined using `svcmpne`.

Reviewers: SjoerdMeijer, efriedma, ctetreau

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78750
2020-05-04 20:38:47 +01:00
Sander de Smalen 54fa46aa0a [SveEmitter] Add builtins for Int & FP reductions
This patch adds integer builtins for:
- svaddv, svandv, sveorv,
  svmaxv, svminv, svorv.

And FP builtins for:
- svadda, svaddv, svmaxv, svmaxnmv,
  svminv, svminnmv
2020-05-04 19:50:16 +01:00
Melanie Blower 7cbb495ab4 Fix LABEL match for test case for D72841 #pragma float_control 2020-05-04 07:27:40 -07:00
Melanie Blower f5360d4bb3 Reapply "Add support for #pragma float_control" with buildbot fixes
Add support for #pragma float_control

Reviewers: rjmccall, erichkeane, sepavloff

Differential Revision: https://reviews.llvm.org/D72841

This reverts commit fce82c0ed3.
2020-05-04 05:51:25 -07:00
Thomas Lively e0f52842c8 [WebAssembly] Renumber SIMD opcodes
Summary:
As described in https://github.com/WebAssembly/simd/pull/209. This is
the final reorganization of the SIMD opcode space before
standardization. It has been landed in concert with corresponding
changes in other projects in the WebAssembly SIMD ecosystem.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79224
2020-05-01 17:20:49 -07:00
Sander de Smalen 334931f54b [SveEmitter] Add builtins for shifts.
This patch adds builtins for:
- svasrd
- svlsl
- svlsr
2020-05-01 22:27:24 +01:00
Melanie Blower fce82c0ed3 Revert "Reapply "Add support for #pragma float_control" with improvements to"
This reverts commit 69aacaf699.
2020-05-01 10:31:09 -07:00
Melanie Blower 69aacaf699 Reapply "Add support for #pragma float_control" with improvements to
test cases
Add support for #pragma float_control

Reviewers: rjmccall, erichkeane, sepavloff

Differential Revision: https://reviews.llvm.org/D72841

This reverts commit 85dc033cac, and makes
corrections to the test cases that failed on buildbots.
2020-05-01 10:03:30 -07:00
Sander de Smalen 1a720d49dc [SveEmitter] Add builtins for various FP operations
Unary:
- svexpa, svtmad, svtsmul, svtssel,
  svscale, svrecpe, svrecps, svrsqrte,
  svrsqrts,

Binary:
- svabd, svadd, svdiv, svdivr,
  svmin, svmax, svminnm, svmaxnm,
  svmul, svmulx, svsub, svsubr,
  svmul_lane

Complex:
- svcadd, svcmla
2020-05-01 17:37:43 +01:00
Melanie Blower 85dc033cac Revert "Add support for #pragma float_control"
This reverts commit 4f1e9a17e9.
due to fail on buildbot, sorry for the noise
2020-05-01 06:36:58 -07:00
Melanie Blower 4f1e9a17e9 Add support for #pragma float_control
Reviewers: rjmccall, erichkeane, sepavloff

Differential Revision: https://reviews.llvm.org/D72841
2020-05-01 06:14:24 -07:00
Nikita Popov afc287e0ab Fix clang test after D76886 2020-04-30 23:42:38 +02:00
Aaron Smith 4eabd00612 [Windows SEH] Fix abnormal-exits in _try
Summary:
Per Windows SEH Spec, except _leave, all other early exits of a _try (goto/return/continue/break) are considered abnormal exits.  In those cases, the first parameter passes to its _finally funclet should be TRUE to indicate an abnormal-termination.

One way to implement abnormal exits in _try is to invoke Windows runtime _local_unwind() (MSVC approach) that will invoke _dtor funclet where abnormal-termination flag is always TRUE when calling _finally.  Obviously this approach is less optimal and is complicated to implement in Clang.

Clang today has a NormalCleanupDestSlot mechanism to dispatch multiple exits at the end of _try.  Since  _leave (or try-end fall-through) is always Indexed with 0 in that NormalCleanupDestSlot,  this fix takes the advantage of that mechanism and just passes NormalCleanupDest ID as 1st Arg to _finally.

Reviewers: rnk, eli.friedman, JosephTremoulet, asmith, efriedma

Reviewed By: efriedma

Subscribers: efriedma, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77936
2020-04-30 09:38:19 -07:00
Erich Keane b5a4deec6a [NFC] Split ext-int calling convention tests into their own file.
I'm currently auditing all of the calling convention implications of
_ExtInt for all platforms, so splitting them up into their own test will
make this a much easier task to organize.
2020-04-29 12:20:21 -07:00
Erich Keane 5a1d9c0f5a Fix x86/x86_64 calling convention for _ExtInt
After speaking with Craig Topper about some recent defects, he pointed
out that _ExtInts should be passed indirectly if larger than the largest
int register, and like ints when smaller than that.  This patch
implements that.

Note that this changed the way vaargs worked quite a bit, but they still
work.

Differential Revision: https://reviews.llvm.org/D78785
2020-04-29 11:04:25 -07:00
Sander de Smalen a4dac6d4e0 [SveEmitter] Add builtins for svmov_b and svnot_b.
These are custom expanded in CGBuiltin:

  svmov_b_z(pg, op) <=> svand_b_z(pg, op, op)
  svnot_b_z(pg, op) <=> sveor_b_z(pg, op, pg)

Reviewers: SjoerdMeijer, efriedma, ctetreau, rengolin

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D79039
2020-04-29 13:33:18 +01:00
Sander de Smalen 42a56bf63f [SveEmitter] Add builtins for gather prefetches
Patch by Andrzej Warzynski

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78677
2020-04-29 11:52:49 +01:00
Momchil Velikov 102b4105e3 [CMSE] Clear padding bits of struct/unions/fp16 passed by value
When passing a value of a struct/union type from secure to non-secure
state (that is returning from a CMSE entry function or passing an
argument to CMSE-non-secure call), there is a potential sensitive
information leak via the padding bits in the structure. It is not
possible in the general case to ensure those bits are cleared by using
Standard C/C++.

This patch makes the compiler emit code to clear such padding
bits. Since type information is lost in LLVM IR, the code generation
is done by Clang.

For each interesting record type, we build a bitmask, in which all the
bits, corresponding to user declared members, are set. Values of
record types are returned by coercing them to an integer. After the
coercion, the coerced value is masked (with bitwise AND) and then
returned by the function. In a similar manner, values of record types
are passed as arguments by coercing them to an array of integers, and
the coerced values themselves are masked.

For union types, we effectively clear only bits, which aren't part of
any member, since we don't know which is the currently active one.
The compiler will issue a warning, whenever a union is passed to
non-secure state.

Values of half-precision floating-point types are passed in the least
significant bits of a 32-bit register (GPR or FPR) with the most
significant bits unspecified. Since this is also a potential leak of
sensitive information, this patch also clears those unspecified bits.

Differential Revision: https://reviews.llvm.org/D76369
2020-04-28 17:05:58 +01:00
Sander de Smalen 43d1d52ad2 [SveEmitter] Add builtins for logical and predicate operations.
This patch adds builtins for logical ops:
- svand, svbic, sveor, svorr, svcnot, svnot

and builtins for predicate operations:
- svand_b_z, svbic_b_z, sveor_b_z, svnand_b_z, svnor_b_z, svorn_b_z, svorr_b_z
- svbrka_b_z, svbrkb_b_z, svbrkpa_b_z, svbrkpb_b_z, svbrkn_b_z
- svpfirst_b
- svpnext
- svptest_any
- svptest_first
- svptest_last
2020-04-28 16:37:17 +01:00
Sander de Smalen 476ba8127b [SveEmitter] Add builtins for zero/sign extension and bit/byte reversal.
This patch adds builtins for predicated unary builtins
svext[bhw] and svrev[bhw] and svrbit.
2020-04-28 14:06:51 +01:00
Sander de Smalen c57720125f [SveEmitter] Add builtins for bitcount operations
This patch adds builtins for svcls, svclz and svcnt.

For merging (_m), zeroing (_z) and don't-care (_x) predication.
2020-04-28 13:53:54 +01:00
Sander de Smalen 6f588c6ef3 [SveEmitter] Add builtins for permutations and selection
This patch adds builtins for:
- svlasta and svlastb
- svclasta and svclastb
- svunpkhi and svunpklo
- svuzp1 and svuzp2
- svzip1 and svzip2
- svrev
- svsel
- svcompact
- svsplice
- svtbl
2020-04-28 13:43:11 +01:00
Sander de Smalen e1932ffbd9 [SveEmitter] Add builtins for ternary ops (fmla, fmad, etc)
This patch adds builtins for:
- svmad, svmla, svmls, svmsb
  svnmad, svnmla, svnmls, svnmsb
  svmla_lane, svmls_lane

These builtins come in several flavours:
- Merge into first source vector (`_m`)
- False lanes are undef (`_x`)
- False lanes are zeroed (`_z`)

And can also have `_n` to indicate the last operand is a scalar.

For example:

  svint32_t svmla[_n_s32]_z(svbool_t pg, svint32_t op1, svint32_t op2, int32_t op3)

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D78960
2020-04-28 10:59:38 +01:00
Sander de Smalen e4872d7f08 [SveEmitter] Add builtins for svlen
The svlen builtins return the number of elements in a vector
and are implemented using `llvm.vscale`.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D78755
2020-04-27 21:27:32 +01:00
Sander de Smalen 03f419f3eb [SveEmitter] IsInsertOp1SVALL and builtins for svqdec[bhwd] and svqinc[bhwd]
Some ACLE builtins leave out the argument to specify the predicate
pattern, which is expected to be expanded to an SV_ALL pattern.

This patch adds the flag IsInsertOp1SVALL to insert SV_ALL as the
second operand.

Reviewers: efriedma, SjoerdMeijer

Reviewed By: SjoerdMeijer

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78401
2020-04-27 11:45:10 +01:00
Sander de Smalen 3817ca7dbf [SveEmitter] Add IsAppendSVALL and builtins for svptrue and svcnt[bhwd]
Some ACLE builtins leave out the argument to specify the predicate
pattern, which is expected to be expanded to an SV_ALL pattern.

This patch adds the flag IsAppendSVALL to append SV_ALL as the final
operand.

Reviewers: SjoerdMeijer, efriedma, rovka, rengolin

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77597
2020-04-26 12:44:26 +01:00
Ayke van Laethem ceba881aea
[AVR][NFC] Move preprocessor tests to Preprocessor directory
These tests were placed in the CodeGen directory while they really
should have been placed in the Preprocessor directory.

Differential Revision: https://reviews.llvm.org/D78163
2020-04-26 01:29:25 +02:00
Craig Topper 0ed5b0d517 [X86] Don't use types when getting the intrinsic declaration for x86_avx512_mask_vcvtph2ps_512.
This intrinsic isn't overloaded so we should query with types.
Doing so causes the backend to miss the intrinsic and not codegen it.
This eventually leads to a linker error.
2020-04-24 11:01:22 -07:00
Luke Geeson 7da1905125 [AArch32] Armv8.6-a Matrix Mult Assembly + Intrinsics
This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

This patch includes:

- Assembly support for AArch32
- Intrinsics Support for AArch32 Neon Intrinsics for Matrix
  Multiplication

Note: these extensions are optional in the 8.6a architecture and so have
to be enabled by default

No additional IR types or C Types are needed for this extension.

This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)

Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman

Reviewers: t.p.northover, miyuki

Reviewed By: miyuki

Subscribers: miyuki, ostannard, kristof.beyls, hiraditya, danielkiss,
cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77872
2020-04-24 15:54:06 +01:00
Luke Geeson 832cd74913 [AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics
This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

This patch includes:

- Assembly support for AArch64 only (no SVE or Neon)
- Intrinsics Support for AArch64 Armv8.6a Matrix Multiplication Instructions (No bfloat16 matrix multiplication)

No IR types or C Types are needed for this extension.

This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)

Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman

Reviewers: ostannard, t.p.northover, rengolin, kmclaughlin

Reviewed By: kmclaughlin

Subscribers: kmclaughlin, kristof.beyls, hiraditya, danielkiss,
cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77871
2020-04-24 15:54:06 +01:00
Sander de Smalen 0ddb2034c1 [SveEmitter] Add builtins for compares and ReverseCompare flag.
The IsReverseCompare flag tells CGBuiltin to swap the operands,
so that a LT/LE intrinsics can be expressed in terms of GE/GT
intrinsics.

This patch also adds builtins for the wide-variants of the compares.

Reviewers: SjoerdMeijer, efriedma, ctetreau

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78747
2020-04-24 14:33:47 +01:00
Sander de Smalen 823e2a670a [SveEmitter] Add builtins for contiguous prefetches
This patch also adds the enum `sv_prfop` for the prefetch operation specifier
and checks to ensure the passed enum values are valid.

Reviewers: SjoerdMeijer, efriedma, ctetreau

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78674
2020-04-24 11:35:59 +01:00
Sander de Smalen db7997472b [SveEmitter] Add builtins for svld1rq
Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78748
2020-04-24 11:10:28 +01:00
Sander de Smalen c84e1305c4 [SveEmitter] Add builtins for scatter stores
D77735 only added scatters for the non-temporal variants.

Reviewers: SjoerdMeijer, efriedma, andwar

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78751
2020-04-24 10:57:43 +01:00
Sander de Smalen 7003a1da37 [SveEmitter] Use llvm.aarch64.sve.ld1/st1 for contiguous load/store builtins
This patch changes the codegen of the builtins for contiguous loads
to map onto the SVE specific IR intrinsics llvm.aarch64.sve.ld1/st1.

Reviewers: SjoerdMeijer, efriedma, kmclaughlin, rengolin

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78673
2020-04-23 15:15:41 +01:00
Sander de Smalen a5e0389b2a [AArch64] Define ACLE FP conversion intrinsics with more specific predicate.
This patch changes the FP conversion intrinsics to take a predicate
that matches the number of lanes for the vector with the widest element
type as opposed to using <vscale x 16 x i1>.

For example:
```<vscale x 4 x float> @llvm.aarch64.sve.fcvt.f32f16(<vscale x 4 x float>, <vscale x 4 x i1>, <vscale x 8 x half>)```
now uses <vscale x 4 x i1> instead of <vscale x 16 x i1>

And similar for:
```<vscale x 4 x float> @llvm.aarch64.sve.fcvt.f32f64(<vscale x 4 x float>, <vscale x 2 x i1>, <vscale x 2 x double>)```
where the predicate now matches the wider type, so <vscale x 2 x i1>.

Reviewers: efriedma, SjoerdMeijer, paulwalker-arm, rengolin

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78402
2020-04-23 10:53:23 +01:00
Sander de Smalen 002164461b [SveEmitter] Add builtins for FP conversions
This adds the flag IsOverloadCvt which tells CGBulitin to use
the result type and the type of the last operand as the
overloaded types for the LLVM IR intrinsic.

This also adds the flag IsFPConvert, which is needed to avoid
converting the predicate of the operation from svbool_t to
a predicate with fewer lanes, as the LLVM IR intrinsics use
the <vscale x 16 x i1> as the predicate.

Reviewers: SjoerdMeijer, efriedma

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78239
2020-04-23 10:49:06 +01:00
Sander de Smalen 2d1baf606a [SveEmitter] Add builtins for svwhilerw/svwhilewr
This also adds the IsOverloadWhileRW flag which tells CGBuiltin to use
the result predicate type and the first pointer type as the
overloaded types for the LLVM IR intrinsic.

Reviewers: SjoerdMeijer, efriedma

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78238
2020-04-22 21:49:18 +01:00
Sander de Smalen 1559485e60 [SveEmitter] Add builtins for svwhile
This also adds the IsOverloadWhile flag which tells CGBuiltin to use
both the default type (predicate) and the type of the second operand
(scalar) as the overloaded types for the LLMV IR intrinsic.

Reviewers: SjoerdMeijer, efriedma, rovka

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77595
2020-04-22 21:47:47 +01:00
David Green eecba95067 [ARM] Replace arm vendor with none. NFC 2020-04-22 18:19:35 +01:00
Sander de Smalen 662cbaf647 [SveEmitter] Add IsOverloadNone flag and builtins for svpfalse and svcnt[bhwd]_pat
Add the IsOverloadNone flag to tell CGBuiltin that it does not have
an overloaded type. This is used for e.g. svpfalse which does
not take any arguments and always returns a svbool_t.

This patch also adds builtins for svcntb_pat, svcnth_pat, svcntw_pat
and svcntd_pat, as those don't require custom codegen.

Reviewers: SjoerdMeijer, efriedma, rovka

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77596
2020-04-22 16:42:08 +01:00
Sander de Smalen 41d52662d5 [SveEmitter] Add support for _n form builtins
The ACLE has builtins that take a scalar value that is to be expanded
into a vector by the operation. While the ISA may have an instruction
that takes an immediate or a scalar to represent this, the LLVM IR
intrinsic may not, so Clang will have to splat the scalar value.

This patch also adds the _n forms for svabd, svadd, svdiv, svdivr,
svmax, svmin, svmul, svmulh, svub and svsubr.

Reviewers: SjoerdMeijer, efriedma, rovka

Reviewed By: SjoerdMeijer

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77594
2020-04-22 14:23:54 +01:00
Pavel Iliin 4eca1c06a4 [AArch64][FIX] f16 indexed patterns encoding restrictions. 2020-04-22 14:11:28 +01:00
Andrzej Warzynski 72f565899d [SveEmitter] Implement builtins for gathers/scatters
This patch adds builtins for:
  * regular, first-faulting and non-temporal gather loads
  * regular and non-temporal scatter stores

Differential Revision: https://reviews.llvm.org/D77735
2020-04-22 13:21:39 +01:00
Justin Hibbits 4ca2cad947 [PowerPC] Add clang -msvr4-struct-return for 32-bit ELF
Summary:

Change the default ABI to be compatible with GCC.  For 32-bit ELF
targets other than Linux, Clang now returns small structs in registers
r3/r4.  This affects FreeBSD, NetBSD, OpenBSD.  There is no change for
32-bit Linux, where Clang continues to return all structs in memory.

Add clang options -maix-struct-return (to return structs in memory) and
-msvr4-struct-return (to return structs in registers) to be compatible
with gcc.  These options are only for PPC32; reject them on PPC64 and
other targets.  The options are like -fpcc-struct-return and
-freg-struct-return for X86_32, and use similar code.

To actually return a struct in registers, coerce it to an integer of the
same size.  LLVM may optimize the code to remove unnecessary accesses to
memory, and will return i32 in r3 or i64 in r3:r4.

Fixes PR#40736

Patch by George Koehler!

Reviewed By: jhibbits, nemanjai
Differential Revision: https://reviews.llvm.org/D73290
2020-04-21 20:17:25 -05:00
Pavel Iliin be881e2831 [AArch64] FMLA/FMLS patterns improvement.
FMLA/FMLS f16 indexed patterns added.
Fixes https://bugs.llvm.org/show_bug.cgi?id=45467
Removed redundant v2f32 vector_extract indexed pattern since
Instruction Selection is able to match v4f32 instead.
2020-04-21 18:23:21 +01:00
Sander de Smalen 06c980df46 [SveEmitter] Implement zeroing of false lanes
This implements zeroing of false lanes for binary operations,
where instead of merging into the first operand vector (_m)
a `select` is placed on the first input vector. This approach
easily translates to the use of the `zeroing movprfx` instruction.

This patch also adds builtins for svabd, svadd, svdiv, svdivr,
svmax, svmin, svmul, svmulh, svub and svsubr.

Reviewers: SjoerdMeijer, efriedma, rovka

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77593
2020-04-20 17:02:48 +01:00
Sander de Smalen 9986b3de26 [SveEmitter] Explicitly merge with zero/undef
Builtins that have the merge type MergeAnyExp or MergeZeroExp,
merge into a 'undef' or 'zero' vector respectively, which enables the
_x and _z behaviour for unary operations.

This patch also adds builtins for svabs and svneg.

Reviewers: SjoerdMeijer, efriedma, rovka

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77591
2020-04-20 16:26:20 +01:00
Sander de Smalen fc64539749 [SveEmitter] Add immediate checks for lanes and complex imms
Adds another bunch of of intrinsics that take immediates with
varying ranges based, some being a complex rotation immediate
which are a set of allowed immediates rather than a range.

    svmla_lane:   lane immediate ranging 0..(128/(1*sizeinbits(elt)) - 1)
    svcmla_lane:  lane immediate ranging 0..(128/(2*sizeinbits(elt)) - 1)
    svdot_lane:   lane immediate ranging 0..(128/(4*sizeinbits(elt)) - 1)
    svcadd:       complex rotate immediate [90, 270]
    svcmla:
    svcmla_lane:  complex rotate immediate [0, 90, 180, 270]

Reviewers: efriedma, SjoerdMeijer, rovka

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76680
2020-04-20 15:10:54 +01:00
Sander de Smalen 515020c091 [SveEmitter] Add more immediate operand checks.
This patch adds a number of intrinsics that take immediates with
varying ranges based on the element size one of the operands.

    svext:   immediate ranging 0 to (2048/sizeinbits(elt) - 1)
    svasrd:  immediate ranging 1..sizeinbits(elt)
    svqshlu: immediate ranging 1..sizeinbits(elt)/2
    ftmad:   immediate ranging 0..(sizeinbits(elt) - 1)

Reviewers: efriedma, SjoerdMeijer, rovka, rengolin

Reviewed By: SjoerdMeijer

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76679
2020-04-20 14:41:58 +01:00
Erich Keane 5f0903e9be Reland Implement _ExtInt as an extended int type specifier.
I fixed the LLDB issue, so re-applying the patch.

This reverts commit a4b88c0449.
2020-04-17 10:45:48 -07:00
Sterling Augustine a4b88c0449 Revert "Implement _ExtInt as an extended int type specifier."
This reverts commit 61ba1481e2.

I'm reverting this because it breaks the lldb build with
incomplete switch coverage warnings. I would fix it forward,
but am not familiar enough with lldb to determine the correct
fix.

lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:3958:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
  switch (qual_type->getTypeClass()) {
          ^
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4633:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
  switch (qual_type->getTypeClass()) {
          ^
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4889:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
  switch (qual_type->getTypeClass()) {
2020-04-17 10:29:40 -07:00
Erich Keane 61ba1481e2 Implement _ExtInt as an extended int type specifier.
Introduction/Motivation:
LLVM-IR supports integers of non-power-of-2 bitwidth, in the iN syntax.
Integers of non-power-of-two aren't particularly interesting or useful
on most hardware, so much so that no language in Clang has been
motivated to expose it before.

However, in the case of FPGA hardware normal integer types where the
full bitwidth isn't used, is extremely wasteful and has severe
performance/space concerns.  Because of this, Intel has introduced this
functionality in the High Level Synthesis compiler[0]
under the name "Arbitrary Precision Integer" (ap_int for short). This
has been extremely useful and effective for our users, permitting them
to optimize their storage and operation space on an architecture where
both can be extremely expensive.

We are proposing upstreaming a more palatable version of this to the
community, in the form of this proposal and accompanying patch.  We are
proposing the syntax _ExtInt(N).  We intend to propose this to the WG14
committee[1], and the underscore-capital seems like the active direction
for a WG14 paper's acceptance.  An alternative that Richard Smith
suggested on the initial review was __int(N), however we believe that
is much less acceptable by WG14.  We considered _Int, however _Int is
used as an identifier in libstdc++ and there is no good way to fall
back to an identifier (since _Int(5) is indistinguishable from an
unnamed initializer of a template type named _Int).

[0]https://www.intel.com/content/www/us/en/software/programmable/quartus-prime/hls-compiler.html)
[1]http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2472.pdf

Differential Revision: https://reviews.llvm.org/D73967
2020-04-17 07:10:57 -07:00
bd1976llvm 86478d3de9 [MC][ELF] Put explicit section name symbols into entry size compatible sections
Ensure that symbols explicitly* assigned a section name are placed into
a section with a compatible entry size.

This is done by creating multiple sections with the same name** if
incompatible symbols are explicitly given the name of an incompatible
section, whilst:

  - Avoiding using uniqued sections where possible (for readability and
    to maximize compatibly with assemblers).

  - Creating as few SHF_MERGE sections as possible (for efficiency).

Given that each symbol is assigned to a section in a single pass, we
must decide which section each symbol is assigned to without seeing the
properties of all symbols. A stable and easy to understand assignment is
desirable. The following rules facilitate this: The "generic" section
for a given section name will be mergeable if the name is a mergeable
"default" section name (such as .debug_str), a mergeable "implicit"
section name (such as .rodata.str2.2), or MC has already created a
mergeable "generic" section for the given section name (e.g. in response
to a section directive in inline assembly). Otherwise, the "generic"
section for a given name is non-mergeable; and, non-mergeable symbols
are assigned to the "generic" section, while mergeable symbols are
assigned to uniqued sections.

Terminology:
"default" sections are those always created by MC initially, e.g. .text
or .debug_str.

"implicit" sections are those created normally by MC in response to the
symbols that it encounters, i.e. in the absence of an explicit section
name assignment on the symbol, e.g. a function foo might be placed into
a .text.foo section.

"generic" sections are those that are referred to when a unique section
ID is not supplied, e.g. if there are multiple unique .bob sections then
".quad .bob" will reference the generic .bob section. Typically, the
generic section is just the first section of a given name to be created.
Default sections are always generic.

* Typically, section names might be explicitly assigned in source code
using a language extension e.g. a section attribute: _attribute_
((section ("section-name"))) -
https://clang.llvm.org/docs/AttributeReference.html

** I refer to such sections as unique/uniqued sections. In assembly the
", unique," assembly syntax is used to express such sections.

Fixes https://bugs.llvm.org/show_bug.cgi?id=43457.

See https://reviews.llvm.org/D68101 for previous discussions leading to
this patch.

Some minor fixes were required to LLVM's tests, for tests had been using
the old behavior - which allowed for explicitly assigning globals with
incompatible entry sizes to a section.

This fix relies on the ",unique ," assembly feature. This feature is not
available until bintuils version 2.35
(https://sourceware.org/bugzilla/show_bug.cgi?id=25380). If the
integrated assembler is not being used then we avoid using this feature
for compatibility and instead try to place mergeable symbols into
non-mergeable sections or issue an error otherwise.

Differential Revision: https://reviews.llvm.org/D72194
2020-04-16 19:12:49 +00:00
George Burgess IV 94908088a8 [CodeGen] fix inline builtin-related breakage from D78162
In cases where we have multiple decls of an inline builtin, we may need
to go hunting for the one with a definition when setting function
attributes.

An additional test-case was provided on
https://github.com/ClangBuiltLinux/linux/issues/979
2020-04-16 11:54:10 -07:00
Georgii Rymar 65a2de7e6c [FileCheck] - Fix the false positive when -implicit-check-not is used with an unknown -check-prefix.
Imagine we have the following invocation:

`FileCheck -check-prefix=UNKNOWN-PREFIX -implicit-check-not=something`

When the check prefix does not exist it does not fail.
This patch fixes the issue.

Differential revision: https://reviews.llvm.org/D78024
2020-04-16 15:00:50 +03:00
Ehud Katz 03a9526fe5 [CGExprAgg] Fix infinite loop in `findPeephole`
Simplify the function using IgnoreParenNoopCasts.

Fix PR45476

Differential Revision: https://reviews.llvm.org/D78098
2020-04-16 13:26:23 +03:00
Ayke van Laethem 215dc2e203
[AVR] Use the correct address space for non-prototyped function calls
Some function declarations like this:

    void foo();

do not have a type declaration, for that you'd use:

    void foo(void);

Clang internally bitcasts the variadic function declaration to a
function pointer, but doesn't use the correct address space on AVR. This
commit fixes that.

This fix is necessary to let Clang compile compiler-rt for AVR.

Differential Revision: https://reviews.llvm.org/D78125
2020-04-15 23:44:51 +02:00
George Burgess IV 2dd17ff081 [CodeGen] only add nobuiltin to inline builtins if we'll emit them
There are some inline builtin definitions that we can't emit
(isTriviallyRecursive & callers go into why). Marking these
nobuiltin is only useful if we actually emit the body, so don't mark
these as such unless we _do_ plan on emitting that.

This suboptimality was encountered in Linux (see some discussion on
D71082, and https://github.com/ClangBuiltLinux/linux/issues/979).

Differential Revision: https://reviews.llvm.org/D78162
2020-04-15 11:05:22 -07:00
Teresa Johnson 33ffb62e23 Allow disabling of vectorization using internal options
Summary:
Currently, the internal options -vectorize-loops, -vectorize-slp, and
-interleave-loops do not have much practical effect. This is because
they are used to initialize the corresponding flags in the pass
managers, and those flags are then unconditionally overwritten when
compiling via clang or via LTO from the linkers. The only exception was
-vectorize-loops via opt because of some special hackery there.

While vectorization could still be disabled when compiling via clang,
using -fno-[slp-]vectorize, this meant that there was no way to disable
it when compiling in LTO mode via the linkers. This only affected
ThinLTO, since for regular LTO vectorization is done during the compile
step for scalability reasons. For ThinLTO it is invoked in the LTO
backends. See also the discussion on PR45434.

This patch makes it so the internal options can actually be used to
disable these optimizations. Ultimately, the best long term solution is
to mark the loops with metadata (similar to the approach used to fix
-fno-unroll-loops in D77058), but this enables a shorter term
workaround, and actually makes these internal options useful.

I constant propagated the initial values of these internal flags into
the pass manager flags (for some reasons vectorize-loops and
interleave-loops were initialized to true, while vectorize-slp was
initialized to false). As mentioned above, they are overwritten
unconditionally so this doesn't have any real impact, and these initial
values aren't particularly meaningful.

I then changed the passes to check the internl values and return without
performing the associated optimization when false (I changed the default
of -vectorize-slp to true so the options behave similarly). I was able
to remove the hackery in opt used to get -vectorize-loops=false to work,
as well as a special option there used to disable SLP vectorization.

Finally, I changed thinlto-slp-vectorize-pm.c to:
a) Only test SLP (moved the loop vectorization checking to a new test).
b) Use code that is slp vectorized when it is enabled, and check that
instead of whether the pass is enabled.
c) Test the new behavior of -vectorize-slp.
d) Test both pass managers.

The loop vectorization (and associated interleaving) testing I moved to
a new thinlto-loop-vectorize-pm.c test, with several changes:
a) Changed the flags on the interleaving testing so that it will
actually interleave, and check that.
b) Test the new behavior of -vectorize-loops and -interleave-loops.
c) Test both pass managers.

Reviewers: fhahn, wmi

Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, davezarzycki, llvm-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77989
2020-04-14 18:09:10 -07:00
Ayke van Laethem fe06e231ff
[AVR] Define __ELF__
This symbol is defined in avr-gcc. Because AVR normally uses the ELF
format, define the symbol unconditionally.

This patch is needed to get Clang to compile compiler-rt.

Differential Revision: https://reviews.llvm.org/D78117
2020-04-15 00:22:53 +02:00
Jon Roelofs 38b39c34ab [clang] Add missing FileCheck colons 2020-04-14 12:32:48 -06:00
Sander de Smalen c8a5b30bac [SveEmitter] Add range checks for immediates and predicate patterns.
Summary:
This patch adds a mechanism to easily add range checks for a builtin's
immediate operands. This patch is tested with the qdech intrinsic, which takes
both an enum for the predicate pattern, as well as an immediate for the
multiplier.

Reviewers: efriedma, SjoerdMeijer, rovka

Reviewed By: efriedma, SjoerdMeijer

Subscribers: mgorny, tschuett, mgrang, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76678
2020-04-14 16:49:32 +01:00
Sander de Smalen 17a68c61a9 [SveEmitter] Implement builtins for contiguous loads/stores
This adds builtins for all contiguous loads/stores, including
non-temporal, first-faulting and non-faulting.

Reviewers: efriedma, SjoerdMeijer

Reviewed By: SjoerdMeijer

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76238
2020-04-14 15:24:57 +01:00
Ayke van Laethem cfc002714a
[AVR] Support aliases in non-zero address space
This fixes code like the following on AVR:

void foo(void) {
}
void bar(void) __attribute__((alias("foo")));

Code like this is present in compiler-rt, which I'm trying to build.

Differential Revision: https://reviews.llvm.org/D76182
2020-04-14 00:42:19 +02:00
Eli Friedman 89e0662dee Make IRBuilder automatically set alignment on load/store/alloca.
This is equivalent in terms of LLVM IR semantics, but we want to
transition away from using MaybeAlign to represent the alignment of
these instructions.

Differential Revision: https://reviews.llvm.org/D77984
2020-04-13 13:43:14 -07:00
Mehdi Amini ed03d9485e Revert "[TLI] Per-function fveclib for math library used for vectorization"
This reverts commit 60c642e74b.

This patch is making the TLI "closed" for a predefined set of VecLib
while at the moment it is extensible for anyone to customize when using
LLVM as a library.
Reverting while we figure out a way to re-land it without losing the
generality of the current API.

Differential Revision: https://reviews.llvm.org/D77925
2020-04-11 01:05:01 +00:00
Kevin P. Neal 7f38812d5b [FPEnv][AArch64] Platform-specific builtin constrained FP enablement
When constrained floating point is enabled the AArch64-specific builtins don't use constrained intrinsics in some cases. Fix that.

Neon is part of this patch, so ARM is affected as well.

Differential Revision: https://reviews.llvm.org/D77074
2020-04-10 13:02:00 -04:00
Wenlei He 60c642e74b [TLI] Per-function fveclib for math library used for vectorization
Summary:
Encode `-fveclib` setting as per-function attribute so it can threaded through to LTO backends. Accordingly per-function TLI now reads
the attributes and select available vector function list based on that. Now we also populate function list for all supported vector
libraries for the shared per-module `TargetLibraryInfoImpl`, so each function can select its available vector list independently but without
duplicating the vector function lists. Inlining between incompatbile vectlib attributed is also prohibited now.

Subscribers: hiraditya, dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77632
2020-04-09 18:26:38 -07:00
WangTianQing a3dc949000 [X86] Add TSXLDTRK instructions.
Summary: For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference

Reviewers: craig.topper, RKSimon, LuoYuanke

Reviewed By: craig.topper

Subscribers: mgorny, hiraditya, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77205
2020-04-09 13:17:29 +08:00
Kevin P. Neal 9f1c35d8b1 Revert "[PowerPC] Replace subtract-from-zero float in version with fneg in PowerPC special fma compiler builtins"
The new test case causes bot failures.

This reverts commit ba87430cad.
2020-04-03 15:47:19 -04:00
Kevin P. Neal d7a0516ddc Fix typo in test.
Differential Revision: https://reviews.llvm.org/D76949
2020-04-03 15:23:49 -04:00
Andrew Wock ba87430cad [PowerPC] Replace subtract-from-zero float in version with fneg in PowerPC special fma compiler builtins
This patch adds a test for the PowerPC fma compiler builtins, some variations
of which negate inputs and outputs. The code to generate IR for these
builtins was untested before this patch.

Originally, the code used the outdated method of subtracting floating point
values from -0.0 as floating point negation. This patch remedies that.

Patch by: Drew Wock <drew.wock@sas.com>
Differential Revision: https://reviews.llvm.org/D76949
2020-04-03 14:59:33 -04:00
Craig Topper be0a4fef6e [X86] Add -flax-vector-conversions=none to more of the clang CodeGen tests
Thankfully no issues found.
2020-04-02 20:39:18 -07:00
WangTianQing d08fadd662 [X86] Add SERIALIZE instruction.
Summary: For more details about this instruction, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference

Reviewers: craig.topper, RKSimon, LuoYuanke

Reviewed By: craig.topper

Subscribers: mgorny, hiraditya, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77193
2020-04-02 16:19:23 +08:00
Ian Levesque bb3111cbaf [clang][xray] Add xray attributes to functions without decls too
Summary: This allows instrumenting things like global initializers

Reviewers: dberris, MaskRay, smeenai

Subscribers: cfe-commits, johnislarry

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77191
2020-04-01 00:02:39 -04:00
Anna Thomas 58a05675da Revert "[InlineFunction] Handle return attributes on call within inlined body"
This reverts commit 28518d9ae3.
There is a failure in MsgPackReader.cpp when built with clang. It
complains about "signext and zeroext" are incompatible. Investigating
offline if it is infact a UB in the MsgPackReader code.
2020-03-31 16:16:34 -04:00
Anna Thomas 28518d9ae3 [InlineFunction] Handle return attributes on call within inlined body
Consider a callee function that has a call (C) within it which feeds
into the return.  When we inline that callee into a callsite that has
return attributes, we can backward propagate those attributes to the
call (C) within that inlined callee body.

This is safe to do so only if we can guarantee transfer of execution to
successor in the window of instructions between return value (i.e. the
call C) and the return instruction.

See added test cases.

Reviewed-By: reames, jdoerfert

Differential Revision: https://reviews.llvm.org/D76140
2020-03-31 14:35:40 -04:00
Yonghong Song ced0d1f42b [BPF] support 128bit int explicitly in layout spec
Currently, bpf does not specify 128bit alignment in its
layout spec. So for a structure like
  struct ipv6_key_t {
    unsigned pid;
    unsigned __int128 saddr;
    unsigned short lport;
  };
clang will generate IR type
  %struct.ipv6_key_t = type { i32, [12 x i8], i128, i16, [14 x i8] }
Additional padding is to ensure later IR->MIR can generate correct
stack layout with target layout spec.

But it is common practice for a tracing program to be
first compiled with target flag (e.g., x86_64 or aarch64) through
clang to generate IR and then go through llc to generate bpf
byte code. Tracing program often refers to kernel internal
data structures which needs to be compiled with non-bpf target.

But such a compilation model may cause a problem on aarch64.
The bcc issue https://github.com/iovisor/bcc/issues/2827
reported such a problem.

For the above structure, since aarch64 has "i128:128" in its
layout string, the generated IR will have
  %struct.ipv6_key_t = type { i32, i128, i16 }

Since bpf does not have "i128:128" in its spec string,
the selectionDAG assumes alignment 8 for i128 and
computes the stack storage size for the above is 32 bytes,
which leads incorrect code later.

The x86_64 does not have this issue as it does not have
"i128:128" in its layout spec as it does permits i128 to
be alignmented at 8 bytes at stack. Its IR type looks like
  %struct.ipv6_key_t = type { i32, [12 x i8], i128, i16, [14 x i8] }

The fix here is add i128 support in layout spec, the same as
aarch64. The only downside is we may have less optimal stack
allocation in certain cases since we require 16byte alignment
for i128 instead of 8. But this is probably fine as i128 is
not used widely and in most cases users should already
have proper alignment.

Differential Revision: https://reviews.llvm.org/D76587
2020-03-28 11:46:29 -07:00
Mikhail Maltsev bd722ef63f [ARM,CDE] Improve CDE intrinsics testing
Summary:
This patch:
* adds tests for vreinterpret intinsics in big-endian mode
* adds C++ runs to the CDE+MVE header compatibility test

Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen

Reviewed By: simon_tatham

Subscribers: kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76927
2020-03-27 16:05:18 +00:00
Mikael Holmen 7d482e9213 Fix TBAA for unsigned fixed-point types
Summary:
Unsigned types can alias the corresponding signed types. I don't see
that this is explicitly mentioned in the Embedded-C specification, but
I think it should work the same as for the integer types.

Patch by: materi

Reviewers: ebevhan, leonardchan

Reviewed By: leonardchan

Subscribers: kosarev, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76856
2020-03-27 10:35:24 +01:00
Sid Manning b0da094983 [Hexagon] Add support for Linux/Musl ABI (part 2)
A continuation of https://reviews.llvm.org/D72701.  This
adds support needed in clang.

Differential Revision: https://reviews.llvm.org/D75638
2020-03-26 17:19:46 -05:00
Sam McCall 13dc21e841 [AST] Make thinlto testcase robust to 159a9f7e76
Ultimately it relies on the output of __PRETTY_FUNCTION__ which isn't reliable.
2020-03-26 12:47:39 +01:00
Sam McCall 38798d0306 Revert "[AST] Fix thinlto testcase missed in 159a9f7e76307734bcdcae3357640e42e0733194"
This reverts commit 4bd1d55884.
Cure is worse than the disease: "> >" is still expected in most configs.
Working on fixing the fuchsia builder.
2020-03-26 12:38:33 +01:00
Sam McCall 4bd1d55884 [AST] Fix thinlto testcase missed in 159a9f7e76 2020-03-26 10:28:54 +01:00
Mikhail Maltsev bb4da94e5b [ARM,CDE] Implement predicated Q-register CDE intrinsics
Summary:
This patch implements the following CDE intrinsics:

  T __arm_vcx1q_m(int coproc, T inactive, uint32_t imm, mve_pred_t p);
  T __arm_vcx2q_m(int coproc, T inactive, U n, uint32_t imm, mve_pred_t p);
  T __arm_vcx3q_m(int coproc, T inactive, U n, V m, uint32_t imm, mve_pred_t p);

  T __arm_vcx1qa_m(int coproc, T acc, uint32_t imm, mve_pred_t p);
  T __arm_vcx2qa_m(int coproc, T acc, U n, uint32_t imm, mve_pred_t p);
  T __arm_vcx3qa_m(int coproc, T acc, U n, V m, uint32_t imm, mve_pred_t p);

The intrinsics are not part of the released ACLE spec, but internally at
Arm we have reached consensus to add them to the next ACLE release.

Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen

Reviewed By: simon_tatham

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76610
2020-03-25 17:08:19 +00:00
Simon Tatham 8f1651ccea [ARM,MVE] Add missing tests for vqdmlash intrinsics.
Summary:
These were accidentally left out of D76123. I added tests for the
other three instructions in this small cross-product family (vqdmlah,
vqrdmlah, vqrdmlash) but missed this one.

Reviewers: miyuki

Reviewed By: miyuki

Subscribers: kristof.beyls, dmgreen, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76714
2020-03-25 09:46:16 +00:00
Erik Pilkington de98cf92e3 [CodeGen] Add an alignment attribute to all sret parameters
This fixes a miscompile when the parameter is actually underaligned.
rdar://58316406

Differential revision: https://reviews.llvm.org/D74183
2020-03-24 15:31:57 -04:00
Sam McCall 0b59982134
[CodeGen] Fix test attr-noreturn.c when run from my home directory 2020-03-24 13:59:16 +01:00
Momchil Velikov 080d046c91 [ARM][CMSE] Implement CMSE attributes
This patch adds CMSE attributes `cmse_nonsecure_call` and
`cmse_nonsecure_entry`.  As usual, specification is available here:
https://developer.arm.com/docs/ecm0359818/latest

Patch by Javed Absar, Bradley Smith, David Green, Momchil Velikov,
possibly others.

Differential Revision: https://reviews.llvm.org/D71129
2020-03-24 10:21:26 +00:00
Momchil Velikov 6081ccf4a3 Apply function attributes through array declarators
There's inconsistency in handling array types between the
`distributeFunctionTypeAttrXXX` functions and the
`FunctionTypeUnwrapper` in `SemaType.cpp`.

This patch lets `FunctionTypeUnwrapper` apply function type attributes
through array types.

Differential Revision: https://reviews.llvm.org/D75109
2020-03-23 11:03:13 +00:00
Thomas Lively de6cd3e836 [WebAssembly] Add SIMD integer abs builtins
Summary:
Since the conditional operator cannot be used with vector conditions
in C, we need a builtin to be able to express this operation in C
source.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76538
2020-03-21 00:21:24 -07:00
Adrian Prantl ceae47143b Allow remapping the sysroot with -fdebug-prefix-map.
<rdar://problem/55685132>

Differential Revision: https://reviews.llvm.org/D76393
2020-03-20 16:27:50 -07:00
Adrian Prantl bde15de3ca Revert "Allow remapping the sysroot with -fdebug-prefix-map."
This reverts commit 6725c4836a.
2020-03-20 16:27:23 -07:00
Adrian Prantl 6725c4836a Allow remapping the sysroot with -fdebug-prefix-map.
<rdar://problem/55685132>

Differential Revision: https://reviews.llvm.org/D76393
2020-03-20 15:52:39 -07:00
Simon Tatham 1adfa4c991 [ARM,MVE] Add ACLE intrinsics for the vaddv/vaddlv family.
Summary:
I've implemented them as target-specific IR intrinsics rather than
using `@llvm.experimental.vector.reduce.add`, on the grounds that the
'experimental' intrinsic doesn't currently have much code generation
benefit, and my replacements encapsulate the sign- or zero-extension
so that you don't expose the illegal MVE vector type (`<4 x i64>`) in
IR.

The machine instructions come in two versions: with and without an
input accumulator. My new IR intrinsics, like the 'experimental' one,
don't take an accumulator parameter: we represent that by just adding
on the input value using an ordinary i32 or i64 add. So if you write
the `vaddvaq` C-language intrinsic with an input accumulator of zero,
it can be optimised to VADDV, and conversely, if you write something
like `x += vaddvq(y)` then that can be combined into VADDVA.

Most of this is achieved in isel lowering, by converting these IR
intrinsics into the existing `ARMISD::VADDV` family of custom SDNode
types. For the difficult case (64-bit accumulators), isel lowering
already implements the optimization of folding an addition into a
VADDLV to make a VADDLVA; so once we've made a VADDLV, our job is
already done, except that I had to introduce a parallel set of ARMISD
nodes for the //predicated// forms of VADDLV.

For the simpler VADDV, we handle the predicated form by just leaving
the IR intrinsic alone and matching it in an ordinary dag pattern.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76491
2020-03-20 15:42:33 +00:00
Simon Tatham 45a9945b9e [ARM,MVE] Add ACLE intrinsics for the vminv/vmaxv family.
Summary:
I've implemented these as target-specific IR intrinsics, because
they're not //quite// enough like @llvm.experimental.vector.reduce.min
(which doesn't take the extra scalar parameter). Also this keeps the
predicated and unpredicated versions looking similar, and the
floating-point minnm/maxnm versions fold into the same schema.

We had a couple of min/max reductions already implemented, from the
initial pathfinding exercise in D67158. Those were done by having
separate IR intrinsic names for the signed and unsigned integer
versions; as part of this commit, I've changed them to use a flag
parameter indicating signedness, which is how we ended up deciding
that the rest of the MVE intrinsics family ought to work. So now
hopefully the ewhole lot is consistent.

In the new llc test, the output code from the `v8f16` test functions
looks quite unpleasant, but most of it is PCS lowering (you can't pass
a `half` directly in or out of a function). In other circumstances,
where you do something else with your `half` in the same function, it
doesn't look nearly as nasty.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76490
2020-03-20 15:42:33 +00:00
Mikhail Maltsev 6ae3eff8ba [ARM,CDE] Implement CDE vreinterpret intrinsics
Summary:
This patch implements the following CDE intrinsics:

  int8x16_t __arm_vreinterpretq_s8_u8 (uint8x16_t in);
  uint16x8_t __arm_vreinterpretq_u16_u8 (uint8x16_t in);
  int16x8_t __arm_vreinterpretq_s16_u8 (uint8x16_t in);
  uint32x4_t __arm_vreinterpretq_u32_u8 (uint8x16_t in);
  int32x4_t __arm_vreinterpretq_s32_u8 (uint8x16_t in);
  uint64x2_t __arm_vreinterpretq_u64_u8 (uint8x16_t in);
  int64x2_t __arm_vreinterpretq_s64_u8 (uint8x16_t in);
  float16x8_t __arm_vreinterpretq_f16_u8 (uint8x16_t in);
  float32x4_t __arm_vreinterpretq_f32_u8 (uint8x16_t in);

These intrinsics are header-only because they reuse the existing
MVE vreinterpret clang built-ins.

This set is slightly different from the published specification
(see https://static.docs.arm.com/101028/0010/ACLE_2019Q4_release-0010.pdf):
it includes

  int8x16_t __arm_vreinterpretq_s8_u8 (uint8x16_t in);

which was unintentionally ommitted from the spec, and
does not include

  float64x2_t __arm_vreinterpretq_f64_u8 (uint8x16_t in);

The float64x2_t type requires additional implementation
effort, and we are not including it yet.

Reviewers: simon_tatham, MarkMurrayARM, dmgreen, ostannard

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76300
2020-03-20 14:01:57 +00:00
Mikhail Maltsev 969034b860 [ARM,CDE] Implement CDE unpredicated Q-register intrinsics
Summary:
This patch implements the following intrinsics:

  uint8x16_t __arm_vcx1q_u8 (int coproc, uint32_t imm);
  T __arm_vcx1qa(int coproc, T acc, uint32_t imm);
  T __arm_vcx2q(int coproc, T n, uint32_t imm);
  uint8x16_t __arm_vcx2q_u8(int coproc, T n, uint32_t imm);
  T __arm_vcx2qa(int coproc, T acc, U n, uint32_t imm);
  T __arm_vcx3q(int coproc, T n, U m, uint32_t imm);
  uint8x16_t __arm_vcx3q_u8(int coproc, T n, U m, uint32_t imm);
  T __arm_vcx3qa(int coproc, T acc, U n, V m, uint32_t imm);

Most of them are polymorphic. Furthermore, some intrinsics are
polymorphic by 2 or 3 parameter types, such polymorphism is not
supported by the existing MVE/CDE tablegen backends, also we don't
really want to have a combinatorial explosion caused by 1000 different
combinations of 3 vector types. Because of this some intrinsics are
implemented as macros involving a cast of the polymorphic arguments to
uint8x16_t.

The IR intrinsics are even more restricted in terms of types: all MVE
vectors are cast to v16i8.

Reviewers: simon_tatham, MarkMurrayARM, dmgreen, ostannard

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76299
2020-03-20 14:01:56 +00:00
Mikhail Maltsev d22e661712 [ARM,CDE] Implement CDE S and D-register intrinsics
Summary:
This patch implements the following ACLE intrinsics:

  uint32_t __arm_vcx1_u32(int coproc, uint32_t imm);
  uint32_t __arm_vcx1a_u32(int coproc, uint32_t acc, uint32_t imm);
  uint32_t __arm_vcx2_u32(int coproc, uint32_t n, uint32_t imm);
  uint32_t __arm_vcx2a_u32(int coproc, uint32_t acc, uint32_t n, uint32_t imm);
  uint32_t __arm_vcx3_u32(int coproc, uint32_t n, uint32_t m, uint32_t imm);
  uint32_t __arm_vcx3a_u32(int coproc, uint32_t acc, uint32_t n, uint32_t m, uint32_t imm);

  uint64_t __arm_vcx1d_u64(int coproc, uint32_t imm);
  uint64_t __arm_vcx1da_u64(int coproc, uint64_t acc, uint32_t imm);
  uint64_t __arm_vcx2d_u64(int coproc, uint64_t m, uint32_t imm);
  uint64_t __arm_vcx2da_u64(int coproc, uint64_t acc, uint64_t m, uint32_t imm);
  uint64_t __arm_vcx3d_u64(int coproc, uint64_t n, uint64_t m, uint32_t imm);
  uint64_t __arm_vcx3da_u64(int coproc, uint64_t acc, uint64_t n, uint64_t m, uint32_t imm);

Since the semantics of CDE instructions is opaque to the compiler, the
ACLE intrinsics require dedicated LLVM IR intrinsics. The 64-bit and
32-bit variants share the same IR intrinsic.

Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76298
2020-03-20 14:01:53 +00:00
Mikhail Maltsev 7a85e3585e [ARM,CDE] Implement GPR CDE intrinsics
Summary:
This change implements ACLE CDE intrinsics that translate to
instructions working with general-purpose registers.

The specification is available at
https://static.docs.arm.com/101028/0010/ACLE_2019Q4_release-0010.pdf

Each ACLE intrinsic gets a corresponding LLVM IR intrinsic (because
they have distinct function prototypes). Dual-register operands are
represented as pairs of i32 values. Because of this the instruction
selection for these intrinsics cannot be represented as TableGen
patterns and requires custom C++ code.

Reviewers: simon_tatham, MarkMurrayARM, dmgreen, ostannard

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76296
2020-03-20 14:01:51 +00:00
Shiva Chen fc3752665f [RISCV] Passing small data limitation value to RISCV backend
Passing small data limit to RISCVELFTargetObjectFile by module flag,
So the backend can set small data section threshold by the value.
The data will be put into the small data section if the data smaller than
the threshold.

Differential Revision: https://reviews.llvm.org/D57497
2020-03-20 11:03:51 +08:00
Thomas Lively a3f974f3c3 [WebAssembly] SIMD bitmask intrinsics and builtin functions
Summary:
These experimental new instructions are proposed in
https://github.com/WebAssembly/simd/pull/201.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76397
2020-03-19 17:15:37 -07:00
Djordje Todorovic d9b9621009 Reland D73534: [DebugInfo] Enable the debug entry values feature by default
The issue that was causing the build failures was fixed with the D76164.
2020-03-19 13:57:30 +01:00
Lucas Prates d4ad386ee1 [ARM] Fixing range checks for Neon's vqdmulhq_lane and vqrdmulhq_lane intrinsics
Summary:
The range checks performed for the vqrdmulh_lane and vqrdmulh_lane Neon
intrinsics were incorrectly using their return type as the base type for
the range check performed on their 'lane' argument.

This patch updates those intrisics to use the type of the proper reference
argument to perform the range checks.

Reviewers: jmolloy, t.p.northover, rsmith, olista01, dnsampaio

Reviewed By: dnsampaio

Subscribers: dnsampaio, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74766
2020-03-19 12:08:12 +00:00
Lucas Prates f56550cf7f [ARM] Enabling range checks on Neon intrinsics' lane arguments
Summary:
Range checks were not properly performed in the lane arguments of Neon
intrinsics implemented based on splat operations. Calls to those
intrinsics where translated to `__builtin__shufflevector` calls directly
by the pre-processor through the arm_neon.h macros, missing the chance
for the proper range checks.

This patch enables the range check by introducing an auxiliary splat
instruction in arm_neon.td, delaying the translation to shufflevector
calls to CGBuiltin.cpp in clang after the checks were performed.

Reviewers: jmolloy, t.p.northover, rsmith, olista01, ostannard

Reviewed By: ostannard

Subscribers: ostannard, dnsampaio, danielkiss, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74619
2020-03-19 12:07:23 +00:00
Lucas Prates 7bf23563f4 Revert "[ARM] Setting missing isLaneQ attribute on Neon Intrisics definitions"
This reverts commit 62ab15ffa3.

Multiple commits were unintentionally squashed into this one. Reverting
so each of them can be pushed properly.
2020-03-19 12:01:13 +00:00
Lucas Prates 62ab15ffa3 [ARM] Setting missing isLaneQ attribute on Neon Intrisics definitions
Summary:
Some of the `*_laneq` intrinsics defined in arm_neon.td were missing the
setting of the `isLaneQ` attribute. This patch sets the attribute on the
related definitions, as they will be required to properly perform range
checks on their lane arguments.

Reviewers: jmolloy, t.p.northover, rsmith, olista01, dnsampaio

Reviewed By: dnsampaio

Subscribers: dnsampaio, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74616
2020-03-19 11:52:41 +00:00
Sander de Smalen 981f0802b3 [SVE] Generate overloaded functions for ACLE intrinsics.
The SVE ACLE allows using a short-form for the intrinsics, e.g.
the following two declarations generate the same code:

  svuint32_t svld1(svbool_t, uint32_t const *);
  svuint32_t svld1_u32(svbool_t, uint32_t const *);

using the attribute:
  __clang_arm_builtin_alias

so that any call to svld1(svbool_t, uint32_t const *) will
map to __builtin_sve_svld1_u32.

Reviewers: SjoerdMeijer, miyuki, efriedma, simon_tatham, rengolin

Reviewed By: SjoerdMeijer

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75861
2020-03-19 09:36:23 +00:00
Richard Smith f18233dad4 Fix -fsanitize=array-bound to treat T[0] union members as flexible array
members regardless of whether they're the last member of the union.
2020-03-18 15:47:24 -07:00
Simon Tatham e13d153c1b [ARM,MVE] Add intrinsics for the VQDMLAD family.
Summary:
This is another set of instructions too complicated to be sensibly
expressed in IR by anything short of a target-specific intrinsic.
Given input vectors a,b, the instruction generates intermediate values
2*(a[0]*b[0]+a[1]+b[1]), 2*(a[2]*b[2]+a[3]+b[3]), etc; takes the high
half of each double-width values, and overwrites half the lanes in the
output vector c, which you therefore have to provide the input value
of. Optionally you can swap the elements of b so that the are things
like a[0]*b[1]+a[1]*b[0]; optionally you can round to nearest when
taking the high half; and optionally you can take the difference
rather than sum of the two products. Finally, saturation is applied
when converting back to a single-width vector lane.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: miyuki

Subscribers: kristof.beyls, hiraditya, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76359
2020-03-18 17:11:22 +00:00
Simon Tatham 928776de92 [ARM,MVE] Add intrinsics for the VQDMLAH family.
Summary:
These are complicated integer multiply+add instructions with extra
saturation, taking the high half of a double-width product, and
optional rounding. There's no sensible way to represent that in
standard IR, so I've converted the clang builtins directly to
target-specific intrinsics.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: miyuki

Subscribers: kristof.beyls, hiraditya, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76123
2020-03-18 10:55:04 +00:00
Simon Tatham 28c5d97bee [ARM,MVE] Add intrinsics and isel for MVE integer VMLA.
Summary:
These instructions compute multiply+add in integers, with one of the
operands being a splat of a scalar. (VMLA and VMLAS differ in whether
the splat operand is a multiplier or the addend.)

I've represented these in IR using existing standard IR operations for
the unpredicated forms. The predicated forms are done with target-
specific intrinsics, as usual.

When operating on n-bit vector lanes, only the bottom n bits of the
i32 scalar operand are used. So we have to tell that to isel lowering,
to allow it to remove a pointless sign- or zero-extension instruction
on that input register. That's done in `PerformIntrinsicCombine`, but
first I had to enable `PerformIntrinsicCombine` for MVE targets
(previously all the intrinsics it handled were for NEON), and make it
a method of `ARMTargetLowering` so that it can get at
`SimplifyDemandedBits`.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76122
2020-03-18 10:55:04 +00:00
Jon Chesterfield cc691f3384 Disable loader-uninitialized tests on Windows 2020-03-17 23:33:12 +00:00
Jon Chesterfield 1d19b15395 Fix arm build broken by D74361 by dropping align from filecheck pattern 2020-03-17 22:15:19 +00:00