Commit Graph

7691 Commits

Author SHA1 Message Date
Yuanfang Chen da6187f566 [Clang] followup D128745, add a missing ClangABICompat check 2022-08-16 18:40:00 -07:00
Yonghong Song d9198f64d9 [Clang][BPF]: Force sign/zero extension for return values in caller
Currently bpf supports calling kernel functions (x86_64, arm64, etc.)
in bpf programs. Tejun discovered a problem where the x86_64 func
return value (a unsigned char type) is stored in 8-bit subregister %al
and the other 56-bits in %rax might be garbage. But based on current
bpf ABI, the bpf program assumes the whole %rax holds the correct value
as the callee is supposed to do necessary sign/zero extension.
This mismatch between bpf and x86_64 caused the incorrect results.

To resolve this problem, this patch forced caller to do needed
sign/zero extension for 8/16-bit return values as well. Note that
32-bit return values already had sign/zero extension even without
this patch.

For example, for the test case attached to this patch:

  $  cat t.c
  _Bool bar_bool(void);
  unsigned char bar_char(void);
  short bar_short(void);
  int bar_int(void);
  int foo_bool(void) {
        if (bar_bool() != 1) return 0; else return 1;
  }
  int foo_char(void) {
        if (bar_char() != 10) return 0; else return 1;
  }
  int foo_short(void) {
        if (bar_short() != 10) return 0; else return 1;
  }
  int foo_int(void) {
        if (bar_int() != 10) return 0; else return 1;
  }

Without this patch, generated call insns in IR looks like:
    %call = call zeroext i1 @bar_bool()
    %call = call zeroext i8 @bar_char()
    %call = call signext i16 @bar_short()
    %call = call i32 @bar_int()
So it is assumed that zero extension has been done for return values of
bar_bool()and bar_char(). Sign extension has been done for the return
value of bar_short(). The return value of bar_int() does not have any
assumption so caller needs to do necessary shifting to get correct
32bit values.

With this patch, generated call insns in IR looks like:
    %call = call i1 @bar_bool()
    %call = call i8 @bar_char()
    %call = call i16 @bar_short()
    %call = call i32 @bar_int()
There are no assumptions for return values of the above four function calls,
so necessary shifting is necessary for all of them.

The following is the objdump file difference for function foo_char().
Without this patch:
  0000000000000010 <foo_char>:
       2:       85 10 00 00 ff ff ff ff call -1
       3:       bf 01 00 00 00 00 00 00 r1 = r0
       4:       b7 00 00 00 01 00 00 00 r0 = 1
       5:       15 01 01 00 0a 00 00 00 if r1 == 10 goto +1 <LBB1_2>
       6:       b7 00 00 00 00 00 00 00 r0 = 0
  0000000000000038 <LBB1_2>:
       7:       95 00 00 00 00 00 00 00 exit

With this patch:
  0000000000000018 <foo_char>:
       3:       85 10 00 00 ff ff ff ff call -1
       4:       bf 01 00 00 00 00 00 00 r1 = r0
       5:       57 01 00 00 ff 00 00 00 r1 &= 255
       6:       b7 00 00 00 01 00 00 00 r0 = 1
       7:       15 01 01 00 0a 00 00 00 if r1 == 10 goto +1 <LBB1_2>
       8:       b7 00 00 00 00 00 00 00 r0 = 0
  0000000000000048 <LBB1_2>:
       9:       95 00 00 00 00 00 00 00 exit
The zero extension of the return 'char' value is done here.

Differential Revision: https://reviews.llvm.org/D131598
2022-08-16 16:08:01 -07:00
Saleem Abdulrasool 585f62be1a CodeGen: correct handling of debug info generation for aliases
When aliasing a static array, the aliasee is going to be a GEP which
points to the value.  We should strip pointer casts before forming the
reference.  This was occluded by the use of opaque pointers.

This problem has existed since the introduction of the debug info
generation for aliases in b1ea0191a4.  The
test case would assert due to the invalid cast with or without
`-no-opaque-pointers` at that revision.

Fixes: #57179
2022-08-16 21:27:05 +00:00
Lei Huang 7d8ae9f755 [NFC][PowerPC] Add missing NOCOMPAT checks for builtins-ppc-xlcompat.c
Followup patch to address request from https://reviews.llvm.org/D124093

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D131622
2022-08-16 13:56:33 -05:00
Yuanfang Chen 6afcc4a459 [c++] implements DR692, DR1395 and tentatively DR1432, about partial ordering of variadic template partial specialization or function template
DR692 handles two cases: pack expansion (for class/var template) and function parameter pack. The former needs DR1432 as a fix, and the latter needs DR1395 as a fix. However, DR1432 has not yet made a wording change. so I made a tentative fix for DR1432 with the same spirit as DR1395.

Reviewed By: aaron.ballman, erichkeane, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D128745
2022-08-14 14:37:40 -07:00
Usman Nadeem 405ad84793 Update hwasan test to fix failure on older Android API versions.
In Android API < 30 there is no HWAsan instrumentation support for globals
so the test fails if API < 30 or if the target triple does not specify the API version.

Add -triple=aarch64-linux-android31 to enable global instrumentation. This is the
same triple as is used in the RUN line for -fsanitize=memtag-globals.

Differential Revision: https://reviews.llvm.org/D131806

Change-Id: I300703bd126b10e3c52505e23c78c5a48acb0309
2022-08-12 16:30:08 -07:00
Alex Bradbury d17de5479c [clang][RISCV][test] Add test that shows incorrect ABI lowering
As reported in <https://github.com/llvm/llvm-project/issues/57084>,
under hard float ABIs there are issues with lowering structs that
inherit from other structs.

See <https://reviews.llvm.org/D131677> for a fix.
2022-08-11 18:51:37 +01:00
Florian Hahn ef110a491f
[Builtins] Do not claim most libfuncs are readnone with trapping math.
At the moment, Clang only considers errno when deciding if a builtin
is const. This ignores the fact that some library functions may raise
floating point exceptions, which may modify global state, e.g. when
updating FP status registers.

To model the fact that some library functions/builtins may raise
floating point exceptions, this patch adds a new 'g' modifier for
builtins. If a builtin is marked with 'g', it cannot be considered
const, unless FP exceptions are ignored.

So far I've not added CHECK lines for all calls in math-libcalls.c. I'll
do that once we agree on the overall direction.

A consequence seems to be that we fail to select some of the constrained
math builtins now, but I am not entirely sure what's going on there.

Reviewed By: john.brawn

Differential Revision: https://reviews.llvm.org/D129231
2022-08-11 12:29:01 +01:00
Aaron Ballman af01f717c4 Default implicit function pointer conversions diagnostic to be an error
Implicitly converting between incompatible function pointers in C is
currently a default-on warning (it is an error in C++). However, this
is very poor security posture. A mismatch in parameters or return
types, or a mismatch in calling conventions, etc can lead to
exploitable security vulnerabilities. Rather than allow this unsafe
practice with a warning, this patch strengthens the warning to be an
error (while still allowing users the ability to disable the error or
the warning entirely to ease migration). Users should either ensure the
signatures are correctly compatible or they should use an explicit cast
if they believe that's more reasonable.

Differential Revision: https://reviews.llvm.org/D131351
2022-08-10 13:54:17 -04:00
David Truby 286d59ef6f [clang][AArch64][SVE] Add unary +/- operators for SVE types
This patch enables the unary promotion and negation operators on
SVE types.

Differential Revision: https://reviews.llvm.org/D130984
2022-08-10 10:32:43 +00:00
Freddy Ye e4888a37d3 [X86][BF16] Enable __bf16 for x86 targets.
X86 psABI has updated to support __bf16 type, the ABI of which is the
same as FP16. See https://discourse.llvm.org/t/patch-add-optional-bfloat16-support/63149

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D130964
2022-08-10 09:00:47 +08:00
Shawn Zhong 82afc9b169 Fix -Wbitfield-constant-conversion on 1-bit signed bitfield
A one-bit signed bit-field can only hold the values 0 and -1; this
corrects the diagnostic behavior accordingly.

Fixes #53253
Differential Revision: https://reviews.llvm.org/D131255
2022-08-09 11:43:50 -04:00
Ariel Burton f53f2f232f Extend ptr32 support to be applied on typedef
Earlier, if the QualType was sugared, then we would error out
as it was not a pointer type, for example,

typedef int *int_star;

int_star __ptr32 p;

Now, if ptr32 is given we apply it if the raw Canonical Type
(i.e., the desugared type) is a PointerType, instead of only
checking whether the sugared type is a pointer type.

As before, we still disallow ptr32 usage if the pointer is used
as a pointer to a member.

Differential Revision: https://reviews.llvm.org/D130123
2022-08-09 11:08:52 -04:00
Jack Kirk 3e0e5568a6 [CUDA] Fixed sm version constrain for __bmma_m8n8k128_mma_and_popc_b1.
As stated in
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-wmma-mma:
".and operation in single-bit wmma requires sm_80 or higher."

tra@: Fixed a bug in builtins-nvptx-mma.py test generator and regenerated the tests.

Differential Revision: https://reviews.llvm.org/D131265
2022-08-05 12:14:06 -07:00
Ellis Hoag 6f4c3c0f64 [InstrProf][attempt 2] Add new format for -fprofile-list=
In D130807 we added the `skipprofile` attribute. This commit
changes the format so we can either `forbid` or `skip` profiling
functions by adding the `noprofile` or `skipprofile` attributes,
respectively. The behavior of the original format remains
unchanged.

Also, add the `skipprofile` attribute when using
`-fprofile-function-groups`.

This was originally landed as https://reviews.llvm.org/D130808 but was
reverted due to a Windows test failure.

Differential Revision: https://reviews.llvm.org/D131195
2022-08-04 17:12:56 -07:00
Zakk Chen 010f329803 [RISCV][Clang] Support policy function for all vector segment load.
We will switch all UndefValue to PoisonValue in follow up patches.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D126750
2022-08-04 17:47:24 +00:00
David Green 8c30f4a5ab [AArch64] Always allow the __bf16 type
We would like to make the ACLE NEON and SVE intrinsics more useable by
gating them on the target, not by ifdef preprocessor macros. In order to
do this the types they use need to be available. This patches makes
__bf16 always available under AArch64 not just when the bf16
architecture feature is present. This bringing it in-line with GCC. In
subsequent patches the NEON bfloat16x8_t and SVE svbfloat16_t types
(along with bfloat16_t used in arm_sve.h) will be made unconditional
too.

The operations valid on the types are still very limited. They can be
used as a storage type, but the intrinsics used for convertions are
still behind an ifdef guard in arm_neon.h/arm_bf16.h.

Differential Revision: https://reviews.llvm.org/D130973
2022-08-04 18:35:27 +01:00
Nico Weber 0eb7d86f58 Revert "[InstrProf] Add new format for -fprofile-list="
This reverts commit b692312ca4.
Breaks tests on Windows, see https://reviews.llvm.org/D130808#3699952
2022-08-04 13:04:59 -04:00
Ellis Hoag b692312ca4 [InstrProf] Add new format for -fprofile-list=
In D130807 we added the `skipprofile` attribute. This commit
changes the format so we can either `forbid` or `skip` profiling
functions by adding the `noprofile` or `skipprofile` attributes,
respectively. The behavior of the original format remains
unchanged.

Also, add the `skipprofile` attribute when using
`-fprofile-function-groups`.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D130808
2022-08-04 08:49:43 -07:00
Ellis Hoag 12e78ff881 [InstrProf] Add the skipprofile attribute
As discussed in [0], this diff adds the `skipprofile` attribute to
prevent the function from being profiled while allowing profiled
functions to be inlined into it. The `noprofile` attribute remains
unchanged.

The `noprofile` attribute is used for functions where it is
dangerous to add instrumentation to while the `skipprofile` attribute is
used to reduce code size or performance overhead.

[0] https://discourse.llvm.org/t/why-does-the-noprofile-attribute-restrict-inlining/64108

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D130807
2022-08-04 08:45:27 -07:00
Matt Jacobson c8b2f3f51b [ObjC] type method metadata `_imp`, messenger routine at callsite with program address space
On targets with non-default program address space (e.g., Harvard
architectures), clang crashes when emitting Objective-C method metadata,
because the address of the method IMP cannot be bitcast to i8*. It similarly
crashes at messenger callsite with a failed bitcast.

Define the _imp field instead as i8 addrspace(1)* (or whatever the target's
program address space is). And in getMessageSendInfo(), create signatureType by
specifying the program address space.

Add a regression test using the AVR target. Test failed previously and passes
now. Checked codegen of the test for x86_64-apple-darwin19.6.0 and saw no
difference, as expected.

Reviewed By: rjmccall, dylanmckay

Differential Revision: https://reviews.llvm.org/D112113
2022-08-04 05:40:32 -04:00
Phoebe Wang 6f867f9102 [X86] Support ``-mindirect-branch-cs-prefix`` for call and jmp to indirect thunk
This is to address feature request from https://github.com/ClangBuiltLinux/linux/issues/1665

Reviewed By: nickdesaulniers, MaskRay

Differential Revision: https://reviews.llvm.org/D130754
2022-08-04 15:12:15 +08:00
Jonas Paulsson 84831bdfed [SystemZ] Make 128 bit integers be aligned to 8 bytes.
The SystemZ ABI says that 128 bit integers should be aligned to only 8 bytes.

Reviewed By: Ulrich Weigand, Nikita Popov

Differential Revision: https://reviews.llvm.org/D130900
2022-08-03 15:39:54 +02:00
Yuanfang Chen 92c1bc6158 [CodeGen][inlineasm] assume the flag output of inline asm is boolean value
GCC inline asm document says that
"... the general rule is that the output variable must be a scalar
integer, and the value is boolean."

Commit e5c37958f9 lowers flag output of
inline asm on X86 with setcc, hence it is guaranteed that the flag
is of boolean value. Clang does not support ARM inline asm flag output
yet so nothing need to be worried about ARM.

See "Flag Output" section at
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#OutputOperands

Fixes https://github.com/llvm/llvm-project/issues/56568

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D129954
2022-08-02 11:49:01 -07:00
Zakk Chen bb99d4b11d [RISCV][Clang] Support policy functions for Vector Mask Instructions.
We will switch all UndefValue to PoisonValue in follow up patches.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D126749
2022-08-02 17:27:57 +00:00
Zakk Chen dffdca85ec [RISCV][Clang] Support policy functions for Vector Reduction
Instructions.

We will switch all UndefValue to PoisonValue in follow up patches.

Thanks for Kito to help on verification with their interanl testsuite.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D126748
2022-08-02 17:27:56 +00:00
Zakk Chen 9caf2cc05c [RISCV][Clang] Support policy functions for Vector Comparison
Instructions.

We will switch all UndefValue to PoisonValue in follow up patches.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D126746
2022-08-02 17:27:56 +00:00
Zakk Chen 7eddeb9e99 [RISCV][Clang] Support policy functions for vmerge, vfmerge and
vcompress.

We will switch all UndefValue to PoisonValue in follow up patches.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D126745
2022-08-02 17:27:55 +00:00
Zakk Chen b1b22b4a85 [RISCV][Clang] Support policy functions for vneg, vnot, vncvt, vwcvt,
vwcvtu, vfabs and vfneg.

We will switch all UndefValue to PoisonValue in follow up patches.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D126744
2022-08-02 17:27:55 +00:00
Zakk Chen 8e51917b39 [RISCV][Clang] Add tests for all supported policy functions. (NFC)
In order to make the review easier, I split a lot of tests from
https://reviews.llvm.org/D126742

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D126743
2022-08-01 17:42:43 +00:00
Zakk Chen 71fd66161d [RISCV][Clang] Support RVV policy functions.
1. Add policy functions support and tests for vadd, vmv, vfmv and all load
   instructions except segment load. I didn't add all combination of policy
   functions in test because it seem not to make sense.
2. Rename HasUnMaskedOverloaded to SupportOverloading.
3. vmv.s.x for ta policy could not have overloaded API.
4. This patch does not support all operations, I will have other follow-up
   patches support all.

[RFC] https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/137

Reviewed By: kito-cheng, fakepaper56, fakepaper56

Differential Revision: https://reviews.llvm.org/D126742
2022-08-01 17:32:08 +00:00
Gabriel Ravier 5674a3c880 Fixed a number of typos
I went over the output of the following mess of a command:

(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z |
 parallel --xargs -0 cat | aspell list --mode=none --ignore-case |
 grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n |
 grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)

and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).

Differential Revision: https://reviews.llvm.org/D130827
2022-08-01 13:13:18 -04:00
skc7 09c4121123 Revert "Revert "[Clang][Attribute] Introduce maybe_undef attribute for function arguments which accepts undef values""
This reverts commit 4e1fe96.

Reverting this commit and fix the tests that caused failures due to
a35c64c.
2022-07-29 19:07:07 +00:00
Amy Kwan 4e1fe968c9 Revert "[Clang][Attribute] Introduce maybe_undef attribute for function arguments which accepts undef values"
This reverts commit a35c64ce23.

Reverting this commit as it causes various failures on LE and BE PPC bots.
2022-07-29 13:28:48 -05:00
Florian Hahn fbe022f189
[Libcalls] Add tests with maytrap & non-errno for math libcalls. 2022-07-29 13:45:34 +01:00
skc7 a35c64ce23 [Clang][Attribute] Introduce maybe_undef attribute for function arguments which accepts undef values
Add the ability to put __attribute__((maybe_undef)) on function arguments.
Clang codegen introduces a freeze instruction on the argument.

Differential Revision: https://reviews.llvm.org/D130224
2022-07-29 02:27:26 +00:00
David Green 3b09e532ee [ARM] Remove duplicate fp16 intrinsics
These vdup and vmov float16 intrinsics are being defined in both the
general section and then again in fp16 under a !aarch64 flag. The
vdup_lane intrinsics were being defined in both aarch64 and !aarch64
sections, so have been commoned.  They are defined as macros, so do not
give duplicate warnings, but removing the duplicates shouldn't alter the
available intrinsics.
2022-07-28 14:26:17 +01:00
Matheus Izvekov 15f3cd6bfc
[clang] Implement ElaboratedType sugaring for types written bare
Without this patch, clang will not wrap in an ElaboratedType node types written
without a keyword and nested name qualifier, which goes against the intent that
we should produce an AST which retains enough details to recover how things are
written.

The lack of this sugar is incompatible with the intent of the type printer
default policy, which is to print types as written, but to fall back and print
them fully qualified when they are desugared.

An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still
requires pointer alignment due to pre-existing bug in the TypeLoc buffer
handling.

---

Troubleshooting list to deal with any breakage seen with this patch:

1) The most likely effect one would see by this patch is a change in how
   a type is printed. The type printer will, by design and default,
   print types as written. There are customization options there, but
   not that many, and they mainly apply to how to print a type that we
   somehow failed to track how it was written. This patch fixes a
   problem where we failed to distinguish between a type
   that was written without any elaborated-type qualifiers,
   such as a 'struct'/'class' tags and name spacifiers such as 'std::',
   and one that has been stripped of any 'metadata' that identifies such,
   the so called canonical types.
   Example:
   ```
   namespace foo {
     struct A {};
     A a;
   };
   ```
   If one were to print the type of `foo::a`, prior to this patch, this
   would result in `foo::A`. This is how the type printer would have,
   by default, printed the canonical type of A as well.
   As soon as you add any name qualifiers to A, the type printer would
   suddenly start accurately printing the type as written. This patch
   will make it print it accurately even when written without
   qualifiers, so we will just print `A` for the initial example, as
   the user did not really write that `foo::` namespace qualifier.

2) This patch could expose a bug in some AST matcher. Matching types
   is harder to get right when there is sugar involved. For example,
   if you want to match a type against being a pointer to some type A,
   then you have to account for getting a type that is sugar for a
   pointer to A, or being a pointer to sugar to A, or both! Usually
   you would get the second part wrong, and this would work for a
   very simple test where you don't use any name qualifiers, but
   you would discover is broken when you do. The usual fix is to
   either use the matcher which strips sugar, which is annoying
   to use as for example if you match an N level pointer, you have
   to put N+1 such matchers in there, beginning to end and between
   all those levels. But in a lot of cases, if the property you want
   to match is present in the canonical type, it's easier and faster
   to just match on that... This goes with what is said in 1), if
   you want to match against the name of a type, and you want
   the name string to be something stable, perhaps matching on
   the name of the canonical type is the better choice.

3) This patch could expose a bug in how you get the source range of some
   TypeLoc. For some reason, a lot of code is using getLocalSourceRange(),
   which only looks at the given TypeLoc node. This patch introduces a new,
   and more common TypeLoc node which contains no source locations on itself.
   This is not an inovation here, and some other, more rare TypeLoc nodes could
   also have this property, but if you use getLocalSourceRange on them, it's not
   going to return any valid locations, because it doesn't have any. The right fix
   here is to always use getSourceRange() or getBeginLoc/getEndLoc which will dive
   into the inner TypeLoc to get the source range if it doesn't find it on the
   top level one. You can use getLocalSourceRange if you are really into
   micro-optimizations and you have some outside knowledge that the TypeLocs you are
   dealing with will always include some source location.

4) Exposed a bug somewhere in the use of the normal clang type class API, where you
   have some type, you want to see if that type is some particular kind, you try a
   `dyn_cast` such as `dyn_cast<TypedefType>` and that fails because now you have an
   ElaboratedType which has a TypeDefType inside of it, which is what you wanted to match.
   Again, like 2), this would usually have been tested poorly with some simple tests with
   no qualifications, and would have been broken had there been any other kind of type sugar,
   be it an ElaboratedType or a TemplateSpecializationType or a SubstTemplateParmType.
   The usual fix here is to use `getAs` instead of `dyn_cast`, which will look deeper
   into the type. Or use `getAsAdjusted` when dealing with TypeLocs.
   For some reason the API is inconsistent there and on TypeLocs getAs behaves like a dyn_cast.

5) It could be a bug in this patch perhaps.

Let me know if you need any help!

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Differential Revision: https://reviews.llvm.org/D112374
2022-07-27 11:10:54 +02:00
Kai Luo 1cbaf681b0 [clang][AIX] Add option to control quadword lock free atomics ABI on AIX
We are supporting quadword lock free atomics on AIX. For the situation that users on AIX are using a libatomic that is lock-based for quadword types, we can't enable quadword lock free atomics by default on AIX in case user's new code and existing code accessing the same shared atomic quadword variable, we can't guarentee atomicity. So we need an option to enable quadword lock free atomics on AIX, thus we can build a quadword lock-free libatomic(also for advanced users considering atomic performance critical) for users to make the transition smooth.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D127189
2022-07-27 01:56:25 +00:00
Fangrui Song de1b5c9145 [AArch64] Simplify BTI/PAC-RET module flags
These module flags use the Min merge behavior with a default value of
zero, so we don't need to emit them if zero.

Reviewed By: danielkiss

Differential Revision: https://reviews.llvm.org/D130145
2022-07-26 09:48:36 -07:00
Sanjay Patel bfb9b8e075 [Passes] add a tail-call-elim pass near the end of the opt pipeline
We call tail-call-elim near the beginning of the pipeline,
but that is too early to annotate calls that get added later.

In the motivating case from issue #47852, the missing 'tail'
on memset leads to sub-optimal codegen.

I experimented with removing the early instance of
tail-call-elim instead of just adding another pass, but that
appears to be slightly worse for compile-time:
+0.15% vs. +0.08% time.
"tailcall" shows adding the pass; "tailcall2" shows moving
the pass to later, then adding the original early pass back
(so 1596886802 is functionally equivalent to 180b0439dc ):
https://llvm-compile-time-tracker.com/index.php?config=NewPM-O3&stat=instructions&remote=rotateright

Note that there was an effort to split the tail call functionality
into 2 passes - that could help reduce compile-time if we find
that this change costs more in compile-time than expected based
on the preliminary testing:
D60031

Differential Revision: https://reviews.llvm.org/D130374
2022-07-25 15:25:47 -04:00
Aaron Ballman 7068aa9841 Strengthen -Wint-conversion to default to an error
Clang has traditionally allowed C programs to implicitly convert
integers to pointers and pointers to integers, despite it not being
valid to do so except under special circumstances (like converting the
integer 0, which is the null pointer constant, to a pointer). In C89,
this would result in undefined behavior per 3.3.4, and in C99 this rule
was strengthened to be a constraint violation instead. Constraint
violations are most often handled as an error.

This patch changes the warning to default to an error in all C modes
(it is already an error in C++). This gives us better security posture
by calling out potential programmer mistakes in code but still allows
users who need this behavior to use -Wno-error=int-conversion to retain
the warning behavior, or -Wno-int-conversion to silence the diagnostic
entirely.

Differential Revision: https://reviews.llvm.org/D129881
2022-07-22 15:24:54 -04:00
Benjamin Kramer 35b80c448b Don't write to source directory in test 2022-07-22 11:14:26 +02:00
Iain Sandoe afda39a566 re-land [C++20][Modules] Build module static initializers per P1874R1.
The re-land fixes module map module dependencies seen on Greendragon, but
not in the clang test suite.

---

Currently we only implement this for the Itanium ABI since the correct
mangling for the initializers in other ABIs is not yet known.

Intended result:

For a module interface [which includes partition interface and implementation
units] (instead of the generic CXX initializer) we emit a module init that:

 - wraps the contained initializations in a control variable to ensure that
   the inits only happen once, even if a module is imported many times by
   imports of the main unit.

 - calls module initializers for imported modules first.  Note that the
   order of module import is not significant, and therefore neither is the
   order of imported module initializers.

 - We then call initializers for the Global Module Fragment (if present)
 - We then call initializers for the current module.
 - We then call initializers for the Private Module Fragment (if present)

For a module implementation unit, or a non-module TU that imports at least one
module we emit a regular CXX init that:

 - Calls the initializers for any imported modules first.
 - Then proceeds as normal with remaining inits.

For all module unit kinds we include a global constructor entry, this allows
for the (in most cases unusual) possibility that a module object could be
included in a final binary without a specific call to its initializer.

Implementation:

 - We provide the module pointer in the AST Context so that CodeGen can act
   on it and its sub-modules.

 - We need to account for module build lines like this:
  ` clang -cc1 -std=c++20 Foo.pcm -emit-obj -o Foo.o` or
  ` clang -cc1 -std=c++20 -xc++-module Foo.cpp -emit-obj -o Foo.o`

 - in order to do this, we add to ParseAST to set the module pointer in
   the ASTContext, once we establish that this is a module build and we
   know the module pointer. To be able to do this, we make the query for
   current module public in Sema.

 - In CodeGen, we determine if the current build requires a CXX20-style module
   init and, if so, we defer any module initializers during the "Eagerly
   Emitted" phase.

 - We then walk the module initializers at the end of the TU but before
   emitting deferred inits (which adds any hidden and static ones, fixing
   https://github.com/llvm/llvm-project/issues/51873 ).

 - We then proceed to emit the deferred inits and continue to emit the CXX
   init function.

Differential Revision: https://reviews.llvm.org/D126189
2022-07-22 08:38:07 +01:00
David Sherwood ceb6c23b70 [NFC][LoopVectorize] Explicitly disable tail-folding on some SVE tests
This patch is in preparation for enabling vectorisation with tail-folding
by default for SVE targets. Once we do that many existing tests will
break that depend upon having normal unpredicated vector loops. For
all such tests I have added the flag:

  -prefer-predicate-over-epilogue=scalar-epilogue

Differential Revision: https://reviews.llvm.org/D129137
2022-07-21 15:23:00 +01:00
Qiu Chaofan 708084ec37 [PowerPC] Support x86 compatible intrinsics on AIX
These headers used to be guarded only on PowerPC64 Linux or FreeBSD, but
they can also be enabled for AIX OS target since it's big-endian ready.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D129461
2022-07-21 16:33:41 +08:00
Chen Zheng ecdeabef38 enable P10 vector builtins test on AIX 64 bit; NFC
Verify that P10 vector builtins with type `vector signed __int128`
and `vector unsigned __int128` work well on AIX 64 bit.
2022-07-21 03:51:30 -04:00
Arthur Eubanks 7e77d31af7 [test] Remove unnecessary -verify-machineinstrs=0
Issue #38784 seems to be fixed and removing these doesn't cause any issues.
2022-07-20 10:55:54 -07:00
Nicolai Hähnle 1ddc51d89d Inliner: don't mark call sites as 'nounwind' if that would be redundant
When F calls G calls H, G is nounwind, and G is inlined into F, then the
inlined call-site to H should be effectively nounwind so as not to lose
information during inlining.

If H itself is nounwind (which often happens when H is an intrinsic), we
no longer mark the callsite explicitly as nounwind. Previously, there
were cases where the inlined call-site of H differs from a pre-existing
call-site of H in F *only* in the explicitly added nounwind attribute,
thus preventing common subexpression elimination.

v2:
- just check CI->doesNotThrow

v3 (resubmit after revert at 3443788087):
- update Clang tests

Differential Revision: https://reviews.llvm.org/D129860
2022-07-20 14:17:23 +02:00
Nicolai Hähnle 7af2818a99 Update some more tests with update_cc_test_checks.py 2022-07-20 13:27:18 +02:00