Note that because we don't usually touch the MMX registers anyway, all -mno-mmx needs to do is tweak the x86-32 calling convention a little for vectors that look like MMX vectors, and prevent the definition of __MMX__.
clang doesn't actually stop the user from using MMX inline asm operands or MMX builtins in -mno-mmx mode; as a QOI issue, it would be nice to diagnose, but I doubt it really matters much.
<rdar://problem/9694837>
llvm-svn: 134770
arithmetic on a VLA as 'nsw', per discussion with djg, and
implement pointer arithmetic (other than array accesses) and
pointer subtraction for VLA types.
llvm-svn: 133855
(or follow up) extern declaration with weak_import as
an actual definition. make clang follows this behavior.
// rdar://9538608
llvm-gcc treats an extern declaration with weak_import
llvm-svn: 133450
ConvertType on InitListExprs as they are being converted. This is
needed for a forthcoming patch, and improves the IR generated anyway
(see additional type names in testcases).
This patch also converts a bunch of std::vector's in CGObjCMac to use
C arrays. There are a ton more that should be converted as well.
llvm-svn: 133413
separate aggregate temporary and then memcpy it over to the
destination. This fixes a regression I introduced with r133235, where
the compound literal on the RHS of an assignment makes use of the
structure on the LHS of the assignment.
I'm deeply suspicious of AggExprEmitter::VisitBinAssign()'s
optimization where it emits the RHS of an aggregate assignment
directly into the LHS lvalue without checking whether there is any
aliasing between the LHS/RHS. However, I'm not in a position to
revisit this now.
Big thanks to Eli for finding the regression!
llvm-svn: 133261
This patch tries relatively hard to avoid creating an extra copy if it can be avoided (see test3 in the included testcase), but it is not possible to avoid in some cases (like test2 in the included testcase).
rdar://9483886
llvm-svn: 132957
generator will give it something sufficient. This is important because
the mid-level optimizer doesn't know what alignment is required otherwise.
llvm-svn: 131879
(__m128){ p[0], p[1], p[2], p[3] }
which produces really bad code. This could be done in instcombine, but it's
probably better to do it in the front-end instead.
<rdar://problem/9424836>
llvm-svn: 131237
Adjacent bit fields are packed into the same 1-, 2-, or
4-byte allocation unit if the integral types are the same
size. // rdar://8823265.
llvm-svn: 130851
it down. we effectively were compile the testcase into:
void test14(int x) {
switch (x) {
case 11: break;
case 42: test14(97); // fallthrough
default: test14(42); break;
which is not the same thing at all. This fixes a miscompilation of
MallocBench/gs seen on the clang-x86_64-linux-fnt buildbot.
llvm-svn: 129679
are trivial. This exposes opportunities earlier, and allows fastisel
to do good things with these at -O0.
This addresses rdar://9289468 - clang doesn't fold memset_chk at -O0
llvm-svn: 129651
by making the isCheapEnoughToEvaluateUnconditionally predicate handle anything that folds to a constant. In particular, we now fold enums.
llvm-svn: 129649
AAPCS+VFP), similar to fastcall / stdcall / whatevercall seen on x86.
In particular, all library functions should always be AAPCS regardless of floating point ABI used.
llvm-svn: 129534
the array alignment to the array access.
- This is more or less the best we can do without having alignment present in
the type system, but is a long way from truly matching how GCC handles this.
llvm-svn: 128691
platform implies default visibility. To achieve these, refactor our
lookup of explicit visibility so that we search for both an explicit
VisibilityAttr and an appropriate AvailabilityAttr, favoring the
VisibilityAttr if it is present.
llvm-svn: 128336
clobber with the 'y' constraint. Otherwise, we get the wrong return type and an
assert, because it created a '<1 x i64>' vector type instead of the x86_mmx
type.
llvm-svn: 127185
they are known to be exact multiples of the width of the char type. Add a
test case to CodeGen/union.c that would have caught the problem with the
previous attempt. No change in functionality intended.
llvm-svn: 126628
live case of a switch statement when switching on a constant. This is terribly
limited, but enough to handle the trivial example included. Before we would
emit:
define void @test1(i32 %i) nounwind {
entry:
%i.addr = alloca i32, align 4
store i32 %i, i32* %i.addr, align 4
switch i32 1, label %sw.epilog [
i32 1, label %sw.bb
]
sw.bb: ; preds = %entry
%tmp = load i32* %i.addr, align 4
%inc = add nsw i32 %tmp, 1
store i32 %inc, i32* %i.addr, align 4
br label %sw.epilog
sw.epilog: ; preds = %sw.bb, %entry
switch i32 0, label %sw.epilog3 [
i32 1, label %sw.bb1
]
sw.bb1: ; preds = %sw.epilog
%tmp2 = load i32* %i.addr, align 4
%add = add nsw i32 %tmp2, 2
store i32 %add, i32* %i.addr, align 4
br label %sw.epilog3
sw.epilog3: ; preds = %sw.bb1, %sw.epilog
ret void
}
now we emit:
define void @test1(i32 %i) nounwind {
entry:
%i.addr = alloca i32, align 4
store i32 %i, i32* %i.addr, align 4
%tmp = load i32* %i.addr, align 4
%inc = add nsw i32 %tmp, 1
store i32 %inc, i32* %i.addr, align 4
ret void
}
This improves -O0 compile time (less IR to generate and shove through the code
generator) and the clever linux kernel people found a way to fail to build if we
don't do this optimization. This step isn't enough to handle the kernel case
though.
llvm-svn: 126597
When the mismatch is due to a larger input operand that is
a constant, truncate it down to the size of the output. This
allows us to accept some cases in the linux kernel and elsewhere.
Pedantically speaking, we generate different code than GCC, though
I can't imagine how it would matter:
Clang:
movb $-1, %al
frob %al
GCC:
movl $255, %eax
frob %al
llvm-svn: 126148
designators: allowing codegen when the element initializer is a
constant or something else without a side effect. This unblocks
enough to let process.c in the linux kernel build, PR9257.
llvm-svn: 126056
without defining them. This should be an error, but I'm paranoid about
"uses" that end up not actually requiring a definition. I'll revisit later.
Also, teach IR generation to not set internal linkage on variable
declarations, just for safety's sake. Doing so produces an invalid module
if the variable is not ultimately defined.
Also, fix several places in the test suite where we were using internal
functions without definitions.
llvm-svn: 126016
- BlockDeclRefExprs always store VarDecls
- BDREs no longer store copy expressions
- BlockDecls now store a list of captured variables, information about
how they're captured, and a copy expression if necessary
With that in hand, change IR generation to use the captures data in
blocks instead of walking the block independently.
Additionally, optimize block layout by emitting fields in descending
alignment order, with a heuristic for filling in words when alignment
of the end of the block header is insufficient for the most aligned
field.
llvm-svn: 125005
compatible, not having the same type.
Fix rdar://8183908 in which compatible vector types weren't initialized properly leading to a crash.
llvm-svn: 124637
delete the block we began emitting into if it had no predecessors. We never
want to do this, because there are several valid cases during statement
emission where an existing block has no known predecessors but will acquire
some later. The case in my test case doesn't inherently fall into this
category, because we could safely emit the case-range code before the statement
body, but there are examples with labels that can't be fallen into
that would also demonstrate this bug.
rdar://problem/8837067
llvm-svn: 123303
Fix the width and align of bool type on Darwin to be 32bits
while keeping it 8 everywhere else.
Change the definition of va_list to default to SV4 ABI one
and let darwin subtarget override this.
Both changes submitted by Nathan Whitehorn and reviewed
by Rafael Espindola.
llvm-svn: 122956
in asm statements:
register int foo asm("rdi");
asm("..." : ... "r" (foo) ...
We also only accept these variables if the constraint in the asm statement is "r".
This fixes most of PR3933.
llvm-svn: 122643
16-bits in size. Implement this by splitting WChar into two enums, like we have
for char. This fixes a miscompmilation of XULRunner, PR8856.
llvm-svn: 122558
not actually frequently used, because ImpCastExprToType only creates a node
if the types differ. So explicitly create an ICE in the lvalue-to-rvalue
conversion code in DefaultFunctionArrayLvalueConversion() as well as several
other new places, and consistently deal with the consequences throughout the
compiler.
In addition, introduce a new cast kind for loading an ObjCProperty l-value,
and make sure we emit those nodes whenever an ObjCProperty l-value appears
that's not on the LHS of an assignment operator.
This breaks a couple of rewriter tests, which I've x-failed until future
development occurs on the rewriter.
Ted Kremenek kindly contributed the analyzer workarounds in this patch.
llvm-svn: 120890
when an initializer is variable (I handled the constant case in a previous
patch). This has three pieces:
1. Enhance AggValueSlot to have a 'isZeroed' bit to tell CGExprAgg that
the memory being stored into has previously been memset to zero.
2. Teach CGExprAgg to not emit stores of zero to isZeroed memory.
3. Teach CodeGenFunction::EmitAggExpr to scan initializers to determine
whether they are profitable to emit a memset + inividual stores vs
stores for everything.
The heuristic used is that a global has to be more than 16 bytes and
has to be 3/4 zero to be candidate for this xform. The two testcases
are illustrative of the scenarios this catches. We now codegen test9 into:
call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 400, i32 4, i1 false)
%.array = getelementptr inbounds [100 x i32]* %Arr, i32 0, i32 0
%tmp = load i32* %X.addr, align 4
store i32 %tmp, i32* %.array
and test10 into:
call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 392, i32 8, i1 false)
%tmp = getelementptr inbounds %struct.b* %S, i32 0, i32 0
%tmp1 = getelementptr inbounds %struct.a* %tmp, i32 0, i32 0
%tmp2 = load i32* %X.addr, align 4
store i32 %tmp2, i32* %tmp1, align 4
%tmp5 = getelementptr inbounds %struct.b* %S, i32 0, i32 3
%tmp10 = getelementptr inbounds %struct.a* %tmp5, i32 0, i32 4
%tmp11 = load i32* %X.addr, align 4
store i32 %tmp11, i32* %tmp10, align 4
Previously we produced 99 stores of zero for test9 and also tons for test10.
This xforms should substantially speed up -O0 builds when it kicks in as well
as reducing code size and optimizer heartburn on insane cases. This resolves
PR279.
llvm-svn: 120692
a global is larger than 32 bytes and has fewer than 6 non-zero values in the
initializer. Previously we'd turn something like this:
char test8(int X) {
char str[10000] = "abc";
into a 10K global variable which we then memcpy'd from. Now we generate:
%str = alloca [10000 x i8], align 16
%tmp = getelementptr inbounds [10000 x i8]* %str, i64 0, i64 0
call void @llvm.memset.p0i8.i64(i8* %tmp, i8 0, i64 10000, i32 16, i1 false)
store i8 97, i8* %tmp, align 16
%0 = getelementptr [10000 x i8]* %str, i64 0, i64 1
store i8 98, i8* %0, align 1
%1 = getelementptr [10000 x i8]* %str, i64 0, i64 2
store i8 99, i8* %1, align 2
Which is much smaller in space and also likely faster.
This is part of PR279
llvm-svn: 120645
I mistakenly thought that this was checking for vector name mangling, but
it is not. Since we're no longer wrapping Neon vectors in structs, this
test can just return a vector directly. There are already other tests for
that, so just to make this interesting, change the test to return a struct
of two vectors.
llvm-svn: 119434
assignment to volatiles in C. This in effect reverts some of mjs's
work in and around r72572. Basically, the C++ standard is quite
clear, except that it lies about volatile behavior approximating
C's, whereas the C standard is almost actively misleading.
llvm-svn: 119344
Return the result of a complex assignment with the original values,
not by performing a load from the l-value; this is the correct
semantics in C, although not in C++.
llvm-svn: 119037
- tags with C linkage should ignore visibility=hidden
- functions and variables with explicit visibility attributes should
ignore the linkage of their types
Either of these should be sufficient to fix PR8457.
Also, FileCheck-ize a test case.
llvm-svn: 117351
completes support for C1X anonymous struct/union init features:
* Indexed anonymous member initializers should not be expanded. Doing so makes
little sense and would cause unresolvable semantic ambiguity in valid code
(regression introduced by r69153).
* Subobject initialization of (possibly nested) anonymous members are now
referred to with paths relative to the naming record context, eliminating the
synthesis of incorrect implicit InitListExprs that caused CodeGen to assert.
* Field lookup was missing a null check in IdentifierInfo comparison which
caused lookup for a known (already resolved) field to match the first unnamed
data member it encountered leading to silent miscompilation.
* Subobject paths are no longer built using the general purpose
Sema::BuildAnonymousStructUnionMemberPath(). If any corner cases crop up, we
will now assert earlier in Sema instead of passing invalid InitListExprs
through to CodeGen.
Fixes PR6955, from Alp Toker!
llvm-svn: 116098
- Therefore, we can lower out the NEON wrapper structs and pass the vectors
directly. This makes a huge difference in the cleanliness of the IR after
optimization.
- I will trust, but verify, via future ABITest testing (for APCS-GNU, at
least).
llvm-svn: 114618
LHS and when conditional expression is an array. Since
it will be decayed, saved expression must be saved with
decayed expression. This is necessary to preserve semantics
of this extension (and prevent an IRGen crash which expects
an array to always be decayed). I am sure there will be other
cases in c++ (aggregate conditionals for example) when saving of the
expression must happen after some transformation on conditional
expression has happened.
Doug, please review. Fixes // rdar://8446940
llvm-svn: 114296
This mainly prevents failures and/or crashes when multiple processes try to read/write the same PCH file. (rdar://8392711&8294781); suggestion & review by Daniel!
llvm-svn: 114187
with a non-default-stack-ABI-alignment (of 16).
- This fixes the ABI convenient, but breaks codegen since we now have
underaligned arguments. Marginal improvement overall though, and will be
fixed in next commit.
llvm-svn: 114113
This reverts commit 113814.
This patch was never intended to stay in the repository. If you are reading this
from the future, we apologize for the noise.
llvm-svn: 113990
This test created a statements.ll file until about a month ago. Some buildbots
still have this file in their source dir. This is the easiest way to remove the
file on all bots. Then I'll revert.
llvm-svn: 113814
initializers, so the result of the evaluation doesn't leak through
inconsistently. Also, don't evaluate references to variables with
initializers with side-effects.
llvm-svn: 113128
Make CGT defer to the ABI on all member pointer types.
This requires giving CGT a handle to the ABI.
It's way easier to make that work if we avoid lazily creating the ABI.
Make it so.
llvm-svn: 111786
__overflow_handler entrypoint that David Chisnall made up.
Calling __overflow_handler is not part of the contract of
-ftrapv provided by GCC, and should never have been checked
in in the first place.
According to:
http://permalink.gmane.org/gmane.comp.compilers.clang.devel/8699
David is using this for some of arbitrary precision integer stuff
or something, which is not an appropriate thing to implement on
this.
llvm-svn: 110490
as soon as we properly codegen the simple vector operations, remove the
unnecessary built-ins/intrinsics from clang and llvm. Also add tests for the new
built-ins
llvm-svn: 110096
a switch or goto somewhere in the function. Indirect gotos trigger the
jump-checker regardless, because the conditions there are slightly more
elaborate and it's too marginal a case to be worth optimizing.
Turns off the jump-checker in a lot of cases in C++. rdar://problem/7702918
llvm-svn: 109962
have a "coerce to" type which often matches the default lowering of Clang
type to LLVM IR type, but the coerce case can be handled by making them
not be the same.
This simplifies things and fixes issues where X86-64 abi lowering would
return coerce after making preferred types exactly match up. This caused
us to compile:
typedef float v4f32 __attribute__((__vector_size__(16)));
v4f32 foo(v4f32 X) {
return X+X;
}
into this code at -O0:
define <4 x float> @foo(<4 x float> %X.coerce) nounwind {
entry:
%retval = alloca <4 x float>, align 16 ; <<4 x float>*> [#uses=2]
%coerce = alloca <4 x float>, align 16 ; <<4 x float>*> [#uses=2]
%X.addr = alloca <4 x float>, align 16 ; <<4 x float>*> [#uses=3]
store <4 x float> %X.coerce, <4 x float>* %coerce
%X = load <4 x float>* %coerce ; <<4 x float>> [#uses=1]
store <4 x float> %X, <4 x float>* %X.addr
%tmp = load <4 x float>* %X.addr ; <<4 x float>> [#uses=1]
%tmp1 = load <4 x float>* %X.addr ; <<4 x float>> [#uses=1]
%add = fadd <4 x float> %tmp, %tmp1 ; <<4 x float>> [#uses=1]
store <4 x float> %add, <4 x float>* %retval
%0 = load <4 x float>* %retval ; <<4 x float>> [#uses=1]
ret <4 x float> %0
}
Now we get:
define <4 x float> @foo(<4 x float> %X) nounwind {
entry:
%X.addr = alloca <4 x float>, align 16 ; <<4 x float>*> [#uses=3]
store <4 x float> %X, <4 x float>* %X.addr
%tmp = load <4 x float>* %X.addr ; <<4 x float>> [#uses=1]
%tmp1 = load <4 x float>* %X.addr ; <<4 x float>> [#uses=1]
%add = fadd <4 x float> %tmp, %tmp1 ; <<4 x float>> [#uses=1]
ret <4 x float> %add
}
This implements rdar://8248065
llvm-svn: 109733
Before we'd compile the example into something like:
%coerce.dive2 = getelementptr %struct.v4f32wrapper* %retval, i32 0, i32 0 ; <<4 x float>*> [#uses=1]
%1 = bitcast <4 x float>* %coerce.dive2 to <2 x double>* ; <<2 x double>*> [#uses=1]
%2 = load <2 x double>* %1, align 1 ; <<2 x double>> [#uses=1]
ret <2 x double> %2
Now we produce:
%coerce.dive2 = getelementptr %struct.v4f32wrapper* %retval, i32 0, i32 0 ; <<4 x float>*> [#uses=1]
%0 = load <4 x float>* %coerce.dive2, align 1 ; <<4 x float>> [#uses=1]
ret <4 x float> %0
llvm-svn: 109732
possible. This improves the example to pass <4 x float> instead of
<2 x double> but we still get awful code, and still don't get the
return value right.
llvm-svn: 109700
their call expressions synthetically have the "deduced" types based on their
first argument. We only insert conversions in the AST for arguments whose
values require conversion to match the value type expected. This keeps PR7600
closed by maintaining the return type, but avoids assertions due to unexpected
implicit casts making the type unsigned (test case added from Daniel).
The magic is moved into the codegen for the atomic builtin which inserts the
casts as needed at the IR level to raise the type to an integer suitable for
the LLVM intrinsic. This shouldn't cause any real change in functionality, but
now we can make the builtin be more truly polymorphic.
llvm-svn: 108638
- This issue here is that /usr/include/Blocks.h wants to define some of the
block runtime globals as weak, depending on the target. This doesn't work in
Clang because we aren't using the AST decl for these globals.
- The fix is a pretty gross hack which just watches all the decls for the
specific blocks globals we need to know about; if we see one we use it,
otherwise we use the hand coded type.
In time, I would like to clean this up by changing IRgen to ask Sema/AST for
the decl, which would then be lazily loaded from the builtin table if
necessary. This could be used in a whole host of places in IRgen and would
get rid of a lot of grotty hand coding of LLVM IR; however, we need some
extra Sema support for this as well as support for builtin global variables.
llvm-svn: 108482
strip cv-qualifiers from the expression's type when the language calls
for it: in C, that's all the time, while C++ only does it for
non-class types.
Centralized the computation of the call expression type in
QualType::getCallResultType() and some helper functions in other nodes
(FunctionDecl, ObjCMethodDecl, FunctionType), and updated all relevant
callers of getResultType() to getCallResultType().
Fixes PR7598 and PR7463, along with a bunch of getResultType() call
sites that weren't stripping references off the result type (nothing
stripped cv-qualifiers properly before this change).
llvm-svn: 108234
at -O0. The only change from the previous patch is that we don't try
to generate virtual method thunks for an available_externally
function.
llvm-svn: 108230
-O0, since we won't be using the definitions for anything anyway. For
lib/System/Path.o when built in Debug+Asserts mode, this leads to a 4%
improvement in compile time (and suppresses 440 function bodies).
<rdar://problem/7987644>
llvm-svn: 108156
was not producing a memcpy with the right address
spaces because of two places in it doing casts of
the arguments to i8, one of which that didn't
preserve the address space.
There is also an optimizer bug here.
llvm-svn: 107842
alloca for an argument. Make sure the argument gets the proper
decl alignment, which may be different than the type alignment.
This fixes PR7567
llvm-svn: 107627
r107173, "fix PR7519: after thrashing around and remembering how all this stuff"
r107216, "fix PR7523, which was caused by the ABI code calling ConvertType instead"
This includes a fix to make ConvertTypeForMem handle the "recursive" case, and call
it as such when lowering function types which have an indirect result.
llvm-svn: 107310
would trigger an extra method call).
- While in the area, I also changed Clang to not emit an unnecessary load from
'x' in cases like 'y = (x = 1)'.
llvm-svn: 107210
works, the fix is quite simple: just make sure to call ConvertTypeRecursive
when the function type being lowered is in the midst of ConvertType.
llvm-svn: 107173
load/store nonsense in the epilog. For example, for:
int foo(int X) {
int A[100];
return A[X];
}
we used to generate:
%arrayidx = getelementptr inbounds [100 x i32]* %A, i32 0, i64 %idxprom ; <i32*> [#uses=1]
%tmp1 = load i32* %arrayidx ; <i32> [#uses=1]
store i32 %tmp1, i32* %retval
%0 = load i32* %retval ; <i32> [#uses=1]
ret i32 %0
}
which codegen'd to this code:
_foo: ## @foo
## BB#0: ## %entry
subq $408, %rsp ## imm = 0x198
movl %edi, 400(%rsp)
movl 400(%rsp), %edi
movslq %edi, %rax
movl (%rsp,%rax,4), %edi
movl %edi, 404(%rsp)
movl 404(%rsp), %eax
addq $408, %rsp ## imm = 0x198
ret
Now we generate:
%arrayidx = getelementptr inbounds [100 x i32]* %A, i32 0, i64 %idxprom ; <i32*> [#uses=1]
%tmp1 = load i32* %arrayidx ; <i32> [#uses=1]
ret i32 %tmp1
}
and:
_foo: ## @foo
## BB#0: ## %entry
subq $408, %rsp ## imm = 0x198
movl %edi, 404(%rsp)
movl 404(%rsp), %edi
movslq %edi, %rax
movl (%rsp,%rax,4), %eax
addq $408, %rsp ## imm = 0x198
ret
This actually does matter, cutting out 2000 lines of IR from CGStmt.ll
for example.
Another interesting effect is that altivec.h functions which are dead
now get dce'd by the inliner. Hence all the changes to
builtins-ppc-altivec.c to ensure the calls aren't dead.
llvm-svn: 106970
As part of this, pull together trapv handling into the same enum.
This also add support for NSW multiplies.
This also makes PCH disagreement on overflow behavior silent, since it
really doesn't matter except for warnings and codegen (no macros get
defined etc).
llvm-svn: 106956
but whose operand isn't a float: specifically, __real__ and __imag__. Instead
of filtering these out, just implement them.
Fixes <rdar://problem/7958272>.
llvm-svn: 103307
they're unreachable. This matters because (if they're POD, or if this is C)
the scope containing the variable might be reachable even if the variable
isn't. Fixes PR7044.
llvm-svn: 103052
input and output types when the smaller value isn't mentioned in the
asm string. Extend this support from integers to also allowing
fp values to be mismatched (if not mentioned in the asm string).
llvm-svn: 102188
- This fixes the last known ABI issues with ARM/APCS.
- I've run the first 1k ABITests with '--no-unsigned --no-vector --no-complex'
on {armv6, armv7} x {-mno-thumb, -mthumb}, and the first 10k tests for armv7
-mthumb, for both function return types and single argument calls. These all
pass now (they failed horribly before without --no-bitfield).
llvm-svn: 102070
- This fixes some pedantic bugs with packed structures, as well as major problems with -fno-bitfield-type-align.
- Fixes PR5591, PR5567, and all known -fno-bitfield-type-align issues.
- Review appreciated.
llvm-svn: 102045
This mirror's Dan's patch for llvm-gcc in r97989, and
fixes the miscompilation in PR6525. There is some contention
over whether this is the right thing to do, but it is the
conservative answer and demonstrably fixes a miscompilation.
llvm-svn: 101877
struct may cause it to shrink more than one byte. Before
my recent changes we compiled the new test into:
%0 = type { [6 x i8] }
@x = global %0 { [6 x i8] undef }, align 2 ; <%0*> [#uses=0]
which is obviously bogus. Now we compile it into:
%0 = type <{ i32, i8, i8 }>
@x = global %0 zeroinitializer, align 2 ; <%0*> [#uses=0]
Where the last byte only is tail padding.
llvm-svn: 101536
- Sadly, this doesn't seem to give any .ll size win so far. It is possible to make this routine significantly smarter & avoid various shifting, masking, and zext/sext, but I'm not really convinced it is worth it. It is tricky, and this is really instcombine's job.
- No intended functionality change; the test case is just to increase coverage & serves as a demo file, it worked before this commit.
The new fixes from r101222 are:
1. The shift to the target position needs to occur after the value is extended to the correct size. This broke Clang bootstrap, among other things no doubt.
2. Swap the order of arguments to OR, to get a tad more constant folding.
llvm-svn: 101339
Stop multiplying constant by 8 accordingly in the header and change
intrinsic definition for what types we expect.
Add to existing palignr test to check that we're emitting the correct things.
llvm-svn: 101332
- Sadly, this doesn't seem to give any .ll size win so far. It is possible to make this routine significantly smarter & avoid various shifting, masking, and zext/sext, but I'm not really convinced it is worth it. It is tricky, and this is really instcombine's job.
- No intended functionality change; the test case is just to increase coverage & serves as a demo file, it worked before this commit.
llvm-svn: 101222
elements with explicit zero values instead of with tail padding.
On an example like this:
struct foo { int a; int b; };
struct foo fooarray[] = {
{1, 2},
{4},
};
We now lay this out as:
@fooarray = global [2 x %struct.foo] [%struct.foo { i32 1, i32 2 }, %struct.foo { i32 4, i32 0 }]
instead of as:
@fooarray = global %0 <{ %struct.foo { i32 1, i32 2 }, %1 { i32 4, [4 x i8] zeroinitializer } }>
Preserving both the struct type of the second element, but also the array type of the entire thing.
llvm-svn: 101155
trailing fields may not be represented in initializer lists, they
are being handled as padding and those fields *must* be zero
initialized.
llvm-svn: 101067
This introduces FunctionType::ExtInfo to hold the calling convention and the
noreturn attribute. The next patch will extend it to include the regparm
attribute and fix the bug.
llvm-svn: 99920
gcc, and the common expectation seems to be that they are unused. If and when
someone cares we can add them back with well documented demantics.
llvm-svn: 99522
expression computation in the wrong bit-width, and end up generating a totally
bogus array reference (_g0+8589934546).
- This showed up on Prolangs/cdecl.
llvm-svn: 99042
Clang's support for weakref is now better than llvm-gcc's :-)
We don't introduce a new symbol and we correctly mark undefined references weak only if there is no
definition or regular undefined references in the same file.
llvm-svn: 97733
a common source of oddities and, in theory, removes some redundant ABI
computations. Also fixes a miscompile I introduced yesterday by refactoring
some code and causing a slightly different code path to be taken that
didn't perform *parameter* type canonicalization, just normal type
canonicalization; this in turn caused a bit of ABI code to misfire because
it was looking for 'double' or 'float' but received 'const float'.
llvm-svn: 97030
of the block descriptor field. This field is the ObjC style @encode
signature of the implementation function, and was to this point
conditionally provided in the block literal data structure. That
provisional support is removed.
Additionally, eliminate unused enumerations for the block literal flags field.
The first shipping ABI unconditionally set (1<<29) but this bit is unused
by the runtime, so the second ABI will unconditionally have (1<<30) set so
that the runtime can in fact distinguish whether the additional data is
present or not.
llvm-svn: 96989
- This fixes many many more places than the test case, but my feeling is we need to audit alignment systematically so I'm not inclined to try hard to test the individual fixes in this patch. If this bothers you, patches welcome!
PR6240.
llvm-svn: 95648
follows (as conservatively as possible) gcc's current behavior: attributes
written on return types that don't apply there are applied to the function
instead, etc. Only parse CC attributes as type attributes, not as decl attributes;
don't accepet noreturn as a decl attribute on ValueDecls, either (it still
needs to apply to other decls, like blocks). Consistently consume CC/noreturn
information throughout codegen; enforce this by removing their default values
in CodeGenTypes::getFunctionInfo().
llvm-svn: 95436
WHAT!?!
It turns out that Type::isPromotableIntegerType() was not considering
enumeration types to be promotable, so we would never do the
promotion despite having properly computed the promotion type when the
enum was defined. Various operations on values of enum type just
"worked" because we could still compute the integer rank of an enum
type; the oddity, however, is that operations such as "add an enum and
an unsigned" would often have an enum result type (!). The bug
actually showed up as a spurious -Wformat diagnostic
(<rdar://problem/7595366>), but in theory it could cause miscompiles.
In this commit:
- Enum types with a promotion type of "int" or "unsigned int" are
promotable.
- Tweaked the computation of promotable types for enums
- For all of the ABIs, treat enum types the same way as their
underlying types (*not* their promotion types) for argument passing
and return values
- Extend the ABI tester with support for enumeration types
llvm-svn: 95117
by setting the section of the generated global. This is an
optimization done by the code generator, and the code being
removed didn't handle the case when the string contained an
embedded nul (which the code generator does correctly
handle). This is rdar://7589850
llvm-svn: 95003
the ABI spec, this turns out to simplify the code. We still have some annoying
code which mismatches the spec with regard to empty structures.
llvm-svn: 94796
only takes a boolean second argument now. Update tests accordingly.
Currently the builtin still accepts the full range for compatibility.
llvm-svn: 91983
- This is designed to make it obvious that %clang_cc1 is a "test variable"
which is substituted. It is '%clang_cc1' instead of '%clang -cc1' because it
can be useful to redefine what gets run as 'clang -cc1' (for example, to set
a default target).
llvm-svn: 91446