Commit Graph

18675 Commits

Author SHA1 Message Date
Jyotsna Verma 50ca6dd8a7 Hexagon: Use multiclass for absolute addressing mode stores.
llvm-svn: 174412
2013-02-05 18:15:34 +00:00
Jakob Stoklund Olesen eb1084ee54 Add a test case for PR14750.
This was fixed by r174402.

llvm-svn: 174405
2013-02-05 18:04:15 +00:00
Derek Schuff 90aa1d8abe [MC] Bundle alignment: Invalidate relaxed fragments
Currently, when a fragment is relaxed, its size is modified, but its
offset is not (it gets laid out as a side effect of checking whether
it needs relaxation), then all subsequent fragments are invalidated
because their offsets need to change. When bundling is enabled,
relaxed fragments need to get laid out again, because the increase in
size may push it over a bundle boundary. So instead of only
invalidating subsequent fragments, also invalidate the fragment that
gets relaxed, which causes it to get laid out again.

This patch also fixes some trailing whitespace and fixes the
bundling-related debug output of MCFragments.

llvm-svn: 174401
2013-02-05 17:55:27 +00:00
Tom Stellard 7d41161a2d R600: Add tests for instruction predicates
llvm-svn: 174393
2013-02-05 17:09:13 +00:00
Tom Stellard 2e5e7a5bef R600: Emit function name in the AsmPrinter
Emitting the function name allows us to check for it in the FileCheck
tests so we can make sure FileCheck is checking the output of the
correct function.

llvm-svn: 174392
2013-02-05 17:09:11 +00:00
Jyotsna Verma 6f635b5488 Hexagon: Add V4 compare instructions. Enable relationship mapping
for the existing instructions.

llvm-svn: 174389
2013-02-05 16:42:24 +00:00
NAKAMURA Takumi 7ec43d9b37 Formatting.
llvm-svn: 174380
2013-02-05 15:32:16 +00:00
NAKAMURA Takumi 6635fe56d3 llvm/test/Transforms/LoopVectorize/X86/vector_ptr_load_store.ll: "-debug" requires +Asserts.
llvm-svn: 174379
2013-02-05 15:32:10 +00:00
Arnold Schwaighofer 22174f5d5a Loop Vectorizer: Handle pointer stores/loads in getWidestType()
In the loop vectorizer cost model, we used to ignore stores/loads of a pointer
type when computing the widest type within a loop. This meant that if we had
only stores/loads of pointers in a loop we would return a widest type of 8bits
(instead of 32 or 64 bit) and therefore a vector factor that was too big.

Now, if we see a consecutive store/load of pointers we use the size of a pointer
(from data layout).

This problem occured in SingleSource/Benchmarks/Shootout-C++/hash.cpp (reduced
test case is the first test in vector_ptr_load_store.ll).

radar://13139343

llvm-svn: 174377
2013-02-05 15:08:02 +00:00
NAKAMURA Takumi 3753b28cd2 Revert r174343, "When the target-independent DAGCombiner inferred a higher alignment for a load,"
It caused hangups in compiling clang/lib/Parse/ParseDecl.cpp and clang/lib/Driver/Tools.cpp in stage2 on some hosts.

llvm-svn: 174374
2013-02-05 14:44:16 +00:00
Logan Chien 4b724429b8 Link .ARM.exidx with corresponding text section.
The sh_link in the ELF section header of .ARM.exidx should
be filled with the section index of the corresponding text
section.

llvm-svn: 174372
2013-02-05 14:18:59 +00:00
Arnold Schwaighofer a804bbee9b ARM cost model: Cost for scalar integer casts and floating point conversions
Also adds some costs for vector integer float conversions.

llvm-svn: 174371
2013-02-05 14:05:55 +00:00
Jack Carter 428a06cc75 This patch that sets the Mips ELF header flag for
MicroMips architectures. 

Contributer: Zoran Jovanovic
 
llvm-svn: 174360
2013-02-05 09:30:03 +00:00
Jack Carter 9c1a027fe8 This patch that sets the EmitAlias flag in td files
and enables the instruction printer to print aliased 
instructions. 

Due to usage of RegisterOperands a change in common 
code (utils/TableGen/AsmWriterEmitter.cpp) is required 
to get the correct register value if it is a RegisterOperand.

Contributer: Vladimir Medic
 
llvm-svn: 174358
2013-02-05 08:32:10 +00:00
Eric Christopher 6a421a944d Add support for testing the output of the abbrev table for the
skeleton CU as part of the DWARF5 split dwarf proposal.

llvm-svn: 174351
2013-02-05 07:32:00 +00:00
Eric Christopher 7a2cdf798b Add support for emitting a stub DW_AT_GNU_dwo_id as part of the
DWARF5 split dwarf proposal.

llvm-svn: 174350
2013-02-05 07:31:55 +00:00
Michael Gottesman e2376cdf71 Add code to GlobalVariable.h so that global variables marked as
externally_initialized return false for hasDefiniteInitializer and
hasUniqueInitializer.

rdar://12580965.

llvm-svn: 174345
2013-02-05 06:53:26 +00:00
Owen Anderson a47fdbb032 When the target-independent DAGCombiner inferred a higher alignment for a load,
it would replace the load with one with the higher alignment.  However, it did
not place the new load in the worklist, which prevented later DAG combines in
the same phase (for example, target-specific combines) from ever seeing it.

This patch corrects that oversight, and updates some tests whose output changed
due to slightly different DAGCombine outputs.

llvm-svn: 174343
2013-02-05 06:25:30 +00:00
Michael Gottesman 27e7ef326a Added LLVM Asm/Bitcode Reader/Writer support for new IR keyword externally_initialized.
llvm-svn: 174340
2013-02-05 05:57:38 +00:00
Manman Ren 86b1d868ba [Stack Alignment] emit warning instead of a hard error
Per discussion in rdar://13127907, we should emit a hard error only if
people write code where the requested alignment is larger than achievable
and assumes the low bits are zeros. A warning should be good enough when
we are not sure if the source code assumes the low bits are zeros.

rdar://13127907

llvm-svn: 174336
2013-02-04 23:45:08 +00:00
Jyotsna Verma 7ab68fbd1d Hexagon: Add V4 combine instructions and some more Def Pats for V2.
llvm-svn: 174331
2013-02-04 15:52:56 +00:00
Benjamin Kramer c35d526489 Disable a couple more vector splat optimizations on PPC.
I didn't see those because the test case used "not grep". FileCheck the test and
XFAIL it, preserving the old optimization, so this can be fixed eventually.

llvm-svn: 174330
2013-02-04 15:52:32 +00:00
Benjamin Kramer 2c9da989c2 X86: Open up some opportunities for constant folding by postponing shift lowering.
Fixes PR15141.

llvm-svn: 174327
2013-02-04 15:19:33 +00:00
Benjamin Kramer 548ffa274a SelectionDAG: Teach FoldConstantArithmetic how to deal with vectors.
This required disabling a PowerPC optimization that did the following:
input:
x = BUILD_VECTOR <i32 16, i32 16, i32 16, i32 16>
lowered to:
tmp = BUILD_VECTOR <i32 8, i32 8, i32 8, i32 8>
x = ADD tmp, tmp

The add now gets folded immediately and we're back at the BUILD_VECTOR we
started from. I don't see a way to fix this currently so I left it disabled
for now.

Fix some trivially foldable X86 tests too.

llvm-svn: 174325
2013-02-04 15:19:18 +00:00
Tim Northover 37b131f607 Update debugging test for change in expected metadata.
llvm-svn: 174321
2013-02-04 12:15:00 +00:00
David Blaikie 2811f8ac28 [DebugInfo] remove more node indirection (this time from the subprogram's variable lists)
llvm-svn: 174305
2013-02-04 05:56:36 +00:00
Arnold Schwaighofer 98f1012f9b ARM cost model: Penalize insertelement into D subregisters
Swift has a renaming dependency if we load into D subregisters. We don't have a
way of distinguishing between insertelement operations of values from loads and
other values. Therefore, we are pessimistic for now (The performance problem
showed up in example 14 of gcc-loops).

radar://13096933

llvm-svn: 174300
2013-02-04 02:52:05 +00:00
David Blaikie 33111dfea0 Remove the (apparently) unnecessary debug info metadata indirection.
The main lists of debug info metadata attached to the compile_unit had an extra
layer of metadata nodes they went through for no apparent reason. This patch
removes that (& still passes just as much of the GDB 7.5 test suite). If anyone
can show evidence as to why these extra metadata nodes are there I'm open to
reverting this patch & documenting why they're there.

llvm-svn: 174266
2013-02-02 05:56:24 +00:00
Reed Kotler f8933f83f0 Start static relocation implementation for mips16.
This checkin makes hello world work. 

llvm-svn: 174264
2013-02-02 04:07:35 +00:00
Manman Ren 053e4ff008 Removing ssp and uwtable from the testcase
llvm-svn: 174259
2013-02-02 01:34:38 +00:00
Shuxin Yang cadd8a068e rdar://13126763
Fix a bug in DAGCombine. The symptom is mistakenly optimizing expression
"x + x*x" into "x * 3.0".

llvm-svn: 174239
2013-02-02 00:22:03 +00:00
Manman Ren e697d3cd2e [Dwarf] avoid emitting multiple AT_const_value for static memebers.
Testing case is reduced from MultiSource/BenchMarks/Prolangs-C++/deriv1.

rdar://problem/13071590

llvm-svn: 174235
2013-02-01 23:54:37 +00:00
Bill Schmidt 52742c25ae LLVM enablement for some older PowerPC CPUs
llvm-svn: 174230
2013-02-01 22:59:51 +00:00
Dan Gohman 9ee4bc1abc Add a testcase for some past-the-end address subtleties.
llvm-svn: 174210
2013-02-01 19:37:52 +00:00
David Sehr 8114a7a651 Two changes relevant to LEA and x32:
1) allows the use of RIP-relative addressing in 32-bit LEA instructions under
   x86-64 (ILP32 and LP64)
2) separates the size of address registers in 64-bit LEA instructions from
   control by ILP32/LP64.

llvm-svn: 174208
2013-02-01 19:28:09 +00:00
Jyotsna Verma 10f5c2db4e Hexagon: Test case to confirm generation of indexed loads with zero offset.
llvm-svn: 174196
2013-02-01 16:40:06 +00:00
Benjamin Kramer c05aa958b1 InstSimplify: stripAndComputeConstantOffsets can be called with vectors of pointers too.
Prepare it for vectors of pointers and handle simple cases. We don't handle
complicated cases because accumulateConstantOffset bails on pointer vectors.
Fixes selfhost on i386.

llvm-svn: 174179
2013-02-01 15:21:10 +00:00
Tim Northover e3d4236402 Add explicit triples to AArch64 tests
Only Linux is supported at the moment, and other platforms quickly fault. As a
result these tests would fail on non-Linux hosts. It may be worth making the
tests more generic again as more platforms are supported.

llvm-svn: 174170
2013-02-01 11:40:47 +00:00
Nadav Rotem 4349f6963e Revert r174152. The shift amount may overflow and in that case this transformation is illegal.
llvm-svn: 174156
2013-02-01 07:59:33 +00:00
Nadav Rotem 1d584029ae Optimize shift lefts of a constant by a value plus constant into a single shift.
llvm-svn: 174152
2013-02-01 06:45:40 +00:00
Dan Gohman b3e2d3a638 Rewrite instsimplify's handling if icmp on pointer values to remove the
remaining use of AliasAnalysis concepts such as isIdentifiedObject to
prove pointer inequality.

@external_compare in test/Transforms/InstSimplify/compare.ll shows a simple
case where a noalias argument can be equal to a global variable address, and
while AliasAnalysis can get away with saying that these pointers don't alias,
instsimplify cannot say that they are not equal.

llvm-svn: 174122
2013-02-01 00:11:13 +00:00
Dan Gohman 995d40e1e2 An alloca can be equal to an argument. It can't *alias* an alloca, but it could
be equal, since there's nothing preventing a caller from correctly predicting
the stack location of an alloca.

llvm-svn: 174119
2013-01-31 23:49:33 +00:00
Bill Wendling 1c7cc8ae90 Remove the AttrBuilder form of the Attribute::get creators.
The AttrBuilder is for building a collection of attributes. The Attribute object
holds only one attribute. So it's not really useful for the Attribute object to
have a creator which takes an AttrBuilder.

This has two fallouts:

1. The AttrBuilder no longer holds its internal attributes in a bit-mask form.
2. The attributes are now ordered alphabetically (hence why the tests have changed).

llvm-svn: 174110
2013-01-31 23:16:25 +00:00
Tom Stellard 4926921bd4 R600: Fold clamp, neg, abs
Patch by: Vincent Lejeune

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174099
2013-01-31 22:11:54 +00:00
Manman Ren aec2ce7db4 Linker: correctly link in dbg.declare
This is a re-worked version of r174048.
Given source IR:
call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15
we used to generate 
call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29
!27 = metadata !{null}

With this patch, we will correctly generate
call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28

Looking up %argc.addr in ValueMap will return null, since %argc.addr is already
correctly set up, we can use identity mapping.

rdar://problem/13089880

llvm-svn: 174093
2013-01-31 21:19:18 +00:00
Lang Hames dd47804394 When lowering memcpys to loads and stores, make sure we don't promote alignments
past the natural stack alignment.

llvm-svn: 174085
2013-01-31 20:23:43 +00:00
Derek Schuff b76ec3bb5e [MC] bundle alignment: prevent padding instructions from crossing bundle boundaries
llvm-svn: 174067
2013-01-31 17:00:03 +00:00
Tim Northover e0e3aefdd3 Add AArch64 as an experimental target.
This patch adds support for AArch64 (ARM's 64-bit architecture) to
LLVM in the "experimental" category. Currently, it won't be built
unless requested explicitly.

This initial commit should have support for:
    + Assembly of all scalar (i.e. non-NEON, non-Crypto) instructions
      (except the late addition CRC instructions).
    + CodeGen features required for C++03 and C99.
    + Compilation for the "small" memory model: code+static data <
      4GB.
    + Absolute and position-independent code.
    + GNU-style (i.e. "__thread") TLS.
    + Debugging information.

The principal omission, currently, is performance tuning.

This patch excludes the NEON support also reviewed due to an outbreak of
batshit insanity in our legal department. That will be committed soon bringing
the changes to precisely what has been approved.

Further reviews would be gratefully received.

llvm-svn: 174054
2013-01-31 12:12:40 +00:00
Pekka Jaaskelainen 995a3e731d Made the min-trip-count-switch test X86-specific to avoid
breakage with builds without X86-support.

llvm-svn: 174052
2013-01-31 10:33:22 +00:00
Alexey Samsonov 5234a8ed9f Revert r173946. This breaks compilation of googletest with Clang
llvm-svn: 174048
2013-01-31 08:02:11 +00:00
Michael Gottesman 41e4ac4224 Filecheckized 2x tests in SimplifyCFG and removed their date prefix to fit with current llvm style for test names.
llvm-svn: 174011
2013-01-31 01:04:23 +00:00
Eric Christopher 4e3e94c13d Check and allow floating point registers to select the size of the
register for inline asm. This conforms to how gcc allows for effective
casting of inputs into gprs (fprs is already handled).

llvm-svn: 174008
2013-01-31 00:50:46 +00:00
Eli Bendersky 6c84b90b70 Replace some more greps with FileChecks in tests
llvm-svn: 174006
2013-01-31 00:44:12 +00:00
Eli Bendersky a320e00e74 Rewrite this test properly with a FileCheck instead of greps
llvm-svn: 173997
2013-01-31 00:11:52 +00:00
Dan Gohman 6a61fccb96 Fix ConstantFold's folding of icmp instructions to recognize that,
for example, a one-past-the-end pointer from one global variable may
be equal to the base pointer of another global variable.

llvm-svn: 173995
2013-01-31 00:01:45 +00:00
Hal Finkel e1df90958d PPC QPX requires a 32-byte aligned stack
On systems which support the QPX vector instructions, the stack must be
32-byte aligned.

llvm-svn: 173993
2013-01-30 23:43:27 +00:00
Evan Cheng 9449ec956f Forgot the test case before.
llvm-svn: 173988
2013-01-30 22:57:00 +00:00
Hal Finkel efb305e54c Add definitions for the PPC a2q core marked as having QPX available
This is the first commit of a large series which will add support for the
QPX vector instruction set to the PowerPC backend. This instruction set is
used on the IBM Blue Gene/Q supercomputers.

llvm-svn: 173973
2013-01-30 21:17:42 +00:00
Manman Ren 81dcc62805 Linker: correctly link in dbg.declare
Given source IR:
call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15
we used to generate 
call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29
!27 = metadata !{null}

With this patch, we will correctly generate
call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28

Looking up %argc.addr in ValueMap will return null, since %argc.addr is already
correctly set up, we can use identity mapping.

llvm-svn: 173946
2013-01-30 17:42:15 +00:00
Eli Bendersky 2e2ce49e59 Add a special ARM trap encoding for NaCl.
More details in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130128/163783.html

Patch by JF Bastien

llvm-svn: 173943
2013-01-30 16:30:19 +00:00
Logan Chien a436e4c7e4 Add missing header and test cases for r173939.
llvm-svn: 173941
2013-01-30 15:48:50 +00:00
Nadav Rotem 513bd8a73c InstCombine: canonicalize sext-and --> select
sext-not-and --> select.

Patch by Muhammad Tauqir Ahmad.

llvm-svn: 173901
2013-01-30 06:35:22 +00:00
Saleem Abdulrasool 26127bd746 build: add --with-python option
This adds a new --with-python option to allow configuration of the python binary
for building.  If not specified, $PATH will be searched for common python binary
names (python, python2, python3).  If specified, and the path is not executable,
it will attempt to search $PATH.

Signed-off-by: Saleem Abdulrasool <compnerd@compnerd.org>
Reviewed-by: Eric Christopher <echristo@gmail.com>, Daniel Dunbar <daniel@zuster.org>
llvm-svn: 173890
2013-01-30 04:07:37 +00:00
Jack Carter 718da0b53b This patch implements runtime ARM specific
setting of ELF header e_flags.

Contributer: Jack Carter
 
llvm-svn: 173885
2013-01-30 02:24:33 +00:00
Jack Carter 7f378104b6 This patch implements runtime Mips specific
setting of ELF header e_flags.

Contributer: Jack Carter
 
llvm-svn: 173884
2013-01-30 02:16:36 +00:00
Jack Carter 1bd90ff6cc This patch reworks how llvm targets set
and update ELF header e_flags.

Currently gathering information such as symbol, 
section and data is done by collecting it in an 
MCAssembler object. From MCAssembler and MCAsmLayout 
objects ELFObjectWriter::WriteObject() forms and 
streams out the ELF object file.

This patch just adds a few members to the MCAssember 
class to store and access the e_flag settings. It 
allows for runtime additions to the e_flag by 
assembler directives. The standalone assembler can 
get to MCAssembler from getParser().getStreamer().getAssembler().

This patch is the generic infrastructure and will be
followed by patches for ARM and Mips for their target 
specific use.

Contributer: Jack Carter
 
llvm-svn: 173882
2013-01-30 02:09:52 +00:00
Akira Hatanaka 4385564e97 [mips] Test case for r173862.
Patch by Sasa Stankovic.

llvm-svn: 173863
2013-01-30 00:28:15 +00:00
Renato Golin 5e9d55eca0 Adding simple cast cost to ARM
Changing ARMBaseTargetMachine to return ARMTargetLowering intead of
the generic one (similar to x86 code).

Tests showing which instructions were added to cast when necessary
or cost zero when not. Downcast to 16 bits are not lowered in NEON,
so costs are not there yet.

llvm-svn: 173849
2013-01-29 23:31:38 +00:00
Michael J. Spencer 54b24e1000 [MC][COFF] Delay handling symbol aliases when writing
Fixes PR14447 and PR9034. Patch by Nico Rieck!

llvm-svn: 173839
2013-01-29 22:10:07 +00:00
Pekka Jaaskelainen f50ab84bb1 LoopVectorize: convert TinyTripCountVectorThreshold constant
to a command line switch.

llvm-svn: 173837
2013-01-29 21:42:08 +00:00
David Blaikie 9a7a7a9a6f Support artificial parameters in function types.
Provides the functionality for Clang change r172911 - I just had this still
lying around.

llvm-svn: 173820
2013-01-29 19:35:24 +00:00
Tim Northover a0edd3ee66 Fix 64-bit atomic operations in Thumb mode.
The ARM and Thumb variants of LDREXD and STREXD have different constraints and
take different operands. Previously the code expanding atomic operations didn't
take this into account and asserted in Thumb mode.

llvm-svn: 173780
2013-01-29 09:06:13 +00:00
Craig Topper c048154b9b Merge SSE and AVX shuffle instructions in the comment printer.
llvm-svn: 173777
2013-01-29 07:54:31 +00:00
Bill Wendling f2955aa3f2 Convert getAttributes() to return an AttributeSetNode.
The AttributeSetNode contains all of the attributes. This removes one (hopefully
last) use of the Attribute class as a container of multiple attributes.

llvm-svn: 173761
2013-01-29 03:20:31 +00:00
Andrew Kaylor 6d8776a514 Add support for source and line information to IntelJITEventListener for object emitted by MCJIT.
llvm-svn: 173712
2013-01-28 19:52:37 +00:00
Bill Schmidt 2e4ae4e154 This patch addresses bug 15031.
The common code in the post-RA scheduler to break anti-dependencies on the
critical path contained a flaw.  In the reported case, an anti-dependency
between the overlapping registers %X4 and %R4 exists:

	%X29<def> = OR8 %X4, %X4
	%R4<def>, %X3<def,dead,tied3> = LBZU 1, %X3<kill,tied1>

The unpatched code breaks the dependency by replacing %R4 and its uses
with %R3, the first register on the available list.  However, %R3 and
%X3 overlap, so this creates two overlapping definitions on the same
instruction.

The fix is straightforward, preventing selection of a register that
overlaps any other defined register on the same instruction.

The test case is reduced from the bug report, and verifies that we no
longer produce "lbzu 3, 1(3)" when breaking this anti-dependency.

llvm-svn: 173706
2013-01-28 18:36:58 +00:00
Evgeniy Stepanov 6f85ef300d [msan] Mostly disable msan-handle-icmp-exact.
It is way too slow. Change the default option value to 0.
Always do exact shadow propagation for unsigned ICmp with constants, it is
cheap (under 1% cpu time) and required for correctness.

llvm-svn: 173682
2013-01-28 11:42:28 +00:00
Craig Topper 5c683972bc Fix 256-bit PALIGNR comment decoding to understand that it works on independent 256-bit lanes.
llvm-svn: 173674
2013-01-28 07:41:18 +00:00
Richard Osborne 038d24f90c [XCore] Add missing l2rus instructions.
These instructions are not targeted by the compiler but they are
needed for the MC layer.

llvm-svn: 173634
2013-01-27 22:28:30 +00:00
Richard Osborne f2ecd40929 [XCore] Add missing l2r instructions.
These instructions are not targeted by the compiler but they are
needed for the MC layer.

llvm-svn: 173629
2013-01-27 21:26:02 +00:00
Richard Osborne 7fe8f63544 [XCore] Add missing 1r instructions.
These instructions are not targeted by the compiler but they are
needed for the MC layer.

llvm-svn: 173624
2013-01-27 20:46:21 +00:00
Richard Osborne 8f56317287 [XCore] Add missing 0r instructions.
These instructions are not targeted by the compiler but they are
needed for the MC layer.

llvm-svn: 173623
2013-01-27 20:42:57 +00:00
Benjamin Kramer 05cc93964a When the legalizer is splitting vector shifts, the result may not have the right shift amount type.
Fix that by adding a cast to the shift expander. This came up with vector shifts
on sse-less X86 CPUs.

   <2 x i64>       = shl <2 x i64> <2 x i64>
-> i64,i64         = shl i64 i64; shl i64 i64
-> i32,i32,i32,i32 = shl_parts i32 i32 i64; shl_parts i32 i32 i64

Now we cast the last two i64s to the right type. Fixes the crash in PR14668.

llvm-svn: 173615
2013-01-27 11:19:11 +00:00
Chandler Carruth 329b590e6e Re-revert r173342, without losing the compile time improvements, flat
out bug fixes, or functionality preserving refactorings.

llvm-svn: 173610
2013-01-27 06:42:03 +00:00
David Blaikie 9f4b70dde0 PR14566: Debug Info: Removing top level lexical blocks
This adds support for LLVM to accept metadata that doesn't include a top level
lexical block in a function. Specifically LLVM couldn't handle this when there
were file changes relating to these blocks. I've updated a few test cases to
ensure other functionality (such as inlining) isn't affected by this change, but
haven't pervasively updated all the test cases.

llvm-svn: 173592
2013-01-26 21:55:23 +00:00
Benjamin Kramer 6a93596538 X86: Decode PALIGN operands so I don't have to do it in my head.
llvm-svn: 173572
2013-01-26 13:31:37 +00:00
Benjamin Kramer 99c68dd964 X86: Do splat promotion later, so the optimizer can chew on it first.
This catches many cases where we can emit a more efficient shuffle for a
specific mask or when the mask contains undefs. Once the splat is lowered to
unpacks we can't do that anymore.

There is a possibility of moving the promotion after pshufb matching, but I'm
not sure if pshufb with a mask loaded from memory is faster than 3 shuffles, so
I avoided that for now.

llvm-svn: 173569
2013-01-26 11:44:21 +00:00
Benjamin Kramer 7268a05178 FileCheckize and merge some tests.
llvm-svn: 173568
2013-01-26 11:14:32 +00:00
Andrew Kaylor 9a8ff813f3 Add DIContext::getLineInfoForAddressRange() function and test. This function allows a caller to obtain a table of line information for a function using the function's address and size.
llvm-svn: 173537
2013-01-26 00:28:05 +00:00
NAKAMURA Takumi 8653bcf024 llvm/test/CMakeLists.txt: Add a dependency to llvm-rtdyld in check-llvm.
llvm-svn: 173528
2013-01-25 23:24:07 +00:00
Hal Finkel 4e5ca9e578 Initial implementation of PPCTargetTransformInfo
This provides a place to add customized operation cost information and
control some other target-specific IR-level transformations.

The only non-trivial logic in this checkin assigns a higher cost to
unaligned loads and stores (covered by the included test case).

llvm-svn: 173520
2013-01-25 23:05:59 +00:00
Andrew Kaylor d55d7019fc Add support for applying in-memory relocations to the .debug_line section and, in the case of ELF files, using symbol addresses when available for relocations to the .debug_info section. Also extending the llvm-rtdyld tool to add the ability to dump line number information for testing purposes.
llvm-svn: 173517
2013-01-25 22:50:58 +00:00
Reid Kleckner 1aa3784960 XFAIL close-stderr on win32
The test runner does not rewrite instances of /dev/null inside the
quoted sh command.  /dev/null does not exist, so opt will fail to open
it, and return a non-zero exit code.

llvm-svn: 173509
2013-01-25 22:12:54 +00:00
Reid Kleckner 0198e00318 Set the +x bit on two batch scripts
Cygwin git-svn will faithfully forward the svn properties all the way
down to the NTFS executable permission.  Without the +x bit, tests using
these scripts fail with "Access Denied".

llvm-svn: 173508
2013-01-25 22:12:50 +00:00
Reid Kleckner ab083f727b FileCheck-ify some grep tests
These tests in particular try to use escaped square brackets as an
argument to grep, which is failing for me with native win32 python.  It
appears the backslash is being lost near the CreateProcess*() call.

llvm-svn: 173506
2013-01-25 22:11:46 +00:00
Eli Bendersky 597fc1233a In this patch, we teach X86_64TargetMachine that it has a ILP32
(defined by the x32 ABI) mode, in which case its pointers are 32-bits
in size. This knowledge is also added to X86RegisterInfo that now
returns the appropriate registers in getPointerRegClass.

There are many outcomes to this change. In order to keep the patches
separate and manageable, we start by focusing on some simple testable
cases. The patch adds a test with passing a pointer to a function -
focusing on the difference between the two data models for x86-64.
Another test is added for handling of 'sret' arguments (and
functionality is added in X86ISelLowering to make it work).

A note on naming: the "x32 ABI" document refers to the AMD64
architecture (in LLVM it's distinguished by being is64Bits() in the
x86 subtarget) with two variations: the LP64 (default) data model, and
the ILP32 data model. This patch adds predicates to the subtarget
which are consistent with this naming scheme.

llvm-svn: 173503
2013-01-25 22:07:43 +00:00
Eli Bendersky 158ea095c0 Add back a RUN line removed by mistake by a previous commit
llvm-svn: 173502
2013-01-25 21:58:09 +00:00
Richard Osborne 6b86eec819 Add instruction encodings / disassembly support for l4r instructions.
llvm-svn: 173501
2013-01-25 21:55:32 +00:00
Eli Bendersky e6abe83258 Now that llvm-dwarfdump supports flags to specify which DWARF section to dump,
use them in tests that run llvm-dwarfdump. This is in order to make tests as
specific as possible.

llvm-svn: 173498
2013-01-25 21:44:53 +00:00
Hal Finkel 1a57ba57a2 Improve the !add TableGen test case.
Suggested by Sean Silva.

llvm-svn: 173481
2013-01-25 20:29:25 +00:00
Eli Bendersky 7a94daa170 Add command-line flags for DWARF dumping.
Flags for dumping specific DWARF sections added in lib/DebugInfo and
llvm-dwarfdump.

llvm-svn: 173480
2013-01-25 20:26:43 +00:00
Richard Osborne a19fa86a70 Add instruction encodings / disassembly support for l5r instructions.
llvm-svn: 173479
2013-01-25 20:20:07 +00:00
Evgeniy Stepanov fac8403249 [msan] Implement exact shadow propagation for relational ICmp.
Only for integers, pointers, and vectors of those. No floats.
Instrumentation seems very heavy, and may need to be replaced
with some approximation in the future.

llvm-svn: 173452
2013-01-25 15:31:10 +00:00
Hal Finkel c7d4dc13a4 Add an addition operator to TableGen
This adds an !add(a, b) operator to tablegen; this will be used
to cleanup the PPC register definitions.

llvm-svn: 173445
2013-01-25 14:49:08 +00:00
Silviu Baranga 3eb45a03af Fixed the condition codes for the atomic64 min/umin code generation on ARM. If the sutraction of the higher 32 bit parts gives a 0 result, we need to do the store operation.
llvm-svn: 173437
2013-01-25 10:39:49 +00:00
Andrew Trick e2c3f5c982 MIsched: Improve the interface to SchedDFS analysis (subtrees).
Allow the strategy to select SchedDFS. Allow the results of SchedDFS
to affect initialization of the scheduler state.

llvm-svn: 173425
2013-01-25 06:33:57 +00:00
Chandler Carruth ceff222dea Switch this code away from Value::isUsedInBasicBlock. That code either
loops over instructions in the basic block or the use-def list of the
value, neither of which are really efficient when repeatedly querying
about values in the same basic block.

What's more, we already know that the CondBB is small, and so we can do
a much more efficient test by counting the uses in CondBB, and seeing if
those account for all of the uses.

Finally, we shouldn't blanket fail on any such instruction, instead we
should conservatively assume that those instructions are part of the
cost.

Note that this actually fixes a bug in the pass because
isUsedInBasicBlock has a really terrible bug in it. I'll fix that in my
next commit, but the fix for it would make this code suddenly take the
compile time hit I thought it already was taking, so I wanted to go
ahead and migrate this code to a faster & better pattern.

The bug in isUsedInBasicBlock was also causing other tests to test the
wrong thing entirely: for example we weren't actually disabling
speculation for floating point operations as intended (and tested), but
the test passed because we failed to speculate them due to the
isUsedInBasicBlock failure.

llvm-svn: 173417
2013-01-25 05:40:09 +00:00
Andrew Trick 44f750a3e5 MISched: Add SchedDFSResult to ScheduleDAGMI to formalize the
interface and allow other strategies to select it.

llvm-svn: 173413
2013-01-25 04:01:04 +00:00
Jack Carter 07c818d2da This patch implements parsing the .word
directive for the Mips assembler.

Contributer: Vladimir Medic
 
llvm-svn: 173407
2013-01-25 01:31:34 +00:00
Akira Hatanaka 28aed9ca85 [mips] Set flag neverHasSideEffects flag on some of the floating point instructions.
llvm-svn: 173401
2013-01-25 00:20:39 +00:00
Benjamin Kramer 1c4e323fdd Reapply chandlerc's r173342 now that the miscompile it was triggering is fixed.
Original commit message:
Plug TTI into the speculation logic, giving it a real cost interface
that can be specialized by targets.

The goal here is not to be more aggressive, but to just be more accurate
with very obvious cases. There are instructions which are known to be
truly free and which were not being modeled as such in this code -- see
the regression test which is distilled from an inner loop of zlib.

Everywhere the TTI cost model is insufficiently conservative I've added
explicit checks with FIXME comments to go add proper modelling of these
cost factors.

If this causes regressions, the likely solution is to make TTI even more
conservative in its cost estimates, but test cases will help here.

llvm-svn: 173357
2013-01-24 16:44:25 +00:00
Benjamin Kramer 435eba09b7 ConstantFolding: Add a missing folding that leads to a miscompile.
We use constant folding to see if an intrinsic evaluates to the same value as a
constant that we know. If we don't take the undefinedness into account we get a
value that doesn't match the actual implementation, and miscompiled code.

This was uncovered by Chandler's simplifycfg changes.

llvm-svn: 173356
2013-01-24 16:28:28 +00:00
Chandler Carruth 321c6a7c50 Revert r173342 temporarily. It appears to cause a very late miscompile
of stage2 in a bootstrap. Still investigating....

llvm-svn: 173343
2013-01-24 13:24:24 +00:00
Chandler Carruth 5f4519309f Plug TTI into the speculation logic, giving it a real cost interface
that can be specialized by targets.

The goal here is not to be more aggressive, but to just be more accurate
with very obvious cases. There are instructions which are known to be
truly free and which were not being modeled as such in this code -- see
the regression test which is distilled from an inner loop of zlib.

Everywhere the TTI cost model is insufficiently conservative I've added
explicit checks with FIXME comments to go add proper modelling of these
cost factors.

If this causes regressions, the likely solution is to make TTI even more
conservative in its cost estimates, but test cases will help here.

llvm-svn: 173342
2013-01-24 12:39:29 +00:00
Chandler Carruth 01bffaad03 Address a large chunk of this FIXME by accumulating the cost for
unfolded constant expressions rather than checking each one
independently.

llvm-svn: 173341
2013-01-24 12:05:17 +00:00
Chandler Carruth 8a21005cca Switch the constant expression speculation cost evaluation away from
a cost fuction that seems both a bit ad-hoc and also poorly suited to
evaluating constant expressions.

Notably, it is missing any support for trivial expressions such as
'inttoptr'. I could fix this routine, but it isn't clear to me all of
the constraints its other users are operating under.

The core protection that seems relevant here is avoiding the formation
of a select instruction wich a further chain of select operations in
a constant expression operand. Just explicitly encode that constraint.

Also, update the comments and organization here to make it clear where
this needs to go -- this should be driven off of real cost measurements
which take into account the number of constants expressions and the
depth of the constant expression tree.

llvm-svn: 173340
2013-01-24 11:53:01 +00:00
Kostya Serebryany 87191f6221 [asan] adaptive redzones for globals (the larger the global the larger is the redzone)
llvm-svn: 173335
2013-01-24 10:35:40 +00:00
Reed Kotler a2d76bce1f The next phase of Mips16 hard float implementation.
Allow Mips16 routines to call Mips32 routines that have abi requirements
that either arguments or return values are passed in floating point 
registers. This handles only the pic case. We have not done non pic
for Mips16 yet in any form.

The libm functions are Mips32, so with this addition we have a complete
Mips16 hard float implementation.

We still are not able to complete mix Mip16 and Mips32 with hard float.
That will be the next phase which will have several steps. For Mips32
to freely call Mips16 some stub functions must be created.

llvm-svn: 173320
2013-01-24 04:24:02 +00:00
Benjamin Kramer d9c3dabbba ConstantFolding: Evaluate GEP indices in the index type.
This fixes some edge cases that we would get wrong with uint64_ts.
PR14986.

llvm-svn: 173289
2013-01-23 20:41:05 +00:00
Richard Osborne 54e311821f Add instruction encodings / disassembly support for l6r instructions.
llvm-svn: 173288
2013-01-23 20:08:11 +00:00
Benjamin Kramer e4c46fec73 Revert "InstCombine: Clean up weird code that talks about a modulus that's long gone."
This causes crashes during the build of compiler-rt during selfhost. Add a
testcase for coverage.

llvm-svn: 173279
2013-01-23 17:52:29 +00:00
Bill Wendling 7c8f96a91b Add the heuristic to differentiate SSPStrong from SSPRequired.
The requirements of the strong heuristic are:

* A Protector is required for functions which contain an array, regardless of
  type or length.

* A Protector is required for functions which contain a structure/union which
  contains an array, regardless of type or length.  Note, there is no limit to
  the depth of nesting.

* A protector is required when the address of a local variable (i.e., stack
  based variable) is exposed. (E.g., such as through a local whose address is
  taken as part of the RHS of an assignment or a local whose address is taken as
  part of a function argument.)

llvm-svn: 173231
2013-01-23 06:43:53 +00:00
Bill Wendling d154e283f2 Add the IR attribute 'sspstrong'.
SSPStrong applies a heuristic to insert stack protectors in these situations:

* A Protector is required for functions which contain an array, regardless of
  type or length.

* A Protector is required for functions which contain a structure/union which
  contains an array, regardless of type or length.  Note, there is no limit to
  the depth of nesting.

* A protector is required when the address of a local variable (i.e., stack
  based variable) is exposed. (E.g., such as through a local whose address is
  taken as part of the RHS of an assignment or a local whose address is taken as
  part of a function argument.)

This patch implements the SSPString attribute to be equivalent to
SSPRequired. This will change in a subsequent patch.

llvm-svn: 173230
2013-01-23 06:41:41 +00:00
Nadav Rotem ab3e698ee9 Add support for reverse pointer induction variables. These are loops that contain pointers that count backwards.
For example, this is the hot loop in BZIP:

  do {
    m = *--p;
    *p = ( ... );
  } while (--n);

llvm-svn: 173219
2013-01-23 01:35:00 +00:00
Richard Osborne 1a06479f46 Add instruction encodings / disassembly support for u10 / lu10 instructions.
llvm-svn: 173204
2013-01-22 22:55:04 +00:00
Michael Liao 3dffc5e2b7 Fix an issue of pseudo atomic instruction DAG schedule
- Add list of physical registers clobbered in pseudo atomic insts
  Physical registers are clobbered when pseudo atomic instructions are
  expanded. Add them in clobber list to prevent DAG scheduler to
  mis-schedule them after these insns are declared side-effect free.
- Add test case from Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 173200
2013-01-22 21:47:38 +00:00
Kevin Enderby 81c944cadb Add a warning when there is a macro defintion that has named parameters but
the body does not use them and it appears the body has positional parameters.

This can cause unexpected results as in the added test case.  As the darwin
version of gas(1) which only supported positional parameters, happened to
ignore the named parameters.  Now that we want to support both styles of
macros we issue a warning in this specific case.

rdar://12861644

llvm-svn: 173199
2013-01-22 21:44:53 +00:00
Akira Hatanaka 88c0ec826c [mips] Implement MipsRegisterInfo::getRegPressureLimit.
llvm-svn: 173197
2013-01-22 21:34:25 +00:00
Kevin Enderby 0017d8a469 Have the integrated assembler give an error if $1 is used as an identifier in
an expression.  Currently this bug causes the line to be ignored in a
release build and an assert in a debug build.

rdar://13062484

llvm-svn: 173195
2013-01-22 21:09:20 +00:00
Eli Bendersky 583860339b Add forgotten test case for the x32 commit
llvm-svn: 173181
2013-01-22 18:52:39 +00:00
Benjamin Kramer fee7d21ae7 X86: Make sure we account for the FMA4 register immediate value, otherwise rip-rel relocations will be off by one byte.
PR15040.

llvm-svn: 173176
2013-01-22 18:05:59 +00:00
Dmitri Gribenko 44ee2e7a23 Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID
This is done to avoid odd test failures, like the one fixed in r171243.

llvm-svn: 173163
2013-01-22 14:39:21 +00:00
Evgeniy Stepanov c4415591ed [msan] Do not insert check on volatile store.
Volatile bitfields can cause valid stores of uninitialized bits.

llvm-svn: 173153
2013-01-22 12:30:52 +00:00
Michael Gottesman 469a465fa8 This test is only supposed to test that the objc-arc alias analysis
allows for gvn to perform certain optimizations. Thus the runline should
only contain -objc-arc-aa, not the full -objc-arc.

llvm-svn: 173126
2013-01-22 04:41:11 +00:00
Daniel Dunbar 34ea79f9a3 [MC/Mach-O] Load commands are supposed to 8-byte aligned on 64-bit.
llvm-svn: 173120
2013-01-22 03:42:49 +00:00
Andrew Trick 2d35fabc7d Remove target triple from an LSR test.
Manish already fixed this test to work with NoTTI.

llvm-svn: 173110
2013-01-22 00:57:16 +00:00
Paul Redmond 9d86a4a3b6 Transform (sub 0, (zext bool to A)) to (sext bool to A) and
(sub 0, (sext bool to A)) to (zext bool to A).

Patch by Muhammad Ahmad
Reviewed by Duncan Sands

llvm-svn: 173093
2013-01-21 21:57:20 +00:00
Richard Osborne 9d3ec06ef8 Add instruction encodings / disassembly support for u6 / lu6 instructions.
llvm-svn: 173086
2013-01-21 20:44:17 +00:00
Richard Osborne 6e58c6d86d Add instruction encoding / disassembly support for ru6 / lru6 instructions.
llvm-svn: 173085
2013-01-21 20:42:16 +00:00
Richard Osborne 4e69724869 Add instruction encodings / disassembly support for l2rus instructions.
llvm-svn: 172987
2013-01-20 18:51:15 +00:00
Richard Osborne 9fbf57b26c Add instruction encodings / disassembly support for l3r instructions.
llvm-svn: 172986
2013-01-20 18:37:49 +00:00
Richard Osborne f063fcee7a Add instruction encodings / disassembler support for 2rus instructions.
llvm-svn: 172985
2013-01-20 17:22:43 +00:00
Richard Osborne 3fb7395233 Add instruction encodings / disassembly support 3r instructions.
It is not possible to distinguish 3r instructions from 2r / rus instructions
using only the fixed bits. Therefore if an instruction doesn't match the
2r / rus format try to decode it as a 3r instruction before returning Fail.

llvm-svn: 172984
2013-01-20 17:18:47 +00:00
NAKAMURA Takumi 9439237063 llvm/test/CodeGen/X86/win_ftol2.ll: Add -cpu=generic to appease valgrind.
On valgrind the processor is reported;
  Host CPU: athlon-fx

llvm-svn: 172983
2013-01-20 15:40:02 +00:00
Nadav Rotem 9450fcfff1 Revert 172708.
The optimization handles esoteric cases but adds a lot of complexity both to the X86 backend and to other backends.
This optimization disables an important canonicalization of chains of SEXT nodes and makes SEXT and ZEXT asymmetrical.
Disabling the canonicalization of consecutive SEXT nodes into a single node disables other DAG optimizations that assume
that there is only one SEXT node. The AVX mask optimizations is one example. Additionally this optimization does not update the cost model.

llvm-svn: 172968
2013-01-20 08:35:56 +00:00
Nadav Rotem c42f90b1f4 LoopVectorizer: Implement a new heuristics for selecting the unroll factor.
We ignore the cpu frontend and focus on pipeline utilization. We do this because we
don't have a good way to estimate the loop body size at the IR level.

llvm-svn: 172964
2013-01-20 05:24:29 +00:00
Nadav Rotem 2169dbed2c Change the cpu type in the test.
llvm-svn: 172963
2013-01-20 05:20:56 +00:00
NAKAMURA Takumi 619ca0dc40 llvm/test/Other/close-stderr.ll: Mark this as XFAIL:valgrind. We got 127 instead of 1 here.
llvm-svn: 172956
2013-01-20 03:35:39 +00:00
David Blaikie a39a76efbc The last of PR14471 - emission of constant floats
llvm-svn: 172941
2013-01-20 01:18:01 +00:00
David Blaikie b085931026 Fix a latent bug exposed by recent static member debug info changes.
We weren't encoding boolean constants correctly due to modeling boolean as a
signed type & then sign extending an i1 up to a byte & getting 255.

llvm-svn: 172926
2013-01-19 23:00:25 +00:00
Benjamin Kramer d455ed85d1 LoopVectorizer: Emit memory checks into their own basic block.
This separates the check for "too few elements to run the vector loop" from the
"memory overlap" check, giving a lot nicer code and allowing to skip the memory
checks when we're not going to execute the vector code anyways. We still leave
the decision of whether to emit the memory checks as branches or setccs, but it
seems to be doing a good job. If ugly code pops up we may want to emit them as
separate blocks too. Small speedup on MultiSource/Benchmarks/MallocBench/espresso.

Most of this is legwork to allow multiple bypass blocks while updating PHIs,
dominators and loop info.

llvm-svn: 172902
2013-01-19 13:57:58 +00:00
Nadav Rotem 7b3120b9ae On Sandybridge split unaligned 256bit stores into two xmm-sized stores.
llvm-svn: 172894
2013-01-19 08:38:41 +00:00
Jakob Stoklund Olesen ac6cfa41d6 Remove some register allocation order dependencies.
llvm-svn: 172874
2013-01-19 00:03:32 +00:00
Nadav Rotem 7431211214 On Sandybridge loading unaligned 256bits using two XMM loads (vmovups and vinsertf128) is faster than using a single vmovups instruction.
llvm-svn: 172868
2013-01-18 23:10:30 +00:00
Eric Christopher e9ec2458e7 Split out DW_OP_addr for the split debug info DWARF5 proposal.
llvm-svn: 172857
2013-01-18 22:11:33 +00:00
Jack Carter c1b17ed2e1 This is a resubmittal. For some reason it broke the bots yesterday
but I cannot reproduce the problem and have scrubed my sources and
even tested with llvm-lit -v --vg.
Support for Mips register information sections.

Mips ELF object files have a section that is dedicated
to register use info. Some of this information such as
the assumed Global Pointer value is used by the linker
in relocation resolution.

The register info file is .reginfo in o32 and .MIPS.options
in 64 and n32 abi files.

This patch contains the changes needed to create the sections,
but leaves the actual register accounting for a future patch.


Contributer: Jack Carter
 
llvm-svn: 172847
2013-01-18 21:20:38 +00:00
Jack Carter 86c2c564ff This is a resubmittal. For some reason it broke the bots yesterday
but I cannot reproduce the problem and have scrubed my sources and
even tested with llvm-lit -v --vg.

Removal of redundant code and formatting fixes.

Contributers: Jack Carter/Vladimir Medic
 
llvm-svn: 172842
2013-01-18 20:15:06 +00:00
Daniel Dunbar 9585612876 [MC/Mach-O] Implement integrated assembler support for linker options.
- Also, fixup syntax errors in LangRef and missing newline in the MCAsmStreamer.

llvm-svn: 172837
2013-01-18 19:37:00 +00:00
NAKAMURA Takumi b72e763325 llvm/test/CodeGen/X86/Atomics-64.ll: Tweak for 2nd RUN not to overwrite %t. It sometimes causes spurious failure on lit win32.
Feel free to prune or suppress each output.

llvm-svn: 172823
2013-01-18 14:52:02 +00:00
Daniel Dunbar eec0f32eea [MC/Mach-O] Add support for linker options in Mach-O files.
llvm-svn: 172779
2013-01-18 01:26:07 +00:00
Daniel Dunbar 16004b8324 [MC/Mach-O] Add AsmParser support for .linker_option directive.
llvm-svn: 172778
2013-01-18 01:25:48 +00:00
Bill Wendling da29e00578 Reverting r171325 & r172363. This was causing a mis-compile on the self-hosted LTO build bots.
Okay, here's how to reproduce the problem:

1) Build a Release (or Release+Asserts) version of clang in the normal way.

2) Using the clang & clang++ binaries from (1), build a Release (or
   Release+Asserts) version of the same sources, but this time enable LTO ---
   specify the `-flto' flag on the command line.

3) Run the ARC migrator tests:

    $ arcmt-test --args -triple x86_64-apple-darwin10 -fsyntax-only -x objective-c++ ./src/tools/clang/test/ARCMT/cxx-rewrite.mm

You'll see that the output isn't correct (the whitespace is off).

The mis-compile is in the function `RewriteBuffer::RemoveText' in the
clang/lib/Rewrite/Core/Rewriter.cpp file. When that function and RewriteRope.cpp
are compiled with LTO and the `arcmt-test' executable is regenerated, you'll see
the error. When those files are not LTO'ed, then the output of the `arcmt-test'
is fine.

It is *really* hard to get a testcase out of this. I'll file a PR with what I
have currently.

--- Reverse-merging r172363 into '.':
U    include/llvm/Analysis/MemoryBuiltins.h
U    lib/Analysis/MemoryBuiltins.cpp

--- Reverse-merging r171325 into '.':
U    test/Transforms/InstCombine/objsize.ll
G    include/llvm/Analysis/MemoryBuiltins.h
G    lib/Analysis/MemoryBuiltins.cpp

llvm-svn: 172756
2013-01-17 21:28:46 +00:00
Bill Schmidt 94b8cdbf55 Restore reverted test case, this time with REQUIRES: asserts
llvm-svn: 172747
2013-01-17 19:46:51 +00:00
Bill Schmidt b400204fd8 Remove bad test case
llvm-svn: 172746
2013-01-17 19:39:36 +00:00
Bill Schmidt dee1ef8f53 This patch fixes PR13626 by providing i128 support in the return
calling convention.  128-bit integers are now properly returned
in GPR3 and GPR4 on PowerPC.

llvm-svn: 172745
2013-01-17 19:34:57 +00:00
Jyotsna Verma 9b60c1d171 Add indexed load/store instructions for offset validation check.
This patch fixes bug 14902 - http://llvm.org/bugs/show_bug.cgi?id=14902

llvm-svn: 172737
2013-01-17 18:42:37 +00:00
Bill Schmidt 6b2940b01e This patch fixes the PPC calling convention to handle returns of
_Complex float and _Complex long double, by simply increasing the
number of floating point registers available for return values.

The test case verifies that the correct registers are loaded.

llvm-svn: 172733
2013-01-17 17:45:19 +00:00
Elena Demikhovsky f6a30e05d5 Optimization for the following SIGN_EXTEND pairs:
v8i8  -> v8i64, 
v8i8  -> v8i32, 
v4i8  -> v4i64, 
v4i16 -> v4i64 
for AVX and AVX2.

Bug 14865.

llvm-svn: 172708
2013-01-17 09:59:53 +00:00
Eric Christopher 4c7765f166 Fix the assembly and dissassembly of DW_FORM_sec_offset. Found this by
changing both the string of the dwo_name to be correct and the type of
the statement list.

Testcases all around.

llvm-svn: 172699
2013-01-17 03:00:04 +00:00
Eric Christopher 1826617133 Add the DW_AT_GNU_addr_base for the skeleton cu. Add support for
emitting the dwarf32 version of DW_FORM_sec_offset and correct
disassembler support.

llvm-svn: 172698
2013-01-17 02:59:59 +00:00
Jack Carter 2a74a87b71 This is a resubmittal. For some reason it broke the bots yesterday
but I cannot reproduce the problem and have scrubed my sources and
even tested with llvm-lit -v --vg.

The Mips RDHWR (Read Hardware Register) instruction was not 
tested for assembler or dissassembler consumption. This patch
adds that functionality.

Contributer: Vladimir Medic
 
llvm-svn: 172685
2013-01-17 00:28:20 +00:00
Daniel Dunbar d77d9fb04d [IR] Add 'Append' and 'AppendUnique' module flag behaviors.
llvm-svn: 172659
2013-01-16 21:38:56 +00:00
Michael Gottesman 00dfc68c2d Added test for r172599 which fixes bugzilla://14584,rdar://11744105.
llvm-svn: 172656
2013-01-16 21:07:18 +00:00
Eric Christopher 69fc38f02f Make this test X86 only.
llvm-svn: 172652
2013-01-16 20:31:35 +00:00
Eric Christopher 45008a5688 Move this to X86.
llvm-svn: 172651
2013-01-16 20:31:32 +00:00
Eric Christopher ce26df829f Add testcase missed yesterday from Paul Robinson.
llvm-svn: 172646
2013-01-16 19:53:47 +00:00
Daniel Dunbar 0ec72bbc4d [Linker] Change module flag linking to be more extensible.
- Instead of computing a bunch of buckets of different flag types, just do an
   incremental link resolving conflicts as they arise.

 - This also has the advantage of making the link result deterministic and not
   dependent on map iteration order.

llvm-svn: 172634
2013-01-16 18:39:23 +00:00
Kevin Enderby e82ada6983 We want the dwarf AT_producer for assembly source files to match clang's
AT_producer.  Which includes clang's version information so we can tell
which version of the compiler was used.

This is the first of two steps to allow us to do that.  This is the llvm-mc
change to provide a method to set the AT_producer string.  The second step,
coming soon to a clang near you, will have the clang driver pass the value
of getClangFullVersion() via an flag when invoking the integrated assembler
on assembly source files.

rdar://12955296

llvm-svn: 172630
2013-01-16 17:46:23 +00:00
Peter Collingbourne a51c6ed608 Introduce llvm::sys::getProcessTriple() function.
In r143502, we renamed getHostTriple() to getDefaultTargetTriple()
as part of work to allow the user to supply a different default
target triple at configure time.  This change also affected the JIT.
However, it is inappropriate to use the default target triple in the
JIT in most circumstances because this will not necessarily match
the current architecture used by the process, leading to illegal
instruction and other such errors at run time.

Introduce the getProcessTriple() function for use in the JIT and
its clients, and cause the JIT to use it.  On architectures with a
single bitness, the host and process triples are identical.  On other
architectures, the host triple represents the architecture of the
host CPU, while the process triple represents the architecture used
by the host CPU to interpret machine code within the current process.
For example, when executing 32-bit code on a 64-bit Linux machine,
the host triple may be 'x86_64-unknown-linux-gnu', while the process
triple may be 'i386-unknown-linux-gnu'.

This fixes JIT for the 32-on-64-bit (and vice versa) build on non-Apple
platforms.

Differential Revision: http://llvm-reviews.chandlerc.com/D254

llvm-svn: 172627
2013-01-16 17:27:22 +00:00
Benjamin Kramer b7050f0a7c Move test that depends on the x86 target into a target-specific directory.
Should fix the arm buildbot (which only builds the arm target).

llvm-svn: 172611
2013-01-16 13:25:56 +00:00
Alexey Samsonov 1345d35e40 ASan: wrap mapping scale and offset in a struct and make it a member of ASan passes. Add test for non-default mapping scale and offset. No functionality change
llvm-svn: 172610
2013-01-16 13:23:28 +00:00
Benjamin Kramer 1f25d24a8f Remove triple from this test, it makes it fail when X86 TTI is missing.
Without a triple opt falls back to NoTTI which comes closer to LSR's pre-TTI behavior.

llvm-svn: 172609
2013-01-16 13:19:59 +00:00
Jack Carter 5619f91bf7 reverting 172579
llvm-svn: 172594
2013-01-16 01:29:10 +00:00
Jack Carter e0c1e1a47e Akira,
Hope you are feeling better.

The Mips RDHWR (Read Hardware Register) instruction was not 
tested for assembler or dissassembler consumption. This patch
adds that functionality.

Contributer: Vladimir Medic
 
llvm-svn: 172579
2013-01-16 00:07:45 +00:00
Eric Christopher 962c9089d9 Split address information for DWARF5 split dwarf proposal. This involves
using the DW_FORM_GNU_addr_index and a separate .debug_addr section which
stays in the executable and is fully linked.

Sneak in two other small changes:

a) Print out the debug_str_offsets.dwo section.
b) Change form we're expecting the entries in the debug_str_offsets.dwo
   section to take from ULEB128 to U32.

Add tests for all of this in the fission-cu.ll test.

llvm-svn: 172578
2013-01-15 23:56:56 +00:00
Nadav Rotem 7df850924d Teach InstCombine to optimize extract of a value from a vector add operation with a constant zero.
llvm-svn: 172576
2013-01-15 23:43:14 +00:00
Shuxin Yang e822745202 1. Hoist minus sign as high as possible in an attempt to reveal
some optimization opportunities (in the enclosing supper-expressions).

   rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y)
     if expression "-0.0 - X" has only one reference.

   rule 2. (0.0 - X ) * Y => -0.0 - (X * Y)
     if expression "0.0 - X" has only one reference, and
        the instruction is marked "noSignedZero".

2. Eliminate negation (The compiler was already able to handle these
    opt if the 0.0s are replaced with -0.0.)

   rule 3: (0.0 - X) * (0.0 - Y) => X * Y
   rule 4: (0.0 - X) * C => X * -C
   if the expr is flagged "noSignedZero".

3. 
  Rule 5: (X*Y) * X => (X*X) * Y
   if X!=Y and the expression is flagged with "UnsafeAlgebra".

   The purpose of this transformation is two-fold:
    a) to form a power expression (of X).
    b) potentially shorten the critical path: After transformation, the
       latency of the instruction Y is amortized by the expression of X*X,
       and therefore Y is in a "less critical" position compared to what it
      was before the transformation. 

4. Remove the InstCombine code about simplifiying "X * select".
   
   The reasons are following:
    a) The "select" is somewhat architecture-dependent, therefore the
       higher level optimizers are not able to precisely predict if
       the simplification really yields any performance improvement
       or not.

    b) The "select" operator is bit complicate, and tends to obscure
       optimization opportunities. It is btter to keep it as low as
       possible in expr tree, and let CodeGen to tackle the optimization.

llvm-svn: 172551
2013-01-15 21:09:32 +00:00
Daniel Dunbar c36547d422 [IR] Add verification for module flags with the "require" behavior.
llvm-svn: 172549
2013-01-15 20:52:06 +00:00
Evgeniy Stepanov 701d2b861e [msan] Temporarily remove ICmpEQ tests.
They are failing on the bots.

llvm-svn: 172540
2013-01-15 17:12:04 +00:00
Evgeniy Stepanov d14e47b146 [msan] Fix handling of equality comparison of pointer vectors.
Also improve test coveration of the handling of relational comparisons.

llvm-svn: 172539
2013-01-15 16:44:52 +00:00
Renato Golin 51c25b0818 Pattern-matched variables in post-inc-icmpzero.ll
Test was failing for clang-native-arm-cortex-a9 build-bot configuration.
The reason for the failure was the test was using hardcoded names.
The attached patch fixes this failure by replacing the hard-coded variables
names with pattern-matched variable names.

Patch by Manish Verma, ARM

llvm-svn: 172534
2013-01-15 15:22:45 +00:00
Daniel Dunbar 25c4b5718b [IR] Add verifier support for llvm.module.flags.
- Also, update the LangRef documentation on module flags to match the
   implementation.

llvm-svn: 172498
2013-01-15 01:22:53 +00:00
Jack Carter f238510c43 This patch fixes a Mips specific bug where
we need to generate a N64 compound relocation
R_MIPS_GPREL_32/R_MIPS_64/R_MIPS_NONE.

The bug was exposed by the SingleSourcetest case 
DuffsDevice.c.

Contributer: Jack Carter
llvm-svn: 172496
2013-01-15 01:08:02 +00:00
Shuxin Yang 320f52a4b0 This change is to implement following rules under the condition C_A and/or C_R
---------------------------------------------------------------------------
 C_A: reassociation is allowed
 C_R: reciprocal of a constant C is appropriate, which means 
    - 1/C is exact, or 
    - reciprocal is allowed and 1/C is neither a special value nor a denormal.
 -----------------------------------------------------------------------------

 rule1:  (X/C1) / C2 => X / (C2*C1)  (if C_A)
                     => X * (1/(C2*C1))  (if C_A && C_R)
 rule 2:  X*C1 / C2 => X * (C1/C2)  if C_A
 rule 3: (X/Y)/Z = > X/(Y*Z)  (if C_A && at least one of Y and Z is symbolic value)
 rule 4: Z/(X/Y) = > (Z*Y)/X  (similar to rule3)

 rule 5: C1/(X*C2) => (C1/C2) / X (if C_A)
 rule 6: C1/(X/C2) => (C1*C2) / X (if C_A)
 rule 7: C1/(C2/X) => (C1/C2) * X (if C_A)

llvm-svn: 172488
2013-01-14 22:48:41 +00:00
Chad Rosier 5c118fd2ec [ms-inline asm] Extend support for parsing Intel bracketed memory operands that
have an arbitrary ordering of the base register, index register and displacement.
rdar://12527141

llvm-svn: 172484
2013-01-14 22:31:35 +00:00
Bill Schmidt d006c6938b This patch addresses an incorrect transformation in the DAG combiner.
The included test case is derived from one of the GCC compatibility tests.
The problem arises after the selection DAG has been converted to type-legalized
form.  The combiner first sees a 64-bit load that can be converted into a
pre-increment form.  The original load feeds into a SRL that isolates the
upper 32 bits of the loaded doubleword.  This looks like an opportunity for
DAGCombiner::ReduceLoadWidth() to replace the 64-bit load with a 32-bit load.

However, this transformation is not valid, as the replacement load is not
a pre-increment load.  The pre-increment load produces an extra result,
which feeds a subsequent add instruction.  The replacement load only has
one result value, and this value is propagated to all uses of the pre-
increment load, including the add.  Because the add is looking for the
second result value as its operand, it ends up attempting to add a constant
to a token chain, resulting in a crash.

So the patch simply disables this transformation for any load with more than
two result values.

llvm-svn: 172480
2013-01-14 22:04:38 +00:00
Andrew Trick d4e1b5e291 SCEVExpander fix. RAUW needs to update the InsertedExpressions cache.
Note that this bug is only exposed because LTO fails to use TTI.

Fixes self-LTO of clang. rdar://13007381.

llvm-svn: 172462
2013-01-14 21:00:37 +00:00
Michael Gottesman c99ee6b336 Added bugzilla PR number to test case.
llvm-svn: 172369
2013-01-13 22:17:22 +00:00
Michael Gottesman f15c0bb495 Fixed an infinite loop in the block escape in analysis in ObjCARC caused by 2x blocks each assigned a value via a phi-node causing each to depend on the other.
A test case is provided as well.

llvm-svn: 172368
2013-01-13 22:12:06 +00:00
Benjamin Kramer bcd14a0f26 X86: Add patterns for X86ISD::VSEXT in registers.
Those can occur when something between the sextload and the store is on the same
chain and blocks isel. Fixes PR14887.

llvm-svn: 172353
2013-01-13 11:37:04 +00:00
Nadav Rotem 40e45eeae2 Fix PR14547. Handle induction variables of small sizes smaller than i32 (i8 and i16).
llvm-svn: 172348
2013-01-13 07:56:29 +00:00
Benjamin Kramer 5ea0349ef5 When lowering an inreg sext first shift left, then right arithmetically.
Shifting right two times will only yield zero. Should fix
SingleSource/UnitTests/SignlessTypes/factor.

llvm-svn: 172322
2013-01-12 19:06:44 +00:00
Michael Gottesman 556ff61122 Fixed bug in ObjCARC where we were changing a call from objc_autoreleaseRV => objc_autorelease but were not updating the InstructionClass to IC_Autorelease.
llvm-svn: 172288
2013-01-12 01:25:19 +00:00
Michael Gottesman c9656faf1e Fixed a bug where we were tail calling objc_autorelease causing an object to not be placed into an autorelease pool.
The reason that this occurs is that tail calling objc_autorelease eventually
tail calls -[NSObject autorelease] which supports fast autorelease. This can
cause us to violate the semantic gaurantees of __autoreleasing variables that
assignment to an __autoreleasing variables always yields an object that is
placed into the innermost autorelease pool.

The fix included in this patch works by:

1. In the peephole optimization function OptimizeIndividualFunctions, always
remove tail call from objc_autorelease.
2. Whenever we convert to/from an objc_autorelease, set/unset the tail call
keyword as appropriate.

*NOTE* I also handled the case where objc_autorelease is converted in
OptimizeReturns to an autoreleaseRV which still violates the ARC semantics. I
will be removing that in a later patch and I wanted to make sure that the tree
is in a consistent state vis-a-vis ARC always.

Additionally some test cases are provided and all tests that have tail call marked
objc_autorelease keywords have been modified so that tail call has been removed.

*NOTE* One test fails due to a separate bug that I am going to commit soon. Thus
I marked the check line TMP: instead of CHECK: so make check does not fail.

llvm-svn: 172287
2013-01-12 01:25:15 +00:00
Jack Carter 873c724b4a This patch tackles the problem of parsing Mips
register names in the standalone assembler llvm-mc.

Registers such as $A1 can represent either a 32 or
64 bit register based on the instruction using it.
In addition, based on the abi, $T0 can represent different
32 bit registers.


The problem is resolved by the Mips specific AsmParser 
td definitions changing to work together. Many cases of
RegisterClass parameters are now RegisterOperand.


Contributer: Vladimir Medic
llvm-svn: 172284
2013-01-12 01:03:14 +00:00
Nadav Rotem dbe5c72d03 PPC: Implement efficient lowering of sign_extend_inreg.
llvm-svn: 172269
2013-01-11 22:57:48 +00:00
Preston Gurd 99c6990457 Update patch for the pad short functions pass for Intel Atom (only).
Adds a check for -Oz, changes the code to not re-visit BBs,
and skips over DBG_VALUE instrs.

Patch by Andy Zhang.

llvm-svn: 172258
2013-01-11 22:06:56 +00:00
Nadav Rotem e55aa3c848 ARM Cost Model: Modify the target independent cost model to ask
the target if it supports the different CAST types. We didn't do this
on X86 because of the different register sizes and types, but on ARM
this makes sense.

llvm-svn: 172245
2013-01-11 19:54:13 +00:00
Eric Christopher 0cb6fd930e For inline asm:
- recognize string "{memory}" in the MI generation
- mark as mayload/maystore when there's a memory clobber constraint.

PR14859.

Patch by Krzysztof Parzyszek

llvm-svn: 172228
2013-01-11 18:12:39 +00:00
Tim Northover 3a51aab390 Simplify writing floating types to assembly.
This removes previous special cases for each floating-point type in favour of a
shared codepath.

llvm-svn: 172189
2013-01-11 10:36:13 +00:00
Nadav Rotem 853fe0acb9 ARM Cost Model: We need to detect the max bitwidth of types in the loop in order to select the max vectorization factor.
We don't have a detailed analysis on which values are vectorized and which stay scalars in the vectorized loop so we use
another method. We look at reduction variables, loads and stores, which are the only ways to get information in and out
of loop iterations. If the data types are extended and truncated then the cost model will catch the cost of the vector
zext/sext/trunc operations.

llvm-svn: 172178
2013-01-11 07:11:59 +00:00
Michael Gottesman 5284bd013f Converted test dont-tce-tail-marked-call.ll to use FileCheck.
llvm-svn: 172172
2013-01-11 04:16:35 +00:00
Michael Gottesman 93a0d49c7e This commit is a 4x squash commit consisting of 4x functions converted to use FileCheck instead of grep.
Messages:
Converted test case trivial_codegen_tailcall.ll to use FileCheck.
Converted test return_constant.ll to use FileCheck instead of grep.
Converted test reorder_load.ll to use FileCheck instead of grep.
Converted test intervening-inst.ll to use FileCheck instead of grep.

llvm-svn: 172171
2013-01-11 04:12:53 +00:00
Shuxin Yang c5c730b0e0 PR14904: Segmentation fault running pass 'Recognize loop idioms'
The root cause is mistakenly taking for granted that 
    "dyn_cast<Instruction>(a-Value)"
return a non-NULL instruction.

llvm-svn: 172145
2013-01-10 23:32:01 +00:00
Evan Cheng 098d7b76b0 CastInst::castIsValid should return true if the dest type is the same as
Value's current type. The casting is trivial even for aggregate type.

llvm-svn: 172143
2013-01-10 23:22:53 +00:00
NAKAMURA Takumi e46e8225f4 llvm/test/CodeGen/X86/ms-inline-asm.ll: Fixup; Globals doesn't have leading underscore in symbol on linux.
llvm-svn: 172139
2013-01-10 23:02:48 +00:00
Michael J. Spencer d857c1c9bf [llvm-objdump] Emit addresses with the correct number of leading 0's.
llvm-svn: 172130
2013-01-10 22:40:50 +00:00
Peter Collingbourne f7d65c43d0 [msan] Change va_start/va_copy shadow memset alignment to 8.
This fixes va_start/va_copy of a va_list field which happens to not
be laid out at a 16-byte boundary.

Differential Revision: http://llvm-reviews.chandlerc.com/D276

llvm-svn: 172128
2013-01-10 22:36:33 +00:00
Evan Cheng c8444b159a PR14896: Handle memcpy from constant string where the memcpy size is larger than the string size.
llvm-svn: 172124
2013-01-10 22:13:27 +00:00
Chad Rosier a4bc9437a2 [ms-inline asm] Add support for calling functions from inline assembly.
Part of rdar://12991541

llvm-svn: 172121
2013-01-10 22:10:27 +00:00
Owen Anderson dbf0ca523d Teach InstCombine to hoist FABS and FNEG through FPTRUNC instructions. The application of these operations commutes with the truncation, so we should prefer to do them in the smallest size we can, to save register space, use smaller constant pool entries, etc.
llvm-svn: 172117
2013-01-10 22:06:52 +00:00
Nadav Rotem 6eae65cfac LoopVectorizer: Fix a bug in the vectorization of BinaryOperators. The BinaryOperator can be folded to an Undef, and we don't want to set NSW flags to undef vals.
PR14878

llvm-svn: 172079
2013-01-10 17:34:39 +00:00
Joey Gouly 5fad3e9ad6 Fix a copy/paste error in the IR Linker, casting an ArrayType instead of a VectorType.
llvm-svn: 172054
2013-01-10 10:49:36 +00:00
Joey Gouly 58bf951dec Fix TryToShrinkGlobalToBoolean in GlobalOpt, so that it does not discard address spaces.
llvm-svn: 172051
2013-01-10 10:31:11 +00:00
Manman Ren 207bcbacca Stack Alignment: throw error if we can't satisfy the minimal alignment
requirement when creating stack objects in MachineFrameInfo.

Add CreateStackObjectWithMinAlign to throw error when the minimal alignment
can't be achieved and to clamp the alignment when the preferred alignment
can't be achieved. Same is true for CreateVariableSizedObject.
Will not emit error in CreateSpillStackObject or CreateStackObject.

As long as callers of CreateStackObject do not assume the object will be
aligned at the requested alignment, we should not have miscompile since
later optimizations which look at the object's alignment will have the correct
information.

rdar://12713765

llvm-svn: 172027
2013-01-10 01:10:10 +00:00
Nadav Rotem b1791a75cd ARM Cost model: Use the size of vector registers and widest vectorizable instruction to determine the max vectorization factor.
llvm-svn: 172010
2013-01-09 22:29:00 +00:00
Evan Cheng 5652a8df32 Fix a DAG combine bug visitBRCOND() is transforming br(xor(x, y)) to br(x != y).
It cahced XOR's operands before calling visitXOR() but failed to update the
operands when visitXOR changed the XOR node.

rdar://12968664

llvm-svn: 171999
2013-01-09 20:56:40 +00:00
Benjamin Kramer 130fcde3e5 LICM: Hoist insertvalue/extractvalue out of loops.
Fixes PR14854.

llvm-svn: 171984
2013-01-09 18:12:03 +00:00
Adhemerval Zanella 1ae2248e14 PowerPC: EH adjustments
This patch adjust the r171506 to make all DWARF enconding pc-relative
for PPC64. It also adds the R_PPC64_REL32 relocation handling in MCJIT
(since the eh_frame will not generate PIC-relative relocation) and also
adds the emission of stubs created by the TTypeEncoding.

llvm-svn: 171979
2013-01-09 17:08:15 +00:00
Nadav Rotem 3f5825c6c1 add -march to the test
llvm-svn: 171956
2013-01-09 07:04:23 +00:00
Nadav Rotem 977e0be4a0 Efficient lowering of vector sdiv when the divisor is a splatted power of two constant.
PR 14848. The lowered sequence is based on the existing sequence the target-independent
DAG Combiner creates for the scalar case.

Patch by Zvi Rackover.

llvm-svn: 171953
2013-01-09 05:14:33 +00:00
Andrew Trick 9f0b95f260 MIsched: add an ILP window property to machine model.
This was an experimental option, but needs to be defined
per-target. e.g. PPC A2 needs to aggressively hide latency.

I converted some in-order scheduling tests to A2. Hal is working on
more test cases.

llvm-svn: 171946
2013-01-09 03:36:49 +00:00
Nadav Rotem 4c66f87e8e ARM Cost Model: Add a basic vectorization unrolling test.
llvm-svn: 171931
2013-01-09 01:29:07 +00:00
Nadav Rotem 30a65bc39e Remove the -licm pass from the loop vectorizer test because the loop vectorizer does it now.
llvm-svn: 171930
2013-01-09 01:20:59 +00:00
Nadav Rotem b696c36fcd Cost Model: Move the 'max unroll factor' variable to the TTI and add initial Cost Model support on ARM.
llvm-svn: 171928
2013-01-09 01:15:42 +00:00
Shuxin Yang f0537ab681 Consider expression "0.0 - X" as the negation of X if
- this expression is explicitly marked no-signed-zero, or
  - no-signed-zero of this expression can be derived from some context.

llvm-svn: 171922
2013-01-09 00:13:41 +00:00
Tim Northover 90fb75d859 Specify complete triple for fp128 tests.
This avoids FileCheck failing over different comment characters in
assembly (notably powerpc64 on Linux vs Darwin) and should fix David's
build-bot.

llvm-svn: 171886
2013-01-08 19:36:33 +00:00
Jack Carter c3dd91c4d7 This patch produces the correct addend value for
an R_MIPS_GPREL16 relocation.


Contributer: Jack Carter
llvm-svn: 171882
2013-01-08 19:01:28 +00:00
Jack Carter 9e28cd3fad This patch produces the correct pointer size
value in the 64 bit .eh_frame section.

It doesn't however allow exception handling to work
yet since it depends on the correct relocation model
being set in the ELF header flags.


Contributer: Jack Carter
llvm-svn: 171881
2013-01-08 18:53:20 +00:00
Preston Gurd a01daace88 Pad Short Functions for Intel Atom
The current Intel Atom microarchitecture has a feature whereby
when a function returns early then it is slightly faster to execute
a sequence of NOP instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction until
the return address is ready.

When compiling for X86 Atom only, this patch will run a pass,
called "X86PadShortFunction" which will add NOP instructions where less
than four cycles elapse between function entry and return.

It includes tests.

This patch has been updated to address Nadav's review comments
- Optimize only at >= O1 and don't do optimization if -Os is set
- Stores MachineBasicBlock* instead of BBNum
- Uses DenseMap instead of std::map
- Fixes placement of braces

Patch by Andy Zhang.

llvm-svn: 171879
2013-01-08 18:27:24 +00:00
Tim Northover 7bb9992cce Allow the asm printer to print fp128 values properly.
llvm-svn: 171866
2013-01-08 16:56:23 +00:00
Bill Wendling 76c6521ba1 Make sure we don't emit instructions before a landingpad instruction.
PR14782

llvm-svn: 171846
2013-01-08 10:51:32 +00:00
Eric Christopher 95de6bd469 Add the C testcase to this file.
Suggested by Dave Blaikie.

llvm-svn: 171839
2013-01-08 03:03:14 +00:00
Eric Christopher 72a529566c Remove the llvm-local DW_TAG_vector_type tag and add a test to
make sure that vector types do work.

llvm-svn: 171833
2013-01-08 01:53:52 +00:00
David Blaikie 5c0b298b91 Mark artificial types as such in the annotated debug output.
llvm-svn: 171826
2013-01-08 00:31:02 +00:00
Nadav Rotem 5a197c06f3 LoopVectorizer: Add support for floating point reductions
llvm-svn: 171812
2013-01-07 23:13:00 +00:00
Eli Bendersky 3e76eb2dbc Add some additional tests for the .bundle_lock align_to_end feature that didn't
make into the last commit.

Also, update the test-generation script to generate an exhaustive test for
align_to_end as well, and include the generated test.

llvm-svn: 171811
2013-01-07 23:12:59 +00:00
Nadav Rotem c60d7d96f5 LoopVectorizer: When we vectorizer and widen loops we process many elements at once. This is a good thing, except for
small loops. On small loops post-loop that handles scalars (and runs slower) can take more time to execute than the
rest of the loop. This patch disables widening of loops with a small static trip count.

llvm-svn: 171798
2013-01-07 21:54:51 +00:00
Eli Bendersky 802b62871e Add the align_to_end option to .bundle_lock in the MC implementation of aligned
bundling. The document describing this feature and the implementation has also
been updated:

https://sites.google.com/a/chromium.org/dev/nativeclient/pnacl/aligned-bundling-support-in-llvm

llvm-svn: 171797
2013-01-07 21:51:08 +00:00
Shuxin Yang df0e61e793 This change is to implement following rules:
o. X/C1 * C2 => X * (C2/C1) (if C2/C1 is neither special FP nor denormal)
  o. X/C1 * C2 -> X/(C1/C2)   (if C2/C1 is either specical FP or denormal, but C1/C2 is a normal Fp)

     Let MDC denote multiplication or dividion with one & only one operand being a constant
  o. (MDC ± C1) * C2 => (MDC * C2) ± (C1 * C2)
     (so long as the constant-folding doesn't yield any denormal or special value)

llvm-svn: 171793
2013-01-07 21:39:23 +00:00
Eric Christopher 2cbd5767ad Add support for separating strings for the split debug info DWARF5
proposal. This leaves the strings in the skeleton die as strp,
but in all dwo files they're accessed now via DW_FORM_GNU_str_index.

Add support for dumping these sections and modify the fission-cu.ll
testcase to have the correct strings and form. Fix a small bug
in the fixed form sizes routine that involved out of array accesses
for the table and add a FIXME in the extractFast routine to fix
this up.

llvm-svn: 171779
2013-01-07 19:32:41 +00:00
Bill Schmidt 9b1e3e25dc This patch addresses bug 14678 by fixing two problems in medium code model
code generation.  Variables addressed through a GlobalAlias were not being
handled, and variables with available_externally linkage were treated
incorrectly.  The patch contains two new tests to verify the correct code
generation for these cases.

llvm-svn: 171778
2013-01-07 19:29:18 +00:00
Quentin Colombet 3b2db0bcd3 When code size is the priority (Oz, MinSize attribute), help llvm
turning a code like this:

if (foo)
   free(foo)

into that:
free(foo)

Move a call to free from basic block FB into FB's predecessor, P,
when the path from P to FB is taken only if the argument of free is
not equal to NULL.

Some restrictions apply on P and FB to be sure that this code motion
is profitable. Namely:
1. FB must have only one predecessor P.
2. FB must contain only the call to free plus an unconditional
   branch to S.
3. P's successors are FB and S.

Because of 1., we will not increase the code size when moving the call
to free from FB to P.
Because of 2., FB will be empty after the move.
Because of 2. and 3., P's branch instruction becomes useless, so as FB
(simplifycfg will do the job).

llvm-svn: 171762
2013-01-07 18:37:41 +00:00
David Blaikie 8a9b6a3681 Make test/DebugInfo/member-pointers.ll portable by removing the TargetData
llvm-svn: 171759
2013-01-07 17:52:49 +00:00
Chandler Carruth 26c59fa870 Switch the SCEV expander and LoopStrengthReduce to use
TargetTransformInfo rather than TargetLowering, removing one of the
primary instances of the layering violation of Transforms depending
directly on Target.

This is a really big deal because LSR used to be a "special" pass that
could only be tested fully using llc and by looking at the full output
of it. It also couldn't run with any other loop passes because it had to
be created by the backend. No longer is this true. LSR is now just
a normal pass and we should probably lift the creation of LSR out of
lib/CodeGen/Passes.cpp and into the PassManagerBuilder. =] I've not done
this, or updated all of the tests to use opt and a triple, because
I suspect someone more familiar with LSR would do a better job. This
change should be essentially without functional impact for normal
compilations, and only change behvaior of targetless compilations.

The conversion required changing all of the LSR code to refer to the TTI
interfaces, which fortunately are very similar to TargetLowering's
interfaces. However, it also allowed us to *always* expect to have some
implementation around. I've pushed that simplification through the pass,
and leveraged it to simplify code somewhat. It required some test
updates for one of two things: either we used to skip some checks
altogether but now we get the default "no" answer for them, or we used
to have no information about the target and now we do have some.

I've also started the process of removing AddrMode, as the TTI interface
doesn't use it any longer. In some cases this simplifies code, and in
others it adds some complexity, but I think it's not a bad tradeoff even
there. Subsequent patches will try to clean this up even further and use
other (more appropriate) abstractions.

Yet again, almost all of the formatting changes brought to you by
clang-format. =]

llvm-svn: 171735
2013-01-07 14:41:08 +00:00
David Tweed 3f90937535 Fix a mistaken commit that included some debugging code.
llvm-svn: 171734
2013-01-07 13:41:55 +00:00
David Tweed a11edf0ce3 There was a switch fall-through in the parser for textual LLVM that caused
bogus comparison operands to default to eq/oeq. Fix that, fix a couple of
tests that accidentally passed and test for bogus comparison opeartors
explicitly.

llvm-svn: 171733
2013-01-07 13:32:38 +00:00
Silviu Baranga a055aab506 Make the MergeGlobals pass correctly handle the address space qualifiers of the global variables. We partition the set of globals by their address space, and apply the same the trasnformation as before to merge them.
llvm-svn: 171730
2013-01-07 12:31:25 +00:00
Chandler Carruth 7383bfd67e Switch BBVectorize to directly depend on having a TTI analysis.
This could be simplified further, but Hal has a specific feature for
ignoring TTI, and so I preserved that.

Also, I needed to use it because a number of tests fail when switching
from a null TTI to the NoTTI nonce implementation. That seems suspicious
to me and so may be something that you need to look into Hal. I worked
it by preserving the old behavior for these tests with the flag that
ignores all target info.

llvm-svn: 171722
2013-01-07 10:22:36 +00:00
David Blaikie 5d3249b554 PR14759: Debug info support for C++ member pointers.
This works fine with GDB for member variable pointers, but GDB's support for
member function pointers seems to be quite unrelated to
DW_TAG_ptr_to_member_type. (see GDB bug 14998 for details)

llvm-svn: 171698
2013-01-07 05:51:15 +00:00
Craig Topper 4f1c7256f9 Fix suffix handling for parsing and printing of cvtsi2ss, cvtsi2sd, cvtss2si, cvttss2si, cvtsd2si, and cvttsd2si to match gas behavior.
cvtsi2* should parse with an 'l' or 'q' suffix or no suffix at all. No suffix should be treated the same as 'l' suffix. Printing should always print a suffix. Previously we didn't parse or print an 'l' suffix.
cvtt*2si/cvt*2si should parse with an 'l' or 'q' suffix or not suffix at all. No suffix should use the destination register size to choose encoding. Printing should not print a suffix.

Original 'l' suffix issue with cvtsi2* pointed out by Michael Kuperstein.

llvm-svn: 171668
2013-01-06 20:39:29 +00:00
Evan Cheng 3fb03e23a4 Fix for PR14739. It's not safe to fold a load into a call across a store. Thanks to Nick Lewycky for the initial patch.
llvm-svn: 171665
2013-01-06 19:00:15 +00:00
Andrew Trick f950ce8e38 Fix a crash in LSR replaceCongruentIVs.
Indirect branch in the preheader crashes replaceCongruentIVs.
Fixes rdar://12910141.

llvm-svn: 171653
2013-01-06 05:59:39 +00:00
Michael J. Spencer c445408710 [Object][ELF] Fix incorrect size of members for the 64 version of Elf_Phdr_Impl.
llvm-svn: 171650
2013-01-06 03:57:11 +00:00
Michael J. Spencer 209565db2d [objdump] Add --private-headers, -p.
This currently prints the ELF program headers.

llvm-svn: 171649
2013-01-06 03:56:49 +00:00
David Blaikie e05754576b Include access modifiers in subprogram metadata IR comment.
Based on code review feedback in r171604 from Chandler Carruth &
Eric Christopher.

llvm-svn: 171636
2013-01-05 21:39:33 +00:00
David Blaikie 800a916f99 Emit DW_TAG_formal_parameter for unnamed parameters.
This change essentially reverts r87069 which came without a test case. It
causes no regressions in the GDB 7.5 test suite & fixes 25 xfails (commit
to the test suite to follow). If anyone can present a test case that
demonstrates why this check is necessary I'd be happy to account for it in one
way or another.

llvm-svn: 171609
2013-01-05 07:43:02 +00:00
Craig Topper 92a70b1e65 Recommit r171461 which was incorrectly reverted. Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks.
llvm-svn: 171608
2013-01-05 07:39:25 +00:00
Nadav Rotem 478b6a47ec Revert revision 171524. Original message:
URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev
Log:
The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171603
2013-01-05 05:42:48 +00:00
Nadav Rotem f19d515316 Fix a typo. Remove the duplicated test.
llvm-svn: 171584
2013-01-05 01:17:46 +00:00
Nadav Rotem e9f5bfd5e9 iLoopVectorize: Non commutative operators can be used as reduction variables as long as the reduction chain is used in the LHS.
PR14803.

llvm-svn: 171583
2013-01-05 01:15:47 +00:00
Nadav Rotem 6d9dafe3ff Force a fixed unroll count on the target independent tests.
This should fix clang-native-arm-cortex-a9. Thanks Renato.

llvm-svn: 171582
2013-01-05 00:58:48 +00:00
Andrew Trick 18021a45aa tabs-to-spaces
llvm-svn: 171550
2013-01-04 23:11:35 +00:00
Paul Redmond 874f01e956 Do not vectorize loops with subtraction reductions
Since subtraction does not commute the loop vectorizer incorrectly vectorizes
reductions such as x = A[i] - x.

Disabling for now.

llvm-svn: 171537
2013-01-04 22:10:16 +00:00
Eric Christopher cad9b53c02 Add a name for the anonymous type we're creating for subrange
types and a FIXME for what we should be doing. Should solve the
immediacy of PR12069 where our debug info is crashing another
tool.

llvm-svn: 171536
2013-01-04 21:51:53 +00:00
Preston Gurd e36b685a94 The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171524
2013-01-04 20:54:54 +00:00
Michael J. Spencer bae14cef80 [Object][ELF] Add a maximum alignment. This is used by createELFObjectFile to create a properly aligned reader.
llvm-svn: 171520
2013-01-04 20:36:28 +00:00
Akira Hatanaka b13b33359b [mips] MipsTargetLowering::getSetCCResultType should return a vector type if
vectors are being compared.

llvm-svn: 171517
2013-01-04 20:06:01 +00:00
Manman Ren fe5a61edbe Memory Dependence Analysis: fix a miscompile that uses DT to approxmiate the
reachablity.

We conservatively approximate the reachability analysis by saying it is not
reachable if there is a single path starting from "From" and the path does not
reach "To".

rdar://12801584

llvm-svn: 171512
2013-01-04 19:19:47 +00:00
Adhemerval Zanella 9b0b781395 PowerPC: Fix eh_frame relocation for PIC
This patch fixes the PPC eh_frame definitions for the personality and 
frame unwinding for PIC objects. It makes PIC build correctly creates
relative relocations in the '.rela.eh_frame' segments and thus avoiding
a text relocation that generates a DT_TEXTREL segments in link phase.

llvm-svn: 171506
2013-01-04 19:08:13 +00:00
Nadav Rotem e1d5c4b8b9 LoopVectorizer:
1. Add code to estimate register pressure.
2. Add code to select the unroll factor based on register pressure.
3. Add bits to TargetTransformInfo to provide the number of registers.

llvm-svn: 171469
2013-01-04 17:48:25 +00:00
Nadav Rotem c616a5408a Revert revision: 171467. This transformation is incorrect and makes some tests fail. Original message:
Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1.
Added a test.

llvm-svn: 171468
2013-01-04 17:35:21 +00:00
Elena Demikhovsky 5f2f06d2d9 Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1.
Added a test.

llvm-svn: 171467
2013-01-03 08:48:33 +00:00
Michael Gottesman 820aac1c78 Revert "Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks."
This reverts commit r171461 since it breaks the following tests:

Clang :: Analysis/outofbound-notwork.c
Clang :: Analysis/string-fail.c
Clang :: CXX/basic/basic.lookup/basic.lookup.qual/p6-0x.cpp
Clang :: CXX/basic/basic.lookup/basic.lookup.unqual/p15.cpp
Clang :: CXX/dcl.dcl/dcl.spec/dcl.fct.spec/p4.cpp
Clang :: CXX/dcl.dcl/dcl.spec/dcl.stc/p10.cpp
Clang :: CXX/temp/temp.param/p14.cpp
Clang :: CXX/temp/temp.res/temp.dep.res/temp.point/p1.cpp
Clang :: CodeGen/2009-02-13-zerosize-union-field-ppc.c
Clang :: CodeGen/blocks-2.c
Clang :: CodeGen/libcalls-d.c
Clang :: CodeGen/libcalls-ld.c
Clang :: CodeGenCXX/conversion-function.cpp
Clang :: CodeGenCXX/debug-info-limit-type.cpp
Clang :: CodeGenCXX/inheriting-constructor.cpp
Clang :: FixIt/fixit-errors.c
Clang :: FixIt/fixit-pmem.cpp
Clang :: Modules/namespaces.cpp
Clang :: PCH/changed-files.c
Clang :: PCH/pr4489.c
Clang :: PCH/source-manager-stack.c
Clang :: Parser/cxx-ambig-decl-expr-xfail.cpp
Clang :: SemaCXX/switch-implicit-fallthrough-cxx98.cpp
Clang :: SemaTemplate/instantiate-function-1.mm

llvm-svn: 171466
2013-01-03 08:18:30 +00:00
Craig Topper 7c27cc9fd0 Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks.
llvm-svn: 171461
2013-01-03 06:40:20 +00:00
Nadav Rotem d554a517c0 LoopVectorizer: Test the unrolling flag.
llvm-svn: 171446
2013-01-03 01:47:31 +00:00
Michael J. Spencer e0219f78d3 [Object] Temporarily disable these tests.
They are failing because archives create unaligned ELF files. The recent
Endian change added a __builtin_unreachable() when this happens. I will be
committing a fix for this soon.

llvm-svn: 171438
2013-01-03 01:24:32 +00:00
Jakob Stoklund Olesen 725d57682b Fix PR14732 by handling all kinds of IMPLICIT_DEF live ranges.
Most IMPLICIT_DEF instructions are removed by the ProcessImplicitDefs
pass, and a few are reinserted by PHIElimination when a PHI argument is
<undef>.

RegisterCoalescer was assuming that all IMPLICIT_DEF live ranges look
like those created by PHIElimination, and that their live range never
leaves the basic block.

The PR14732 test case does tricks with PHI nodes that causes a longer
IMPLICIT_DEF live range to appear. This happens very rarely, but
RegisterCoalescer should be able to handle it.

llvm-svn: 171435
2013-01-03 00:47:51 +00:00
Nadav Rotem 4897392360 Avoid vectorization when the function has the "noimplicitflot" attribute.
llvm-svn: 171429
2013-01-02 23:54:43 +00:00
Eric Christopher da4b2195fc Extend the dumping infrastructure to deal with additional
sections for debug info. These are some of the dwo sections from the
DWARF5 split debug info proposal. Update the fission-cu.ll testcase
to show what we should be able to dump more of now.

Work in progress: Ultimately the relocations will be gone for the
dwo section and the strings will be a different form (as well as
the rest of the sections will be included).

llvm-svn: 171428
2013-01-02 23:52:13 +00:00
Tom Stellard 567f886eb0 DAGCombiner: Avoid generating illegal vector INT_TO_FP nodes
DAGCombiner::reduceBuildVecConvertToConvertBuildVec() was making two
mistakes:

1. It was checking the legality of scalar INT_TO_FP nodes and then generating
vector nodes.

2. It was passing the result value type to
TargetLoweringInfo::getOperationAction() when it should have been
passing the value type of the first operand.

llvm-svn: 171420
2013-01-02 22:13:01 +00:00
Kevin Enderby 726e0ea6eb Adds missing aliases for fcom and fcomp instructions without arguments.
Patch by Michael M Kuperstein!

llvm-svn: 171414
2013-01-02 21:20:15 +00:00
Nadav Rotem c8d7047fa9 AVX: Fix a bug in WidenMaskArithmetic.
llvm-svn: 171397
2013-01-02 17:40:39 +00:00
Dmitri Gribenko 86fb558d9a Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID
This is done to avoid odd test failures, like the one fixed in r171243.

While there, FileCheck'ize tests.

llvm-svn: 171344
2013-01-01 14:04:36 +00:00
Dmitri Gribenko d7beca87f5 Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID
This is done to avoid odd test failures, like the one fixed in r171243.

My previous regex was not good enough to find these.

llvm-svn: 171343
2013-01-01 13:57:25 +00:00
Nadav Rotem b1615b1ac4 Make opt grab the triple from the module and use it to initialize the target machine.
llvm-svn: 171341
2013-01-01 08:00:32 +00:00
Nuno Lopes d896a400f1 recommit r171298 (add support for PHI nodes to ObjectSizeOffsetVisitor). Hopefully with bugs corrected now.
llvm-svn: 171325
2012-12-31 20:45:10 +00:00
Benjamin Kramer af463573cb Revert "add support for PHI nodes to ObjectSizeOffsetVisitor"
This reverts r171298. Breaks clang selfhost.

llvm-svn: 171318
2012-12-31 19:51:10 +00:00
Jakub Staszak c48bbe7170 Add extra CHECK to make sure that 'or' instruction was replaced.
Also add an assert to avoid confusion in the code where is known that C1 <= C2.

llvm-svn: 171310
2012-12-31 18:26:42 +00:00
Rafael Espindola c8288c103d Fix bits check in ELFObjectFile::isSectionZeroInit().
Fixes PR14723.

Patch by Sami Liedes!

llvm-svn: 171309
2012-12-31 18:20:51 +00:00
Rafael Espindola 21bd841d27 Dump sections. Extracted from a patch by Sami Liedes.
llvm-svn: 171304
2012-12-31 16:29:44 +00:00
Rafael Espindola 144af2cb4d Print a header above the symbols. Extracted from a patch by Sami Liedes.
llvm-svn: 171302
2012-12-31 16:05:21 +00:00
Nuno Lopes 7ab7c02d23 add support for PHI nodes to ObjectSizeOffsetVisitor
llvm-svn: 171298
2012-12-31 13:52:36 +00:00
Chris Lattner f5cca68c2c Fix LICM's memory promotion optimization to preserve TBAA tags when
promoting a store in a loop.  This was noticed when working on PR14753,
but isn't directly related.

llvm-svn: 171281
2012-12-31 08:37:17 +00:00
Chris Lattner eeefe1bc07 teach instcombine to preserve TBAA tag when merging two stores, part of
PR14753

llvm-svn: 171279
2012-12-31 08:10:58 +00:00
Jakub Staszak ea2b9b9d67 Transform (A == C1 || A == C2) into (A & ~(C1 ^ C2)) == C1
if C1 and C2 differ only with one bit.
Fixes PR14708.

llvm-svn: 171270
2012-12-31 00:34:55 +00:00
Hal Finkel 6dbdd4307b Support ppcf128 in SelectionDAG::getConstantFP
Fixes pr14751.

Patch by Kai; Thanks!

llvm-svn: 171261
2012-12-30 19:03:32 +00:00
Nadav Rotem 0b37f14371 LoopVectorizer: Fix a bug in the code that updates the loop exiting block.
LCSSA PHIs may have undef values. The vectorizer updates values that are used by outside users such as PHIs.
The bug happened because undefs are not loop values. This patch handles these PHIs.

PR14725

llvm-svn: 171251
2012-12-30 07:47:00 +00:00
Dmitri Gribenko 56bf2e1830 Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID
This is done to avoid odd test failures, like the one fixed in r171243.

llvm-svn: 171250
2012-12-30 02:33:22 +00:00
Dmitri Gribenko 10c4b4d249 Add a check to the test Analysis/ScalarEvolution/2010-09-03-RequiredTransitive.ll
This test did not test anything at all (except for opt crashing, but that was
not the reason why it was added).

llvm-svn: 171248
2012-12-30 01:42:34 +00:00
Dmitri Gribenko b137c9e551 Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID
This is done to avoid odd test failures, like the one fixed in r171243.

llvm-svn: 171246
2012-12-30 01:28:40 +00:00
NAKAMURA Takumi 5a495a5c96 llvm/test/Transforms/GVN/null-aliases-nothing.ll: Fix a RUN line not to emit ModuleID.
Larry Evans reported it fails if source tree contains "load", like "download".

llvm-svn: 171243
2012-12-30 00:33:26 +00:00
Chandler Carruth 86ed53089f Fix a stunning oversight in the inline cost analysis. It was never
propagating one of the values it simplified to a constant across
a myriad of instructions. Notably, ptrtoint instructions when we had
a constant pointer (say, 0) didn't propagate that, blocking a massive
number of down-stream optimizations.

This was uncovered when investigating why we fail to inline and delete
the boilerplate in:

  void f() {
    std::vector<int> v;
    v.push_back(1);
  }

It turns out most of the efforts I've made thus far to improve the
analysis weren't making it far purely because of this. After this is
fixed, the store-to-load forwarding patch enables LLVM to optimize the
above to an empty function. We still can't nuke a second push_back, but
for different reasons.

There is a very real chance this will cause somewhat noticable changes
in inlining behavior, so please let me know if you see regressions (or
improvements!) because of this patch.

llvm-svn: 171196
2012-12-28 14:43:42 +00:00
Chandler Carruth 753e21d057 Teach the inline cost analysis about calls that can be simplified and
how to propagate constants through insert and extract value
instructions.

With the recent improvements to instsimplify, this allows inline cost
analysis to constant fold through intrinsic functions, including notably
the with.overflow intrinsic math routines which often show up inside of
STL abstractions. This is yet another piece in the puzzle of breaking
down the code for:

  void f() {
    std::vector<int> v;
    v.push_back(1);
  }

But it still isn't enough. There are a pile of bugs in inline cost still
blocking this.

llvm-svn: 171195
2012-12-28 14:23:32 +00:00
Chandler Carruth f6182155f6 Teach instsimplify to use the constant folder where appropriate for
constant folding calls. Add the initial tests for this which show that
now instsimplify can simplify blindingly obvious code patterns expressed
with both intrinsics and library calls.

llvm-svn: 171194
2012-12-28 14:23:29 +00:00
Nadav Rotem 3da9ac72fa AVX: Move the ZEXT/ANYEXT DAGCo optimizations to the lowering of these optimizations. The old test cases still cover all of these lowering/optimizations. The single change that we have is that now anyext does not need to zero a register, because it does not use the exact code path as the zero_extend.
llvm-svn: 171178
2012-12-28 05:45:24 +00:00
Alexey Samsonov 29dd7f2090 [ASan] Fix lifetime intrinsics handling. Now for each intrinsic we check if it describes one of 'interesting' allocas. Assume that allocas can go through casts and phi-nodes before apperaring as llvm.lifetime arguments
llvm-svn: 171153
2012-12-27 08:50:58 +00:00
Nadav Rotem 2a054b4475 On AVX/AVX2 the type v8i1 is legalized to v8i16, which is an XMM sized
register. In most cases we actually compare or select YMM-sized registers
and mixing the two types creates horrible code. This commit optimizes
some of the transition sequences.

PR14657.

llvm-svn: 171148
2012-12-27 08:15:45 +00:00
Eric Christopher 3bf29fda91 For the dwarf5 split debug info code split out the string section
per compile unit/skeleton compile unit. Update tests accordingly.

llvm-svn: 171133
2012-12-27 02:14:01 +00:00
Eric Christopher c8a88ee691 FileCheck-ize.
llvm-svn: 171132
2012-12-27 02:13:58 +00:00
Eric Christopher d6152aabbb FileCheck-ize.
llvm-svn: 171131
2012-12-27 02:13:55 +00:00
Eric Christopher 5a6acfa4c8 Right now all of the relocations are 32-bit dwarf, and the relocation
information doesn't return an addend for Rel relocations. Go ahead
and use this information to fix relocation handling inside dwarfdump
for 32-bit ELF REL.

llvm-svn: 171126
2012-12-27 01:07:07 +00:00
Nadav Rotem 5350cd314b If all of the write objects are identified then we can vectorize the loop even if the read objects are unidentified.
PR14719.

llvm-svn: 171124
2012-12-26 23:30:53 +00:00
Nadav Rotem 3f7c4f36ba LoopVectorizer: Optimize the vectorization of consecutive memory access when the iteration step is -1
llvm-svn: 171114
2012-12-26 19:08:17 +00:00
Evgeniy Stepanov 5eb5bf8b46 [msan] Raise alignment of origin stores/loads when possible.
Origin alignment is as high as the alignment of the corresponding application
location, but never less than 4.

llvm-svn: 171110
2012-12-26 11:55:09 +00:00
NAKAMURA Takumi 40aa3285f4 llvm/test/CodeGen/X86: FileCheck-ize two tests in r171083.
llvm-svn: 171084
2012-12-26 03:19:30 +00:00
NAKAMURA Takumi 334f685328 llvm/test/CodeGen/X86: Disable avx in two tests corresponding to r171082.
llvm-svn: 171083
2012-12-26 03:08:55 +00:00
Hal Finkel 30e95a8ebb BBVectorize: Use VTTI to compute costs for intrinsics vectorization
For the time being this includes only some dummy test cases. Once the
generic implementation of the intrinsics cost function does something other
than assuming scalarization in all cases, or some target specializes the
interface, some real test cases can be added.

Also, for consistency, I changed the type of IID from unsigned to Intrinsic::ID
in a few other places.

llvm-svn: 171079
2012-12-26 01:36:57 +00:00
Hal Finkel b44f890133 LoopVectorize: Enable vectorization of the fmuladd intrinsic
llvm-svn: 171076
2012-12-25 23:21:29 +00:00
Hal Finkel 2a456112ec BBVectorize: Enable vectorization of the fmuladd intrinsic
llvm-svn: 171075
2012-12-25 22:36:08 +00:00
Hal Finkel 2ebe6d08cd Loosen scheduling restrictions on the PPC dcbt intrinsic
As with the prefetch intrinsic to which it maps, simply have dcbt
marked as reading from and writing to its arguments instead of having
unmodeled side effects. While this might cause unwanted code motion
(because aliasing checks don't really capture cache-line sharing),
it is more important that prefetches in unrolled loops don't block
the scheduler from rearranging the unrolled loop body.

llvm-svn: 171073
2012-12-25 18:51:18 +00:00
Hal Finkel 1b5ff08d43 Expand PPC64 atomic load and store
Use of store or load with the atomic specifier on 64-bit types would
cause instruction-selection failures. As with the 32-bit case, these
can use the default expansion in terms of cmp-and-swap.

llvm-svn: 171072
2012-12-25 17:22:53 +00:00
Evgeniy Stepanov f19c086d1e [msan] Fix handling of vectors of pointers.
VectorType::getInteger() can not be used with them, because pointer size
depends on the target.

llvm-svn: 171070
2012-12-25 16:04:38 +00:00
Evgeniy Stepanov ec8371283b [msan] Fix handling of select with vector condition.
llvm-svn: 171069
2012-12-25 14:56:21 +00:00
Benjamin Kramer a9f265ee98 Harden test so it's not affected by changes to compare lowering.
This only failed on hosts that don't have SSE41.

llvm-svn: 171066
2012-12-25 13:23:23 +00:00
Benjamin Kramer 81b5a8fd2e X86: Shave off one shuffle from the pcmpeqq sequence for SSE2 by making use of and commutativity.
llvm-svn: 171064
2012-12-25 13:09:08 +00:00
Benjamin Kramer df4af41b9b X86: Custom lower <2 x i64> eq and ne when SSE41 is not available.
pcmpeqd, pshufd, pshufd, pand is cheaper than unpack + cmpq, sbbq, cmpq, sbbq + pack.
Small speedup on loop-vectorized viterbi (-march=core2).

llvm-svn: 171063
2012-12-25 12:54:19 +00:00
Nick Lewycky fb43258080 Fix typo "Makre" -> "Make".
llvm-svn: 171043
2012-12-24 19:55:47 +00:00
NAKAMURA Takumi 1b18db7ea3 llvm/test/CodeGen/X86/fold-vex.ll: Add explicit triple.
llvm-svn: 171029
2012-12-24 11:14:06 +00:00
Nadav Rotem dc0ad92b64 Some x86 instructions can load/store one of the operands to memory. On SSE, this memory needs to be aligned.
When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding
tables and removes the alignment restrictions from VEX-encoded instructions.

llvm-svn: 171024
2012-12-24 09:40:33 +00:00
Nadav Rotem 5f7c12cfbd LoopVectorizer: When checking for vectorizable types, also check
the StoreInst operands.

PR14705.

llvm-svn: 171023
2012-12-24 09:14:18 +00:00
Nadav Rotem bd5d1d832a LoopVectorizer: Fix an endless loop in the code that looks for reductions.
The bug was in the code that detects PHIs in if-then-else block sequence.

PR14701.

llvm-svn: 171008
2012-12-24 01:22:06 +00:00
Nadav Rotem cf9999d9d5 CostModel: Change the default target-independent implementation for finding
the cost of arithmetic functions. We now assume that the cost of arithmetic
operations that are marked as Legal or Promote is low, but ops that are
marked as custom are higher.

llvm-svn: 171002
2012-12-23 17:31:23 +00:00
Nadav Rotem aa92ea4f12 We are not ready to estimate the cost of integer expansions based on the number of parts. This test is too noisy.
llvm-svn: 170999
2012-12-23 09:11:07 +00:00
Nadav Rotem 2cade68025 Loop Vectorizer: Update the cost model of scatter/gather operations and make
them more expensive.

llvm-svn: 170995
2012-12-23 07:23:55 +00:00
Benjamin Kramer 76268ac682 X86: Turn mul of <4 x i32> into pmuludq when no SSE4.1 is available.
pmuludq is slow, but it turns out that all the unpacking and packing of the
scalarized mul is even slower. 10% speedup on loop-vectorized paq8p.

llvm-svn: 170985
2012-12-22 16:07:56 +00:00
Benjamin Kramer b2f0a2bd4b X86: Emit vector sext as shuffle + sra if vpmovsx is not available.
Also loosen the SSSE3 dependency a bit, expanded pshufb + psra is still better
than scalarized loads. Fixes PR14590.

llvm-svn: 170984
2012-12-22 11:34:28 +00:00
Nadav Rotem d5aae980cb In some cases, due to scheduling constraints we copy the EFLAGS.
The only way to read the eflags is using push and pop. If we don't
adjust the stack then we run over the first frame index. This is
not something that we want to do, so we have to make sure that
our machine function does not copy the flags. If it does then
we have to emit the prolog that adjusts the stack.

rdar://12896831

llvm-svn: 170961
2012-12-21 23:48:49 +00:00
Akira Hatanaka d6b694f036 [mips] Fix encoding of BAL instruction. Also, fix assembler test case which
was not catching the error.

llvm-svn: 170953
2012-12-21 23:13:59 +00:00
Benjamin Kramer b4688f84bd try to unbreak ppc buildbots.
llvm-svn: 170913
2012-12-21 18:11:45 +00:00
Benjamin Kramer 82d1c371e2 X86: Match pmin/pmax as a target specific dag combine. This occurs during vectorization.
Part of PR14667.

llvm-svn: 170908
2012-12-21 17:46:58 +00:00
Tom Stellard a8b0351720 R600: Expand vec4 INT <-> FP conversions
llvm-svn: 170901
2012-12-21 16:33:24 +00:00
Evgeniy Stepanov 4fbc0d08bf [msan] Remove unreachable blocks before instrumenting a function.
llvm-svn: 170883
2012-12-21 11:18:49 +00:00
Nadav Rotem 6d4fdd6d2c Improve the X86 cost model for loads and stores.
llvm-svn: 170830
2012-12-21 01:33:59 +00:00
Reed Kotler 93f778d2bd Add test case for r170674
llvm-svn: 170823
2012-12-21 00:55:10 +00:00
Nadav Rotem e7785686a5 Fix a bug in the code that checks if we can vectorize loops while using dynamic
memory bound checks.  Before the fix we were able to vectorize this loop from
the Livermore Loops benchmark:

for ( k=1 ; k<n ; k++ )
  x[k] = x[k-1] + y[k];

llvm-svn: 170811
2012-12-21 00:07:35 +00:00
Eric Christopher 6e47b725ff Move these files over to the debug info directory.
llvm-svn: 170810
2012-12-21 00:03:42 +00:00
Bob Wilson 7bba4f8957 Revert "Adding support for llvm.arm.neon.vaddl[su].* and"
This reverts r170694.  The operations can be represented in IR without
adding any new intrinsics.

llvm-svn: 170765
2012-12-20 21:09:38 +00:00
Nadav Rotem 2ababf68d7 LoopVectorize: Fix a bug in the scalarization of instructions.
Before if-conversion we could check if a value is loop invariant
if it was declared inside the basic block. Now that loops have
multiple blocks this check is incorrect.

This fixes External/SPEC/CINT95/099_go/099_go

llvm-svn: 170756
2012-12-20 20:24:40 +00:00
Evan Cheng ddc0cb6dc5 On some ARM cpus, flags setting movs with shifter operand, i.e. lsl, lsr, asr,
are more expensive than the non-flag setting variant. Teach thumb2 size
reduction pass to avoid generating them unless we are optimizing for size.

rdar://12892707

llvm-svn: 170728
2012-12-20 19:59:30 +00:00
Eli Bendersky 4cfb5b9e64 Change Lit error redirection to FileCheck to a more common syntax since it
can potentially cause some bots to fail.

llvm-svn: 170726
2012-12-20 19:54:02 +00:00
Eli Bendersky f658e92724 Add a largish auto-generated test for the aligned bundling feature, along with
the script generating it. The test should never be modified manually. If anyone
needs to change it, please change the script and re-run it.

The script is placed into utils/testgen - I couldn't think of a better place,
and after some discussion on IRC this looked like a logical location.

llvm-svn: 170720
2012-12-20 19:16:57 +00:00
Eli Bendersky 4c4f11eb0d Tests for the aligned bundling support added in r170718
llvm-svn: 170719
2012-12-20 19:07:30 +00:00
Rafael Espindola 642c7cd56e Simplify the testcase a bit.
I checked that it would still crash llc before the corresponding fix.

llvm-svn: 170709
2012-12-20 17:47:27 +00:00
James Molloy 4f6fb953a7 Add a new attribute, 'noduplicate'. If a function contains a noduplicate call, the call cannot be duplicated - Jump threading, loop unrolling, loop unswitching, and loop rotation are inhibited if they would duplicate the call.
Similarly inlining of the function is inhibited, if that would duplicate the call (in particular inlining is still allowed when there is only one callsite and the function has internal linkage).

llvm-svn: 170704
2012-12-20 16:04:27 +00:00
Renato Golin 6b2ea4a48f Adding support for llvm.arm.neon.vaddl[su].* and
llvm.arm.neon.vsub[su].* intrinsics.

Patch by Pete Couperus <pjcoup@gmail.com>

llvm-svn: 170694
2012-12-20 13:52:11 +00:00
Reed Kotler d019dbf75e fix most of remaining issues with large frames.
these patches are tested a lot by test-suite but
make check tests are forthcoming once the next
few patches that complete this are committed.
with the next few patches the pass rate for mips16 is
near 100%

llvm-svn: 170656
2012-12-20 04:07:42 +00:00
Akira Hatanaka f423672117 [mips] Use "or $r0, $r1, $zero" instead of "addu $r0, $zero, $r1" to copy
physical register $r1 to $r0.

GNU disassembler recognizes an "or" instruction as a "move", and this change
makes the disassembled code easier to read.

Original patch by Reed Kotler.

llvm-svn: 170655
2012-12-20 04:06:06 +00:00
Bob Wilson 3365b80290 Do not introduce vector operations in functions marked with noimplicitfloat.
<rdar://problem/12879313>

llvm-svn: 170630
2012-12-20 01:36:20 +00:00
Eric Christopher 3c5a1914b6 Split out abbreviations for the skeleton info from the rest of
the abbreviations. Part of implementing split dwarf.

llvm-svn: 170589
2012-12-19 22:02:53 +00:00
Evan Cheng eae6d2ccea LLVM sdisel normalize bit extraction of the form:
((x & 0xff00) >> 8) << 2
to
 (x >> 6) & 0x3fc

This is general goodness since it folds a left shift into the mask. However,
the trailing zeros in the mask prevents the ARM backend from using the bit
extraction instructions. And worse since the mask materialization may require
an addition instruction. This comes up fairly frequently when the result of 
the bit twiddling is used as memory address. e.g.

 = ptr[(x & 0xFF0000) >> 16]

We want to generate:
  ubfx   r3, r1, #16, #8
  ldr.w  r3, [r0, r3, lsl #2]

vs.
  mov.w  r9, #1020
  and.w  r2, r9, r1, lsr #14
  ldr    r2, [r0, r2]

Add a late ARM specific isel optimization to
ARMDAGToDAGISel::PreprocessISelDAG(). It folds the left shift to the
'base + offset' address computation; change the mask to one which doesn't have
trailing zeros and enable the use of ubfx.

Note the optimization has to be done late since it's target specific and we
don't want to change the DAG normalization. It's also fairly restrictive
as shifter operands are not always free. It's only done for lsh 1 / 2. It's
known to be free on some cpus and they are most common for address
computation.

This is a slight win for blowfish, rijndael, etc.

rdar://12870177

llvm-svn: 170581
2012-12-19 20:16:09 +00:00
Roman Divacky e3d323052f Remove edis - the enhanced disassembler. Fixes PR14654.
llvm-svn: 170578
2012-12-19 19:55:47 +00:00
Paul Redmond 5917f4c715 Transform (x&C)>V into (x&C)!=0 where possible
When the least bit of C is greater than V, (x&C) must be greater than V
if it is not zero, so the comparison can be simplified.

Although this was suggested in Target/X86/README.txt, it benefits any
architecture with a directly testable form of AND.

Patch by Kevin Schoedel

llvm-svn: 170576
2012-12-19 19:47:13 +00:00
Benjamin Kramer c5071466d4 PowerPC: Expand VSELECT nodes.
There's probably a better expansion for those nodes than the default for
altivec, but this is better than crashing. VSELECTs occur in loop vectorizer
output.

llvm-svn: 170551
2012-12-19 15:49:14 +00:00
Benjamin Kramer ae0bb61053 Make TargetLowering::getTypeConversion more resilient against odd illegal MVTs.
- An MVT can become an EVT when being split (e.g. v2i8 -> v1i8, the latter doesn't exist)
- Return the scalar value when an MVT is scalarized (v1i64 -> i64)

Fixes PR14639ff.

llvm-svn: 170546
2012-12-19 14:34:28 +00:00
Evgeniy Stepanov d7571cd4bc [msan] Heuristically instrument unknown intrinsics.
This changes adds shadow and origin propagation for unknown intrinsics
by examining the arguments and ModRef behaviour. For now, only 3 classes
of intrinsics are handled:
- those that look like simple SIMD store
- those that look like simple SIMD load
- those that don't have memory effects and look like arithmetic/logic/whatever
  operation on simple types.

llvm-svn: 170530
2012-12-19 11:22:04 +00:00
Elena Demikhovsky 14a4af0e66 Optimized load + SIGN_EXTEND patterns in the X86 backend.
llvm-svn: 170506
2012-12-19 07:50:20 +00:00
Nadav Rotem 33360d8ae9 After reducing the size of an operation in the DAG we zero-extend the reduced
bitwidth op back to the original size. If we reduce ANDs then this can cause
an endless loop. This patch changes the ZEXT to ANY_EXTEND if the demanded bits
are equal or smaller than the size of the reduced operation.

llvm-svn: 170505
2012-12-19 07:39:08 +00:00
Craig Topper 63f5921776 Teach SimplifySetCC that comparing AssertZext i1 against a constant 1 can be rewritten as a compare against a constant 0 with the opposite condition.
llvm-svn: 170495
2012-12-19 06:12:28 +00:00
Shuxin Yang 37a1efe1c6 rdar://12801297
InstCombine for unsafe floating-point add/sub.

llvm-svn: 170471
2012-12-18 23:10:12 +00:00
Jakub Staszak 338863a546 Reverse order of checking SSE level when calculating compare cost, so we check
AVX2 before AVX.

llvm-svn: 170464
2012-12-18 22:57:56 +00:00
Quentin Colombet 23b404d5ad Disable ARM partial flag dependency optimization at -Oz
To not over constrain the scheduler for ARM in thumb mode, some optimizations  for code size reduction, specific to ARM thumb, are blocked when they add a dependency (like write after read dependency).

Disables this check when code size is the priority, i.e., code is compiled with -Oz.

llvm-svn: 170462
2012-12-18 22:47:16 +00:00
Andrew Trick ec2564818c MISched: add dependence to ExitSU to model live-out latency.
llvm-svn: 170454
2012-12-18 20:53:01 +00:00
Benjamin Kramer f0e5d2f032 LoopVectorize: Emit reductions as log2(vectorsize) shuffles + vector ops instead of scalar operations.
For example on x86 with SSE4.2 a <8 x i8> add reduction becomes
	movdqa	%xmm0, %xmm1
	movhlps	%xmm1, %xmm1            ## xmm1 = xmm1[1,1]
	paddw	%xmm0, %xmm1
	pshufd	$1, %xmm1, %xmm0        ## xmm0 = xmm1[1,0,0,0]
	paddw	%xmm1, %xmm0
	phaddw	%xmm0, %xmm0
	pextrb	$0, %xmm0, %edx

instead of
	pextrb	$2, %xmm0, %esi
	pextrb	$0, %xmm0, %edx
	addb	%sil, %dl
	pextrb	$4, %xmm0, %esi
	addb	%dl, %sil
	pextrb	$6, %xmm0, %edx
	addb	%sil, %dl
	pextrb	$8, %xmm0, %esi
	addb	%dl, %sil
	pextrb	$10, %xmm0, %edi
	pextrb	$14, %xmm0, %edx
	addb	%sil, %dil
	pextrb	$12, %xmm0, %esi
	addb	%dil, %sil
	addb	%sil, %dl

llvm-svn: 170439
2012-12-18 18:40:20 +00:00
Hal Finkel 943f76d1b3 Check multiple register classes for inline asm tied registers
A register can be associated with several distinct register classes.
For example, on PPC, the floating point registers are each associated with
both F4RC (which holds f32) and F8RC (which holds f64). As a result, this code
would fail when provided with a floating point register and an f64 operand
because it would happen to find the register in the F4RC class first and
return that. From the F4RC class, SDAG would extract f32 as the register
type and then assert because of the invalid implied conversion between
the f64 value and the f32 register.

Instead, search all register classes. If a register class containing the
the requested register has the requested type, then return that register
class. Otherwise, as before, return the first register class found that
contains the requested register.

llvm-svn: 170436
2012-12-18 17:50:58 +00:00
Nadav Rotem cb23342876 Rename the test so that we can add additional vectors-of-pointers tests
into the same file in the future.

llvm-svn: 170414
2012-12-18 05:50:54 +00:00
Nadav Rotem a5024fc3e1 SROA: Replace calls to getScalarSizeInBits to DataLayout's API because
getScalarSizeInBits could not handle vectors of pointers.

llvm-svn: 170412
2012-12-18 05:23:31 +00:00
NAKAMURA Takumi ad0c80b8e6 llvm/test/MC/ELF/comp-dir.s: Appease MSYS Bash.
llvm-svn: 170410
2012-12-18 05:08:12 +00:00
Eric Christopher 906da23229 Add support for passing -main-file-name all the way through to
the assembler.

Part of PR14624

llvm-svn: 170390
2012-12-18 00:31:01 +00:00
Chandler Carruth d75be9b4fb Add a triple to this test -- it has to be an ELF platform...
llvm-svn: 170374
2012-12-17 21:44:50 +00:00
Chandler Carruth 10700aad85 Prepare LLVM to fix PR14625, exposing a hook in MCContext to manage the
compilation directory.

This defaults to the current working directory, just as it always has,
but now an assembler can choose to override it with a custom directory.
I've taught llvm-mc about this option and added a test case.

llvm-svn: 170371
2012-12-17 21:32:42 +00:00
Chandler Carruth e3f4119b06 Fix another SROA crasher, PR14601.
This was a silly oversight, we weren't pruning allocas which were used
by variable-length memory intrinsics from the set that could be widened
and promoted as integers. Fix that.

llvm-svn: 170353
2012-12-17 18:48:07 +00:00
Tim Northover 5edabc131a Teach MachO which sections contain code
llvm-svn: 170349
2012-12-17 17:59:32 +00:00
Richard Osborne 459e35c261 Add instruction encodings / disassembly support for l2r instructions.
llvm-svn: 170345
2012-12-17 16:28:02 +00:00
Chandler Carruth 21eb4e96c2 Teach the rewriting of memcpy calls to support subvector copies.
This also cleans up a bit of the memcpy call rewriting by sinking some
irrelevant code further down and making the call-emitting code a bit
more concrete.

Previously, memcpy of a subvector would actually miscompile (!!!) the
copy into a single vector element copy. I have no idea how this ever
worked. =/ This is the memcpy half of PR14478 which we probably weren't
noticing previously because it didn't actually assert.

The rewrite relies on the newly refactored insert- and extractVector
functions to do the heavy lifting, and those are the same as used for
loads and stores which makes the test coverage a bit more meaningful
here.

llvm-svn: 170338
2012-12-17 14:51:24 +00:00
Richard Osborne 51bf1b269a Add instruction encodings for PEEK and ENDIN.
Previously these were marked with the wrong format.

llvm-svn: 170334
2012-12-17 14:23:54 +00:00
Chandler Carruth cacda256a1 Fix a secondary bug I introduced while fixing the first part of PR14478.
The first half of fixing this bug was actually in r170328, but was
entirely coincidental. It did however get me to realize the nature of
the bug, and adapt the test case to test more interesting behavior. In
turn, that uncovered the rest of the bug which I've fixed here.

This should fix two new asserts that showed up in the vectorize nightly
tester.

llvm-svn: 170333
2012-12-17 14:03:01 +00:00
Richard Osborne 041071c558 Add instruction encodings / disassembly support for rus instructions.
llvm-svn: 170330
2012-12-17 13:50:04 +00:00
Richard Osborne e405e58639 Add instruction encodings for ZEXT and SEXT.
Previously these were marked with the wrong format.

llvm-svn: 170327
2012-12-17 13:20:37 +00:00
Richard Osborne 3a0d5cc314 Add instruction encodings / disassembly support for 2r instructions.
llvm-svn: 170323
2012-12-17 12:29:31 +00:00
Richard Osborne 016967e4ff Add instruction encodings / disassembly support for 0r instructions.
llvm-svn: 170322
2012-12-17 12:26:29 +00:00
Craig Topper f924a58af1 Add rest of BMI/BMI2 instructions to the folding tables as well as popcnt and lzcnt.
llvm-svn: 170304
2012-12-17 05:02:29 +00:00
Chandler Carruth ccca504f3a Fix the first part of PR14478: memset now works.
PR14478 highlights a serious problem in SROA that simply wasn't being
exercised due to a lack of vector input code mixed with C-library
function calls. Part of SROA was written carefully to handle subvector
accesses via memset and memcpy, but the rewriter never grew support for
this. Fixing it required refactoring the subvector access code in other
parts of SROA so it could be shared, and then fixing the splat formation
logic and using subvector insertion (this patch).

The PR isn't quite fixed yet, as memcpy is still broken in the same way.
I'm starting on that series of patches now.

Hopefully this will be enough to bring the bullet benchmark back to life
with the bb-vectorizer enabled, but that may require fixing memcpy as
well.

llvm-svn: 170301
2012-12-17 04:07:37 +00:00
Richard Osborne c5287b8889 Add tests for disassembly of 1r XCore instructions.
llvm-svn: 170295
2012-12-16 18:06:30 +00:00
Reed Kotler aee4d5d194 This patch is needed to make c++ exceptions work for mips16.
Mips16 is really a processor decoding mode (ala thumb 1) and in the same
program, mips16 and mips32 functions can exist and can call each other.

If a jal type instruction encounters an address with the lower bit set, then
the processor switches to mips16 mode (if it is not already in it). If the
lower bit is not set, then it switches to mips32 mode.

The linker knows which functions are mips16 and which are mips32.
When relocation is performed on code labels, this lower order bit is
set if the code label is a mips16 code label.

In general this works just fine, however when creating exception handling
tables and dwarf, there are cases where you don't want this lower order
bit added in.

This has been traditionally distinguished in gas assembly source by using a
different syntax for the label.

lab1:      ; this will cause the lower order bit to be added
lab2=.     ; this will not cause the lower order bit to be added

In some cases, it does not matter because in dwarf and debug tables
the difference of two labels is used and in that case the lower order
bits subtract each other out.

To fix this, I have added to mcstreamer the notion of a debuglabel.
The default is for label and debug label to be the same. So calling
EmitLabel and EmitDebugLabel produce the same result.

For various reasons, there is only one set of labels that needs to be
modified for the mips exceptions to work. These are the "$eh_func_beginXXX" 
labels.

Mips overrides the debug label suffix from ":" to "=." .

This initial patch fixes exceptions. More changes most likely
will be needed to DwarfCFException to make all of this work
for actual debugging. These changes will be to emit debug labels in some
places where a simple label is emitted now.

Some historical discussion on this from gcc can be found at:
http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00623.html
http://gcc.gnu.org/ml/gcc-patches/2008-11/msg01273.html 

llvm-svn: 170279
2012-12-16 04:00:45 +00:00
Benjamin Kramer b16ccde7a4 X86: Add a couple of target-specific dag combines that turn VSELECTS into psubus if possible.
We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases
if y is a constant. DAGCombiner canonicalizes those so we first have to undo the
canonicalization for those cases. The pattern occurs in gzip when the loop
vectorizer is enabled. Part of PR14613.

llvm-svn: 170273
2012-12-15 16:47:44 +00:00
Chandler Carruth c50394fcfa Add a corollary test for PR14572. We got this code path correct already.
llvm-svn: 170271
2012-12-15 09:31:54 +00:00
Chandler Carruth 067edd342f Relax an overly aggressive assert to fix PR14572.
The alloca width is based on the alloc size, not the type size.

llvm-svn: 170270
2012-12-15 09:26:06 +00:00
Reed Kotler 5fdeb21249 This code implements most of mips16 hardfloat as it is done by gcc.
In this case, essentially it is soft float with different library routines.
The next step will be to make this fully interoperational with mips32 floating
point and that requires creating stubs for functions with signatures that
contain floating point types.

I have a more sophisticated design for mips16 hardfloat which I hope to
implement at a later time that directly does floating point without the need
for function calls.

The mips16 encoding has no floating point instructions so one needs to
switch to mips32 mode to execute floating point instructions.

llvm-svn: 170259
2012-12-15 00:20:05 +00:00
Kevin Enderby 06aa3eb8ce Make sure the alternate PC+imm syntax of LDR instruction with a small
immediate generates the narrow version.  Needed when doing round-trip
assemble/disassemble testing using the alternate syntax that specifies
'pc' directly.

llvm-svn: 170255
2012-12-14 23:04:25 +00:00
Michael Ilseman e2754dc887 Add back FoldOpIntoPhi optimizations with fix. Included test cases to help catch these errors and to test the presence of the optimization itself
llvm-svn: 170248
2012-12-14 22:08:26 +00:00
Nadav Rotem 8487537bdb TypeLegalizer: Do not generate target specific nodes with illegal types, because we cant type-legalize them.
llvm-svn: 170245
2012-12-14 21:20:37 +00:00
Nadav Rotem aa3e2a907e Fix a crash in ValueTracking on vectors of pointers.
llvm-svn: 170240
2012-12-14 20:43:49 +00:00
Bill Schmidt a4f898448c This patch removes some nondeterminism from direct object file output
for TLS dynamic models on 64-bit PowerPC ELF.  The default sort routine
for relocations only sorts on the r_offset field; but with TLS, there
can be two relocations with the same r_offset.  For PowerPC, this patch
sorts secondarily on descending r_type, which matches the behavior
expected by the linker.

llvm-svn: 170237
2012-12-14 20:28:38 +00:00
Shuxin Yang f8e9a5a061 rdar://12753946
Implement rule : "x * (select cond 1.0, 0.0) -> select cond x, 0.0"

llvm-svn: 170226
2012-12-14 18:46:06 +00:00
Bill Schmidt 9f0b4ec0f5 This patch improves the 64-bit PowerPC InitialExec TLS support by providing
for a wider range of GOT entries that can hold thread-relative offsets.
This matches the behavior of GCC, which was not documented in the PPC64 TLS
ABI.  The ABI will be updated with the new code sequence.

Former sequence:

  ld 9,x@got@tprel(2)
  add 9,9,x@tls

New sequence:

  addis 9,2,x@got@tprel@ha
  ld 9,x@got@tprel@l(9)
  add 9,9,x@tls

Note that a linker optimization exists to transform the new sequence into
the shorter sequence when appropriate, by replacing the addis with a nop
and modifying the base register and relocation type of the ld.

llvm-svn: 170209
2012-12-14 17:02:38 +00:00
Evgeniy Stepanov 49175b237d [msan] Origin stores and loads do not need explicit alignment.
Origin address is always 4 byte aligned, and the access type is always i32.

llvm-svn: 170199
2012-12-14 13:43:11 +00:00
David Blaikie 37fefc3f8d Debug Info: add support to mark member variables as artificial
This is the LLVM portion of r170154.

llvm-svn: 170156
2012-12-13 22:43:07 +00:00
NAKAMURA Takumi 38d2b2442f Revert r170020, "Simplify negated bit test", for now.
This assumes (1 << n) is always not zero. Consider n is greater than word size.
Although I know it is undefined, this transforms undefined behavior hidden.

This led clang unexpected behavior with some failures. I will investigate to fix undefined shl in clang.

llvm-svn: 170128
2012-12-13 14:28:16 +00:00
Akira Hatanaka cf9a61b6ee [mips] Do not copy GOT address to register $gp if the function being called has
internal linkage.

llvm-svn: 170092
2012-12-13 03:17:29 +00:00
Eli Bendersky f9be4c8b47 Make this Lit config file a bit slimmer
llvm-svn: 170083
2012-12-13 02:03:46 +00:00
Evan Cheng bf0baa9de7 Fix a bug in DAGCombiner::MatchBSwapHWord. Make sure the node has operands before referencing them. rdar://12868039
llvm-svn: 170078
2012-12-13 01:34:32 +00:00
Quentin Colombet c0dba2035a Take into account minimize size attribute in the inliner.
Better controls the inlining of functions when the caller function has MinSize attribute.
Basically, when the caller function has this attribute, we do not "force" the inlining
of callee functions carrying the InlineHint attribute (i.e., functions defined with
inline keyword)

llvm-svn: 170065
2012-12-13 01:05:25 +00:00
Nadav Rotem 36510f7194 Teach the cost model about the optimization in r169904: Truncation of induction variables costs the same as scalar trunc.
llvm-svn: 170051
2012-12-13 00:21:03 +00:00
Jakub Staszak c6ecd7deba Fix typo, which prevent test from being check.
llvm-svn: 170025
2012-12-12 21:10:56 +00:00
Jakub Staszak a3619d31d8 unHECKify test fixed by Jacob in r159003.
llvm-svn: 170023
2012-12-12 20:58:42 +00:00
David Majnemer 5226aa94ce Simplify negated bit test
llvm-svn: 170020
2012-12-12 20:48:54 +00:00
Evan Cheng b7d3d03bf9 Fix a logic bug in inline expansion of memcpy / memset with an overlapping
load / store pair. It's not legal to use a wider load than the size of
the remaining bytes if it's the first pair of load / store.

llvm-svn: 170018
2012-12-12 20:43:23 +00:00
Jakub Staszak 0a74fc8d6c unHECKify test. It was fixed by Chris in 2009.
llvm-svn: 170017
2012-12-12 20:43:00 +00:00
Bill Schmidt 25ffd19502 The ordering of two relocations on the same instruction is apparently not
predictable when compiled on at least one non-PowerPC host.  Source of
nondeterminism not apparent.  Restrict the test to build on PowerPC hosts
for now while looking into the issue further.

llvm-svn: 170016
2012-12-12 20:29:20 +00:00
Jakub Staszak aee9cca331 Fix typo in test-case.
llvm-svn: 170015
2012-12-12 20:29:06 +00:00
Jakub Staszak 67bf76ebbc Fix typo.
llvm-svn: 170006
2012-12-12 19:47:04 +00:00
Nadav Rotem d0bb22bba3 LoopVectorizer: Use the "optsize" attribute to decide if we are allowed to increase the function size.
llvm-svn: 170004
2012-12-12 19:29:45 +00:00
Bill Schmidt 24b8dd6eb7 This patch implements local-dynamic TLS model support for the 64-bit
PowerPC target.  This is the last of the four models, so we now have 
full TLS support.

This is mostly a straightforward extension of the general dynamic model.
I had to use an additional Chain operand to tie ADDIS_DTPREL_HA to the
register copy following ADDI_TLSLD_L; otherwise everything above the
ADDIS_DTPREL_HA appeared dead and was removed.

As before, there are new test cases to test the assembly generation, and
the relocations output during integrated assembly.  The expected code
gen sequence can be read in test/CodeGen/PowerPC/tls-ld.ll.

There are a couple of things I think can be done more efficiently in the
overall TLS code, so there will likely be a clean-up patch forthcoming;
but for now I want to be sure the functionality is in place.

Bill

llvm-svn: 170003
2012-12-12 19:29:35 +00:00
Alexey Samsonov 3d43b63a6e Improve debug info generated with enabled AddressSanitizer.
When ASan replaces <alloca instruction> with
<offset into a common large alloca>, it should also patch
llvm.dbg.declare calls and replace debug info descriptors to mark
that we've replaced alloca with a value that stores an address
of the user variable, not the user variable itself.

See PR11818 for more context.

llvm-svn: 169984
2012-12-12 14:31:53 +00:00
NAKAMURA Takumi be230b8fdb llvm/test/CodeGen/X86/atom-bypass-slow-division.ll: Fix possible typo(s) in CHECK-NOT lines.
Found by Alexander Zinenko, thanks!

llvm-svn: 169978
2012-12-12 13:34:20 +00:00
NAKAMURA Takumi cae5321a3b llvm/test/CodeGen/X86/atom-bypass-slow-division.ll: Rename symbols, s/test_/Test/g, not to mismatch "CHECK(-NOT): test".
llvm-svn: 169977
2012-12-12 13:34:14 +00:00
NAKAMURA Takumi 69d1405e48 llvm/test/CodeGen/X86/store_op_load_fold.ll: Fix typo, s/CHECK_NEXT/CHECK-NEXT/
llvm-svn: 169957
2012-12-12 01:41:01 +00:00
NAKAMURA Takumi 01ac65af00 llvm/test/CodeGen/X86/store_op_load_fold.ll: Add explicit triple.
llvm-svn: 169956
2012-12-12 01:40:56 +00:00
Manman Ren 82751a105c DAGCombine: clamp hi bit in APInt::getBitsSet to avoid assertion
rdar://12838504

llvm-svn: 169951
2012-12-12 01:13:50 +00:00
Evan Cheng 04e5518783 Avoid using lossy load / stores for memcpy / memset expansion. e.g.
f64 load / store on non-SSE2 x86 targets.

llvm-svn: 169944
2012-12-12 00:42:09 +00:00
Shuxin Yang 81b3678564 - Fix a problematic way in creating all-the-1 APInt.
- Propagate "exact" bit of [l|a]shr instruction.

llvm-svn: 169942
2012-12-12 00:29:03 +00:00
Michael Ilseman bb6f691b01 Added a slew of SimplifyInstruction floating-point optimizations, many of which take advantage of fast-math flags. Test cases included.
fsub X, +0 ==> X
  fsub X, -0 ==> X, when we know X is not -0
  fsub +/-0.0, (fsub -0.0, X) ==> X
  fsub nsz +/-0.0, (fsub +/-0.0, X) ==> X
  fsub nnan ninf X, X ==> 0.0
  fadd nsz X, 0 ==> X
  fadd [nnan ninf] X, (fsub [nnan ninf] 0, X) ==> 0
    where nnan and ninf have to occur at least once somewhere in this expression
  fmul X, 1.0 ==> X

llvm-svn: 169940
2012-12-12 00:27:46 +00:00
Nadav Rotem f707bf4ca3 PR14574. Fix a bug in the code that calculates the mask the converted PHIs in if-conversion.
llvm-svn: 169916
2012-12-11 21:30:14 +00:00
Tom Stellard 75aadc2813 Add R600 backend
A new backend supporting AMD GPUs: Radeon HD2XXX - HD7XXX

llvm-svn: 169915
2012-12-11 21:25:42 +00:00
Bill Schmidt c56f1d34bc This patch implements the general dynamic TLS model for 64-bit PowerPC.
Given a thread-local symbol x with global-dynamic access, the generated
code to obtain x's address is:

     Instruction                            Relocation            Symbol
  addis ra,r2,x@got@tlsgd@ha           R_PPC64_GOT_TLSGD16_HA       x
  addi  r3,ra,x@got@tlsgd@l            R_PPC64_GOT_TLSGD16_L        x
  bl __tls_get_addr(x@tlsgd)           R_PPC64_TLSGD                x
                                       R_PPC64_REL24           __tls_get_addr
  nop
  <use address in r3>

The implementation borrows from the medium code model work for introducing
special forms of ADDIS and ADDI into the DAG representation.  This is made
slightly more complicated by having to introduce a call to the external
function __tls_get_addr.  Using the full call machinery is overkill and,
more importantly, makes it difficult to add a special relocation.  So I've
introduced another opcode GET_TLS_ADDR to represent the function call, and
surrounded it with register copies to set up the parameter and return value.

Most of the code is pretty straightforward.  I ran into one peculiarity
when I introduced a new PPC opcode BL8_NOP_ELF_TLSGD, which is just like
BL8_NOP_ELF except that it takes another parameter to represent the symbol
("x" above) that requires a relocation on the call.  Something in the 
TblGen machinery causes BL8_NOP_ELF and BL8_NOP_ELF_TLSGD to be treated
identically during the emit phase, so this second operand was never
visited to generate relocations.  This is the reason for the slightly
messy workaround in PPCMCCodeEmitter.cpp:getDirectBrEncoding().

Two new tests are included to demonstrate correct external assembly and
correct generation of relocations using the integrated assembler.

Comments welcome!

Thanks,
Bill

llvm-svn: 169910
2012-12-11 20:30:11 +00:00
Nadav Rotem e266efb70b Loop Vectorize: optimize the vectorization of trunc(induction_var). The truncation is now done on scalars.
llvm-svn: 169904
2012-12-11 18:58:10 +00:00
NAKAMURA Takumi e55382ea55 llvm/test/TableGen: Remove XFAIL:vg_leak in dozen of tests, according to llvm-x86_64-linux-vg_leak.
llvm-svn: 169862
2012-12-11 13:14:16 +00:00
Hao Liu 14390f031c revert the test change
llvm-svn: 169823
2012-12-11 06:25:18 +00:00
Hao Liu 0eb50fb49d A newbie try a test commit
llvm-svn: 169821
2012-12-11 06:22:54 +00:00
Nadav Rotem dbb3328194 Fix PR14565. Don't if-convert loops that have switch statements in them.
llvm-svn: 169813
2012-12-11 04:55:10 +00:00
Chad Rosier d4c0c6cb22 Add a triple to this test.
llvm-svn: 169803
2012-12-11 00:51:36 +00:00
Chandler Carruth b27041c50b Fix a miscompile in the DAG combiner. Previously, we would incorrectly
try to reduce the width of this load, and would end up transforming:

  (truncate (lshr (sextload i48 <ptr> as i64), 32) to i32)
to
  (truncate (zextload i32 <ptr+4> as i64) to i32)

We lost the sext attached to the load while building the narrower i32
load, and replaced it with a zext because lshr always zext's the
results. Instead, bail out of this combine when there is a conflict
between a sextload and a zext narrowing. The rest of the DAG combiner
still optimize the code down to the proper single instruction:

  movswl 6(...),%eax

Which is exactly what we wanted. Previously we read past the end *and*
missed the sign extension:

  movl 6(...), %eax

llvm-svn: 169802
2012-12-11 00:36:57 +00:00
Paul Redmond c4550d4967 move X86-specific test
This test case uses -mcpu=corei7 so it belongs in CodeGen/X86

Reviewed by: Nadav

llvm-svn: 169801
2012-12-11 00:36:43 +00:00
Chad Rosier df42cf39ab Fall back to the selection dag isel to select tail calls.
This shouldn't affect codegen for -O0 compiles as tail call markers are not
emitted in unoptimized compiles.  Testing with the external/internal nightly
test suite reveals no change in compile time performance.  Testing with -O1,
-O2 and -O3 with fast-isel enabled did not cause any compile-time or
execution-time failures.  All tests were performed on my x86 machine.
I'll monitor our arm testers to ensure no regressions occur there.

In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue
and objc_retainAutoreleaseReturnValue as tail calls unconditionally.  While
it's theoretically true that this is just an optimization, it's an
optimization that we very much want to happen even at -O0, or else ARC
applications become substantially harder to debug.

Part of rdar://12553082

llvm-svn: 169796
2012-12-11 00:18:02 +00:00
Eric Christopher c8a310edc1 Refactor out the abbreviation handling into a separate class that
controls each of the abbreviation sets (only a single one at the
moment) and computes offsets separately as well for each set
of DIEs.

No real function change, ordering of abbreviations for the skeleton
CU changed but only because we're computing in a separate order. Fix
the testcase not to care.

llvm-svn: 169793
2012-12-10 23:34:43 +00:00
Evan Cheng 79e2ca90bc Some enhancements for memcpy / memset inline expansion.
1. Teach it to use overlapping unaligned load / store to copy / set the trailing
   bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
   x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
   if it's not possible to materialize an integer immediate with a single
   instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
   are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
   Also increase the threshold to something reasonable (8 for memset, 4 pairs
   for memcpy).

This significantly improves Dhrystone, up to 50% on ARM iOS devices.

rdar://12760078

llvm-svn: 169791
2012-12-10 23:21:26 +00:00
Arnold Schwaighofer edd62b14e5 Optimistically analyse Phi cycles
Analyse Phis under the starting assumption that they are NoAlias. Recursively
look at their inputs.
If they MayAlias/MustAlias there must be an input that makes them so.

Addresses bug 14351.

llvm-svn: 169788
2012-12-10 23:02:41 +00:00
Eli Bendersky 7352e4ffcb Add a test for explicitly exercising the mc-relax-all flag.
llvm-svn: 169764
2012-12-10 20:36:01 +00:00
Eric Christopher cdf218d606 Use the somewhat semantic term "split dwarf" it more matches what's
going on and makes a lot of the terminology in comments make more sense.

llvm-svn: 169758
2012-12-10 19:51:21 +00:00
Nadav Rotem 7b5b55c195 Add support for reverse induction variables. For example:
while (i--)
 sum+=A[i];

llvm-svn: 169752
2012-12-10 19:25:06 +00:00
Hal Finkel 66859ae0f6 Use GetUnderlyingObjects in misched
misched used GetUnderlyingObject in order to break false load/store
dependencies, and the -enable-aa-sched-mi feature similarly relied on
GetUnderlyingObject in order to ensure it is safe to use the aliasing analysis.
Unfortunately, GetUnderlyingObject does not recurse through phi nodes, and so
(especially due to LSR) all of these mechanisms failed for
induction-variable-dependent loads and stores inside loops.

This change replaces uses of GetUnderlyingObject with GetUnderlyingObjects
(which will recurse through phi and select instructions) in misched.

Andy reviewed, tested and simplified this patch; Thanks!

llvm-svn: 169744
2012-12-10 18:49:16 +00:00
Craig Topper d8005db486 Teach DAG combine to handle vector add/sub with vectors of all 0s.
llvm-svn: 169727
2012-12-10 08:12:29 +00:00
Chandler Carruth e45f4658a3 Fix PR14548: SROA was crashing on a mixture of i1 and i8 loads and stores.
When SROA was evaluating a mixture of i1 and i8 loads and stores, in
just a particular case, it would tickle a latent bug where we compared
bits to bytes rather than bits to bits. As a consequence of the latent
bug, we would allow integers through which were not byte-size multiples,
a situation the later rewriting code was never intended to handle.

In release builds this could trigger all manner of oddities, but the
reported issue in PR14548 was forming invalid bitcast instructions.

The only downside of this fix is that it makes it more clear that SROA
in its current form is not capable of handling mixed i1 and i8 loads and
stores. Sometimes with the previous code this would work by luck, but
usually it would crash, so I'm not terribly worried. I'll watch the LNT
numbers just to be sure.

llvm-svn: 169719
2012-12-10 00:54:45 +00:00
Paul Redmond 2adb13c100 LoopVectorize: support vectorizing intrinsic calls
- added function to VectorTargetTransformInfo to query cost of intrinsics
- vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc.

Reviewed by: Nadav

llvm-svn: 169711
2012-12-09 20:42:17 +00:00
Benjamin Kramer 054ab17c17 Drop the address space limit for tests in the makefile build.
The limit seems to break newer pythons (see PR13598) so just drop it for now.
Eventually lit should learn to set limits for its children instead of a global
limit in the makefile.

If some PPC bots fail after this change: That's a good thing, they actually run
clang tests now.

llvm-svn: 169695
2012-12-09 10:34:22 +00:00
Shuxin Yang 95de7c37e2 - Re-enable population count loop idiom recognization
- fix a bug which cause sigfault.
- add two testing cases which was causing crash

llvm-svn: 169687
2012-12-09 03:12:46 +00:00
Craig Topper a183ddb0fe Teach DAG combine to handle vector logical operations with vectors of all 1s or all 0s. These cases can show up when vectors are split for legalizing. Fix some tests that were dependent on these cases not being combined.
llvm-svn: 169684
2012-12-08 22:49:19 +00:00
Chandler Carruth 91e47532fe Revert the patches adding a popcount loop idiom recognition pass.
There are still bugs in this pass, as well as other issues that are
being worked on, but the bugs are crashers that occur pretty easily in
the wild. Test cases have been sent to the original commit's review
thread.

This reverts the commits:
  r169671: Fix a logic error.
  r169604: Move the popcnt tests to an X86 subdirectory.
  r168931: Initial commit adding the pass.

llvm-svn: 169683
2012-12-08 22:18:29 +00:00
Nadav Rotem ad0b5fbe8c When we use the BLEND instruction that uses the MSB as a mask, we can remove
the VSRI instruction before it since it does not affect the MSB.

Thanks Craig Topper for suggesting this.

llvm-svn: 169638
2012-12-07 21:43:11 +00:00
Matthew Curtis 7a93811e8b In hexagon convertToHardwareLoop, don't deref end() iterator
In particular, check if MachineBasicBlock::iterator is end() before
using it to call getDebugLoc();

See also this thread on llvm-commits:
   http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121112/155914.html

llvm-svn: 169634
2012-12-07 21:03:15 +00:00
Nadav Rotem 481e50efe0 X86: Prefer using VPSHUFD over VPERMIL because it has better throughput.
llvm-svn: 169624
2012-12-07 19:01:13 +00:00
Eli Bendersky 2ccd044c22 Add separate statistics for Data and Inst fragments emitted during relaxation.
Also fixes a test that was overly-sensitive to the exact order of statistics
emitted.

llvm-svn: 169619
2012-12-07 17:59:21 +00:00
Tim Northover 5cc3dc86bb Added Mapping Symbols for ARM ELF
Before this patch, when you objdump an LLVM-compiled file, objdump tried to
decode data-in-code sections as if they were code.  This patch adds the missing
Mapping Symbols, as defined by "ELF for the ARM Architecture" (ARM IHI 0044D).

Patch based on work by Greg Fitzgerald.

llvm-svn: 169609
2012-12-07 16:50:23 +00:00
David Tweed cfeb8fc49a The test unconditionally assumes a particular cpu has a backend build in the target.
Buildbots for some hosts may choose to build only their own backend in order to
maximise testing-turnaround time. Move the test into a prefixed directory so
lit's standard "backend specific" suppression can be done.

llvm-svn: 169604
2012-12-07 15:57:45 +00:00
Chandler Carruth 80d3e56c73 Add support to ValueTracking for determining that a pointer is non-null
by virtue of inbounds GEPs that preclude a null pointer.

This is a very common pattern in the code generated by std::vector and
other standard library routines which use allocators that test for null
pervasively. This is one step closer to teaching Clang+LLVM to be able
to produce an empty function for:

  void f() {
    std::vector<int> v;
    v.push_back(1);
    v.push_back(2);
    v.push_back(3);
    v.push_back(4);
  }

Which is related to getting them to completely fold SmallVector
push_back sequences into constants when inlining and other optimizations
make that a possibility.

llvm-svn: 169573
2012-12-07 02:08:58 +00:00
Dmitri Gribenko 1c704355cf Fix typos in CHECK lines.
Patch by Alexander Zinenko.

llvm-svn: 169547
2012-12-06 21:24:47 +00:00
Nadav Rotem ac450eb59e Fix a bug in the code that merges consecutive stores. Previously we did not
check if loads that happen in between stores alias with the first store in the
chain, only with the second store onwards.

llvm-svn: 169516
2012-12-06 17:34:13 +00:00
Evgeniy Stepanov 4f220d96c5 [msan] Do not store origin for clean values.
Instead of unconditionally storing origin with every application store,
only do this when the shadow of the stored value is != 0.

This change also delays instrumentation of stores until after the walk over
function's instructions, because adding new basic blocks confuses InstVisitor.

We only keep 1 origin value per 4 bytes of application memory. This change
fixes the bug when a store of a single clean byte wiped the origin for the
whole 4-byte area.

Since stores of uninitialized values are relatively uncommon, this change
improves performance of track-origins mode by 5% median and by up to 47% on
specs.

llvm-svn: 169490
2012-12-06 11:41:03 +00:00
Bill Wendling 28fe9e7a36 Handle non-default array bounds.
Some languages, e.g. Ada and Pascal, allow you to specify that the array bounds
are different from the default (1 in these cases). If we have a lower bound
that's non-default, then we emit the lower bound. We also calculate the correct
upper bound in those cases.

llvm-svn: 169484
2012-12-06 07:38:10 +00:00
Craig Topper 216bcd522b Remove intrinsic specific instructions for (V)MOVQUmr with patterns pointing to the normal instructions.
llvm-svn: 169482
2012-12-06 07:31:16 +00:00
Evan Cheng 16846051db Properly fix the tes.
llvm-svn: 169464
2012-12-06 02:29:29 +00:00
NAKAMURA Takumi 1eccd286fd llvm/test/CodeGen/ARM/extload-knownzero.ll: Try to unbreak, to add -O0. I guess Chad expects fastisel here.
llvm-svn: 169463
2012-12-06 02:22:58 +00:00
Chad Rosier 9f5c68af4c [arm fast-isel] Make the fast-isel implementation of memcpy respect alignment.
rdar://12821569

llvm-svn: 169460
2012-12-06 01:34:31 +00:00
Evan Cheng 5213139f48 Let targets provide hooks that compute known zero and ones for any_extend
and extload's. If they are implemented as zero-extend, or implicitly
zero-extend, then this can enable more demanded bits optimizations. e.g.

define void @foo(i16* %ptr, i32 %a) nounwind {
entry:
  %tmp1 = icmp ult i32 %a, 100
  br i1 %tmp1, label %bb1, label %bb2
bb1:
  %tmp2 = load i16* %ptr, align 2
  br label %bb2
bb2:
  %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ]
  %cmp = icmp ult i16 %tmp3, 24
  br i1 %cmp, label %bb3, label %exit
bb3:
  call void @bar() nounwind
  br label %exit
exit:
  ret void
}

This compiles to the followings before:
        push    {lr}
        mov     r2, #0
        cmp     r1, #99
        bhi     LBB0_2
@ BB#1:                                 @ %bb1
        ldrh    r2, [r0]
LBB0_2:                                 @ %bb2
        uxth    r0, r2
        cmp     r0, #23
        bhi     LBB0_4
@ BB#3:                                 @ %bb3
        bl      _bar
LBB0_4:                                 @ %exit
        pop     {lr}
        bx      lr

The uxth is not needed since ldrh implicitly zero-extend the high bits. With
this change it's eliminated.

rdar://12771555

llvm-svn: 169459
2012-12-06 01:28:01 +00:00
Richard Smith 0ab6c5d9d8 PR10867: Analogue of r169441 for when using external 'sh'. And actually run the test!
llvm-svn: 169446
2012-12-05 23:15:33 +00:00
Richard Smith 69c87b0914 PR10867. lit would interpret
RUN: a
  RUN: b || true

as "a && (b || true)" in Tcl mode, and as "(a && b) || true" in sh mode.
Everyone seems to (quite reasonably) write tests assuming the Tcl behavior,
so use that in sh mode too.

llvm-svn: 169441
2012-12-05 22:54:26 +00:00
Andrew Trick fda7a8832d RegisterPressureTracker: fix findUseBetween to handle DebugValue
llvm-svn: 169427
2012-12-05 21:37:50 +00:00
Andrew Trick 7f7cee39ab RegisterPresssureTracker: Track live physical register by unit.
This is much simpler to reason about, more efficient, and
fixes some corner cases involving implicit super-register defs.
Fixed rdar://12797931.

llvm-svn: 169425
2012-12-05 21:37:42 +00:00
Nadav Rotem 0a471ea66c Cost Model: change the default cost of control flow instructions (br / ret / ...) to zero.
llvm-svn: 169423
2012-12-05 21:21:26 +00:00
David Sehr 05176cad21 Correct ARM NOP encoding
The encoding of NOP in ARMAsmBackend.cpp is missing a trailing zero, which
causes the emission of a coprocessor instruction rather than "mov r0, r0"
as indicated in the comment.  The test also checks for the wrong encoding.

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121203/157919.html

llvm-svn: 169420
2012-12-05 21:01:27 +00:00
Justin Holewinski fb711156ae [NVPTX] Fix crash with unnamed struct arguments
Patch by Eric Holk

llvm-svn: 169418
2012-12-05 20:50:28 +00:00
Michael J. Spencer 0c6ec48d0b Add dump of Win64 EH unwind data.
The new command line option -unwind-info dumps the Win64 EH unwind
data to the console. This is a nice feature if you need to debug
generated EH data (e.g. from LLVM). Includes a test case.

Initial patch by João Matos, extensions and rework by Kai Nacke.

llvm-svn: 169415
2012-12-05 20:12:35 +00:00
David Sehr 1fa8d37efc Test commit.
llvm-svn: 169410
2012-12-05 19:47:56 +00:00
Jyotsna Verma 90295156d8 Use multiclass to define store instructions with base+immediate offset
addressing mode and immediate stored value.

llvm-svn: 169408
2012-12-05 19:32:03 +00:00
Kevin Enderby 168ffb36a5 Added a option to the disassembler to print immediates as hex.
This is for the lldb team so most of but not all of the values are
to be printed as hex with this option.  Some small values like the
scale in an X86 address were requested to printed in decimal
without the leading 0x.

There may be some tweaks need to places that may still be in
decimal that they want in hex.  Specially for arm.  I made my best
guess.  Any tweaks from here should be simple.

I also did the best I know now with help from the C++ gurus
creating the cleanest formatImm() utility function and containing
the changes.  But if someone has a better idea to make something
cleaner I'm all ears and game for changing the implementation.

rdar://8109283

llvm-svn: 169393
2012-12-05 18:13:19 +00:00
Evgeniy Stepanov 8b51bab495 [msan] Instrument bswap intrinsic.
llvm-svn: 169383
2012-12-05 14:39:55 +00:00
Evgeniy Stepanov 474cb3b3b5 [msan] Change linkage type of __msan_track_origins.
LinkOnceODRLinkage globals may be removed in GlobalOpt if not used in the
current module.

llvm-svn: 169377
2012-12-05 12:49:41 +00:00
Elena Demikhovsky cd3c1c4a16 Simplified BLEND pattern matching for shuffles.
Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2.

llvm-svn: 169366
2012-12-05 09:24:57 +00:00
Shuxin Yang ada92f5018 fix a typo
llvm-svn: 169345
2012-12-05 00:33:16 +00:00