Commit Graph

30331 Commits

Author SHA1 Message Date
Filipe Cabecinhas 1c299d05e6 [BitcodeReader] Don't allow INSERTVAL/EXTRACTVAL with 0 indices
This would trigger an assertion later.

Bug found with AFL fuzz.

llvm-svn: 237494
2015-05-16 00:33:12 +00:00
Jingyue Wu 154eb5aa1d Add a speculative execution pass
Summary:
This is a pass for speculative execution of instructions for simple if-then (triangle) control flow. It's aimed at GPUs, but could perhaps be used in other contexts. Enabling this pass gives us a 1.0% geomean improvement on Google benchmark suites, with one benchmark improving 33%.

Credit goes to Jingyue Wu for writing an earlier version of this pass.

Patched by Bjarke Roune. 

Test Plan:
This patch adds a set of tests in test/Transforms/SpeculativeExecution/spec.ll
The pass is controlled by a flag which defaults to having the pass not run.

Reviewers: eliben, dberlin, meheff, jingyue, hfinkel

Reviewed By: jingyue, hfinkel

Subscribers: majnemer, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9360

llvm-svn: 237459
2015-05-15 17:54:48 +00:00
James Molloy 1675b4a57f Revert "Canonicalize min/max expressions correctly."
This reverts r237453 - it was causing timeouts on some bots. Reverting
while I investigate (it's probably InstCombine fighting itself...)

llvm-svn: 237458
2015-05-15 17:45:09 +00:00
Jingyue Wu 80a96d299a [SLSR] handle (B | i) * S
Summary:
Consider (B | i) * S as (B + i) * S if B and i have no bits set in
common.

Test Plan: @or in slsr-mul.ll

Reviewers: broune, meheff

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9788

llvm-svn: 237456
2015-05-15 17:07:48 +00:00
James Molloy cfb0443af6 Mark SMIN/SMAX/UMIN/UMAX nodes as legal and add patterns for them.
The new [SU]{MIN,MAX} SDNodes can be lowered directly to instructions for
most NEON datatypes - the big exclusion being v2i64.

llvm-svn: 237455
2015-05-15 16:15:57 +00:00
James Molloy 6edf0b4cd4 Canonicalize min/max expressions correctly.
This patch introduces a canonical form for min/max idioms where one operand
is extended or truncated. This often happens when the other operand is a
constant. For example:

  %1 = icmp slt i32 %a, i32 0
  %2 = sext i32 %a to i64
  %3 = select i1 %1, i64 %2, i64 0

Would now be canonicalized into:

  %1 = icmp slt i32 %a, i32 0
  %2 = select i1 %1, i32 %a, i32 0
  %3 = sext i32 %2 to i64

This builds upon a patch posted by David Majenemer
(https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
passively stopped instcombine from ruining canonical patterns. This
patch additionally actively makes instcombine canonicalize too.

Canonicalization of expressions involving a change in type from int->fp
or fp->int are not yet implemented.

llvm-svn: 237453
2015-05-15 16:10:59 +00:00
Simon Atanasyan eeb2fa9877 [llvm-readobj] Teach llvm-readobj to print PT_MIPS_ABIFLAGS program header
llvm-svn: 237451
2015-05-15 15:59:22 +00:00
Nemanja Ivanovic ce6211f7ff NFC - Test case invokes llc on a file rather than redirected from a file.
This has caused some local failures. Updating the test case to be more
like the majority of the similar test cases.
Committing on behalf of Hubert Tong (hstong@ca.ibm.com).

llvm-svn: 237449
2015-05-15 15:29:53 +00:00
James Molloy c0661aeaf8 [DependenceAnalysis] Fix for PR21585: collectUpperBound triggers asserts
collectUpperBound hits an assertion when the back edge count is wider then the desired type.

If that happens, truncate the backedge count.

Patch by Philip Pfaffe!

llvm-svn: 237439
2015-05-15 12:17:22 +00:00
Toma Tabacu a3d056fd4c [mips] [IAS] Fix expansion of negative 32-bit immediates for LI/DLI.
Summary:
To maintain compatibility with GAS, we need to stop treating negative 32-bit immediates as 64-bit values when expanding LI/DLI.
This currently happens because of sign extension.

To do this we need to choose the 32-bit value expansion for values which use their upper 33 bits only for sign extension (i.e. no 0's, only 1's).

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8662

llvm-svn: 237428
2015-05-15 09:42:11 +00:00
Sanjoy Das 2c2661456e [PlaceSafepoints] Fix a bug that came in with rL236672.
Transfer the calling convention from the invoke being replaced by
PlaceStatepoints to the new invoke to gc.statepoint created.  Add a test
case that would have caught this issue.

llvm-svn: 237414
2015-05-15 00:26:21 +00:00
Sanjoy Das 8045810c58 [PlaceSafepoints] Fix a bug that came in with rL236672.
rL236672 would generate all invoke statepoints with deopt args set to a
list containing the single element "0", instead of an empty list.

Also add a test case that would have caught this.

llvm-svn: 237413
2015-05-15 00:26:15 +00:00
Akira Hatanaka 60f21fdd50 Fix the check strings in a test case committed in r212455.
The access size (8, in this case) was missing in the function name that was
being checked.

llvm-svn: 237410
2015-05-15 00:12:26 +00:00
Jingyue Wu ca32190379 [ValueTracking] refactor: extract method haveNoCommonBitsSet
Summary:
Extract method haveNoCommonBitsSet so that we don't have to duplicate this logic in
InstCombine and SeparateConstOffsetFromGEP.

This patch also makes SeparateConstOffsetFromGEP more precise by passing
DominatorTree to computeKnownBits.

Test Plan: value-tracking-domtree.ll that tests ValueTracking indeed leverages dominating conditions

Reviewers: broune, meheff, majnemer

Reviewed By: majnemer

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9734

llvm-svn: 237407
2015-05-14 23:53:19 +00:00
Wei Mi bf727ba371 Add another InstCombine pass after LoopUnroll.
This is to cleanup some redundency generated by LoopUnroll pass. Such redundency may not be cleaned up by existing passes after LoopUnroll.

Differential Revision: http://reviews.llvm.org/D9777

llvm-svn: 237395
2015-05-14 22:02:54 +00:00
Brendon Cahoon 7c8a3b0ef6 [Hexagon] Generate hardware loop for a vectorized loop
The induction variable in the vectorized loop wasn't
recognized properly, so a hardware loop wasn't generated.

Differential Revision: http://reviews.llvm.org/D9722

llvm-svn: 237388
2015-05-14 20:36:19 +00:00
Andrea Di Biagio 2905999c22 [ConstantFolding] Fix wrong folding of intrinsic 'convert.from.fp16'.
Function 'ConstantFoldScalarCall' (in ConstantFolding.cpp) works under the
wrong assumption that a call to 'convert.from.fp16' returns a value of
type 'float'.
However, intrinsic 'convert.from.fp16' can be overloaded; for example, we
can call 'convert.from.fp16.f64' to convert from half to double; etc.

Before this patch, the following example would have triggered an assertion
failure in opt (with -constprop):

```
define double @foo() {
entry:
  %0 = call double @llvm.convert.from.fp16.f64(i16 0)
  ret double %0
}
```

This patch fixes the problem in ConstantFolding.cpp. When folding a call to
convert.from.fp16, we perform a different kind of conversion based on the call
return type.

Added test 'Transform/ConstProp/convert-from-fp16.ll'.

Differential Revision: http://reviews.llvm.org/D9771

llvm-svn: 237377
2015-05-14 18:01:48 +00:00
Brendon Cahoon 485bea74ad [Hexagon] Remove dead constant assignment in hardware loop pass
After converting a loop to a hardware loop, the pass should remove
any unnecessary instructions from the old compare-and-branch
code. This patch removes a dead constant assignment that was
used in the compare instruction.

Differential Revision: http://reviews.llvm.org/D9720

llvm-svn: 237373
2015-05-14 17:31:40 +00:00
Toma Tabacu e625b5fc02 [mips] [IAS] Enforce .set nomacro.
Summary: When used, ".set nomacro" causes warning messages to be reported when we expand pseudo-instructions to multiple machine instructions.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9564

llvm-svn: 237366
2015-05-14 14:51:32 +00:00
Brendon Cahoon 9376e9998e [Hexagon] Check for underflow/wrap in hardware loop pass
If the loop trip count may underflow or wrap, the compiler should
not generate a hardware loop since the trip count will be
incorrect.

llvm-svn: 237365
2015-05-14 14:15:08 +00:00
Toma Tabacu 772155cbc6 [mips] [IAS] Emit .set macro/nomacro.
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9563

llvm-svn: 237363
2015-05-14 13:42:10 +00:00
Vasileios Kalintiris 70b744e4a1 [mips] Do not place users of $ra in the delay slot of call instructions.
Summary:
When we are trying to fill the delay slot of a call instruction, we must avoid
filler instructions that use the $ra register. This fixes the test
MultiSource/Applications/JM/lencod when we enable the forward delay slot filler.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9670

llvm-svn: 237362
2015-05-14 13:17:56 +00:00
Artyom Skrobov a70dfe18d3 Re-apply r237247 - [AArch64] Codegen VMAX/VMIN for safe math cases
No longer breaks SPEC2000/2006

llvm-svn: 237361
2015-05-14 12:59:46 +00:00
Adam Nemet 938d3d63d6 New Loop Distribution pass
Summary:
This implements the initial version as was proposed earlier this year
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-January/080462.html).
Since then Loop Access Analysis was split out from the Loop Vectorizer
and was made into a separate analysis pass.  Loop Distribution becomes
the second user of this analysis.

The pass is off by default and can be enabled
with -enable-loop-distribution.  There is currently no notion of
profitability; if there is a loop with dependence cycles, the pass will
try to split them off from other memory operations into a separate loop.

I decided to remove the control-dependence calculation from this first
version.  This and the issues with the PDT are actively discussed so it
probably makes sense to treat it separately.  Right now I just mark all
terminator instruction required which keeps identical CFGs for each
distributed loop.  This seems to be working pretty well for 456.hmmer
where even though there is an empty if-then block in the distributed
loop initially, it gets completely removed.

The pass keeps DominatorTree and LoopInfo updated.  I've tested this
with -loop-distribute-verify with the testsuite where we distribute ~90
loops.  SimplifyLoop is violated in some cases and I have a FIXME
covering this.

Reviewers: hfinkel, nadav, aschwaighofer

Reviewed By: aschwaighofer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8831

llvm-svn: 237358
2015-05-14 12:05:18 +00:00
Toma Tabacu ec1de82213 [mips] [IAS] Warn when LA is used with a 64-bit symbol.
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9295

llvm-svn: 237356
2015-05-14 10:53:40 +00:00
Elena Demikhovsky d5b3e376d2 AVX-512: Added i1 type handling for calling conventions.
i1 type is a legal type on AVX-512 and can be passed as parameter or return value.
i1 is promoted to i8 on return and to i32 for call arguments (i8 is also promoted to i32 here).
The result code is similar to the previous X86 targets, where i1 is allways promoted to i8.

llvm-svn: 237350
2015-05-14 09:04:45 +00:00
Andy Ayers 9e5c851419 Don't omit the constant when computing a cross-section relative relocation.
Differential Revision: http://reviews.llvm.org/D9692

llvm-svn: 237327
2015-05-14 01:10:41 +00:00
Ahmed Bougacha 6402ad27c0 [CodeGen] Use standard -not gnueabi- naming for f16 libcalls on Darwin.
Other targets probably should as well.  Since r237161, compiler-rt has
both, but I don't see why anything other than gnueabi would use a
gnueabi naming scheme.

llvm-svn: 237324
2015-05-14 01:00:51 +00:00
Alex Lorenz a22b250c6f YAML: Implement block scalar parsing.
This commit implements the parsing of YAML block scalars.
Some code existed for it before, but it couldn't parse block
scalars.

This commit adds a new yaml node type to represent the block
scalar values. 

This commit also deletes the 'spec-09-27' and 'spec-09-28' tests
as they are identical to the test file 'spec-09-26'.

This commit introduces 3 new utility functions to the YAML scanner
class: `skip_s_space`, `advanceWhile` and `consumeLineBreakIfPresent`.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D9503

llvm-svn: 237314
2015-05-13 23:10:51 +00:00
Douglas Katzman 6dc1397298 [X86] Fix PR23271 - RIP-relative decoding bug in disassembler.
Differential Revision: http://reviews.llvm.org/D9110

llvm-svn: 237310
2015-05-13 22:44:52 +00:00
Justin Bogner d0ceebf160 InstrProf: Fix display of large numbers in llvm-cov
llvm-cov was truncating numbers that were larger than a particular
fixed width, which is as confusing as it is useless. Instead, we use
engineering notation with SI prefix for magnitude.

llvm-svn: 237307
2015-05-13 22:41:48 +00:00
Sanjoy Das ba74e645d8 [PlaceSafepoints] New attributes for patchable statepoints.
Summary:
This patch teaches the PlaceSafepoints pass about two `CallSite`
function attributes:

 * "statepoint-id": if the string value of this attribute can be parsed
   as an integer, then it is propagated to the ID parameter of the
   statepoint created.

 * "statepoint-num-patch-bytes": if the string value of this attribute
   can be parsed as an integer, then it is propagated to the `num patch
   bytes` parameter of the statepoint created.

This change intentionally does not assert on a malformed value for these
attributes, given that they're not "official" attributes.

Reviewers: reames, pgavlin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9735

llvm-svn: 237286
2015-05-13 20:11:31 +00:00
Jingyue Wu c74e33bffe [NaryReassociate] avoid running forever
Avoid running forever by checking we are not reassociating an expression into
the same form.

Tested with @avoid_infinite_loops in nary-add.ll

llvm-svn: 237269
2015-05-13 18:12:24 +00:00
Brendon Cahoon d11c92a41c [Hexagon] Generate loop1 instruction for nested loops
loop1 is for the outer loop and loop0 is for the inner loop.

Differential Revision: http://reviews.llvm.org/D9680

llvm-svn: 237266
2015-05-13 17:56:03 +00:00
Diego Novillo ffc84e378a Add function entry counts from sample profiles.
This patch uses the new function profile metadata "function_entry_count"
to annotate entry counts from sample profiles.

In a sampling profile, the total samples collected at the function entry
are an approximation for the number of times that function was invoked.

llvm-svn: 237265
2015-05-13 17:04:29 +00:00
Diego Novillo 2567f3d0fb Add function entry count metadata.
Summary:
This adds three Function methods to handle function entry counts:
setEntryCount() and getEntryCount().

Entry counts are stored under the MD_prof metadata node with the name
"function_entry_count". They are unsigned 64 bit values set by profilers
(instrumentation and sample profiler changes coming up).

Added documentation for new profile metadata and tests.

Reviewers: dexonsmith, bogner

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9628

llvm-svn: 237260
2015-05-13 15:13:45 +00:00
Brendon Cahoon 254e656862 [Hexagon] Generate hardware loop when loop has a critical edge
The hardware loop pass should try to generate a hardware loop
instruction when the original loop has a critical edge.

Differential Revision: http://reviews.llvm.org/D9678

llvm-svn: 237258
2015-05-13 14:54:24 +00:00
Jozef Kolek 6fec325d10 [mips][microMIPSr6] Implement CLO and CLZ instructions
This patch implements CLO and CLZ instructions using mapping.

Differential Revision: http://reviews.llvm.org/D8553

llvm-svn: 237257
2015-05-13 14:18:11 +00:00
Silviu Baranga 780a3b3be7 Revert r237247 - [AArch64] Codegen VMAX/VMIN.. as it is causing failures in SPEC2000/2006
llvm-svn: 237256
2015-05-13 14:03:18 +00:00
Toma Tabacu d0a7ff2ed7 [mips] [IAS] Unify common functionality of LA and LI.
Summary: A side-effect of this is that LA gains proper handling of unsigned and positive signed 16-bit immediates and more accurate error messages.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9290

llvm-svn: 237255
2015-05-13 13:56:16 +00:00
Artyom Skrobov b526681e08 [AArch64] Codegen VMAX/VMIN for safe math cases
llvm-svn: 237247
2015-05-13 12:01:09 +00:00
Michael Kuperstein c3434b390d Reverting r237234, "Use std::bitset for SubtargetFeatures"
The buildbots are still not satisfied.
MIPS and ARM are failing (even though at least MIPS was expected to pass).

llvm-svn: 237245
2015-05-13 10:28:46 +00:00
Toma Tabacu 9fb2ff71ca [mips] [IAS] Merge the micromips-expressions.s test into expr1.s. NFC.
Summary: Also did some minor reformatting in the resulting test.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9702

llvm-svn: 237242
2015-05-13 09:53:53 +00:00
Sergey Dmitrouk 46c4f02848 [DebugInfo] Debug locations for constant SD nodes
Several updates for [DebugInfo] Add debug locations to constant SD nodes (r235989).
Includes:

 *  re-enabling the change (disabled recently);
 *  missing change for FP constants;
 *  resetting debug location of constant node if it's used more than at one place
    to prevent emission of wrong locations in case of coalesced constants;
 *  a couple of additional tests.

Now all look ups in CSEMap are wrapped by additional method.

Comment in D9084 suggests that debug locations aren't useful for "target constants",
so there might be one more change related to this API (namely, dropping debug
locations for getTarget*Constant methods).

Differential Revision: http://reviews.llvm.org/D9604

llvm-svn: 237237
2015-05-13 08:58:03 +00:00
Michael Kuperstein aba4a34ef2 Use std::bitset for SubtargetFeatures
Previously, subtarget features were a bitfield with the underlying type being uint64_t. 
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.
No functional change.

The first two times this was committed (r229831, r233055), it caused several buildbot failures. 
At least some of the ARM and MIPS ones were due to gcc/binutils issues, and should now be fixed.

llvm-svn: 237234
2015-05-13 08:27:08 +00:00
Elena Demikhovsky 1b2f2f1b37 AVX-512: fixed a bug in encoding of VPSRAQ instrcution,
added a bunch of encoding tests.

llvm-svn: 237232
2015-05-13 07:35:05 +00:00
Jingyue Wu 4b6125d788 [SLSR] handles non-canonicalized Mul candidates
such as (2 + B) * S.

Tested by @non_canonicalized in slsr-mul.ll

llvm-svn: 237216
2015-05-13 00:03:17 +00:00
Sanjoy Das a1d39ba940 [Statepoints] Support for "patchable" statepoints.
Summary:
This change adds two new parameters to the statepoint intrinsic, `i64 id`
and `i32 num_patch_bytes`.  `id` gets propagated to the ID field
in the generated StackMap section.  If the `num_patch_bytes` is
non-zero then the statepoint is lowered to `num_patch_bytes` bytes of
nops instead of a call (the spill and reload code remains unchanged).
A non-zero `num_patch_bytes` is useful in situations where a language
runtime requires complete control over how a call is lowered.

This change brings statepoints one step closer to patchpoints.  With
some additional work (that is not part of this patch) it should be
possible to get rid of `TargetOpcode::STATEPOINT` altogether.

PlaceSafepoints generates `statepoint` wrappers with `id` set to
`0xABCDEF00` (the old default value for the ID reported in the stackmap)
and `num_patch_bytes` set to `0`.  This can be made more sophisticated
later.

Reviewers: reames, pgavlin, swaroop.sridhar, AndyAyers

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9546

llvm-svn: 237214
2015-05-12 23:52:24 +00:00
Saleem Abdulrasool ee13fbe848 CodeGen: ignore DEBUG_VALUE nodes in KILL tagging
DEBUG_VALUE nodes do not take part in code generation.  Ignore them when
performing KILL updates.  Addresses PR23486.

llvm-svn: 237211
2015-05-12 23:36:18 +00:00
Chandler Carruth 942fba95e2 Revert r237175: [X86] Always return the sret parameter in eax/rax ...
This commit broke an x86 test and the bots have been broken for well
over an hour now so I'm just reverting.

llvm-svn: 237210
2015-05-12 23:34:27 +00:00
Bjorn Steinbrink 2833966a3c CVP: Improve handling of Selects used as incoming PHI values
Summary:
If the branch that leads to the PHI node and the Select instruction
depend on correlated conditions, we might be able to directly use the
corresponding value from the Select instruction as the incoming value
for the PHI node, allowing later removal of the select instruction.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9051

llvm-svn: 237201
2015-05-12 22:31:47 +00:00
Philip Reames 311f710654 [RewriteStatepointsForGC] Extend base pointer to handle more cases w/vectors
When relocating a pointer, we need to determine a base pointer for the derived pointer being relocated. We have limited support for handling a pointer extracted from a vector; the current code only handled the case where the entire vector was known to contain base pointers. This patch extends the reasoning to handle chains of insertelements where the indices are constants. This case turns out to be fairly common in vectorized code. We can now handle vectors which contains mixtures of base and derived pointers provided the insertelements use constant indices.

Note that this doesn't solve the general problem. To handle variable indexed insertelements, we'd need to scalarize and introduce conditional branching based on the index. Alternatively, we could eagerly scalarize, but the code structure doesn't currently make either fix easy. The patch also doesn't handle shufflevector or other vector manipulation for much the same reasons. I plan to defer this work until I have a motivating test case.

Differential Revision: http://reviews.llvm.org/D9676

llvm-svn: 237200
2015-05-12 22:19:52 +00:00
Arnold Schwaighofer 6a8c5f6403 MergeFunctions: Two different sized allocas are *not* the same
llvm-svn: 237193
2015-05-12 21:42:22 +00:00
Reid Kleckner b465563b46 [X86] Always return the sret parameter in eax/rax, even on 32-bit
Summary:
This rule was always in the old SysV i386 ABI docs and the new ones that
H.J. Lu has put together, but we never noticed:

  EAX   scratch register; also used to return integer and pointer values
        from functions; also stores the address of a returned struct or union

Fixes PR23491.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9715

llvm-svn: 237175
2015-05-12 20:56:32 +00:00
Philip Reames 5708cca7ab [PlaceSafepoints] Remove dependence on LoopSimplify
As a step towards getting rid of internal pass manager hack entirely, remove the need for loop simplify to run in the inner pass manager. The new code does produce slightly different loop structures, so this isn't technically NFC.

Differential Revision: http://reviews.llvm.org/D9585

llvm-svn: 237172
2015-05-12 20:43:48 +00:00
Sundeep Kushwaha 83a4039c17 [PATCH] [HEXAGON] Add a test program to verify calling convention
for large struct return by value.

Differential Revision: http://reviews.llvm.org/D9709

llvm-svn: 237170
2015-05-12 20:13:10 +00:00
Pat Gavlin c7dc6d6ee7 [Statepoints] Split the calling convention and statepoint flags operand to STATEPOINT into two separate operands.
Differential Revision: http://reviews.llvm.org/D9623

llvm-svn: 237166
2015-05-12 19:50:19 +00:00
Jozef Kolek 38bb81db85 [mips][microMIPSr6] Implement SELEQZ and SELNEZ instructions
This patch implements SELEQZ and SELNEZ instructions using mapping.

Differential Revision: http://reviews.llvm.org/D8497

llvm-svn: 237158
2015-05-12 17:39:32 +00:00
Michael Zolotukhin 8c68171fef Reimplement heuristic for estimating complete-unroll optimization effects.
Summary:
This patch reimplements heuristic that tries to estimate optimization beneftis
from complete loop unrolling.

In this patch I kept the minimal changes - e.g. I removed code handling
branches and folding compares. That's a promising area, but now there
are too many questions to discuss before we can enable it.

Test Plan: Tests are included in the patch.

Reviewers: hfinkel, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8816

llvm-svn: 237156
2015-05-12 17:20:03 +00:00
Petar Jovanovic e0de8f4efb [Mips] Return false for isFPCloseToIncomingSP()
On Mips, frame pointer points to the same side of the frame as the stack
pointer. This function is used to decide where to put register scavenging
spill slot. So far, it was put on the wrong side of the frame, and thus it
was too far away from $fp when frame was larger than 2^15 bytes.

Patch by Vladimir Radosavljevic.

http://reviews.llvm.org/D8895

llvm-svn: 237153
2015-05-12 17:14:05 +00:00
Tom Stellard 28d13a4b12 R600/SI: add pass to mark CF live ranges as non-spillable
Spilling can insert instructions almost anywhere, and this can mess
up control flow lowering in a multitude of ways, due to instruction
reordering. Let's sort this out the easy way: never spill registers
involved with control flow, i.e. saved EXEC masks.

Unfortunately, this does not work at all with optimizations disabled,
as the register allocator ignores spill weights. This should be
addressed in a future commit.

The test was reduced from the "stacks" shader of [1]. Some issues
trigger the machine verifier while another one is checked manually.

[1] http://madebyevan.com/webgl-path-tracing/

v2: only insert pass with optimizations enabled, merge test runs.

Patch by: Grigori Goronzy

llvm-svn: 237152
2015-05-12 17:13:02 +00:00
Sunil Srivastava d79dfcbc37 Changed renaming of local symbols by inserting a dot vefore the numeric suffix.
One code change and several test changes to match that
details in http://reviews.llvm.org/D9481

llvm-svn: 237150
2015-05-12 16:47:30 +00:00
Keith Walker ea9483f847 [DWARF] Add CIE header fields address_size and segment_size when generating dwarf-4
The DWARF-4 specification added 2 new fields in the CIE header called
address_size and segment_size.
Create these 2 new fields when generating dwarf-4 CIE entries, print out
the new fields when dumping the CIE and update tests

Differential Revision: http://reviews.llvm.org/D9558

llvm-svn: 237145
2015-05-12 15:25:08 +00:00
Tom Stellard 8f96dfc9ea R600/SI: Remove M0Reg register class
It is no longer used.

llvm-svn: 237142
2015-05-12 15:00:52 +00:00
Tom Stellard 381a94a764 R600/SI: Remove explicit m0 operand from DS instructions
Instead add m0 as an implicit operand.  This helps avoid spills
of the m0 register in some cases.

llvm-svn: 237141
2015-05-12 15:00:49 +00:00
Tom Stellard 7b222e110f R600/SI: Make sendmsg test more strict
We want to make sure that the m0 copies are being cse'd.

llvm-svn: 237134
2015-05-12 14:18:16 +00:00
Elena Demikhovsky fae20d3565 AVX-512, X86: Added lowering for shift operations for SKX.
The other changes in the LowerShift() are not functional,
just to make the code more convenient.
So, the functional changes for SKX only.

llvm-svn: 237129
2015-05-12 13:25:46 +00:00
John Brawn 70605f7d22 [ARM] Use AEABI aligned function variants
AEABI defines aligned variants of memcpy etc. that can be faster than
the default version due to not having to do alignment checks. When
emitting target code for these functions make use of these aligned
variants if possible. Also convert memset to memclr if possible.

Differential Revision: http://reviews.llvm.org/D8060

llvm-svn: 237127
2015-05-12 13:13:38 +00:00
Igor Laevsky 87ef5eaf46 Reverse ordering of base and derived pointer during safepoint lowering.
According to the documentation in StackMap section for the safepoint we should have:
"The first Location in each pair describes the base pointer for the object. The second is the derived pointer actually being relocated."
But before this change we emitted them in reverse order - derived pointer first, base pointer second.

llvm-svn: 237126
2015-05-12 13:12:14 +00:00
Vasileios Kalintiris b48c905613 [mips][FastISel] Handle calls with non legal types i8 and i16.
Summary: Allow calls with non legal integer types based on i8 and i16 to be processed by mips fast-isel.

Based on a patch by Reed Kotler.

Test Plan:
"Make check" test forthcoming.
Test-suite passes at O0/O2 and with mips32 r1/r2

Reviewers: rkotler, dsanders

Subscribers: llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D6770

llvm-svn: 237121
2015-05-12 12:29:17 +00:00
Vasileios Kalintiris c07f848785 [mips][FastISel] Simplify callabi.ll by using multiple check prefixes.
Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9635

llvm-svn: 237119
2015-05-12 12:17:11 +00:00
Vasileios Kalintiris 32cd69a2eb [mips][FastISel] Allow computation of addresses from constant expressions.
Summary:
Try to compute addresses when the offset from a memory location is a constant
expression.

Based on a patch by Reed Kotler.

Test Plan:
Passes test-suite for -O0/O2 and mips 32 r1/r2

Reviewers: rkotler, dsanders

Subscribers: llvm-commits, aemerson, rfuhler

Differential Revision: http://reviews.llvm.org/D6767

llvm-svn: 237117
2015-05-12 12:08:31 +00:00
Elena Demikhovsky 6f30dc18d3 AVX-512: asm parser errors check
I reverted the error check that was removed in 236416.
I put the it in a separate file.

llvm-svn: 237107
2015-05-12 09:47:23 +00:00
Elena Demikhovsky c1ac5d7bd5 AVX-512: select operation for i1 vectors
like: select i1 %cond, <16 x i1> %a, <16 x i1> %b.
I added pseudo-CMOV patterns to resolve the "select".
Added tests for KNL and SKX.

llvm-svn: 237106
2015-05-12 09:36:52 +00:00
Michael Kuperstein 6f5ff6905c [X86] DAGCombine should not assume arbitrary vector types are simple
The X86-specific DAGCombine for stores should not assume vector types are always simple.
This fixes PR23476.

Differential Revision: http://reviews.llvm.org/D9659

llvm-svn: 237097
2015-05-12 07:33:07 +00:00
Eric Christopher 824f42f209 Migrate existing backends that care about software floating point
to use the information in the module rather than TargetOptions.

We've had and clang has used the use-soft-float attribute for some
time now so have the backends set a subtarget feature based on
a particular function now that subtargets are created based on
functions and function attributes.

For the one middle end soft float check go ahead and create
an overloadable TargetLowering::useSoftFloat function that
just checks the TargetSubtargetInfo in all cases.

Also remove the command line option that hard codes whether or
not soft-float is set by using the attribute for all of the
target specific test cases - for the generic just go ahead and
add the attribute in the one case that showed up.

llvm-svn: 237079
2015-05-12 01:26:05 +00:00
Ahmed Bougacha b61696656e [MemCpyOpt] Look at any dependency -not just source- for memset+memcpy.
This fixes another miscompile introduced by r235232: when there was a
dependency on the memcpy destination other than the memset, we would
ignore it, because we only looked at the source dependency.

It was a mistake to use SrcDepInfo.  Instead, just use DepInfo.

llvm-svn: 237066
2015-05-11 23:09:46 +00:00
Andrew Kaylor cc14f387e8 [WinEH] Handle nested landing pads that return directly to the parent function.
Differential Revision: http://reviews.llvm.org/D9684

llvm-svn: 237063
2015-05-11 23:06:02 +00:00
Andrew Kaylor 762a6bea1f [WinEH] Update exception numbering to give handlers their own base state.
Differential Revision: http://reviews.llvm.org/D9512

llvm-svn: 237014
2015-05-11 19:41:19 +00:00
Sanjoy Das 89c5491a72 [RewriteStatepointsForGC] Fix a bug on creating gc_relocate for pointer to vector of pointers
Summary:
In RewriteStatepointsForGC pass, we create a gc_relocate intrinsic for
each relocated pointer, and the gc_relocate has the same type with the
pointer. During the creation of gc_relocate intrinsic, llvm requires to
mangle its type. However, llvm does not support mangling of all possible
types. RewriteStatepointsForGC will hit an assertion failure when it
tries to create a gc_relocate for pointer to vector of pointers because
mangling for vector of pointers is not supported.

This patch changes the way RewriteStatepointsForGC pass creates
gc_relocate. For each relocated pointer, we erase the type of pointers
and create an unified gc_relocate of type i8 addrspace(1)*. Then a
bitcast is inserted to convert the gc_relocate to the correct type. In
this way, gc_relocate does not need to deal with different types of
pointers and the unsupported type mangling is no longer a problem. This
change would also ease further merge when LLVM erases types of pointers
and introduces an unified pointer type.

Some minor changes are also introduced to gc_relocate related part in
InstCombineCalls, CodeGenPrepare, and Verifier accordingly.

Patch by Chen Li!

Reviewers: reames, AndyAyers, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9592

llvm-svn: 237009
2015-05-11 18:49:34 +00:00
Pirama Arumuga Nainar af171e7720 [X86] Updates to X86 backend for f16 promotion
Summary:
r235215 adds support for f16 to be considered as a load/store type and
promote f16 operations to f32.

This patch has miscellaneous fixes for the X86 backend so all f16
operations are handled:
1. Set loadextaction for f16 vectors to expand.
2. Handle FP_EXTEND in a switch statement when handling v2f32
3. Do not fold (FP_TO_SINT (load f16)) into FP_TO_INT*_IN_MEM or
(store (SINT_TO_FP )) to a FILD.

Tests included.

Reviewers: ab, srhines, delena

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9092

llvm-svn: 237004
2015-05-11 17:14:39 +00:00
Adam Nemet 0fd96dddf5 [Testsuite] Renumber metadata in ScopedNoAliasAA test to match CHECK lines
Summary:
Now it's much easier to follow what's happening in this test.

Also removed some unused metadata entries.

Reviewers: hfinkel

Reviewed By: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9601

llvm-svn: 236981
2015-05-11 09:10:14 +00:00
Elena Demikhovsky 3822af0715 AVX-512: Changed CC parameter in "cmp" intrinsic
from i8 to i32 according to the Intel Spec

by Igor Breger (igor.breger@intel.com)

llvm-svn: 236979
2015-05-11 09:03:14 +00:00
Hal Finkel f0d68d788b [InstCombine/PowerPC] Fix single-precision QPX load/store replacement
The QPX single-precision load/store intrinsics have implied
truncation/extension from/to the declared value type of <4 x double> to the
memory type of <4 x float>. When we can prove the alignment of the pointer
argument, and thus replace the intrinsic with a regular load or store, we need
to load or store the correct data type (<4 x float>) instead of (<4 x double>).

llvm-svn: 236973
2015-05-11 06:37:03 +00:00
Elena Demikhovsky 0d7e9364d1 AVX-512: Added SKX instructions and intrinsics:
{add/sub/mul/div/} x {ps/pd} x {128/256} 2. max/min with sae

By Asaf Badouh (asaf.badouh@intel.com)

llvm-svn: 236971
2015-05-11 06:05:05 +00:00
David Majnemer 176fd7c4af Make buildbots happy
llvm-svn: 236970
2015-05-11 05:33:27 +00:00
David Majnemer 7536460c0f [InstCombine] Canonicalize single element array store
Use the element type instead of the aggregate type.

Differential Revision: http://reviews.llvm.org/D9591

llvm-svn: 236969
2015-05-11 05:04:27 +00:00
David Majnemer 58fb038b1b [InstCombine] Canonicalize single element array load
Use the element type instead of the aggregate type.

Differential Revision: http://reviews.llvm.org/D9596

llvm-svn: 236968
2015-05-11 05:04:22 +00:00
Elena Demikhovsky f40342d6a2 AVX-512: fixed UINT_TO_FP operation for 512-bit types.
llvm-svn: 236955
2015-05-10 14:23:52 +00:00
Elena Demikhovsky 75d1489326 AVX-512: fixed a bug in i1 vectors lowering
llvm-svn: 236947
2015-05-10 10:33:32 +00:00
NAKAMURA Takumi 7f52d6d82c llvm/test/CodeGen/AArch64/tailcall_misched_graph.ll: s/REQUIRE/REQUIRES/
llvm-svn: 236928
2015-05-09 05:59:00 +00:00
James Y Knight fca02be3c1 Fix MergeConsecutiveStore for non-byte-sized memory accesses.
The bug showed up as a compile-time assertion failure:
  Assertion `NumBits >= MIN_INT_BITS && "bitwidth too small"' failed
when building msan tests on x86-64.

Prior to r236850, this bug was masked due to a bogus alignment check,
which also accidentally rejected non-byte-sized accesses. Afterwards,
an invalid ElementSizeBytes == 0 got further into the function, and
triggered the assertion failure.

It would probably be a good idea to allow it to handle merging stores
of unusual widths as well, but for now, to un-break it, I'm just
making the minimal fix.

Differential Revision: http://reviews.llvm.org/D9626

llvm-svn: 236927
2015-05-09 03:13:37 +00:00
Pete Cooper d54fb89901 [Fast-ISel] Don't mark the first use of a remat constant as killed.
When emitting something like 'add x, 1000' if we remat the 1000 then we should be able to
mark the vreg containing 1000 as killed.  Given that we go bottom up in fast-isel, a later
use of 1000 will be higher up in the BB and won't kill it, or be impacted by the lower kill.

However, rematerialised constant expressions aren't generated bottom up.  The local value save area
grows downwards.  This means that if you remat 2 constant expressions which both use 1000 then the
first will kill it, then the second, which is *lower* in the BB will read a killed register.

This is the case in the attached test where the 2 GEPs both need to generate 'add x, 6680' for the constant offset.

Note that this commit only makes kill flag generation conservative.  There's nothing else obviously wrong with
the local value save area growing downwards, and in fact it needs to for handling arbitrarily complex constant expressions.

However, it would be nice if there was a solution which would let us generate more accurate kill flags, or just kill flags completely.

llvm-svn: 236922
2015-05-09 00:51:03 +00:00
Arnold Schwaighofer f54b73d681 ScheduleDAGInstrs: In functions with tail calls PseudoSourceValues are not non-aliasing distinct objects
The code that builds the dependence graph assumes that two PseudoSourceValues
don't alias. In a tail calling function two FixedStackObjects might refer to the
same location. Worse 'immutable' fixed stack objects like function arguments are
not immutable and will be clobbered.

Change this so that a load from a FixedStackObject is not invariant in a tail
calling function and don't return a PseudoSourceValue for an instruction in tail
calling functions when building the dependence graph so that we handle function
arguments conservatively.

Fix for PR23459.

rdar://20740035

llvm-svn: 236916
2015-05-08 23:52:00 +00:00
Hans Wennborg ae0254dabc Switch lowering: cluster adjacent fall-through cases even at -O0
It's cheap to do, and codegen is much faster if cases can be merged
into clusters.

llvm-svn: 236905
2015-05-08 21:23:39 +00:00
Pete Cooper e4bb07ecff [Fast-ISel] Clear kill flags on registers replaced by updateValueMap.
When selecting an extract instruction, we don't actually generate code but instead work out which register we are reading, and rewrite uses of the extract def to the source register.  This is done via updateValueMap,.

However, its possible that the source register we are rewriting *to* to also have uses.  If those uses are after a kill of the value we are rewriting *from* then we have uses after a kill and the verifier fails.

This code checks for the case where the to register is also used, and if so it clears all kill on the from register.  This is conservative, but better that always clearing kills on the from register.

llvm-svn: 236897
2015-05-08 20:46:54 +00:00
Brendon Cahoon bece8edcdd [Hexagon] Generate more hardware loops
Refactored parts of the hardware loop pass to generate
more. Also, added more tests.

Differential Revision: http://reviews.llvm.org/D9568

llvm-svn: 236896
2015-05-08 20:18:21 +00:00
Sanjoy Das 14f5080aa1 [BasicAA] Fix zext & sext handling
Summary:

There are several unhandled edge cases in BasicAA's GetLinearExpression
method. This changes fixes outstanding issues, including zext / sext of
a constant with the sign bit set, and the refusal to decompose zexts or
sexts of wrapping arithmetic.

Test Plan: Unit tests added in //q.ext.ll//.

Patch by Nick White.

Reviewers: hfinkel, sanjoy

Reviewed By: hfinkel, sanjoy

Subscribers: sanjoy, llvm-commits, hfinkel

Differential Revision: http://reviews.llvm.org/D6682

llvm-svn: 236894
2015-05-08 18:58:55 +00:00
Pete Cooper 7f7c9f1dad [X86] Fast-ISel was incorrectly always killing the source of a truncate.
A trunc from i32 to i1 on x86_64 generates an instruction such as

%vreg19<def> = COPY %vreg9:sub_8bit<kill>; GR8:%vreg19 GR32:%vreg9

However, the copy here should only have the kill flag on the 32-bit path, not the 64-bit one.
Otherwise, we are killing the source of the truncate which could be used later in the program.

llvm-svn: 236890
2015-05-08 18:29:42 +00:00
Pat Gavlin cc0431d1c0 Extend the statepoint intrinsic to allow statepoints to be marked as transitions from GC-aware code to code that is not GC-aware.
This changes the shape of the statepoint intrinsic from:

  @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 unused, ...call args, i32 # deopt args, ...deopt args, ...gc args)

to:

  @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 flags, ...call args, i32 # transition args, ...transition args, i32 # deopt args, ...deopt args, ...gc args)

This extension offers the backend the opportunity to insert (somewhat) arbitrary code to manage the transition from GC-aware code to code that is not GC-aware and back.

In order to support the injection of transition code, this extension wraps the STATEPOINT ISD node generated by the usual lowering lowering with two additional nodes: GC_TRANSITION_START and GC_TRANSITION_END. The transition arguments that were passed passed to the intrinsic (if any) are lowered and provided as operands to these nodes and may be used by the backend during code generation.

Eventually, the lowering of the GC_TRANSITION_{START,END} nodes should be informed by the GC strategy in use for the function containing the intrinsic call; for now, these nodes are instead replaced with no-ops.

Differential Revision: http://reviews.llvm.org/D9501

llvm-svn: 236888
2015-05-08 18:07:42 +00:00
Jingyue Wu 78b0d51cd0 [NoTTI] reject negative scale in addressing mode
Summary:
I noticed this bug when deubging a WIP on LSR. I wonder whether and how we
should add a regression test for this.

Test Plan: no tests failed.

Reviewers: atrick

Subscribers: hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D9536

llvm-svn: 236887
2015-05-08 18:07:24 +00:00
Pete Cooper 85b1c48b20 Clear kill flags on all used registers when sinking instructions.
The test here was sinking the AND here to a lower BB:

	%vreg7<def> = ANDWri %vreg8, 0; GPR32common:%vreg7,%vreg8
	TBNZW %vreg8<kill>, 0, <BB#1>; GPR32common:%vreg8

which meant that vreg8 was read after it was killed.

This commit changes the code from clearing kill flags on the AND to clearing flags on all registers used by the AND.

llvm-svn: 236886
2015-05-08 17:54:32 +00:00
Pete Cooper eaef4bcd40 Remove duplicate cmake target I added in r236792.
Thanks to Daniel Jasper for pointing out the mistake.

llvm-svn: 236881
2015-05-08 16:59:53 +00:00
Brendon Cahoon df43e68629 [Hexagon] Update AnalyzeBranch, etc target hooks
Improved the AnalyzeBranch, InsertBranch, and RemoveBranch
functions in order to handle more of our branch instructions.
This requires changes to analyzeCompare and PredicateInstructions.
Specifically, we've added support for new value compare jumps,
improved handling of endloop, added more compare instructions,
and improved support for predicate instructions.

Differential Revision: http://reviews.llvm.org/D9559

llvm-svn: 236876
2015-05-08 16:16:29 +00:00
Andrea Di Biagio 84e22b9096 [X86] Teach 'getTargetShuffleMask' how to look through ISD::WrapperRIP when decoding a PSHUFB mask.
The function 'getTargetShuffleMask' already knows how to deal with PSHUFB nodes
where the mask node is a load from constant pool, and the constant pool node
is wrapped by a X86ISD::Wrapper node. This patch extends that logic by teaching
it how to also look through X86ISD::WrapperRIP.

This helps function combineX86ShufflesRecusively to combine more shuffle
sequences containing PSHUFB nodes if we are in RIPRel PIC mode.

Before this change, llc (with -relocation-model=pic -march=x86-64) was unable
to decode a pshufb where the mask was loaded from a constant pool. For example,
the no-op shuffle from test 'x86-fold-pshufb.ll' was not folded into its
operand, so instead of generating a single 'movaps' the backend always
generated a sub-optimal 'movdqa + pshufb' sequence.

Added test x86-fold-pshufb.ll.

llvm-svn: 236863
2015-05-08 15:11:07 +00:00
Jozef Kolek 8abad7bacc [mips][microMIPSr6] Implement ALUIPC and AUIPC instructions
This patch implements ALUIPC and AUIPC instructions using mapping.

Differential Revision: http://reviews.llvm.org/D8441

llvm-svn: 236858
2015-05-08 14:25:11 +00:00
James Y Knight 38514d2177 Fix test added in r236850 for OSX builders.
Need to specify triple so that llvm emits the asm syntax that the
test expected.

llvm-svn: 236855
2015-05-08 14:04:54 +00:00
Jozef Kolek 9ce6e0a926 [mips][microMIPSr6] Implement ADDIUPC and LWPC instructions
This patch implements ADDIUPC and LWPC instructions using mapping.

Differential Revision: http://reviews.llvm.org/D8415

llvm-svn: 236852
2015-05-08 13:52:04 +00:00
James Y Knight 284e7b3d6c Fix alignment checks in MergeConsecutiveStores.
1) check whether the alignment of the memory is sufficient for the
*merged* store or load to be efficient.

Not doing so can result in some ridiculously poor code generation, if
merging creates a vector operation which must be aligned but isn't.

2) DON'T check that the alignment of each load/store is equal. If
you're merging 2 4-byte stores, the first *might* have 8-byte
alignment, but the second certainly will have 4-byte alignment. We do
want to allow those to be merged.

llvm-svn: 236850
2015-05-08 13:47:01 +00:00
Vasileios Kalintiris 42544d6472 [mips] Emit the .insn directive for empty basic blocks.
Summary:
In microMIPS, labels need to know whether they are on code or data. This is
indicated with STO_MIPS_MICROMIPS and can be inferred by being followed
by instructions. For empty basic blocks, we can ensure this by emitting the
.insn directive after the label.

Also, this fixes some failures in our out-of-tree microMIPS buildbots, for the
exception handling regression tests under: SingleSource/Regression/C++/EH

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9530

llvm-svn: 236815
2015-05-08 09:10:15 +00:00
Simon Atanasyan 40e7eb166a [llvm-readobj/obj2yaml/yaml2obj] Support MIPS machine ELF header flags
llvm-svn: 236807
2015-05-08 07:04:59 +00:00
Eric Christopher 81fa35ca24 Now that we have a soft-float attribute, use it instead of the
hard coded command line option for the Mips soft float tests.

llvm-svn: 236801
2015-05-08 00:57:22 +00:00
Pete Cooper fb13c57669 Add yaml-bench to the list of tools make check needs to run
llvm-svn: 236792
2015-05-07 22:53:11 +00:00
NAKAMURA Takumi 723b449df5 [CMake] llvm/test/YAMLParser requires yaml-bench. This fixes r236754.
llvm-svn: 236787
2015-05-07 22:24:58 +00:00
Pete Cooper ba593ad3f3 Clear kill flags in tail duplication.
If we duplicate an instruction then we must also clear kill flags on any uses we rewrite.
Otherwise we might be killing a register which was used in other BBs.

For example, here the entry BB ended up with these instructions, the ADD having been tail duplicated.

	%vreg24<def> = t2ADDri %vreg10<kill>, 1, pred:14, pred:%noreg, opt:%noreg; GPRnopc:%vreg24 rGPR:%vreg10
	%vreg22<def> = COPY %vreg10; GPR:%vreg22 rGPR:%vreg10

	The copy here is inserted after the add and so needs vreg10 to be live.

llvm-svn: 236782
2015-05-07 21:48:26 +00:00
Ismail Pazarbasi d6291964fb When checking msan.module_ctor, use CHECK-LABEL instead of CHECK
llvm-svn: 236781
2015-05-07 21:47:25 +00:00
Ismail Pazarbasi e5048e153a MSan: Use `createSanitizerCtor` to create ctor, and call `__msan_init`
Reviewers: kcc, eugenis

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8781

llvm-svn: 236779
2015-05-07 21:41:52 +00:00
Ismail Pazarbasi 2d4ae9f0d5 TSan: Use `createSanitizerCtor` to create ctor, and call `__tsan_init`
Reviewers: kcc, dvyukov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8779

llvm-svn: 236778
2015-05-07 21:41:23 +00:00
Ismail Pazarbasi 09c3709e75 ASan: Use `createSanitizerCtor` to create ctor, and call `__asan_init`
Reviewers: kcc, samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8778

llvm-svn: 236777
2015-05-07 21:40:46 +00:00
Pete Cooper f52123b454 [AArch64] Fix sext/zext folding in address arithmetic.
We were accidentally folding a sign/zero extend in to address arithmetic in a different BB when the extend wasn't available there.

Cross BB fast-isel isn't safe, so restrict this to only when the extend is in the same BB as the use.

llvm-svn: 236764
2015-05-07 19:21:36 +00:00
Sergey Dmitrouk 9a6caea967 Disable r235989 "Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes""
Will be re-enabled with missing changes for ConstantFPSDNode and
fixes for wrong locations due to constant coalescing.

llvm-svn: 236758
2015-05-07 18:33:50 +00:00
Nemanja Ivanovic f3c94b1e3c Add VSX Scalar loads and stores to the PPC back end
This patch corresponds to review:
http://reviews.llvm.org/D9440

It adds a new register class to the PPC back end to contain single precision
values in VSX registers. Additionally, it adds scalar loads and stores for
VSX registers.

llvm-svn: 236755
2015-05-07 18:24:05 +00:00
Alex Lorenz e4bcfbf5dc YAML: Enable the YAMLParser tests.
This commit enables the tests located in test/YAMLParser directory.
Those tests were never actually enabled, as llvm-lit didn't pick up the
files with the 'data' extension. The commit renames those test files to files
with the 'test' extension so that llvm-lit would find them.

This commit also modifies yaml-bench so that it returns an error status
if an error occurred during parsing. It also adds the '-use-color'
command line option to yaml-bench (to make sure that file check matches
the error messages in the output stream).

This commit modifies some of the renamed tests so that they wouldn't
fail. It gets rid of XFAILs and uses the 'not' command instead for
some of the tests that have to fail during parsing. This commit
also adds some 'FIXME' comments to a couple of tests that are
supposed to fail but currently pass because of various bugs
in the implementation of the yaml parser.

Reviewers: Justin Bogner

Differential Revision: http://reviews.llvm.org/D9448

llvm-svn: 236754
2015-05-07 18:08:46 +00:00
Diego Novillo de5b8016ab Fix information loss in branch probability computation.
Summary:
This addresses PR 22718. When branch weights are too large, they were
being clamped to the range [1, MaxWeightForBB]. But this clamping is
only applied to edges that go outside the range, so it distorts the
relative branch probabilities.

This patch changes the weight calculation to scale every branch so the
relative probabilities are preserved. The scaling is done differently
now. First, all the branch weights are added up, and if the sum exceeds
32 bits, it computes an integer scale to bring all the weights within
the range.

The patch fixes an existing test that had slightly wrong branch
probabilities due to the previous clamping. It now gets branch weights
scaled accordingly.

Reviewers: dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9442

llvm-svn: 236750
2015-05-07 17:22:06 +00:00
Jozef Kolek cf98462818 [mips][microMIPSr6] Implement JIALC and JIC instructions
This patch implements JIALC and JIC instructions using mapping.

Differential Revision: http://reviews.llvm.org/D8389

llvm-svn: 236748
2015-05-07 17:12:23 +00:00
Michael Zolotukhin de63aace8a Populate list of vectorizable functions for Accelerate library.
Summary:
This patch adds majority of supported by Accelerate library functions to the
list of vectorizable functions.

The full list of available vector functions could be found here:
https://developer.apple.com/library/mac/documentation/Performance/Conceptual/vecLib/index.html

Test Plan: Unit tests are added.

Reviewers: hfinkel, aschwaighofer, nadav

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9543

llvm-svn: 236747
2015-05-07 17:11:51 +00:00
Sanjay Patel a9f6d3505d [x86] eliminate unnecessary shuffling/moves with unary scalar math ops (PR21507)
Finish the job that was abandoned in D6958 following the refactoring in
http://reviews.llvm.org/rL230221:

1. Uncomment the intrinsic def for the AVX r_Int instruction.
2. Add missing r_Int entries to the load folding tables; there are already
   tests that check these in "test/Codegen/X86/fold-load-unops.ll", so I
   haven't added any more in this patch.
3. Add patterns to solve PR21507 ( https://llvm.org/bugs/show_bug.cgi?id=21507 ).

So instead of this:

  movaps	%xmm0, %xmm1
  rcpss	%xmm1, %xmm1
  movss	%xmm1, %xmm0

We should now get:

  rcpss	%xmm0, %xmm0

And instead of this:

  vsqrtss	%xmm0, %xmm0, %xmm1
  vblendps	$1, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm1[0],xmm0[1,2,3]

We should now get:

  vsqrtss	%xmm0, %xmm0, %xmm0


Differential Revision: http://reviews.llvm.org/D9504

llvm-svn: 236740
2015-05-07 15:48:53 +00:00
Simon Atanasyan 04d9e653ed [obj2yaml/yaml2obj] Add SHT_MIPS_ABIFLAGS section support
This change adds support for the SHT_MIPS_ABIFLAGS section
reading/writing to the obj2yaml and yaml2obj tools.

llvm-svn: 236738
2015-05-07 15:40:48 +00:00
Simon Atanasyan c914de2770 [llvm-readobj] Print .MIPS.abiflags section content
This change adds new flag -mips-abi-flags to the llvm-readobj. This flag
forces printing of .MIPS.abiflags section content.

https://dmz-portal.mips.com/wiki/MIPS_O32_ABI_-_FR0_and_FR1_Interlinking#10.2.1._.MIPS.abiflags

llvm-svn: 236737
2015-05-07 15:40:35 +00:00
Simon Atanasyan 67bdc799a7 [llvm-readobj/obj2yaml/yaml2obj] Support more MIPS ELF header flags
llvm-svn: 236728
2015-05-07 14:04:44 +00:00
Elena Demikhovsky 29792e9a80 AVX-512: Added all forms of FP compare instructions for KNL and SKX.
Added intrinsics for the instructions. CC parameter of the intrinsics was changed from i8 to i32 according to the spec.

By Igor Breger (igor.breger@intel.com)

llvm-svn: 236714
2015-05-07 11:24:42 +00:00
Toma Tabacu 506cfd0b2b [mips] Add the SoftFloat MipsSubtarget feature.
Summary: This will enable the IAS to reject floating point instructions if soft-float is enabled.

Reviewers: dsanders, echristo

Reviewed By: dsanders

Subscribers: jfb, llvm-commits, mpf

Differential Revision: http://reviews.llvm.org/D9053

llvm-svn: 236713
2015-05-07 10:29:52 +00:00
NAKAMURA Takumi 386c22c0ff llvm/test/CodeGen/X86/llc-override-mcpu-mattr.ll: Tweak not to be affected by x64 Calling Convention.
llvm-svn: 236710
2015-05-07 10:18:28 +00:00
Mehdi Amini 2668a487a7 Update InstCombine to transform aggregate loads into scalar loads.
Summary:
One step further getting aggregate loads and store being optimized
properly. This will only handle struct with one element at this point.

Test Plan: Added unit tests for the new supported cases.

Reviewers: chandlerc, joker-eph, joker.eph, majnemer

Reviewed By: majnemer

Subscribers: pete, llvm-commits

Differential Revision: http://reviews.llvm.org/D8339

Patch by Amaury Sechet.

From: Amaury Sechet <amaury@fb.com>
llvm-svn: 236695
2015-05-07 05:52:40 +00:00
Philip Reames 7a738dd94c [JumpThreading] Simplify comparisons when simplifying branches
If we have recognized that a conditional is constant at a particular location in the code (while trying to decide if we can simplify a conditional branch), we can eagerly replace that condition with a constant if it's definition is post dominated by the branch in question.

In practice, this ends up being a compile time savings at most. JumpThreading would have visited each using branch anyways. CVP would have visited the cmp itself again. Unless LVI gives up early, we shouldn't gain any addition power by doing this transformation early. What we do gain is simplicity and compile time.

Differential Revision: http://reviews.llvm.org/D9312

llvm-svn: 236684
2015-05-07 00:19:14 +00:00
Akira Hatanaka 3058d0f080 Let llc and opt override "-target-cpu" and "-target-features" via command line
options.

This commit fixes a bug in llc and opt where "-mcpu" and "-mattr" wouldn't
override function attributes "-target-cpu" and "-target-features" in the IR.

Differential Revision: http://reviews.llvm.org/D9537

llvm-svn: 236677
2015-05-06 23:54:14 +00:00
Sanjoy Das abe1c685ac [IRBuilder] Add a CreateGCStatepointInvoke.
Renames the original CreateGCStatepoint to CreateGCStatepointCall, and
moves invoke creating functionality from PlaceSafepoints.cpp to
IRBuilder.cpp.

This changes the labels generated for PlaceSafepoints/invokes.ll so use
a regex there to make the basic block labels more resilient.

llvm-svn: 236672
2015-05-06 23:53:09 +00:00
Justin Bogner 367a9f28c1 InstrProf: Give coverage its own errors instead of piggy backing on instrprof
Since the coverage mapping reader and the instrprof reader were
emitting a shared set of error codes, the error messages you'd get
back from llvm-cov were ambiguous about what was actually wrong. Add
another error category to fix this.

I've also improved the wording on a couple of the instrprof errors,
for consistency.

llvm-svn: 236665
2015-05-06 23:19:35 +00:00
Duncan P. N. Exon Smith 538ef562bd Bitcode: Set LastDL after writing DebugLocs
Somehow I dropped this in r233585, and we haven't had `DEBUG_LOC_AGAIN`
records since.  Add it back.  Also tests that the output assembly looks
okay.

Fixes PR23436.

llvm-svn: 236661
2015-05-06 22:51:12 +00:00
Pete Cooper 27483915e8 Handle dead defs in the if converter.
We had code such as this:
  r2 = ...
  t2Bcc

label1:
  ldr ... r2

label2;
  return r2<dead, def>

The if converter was transforming this to
   r2<def> = ...
   return [pred] r2<dead,def>
   ldr <r2, kill>
   return

which fails the machine verifier because the ldr now reads from a dead def.

The fix here detects dead defs in stepForward and passes them back to the caller in the clobbers list.  The caller then clears the dead flag from the def is the value is live.

llvm-svn: 236660
2015-05-06 22:51:04 +00:00
Zachary Turner c007aa41b6 A few fixes for llvm-symbolizer on Windows.
Specifically, this patch correctly respects the -demangle option,
and additionally adds a hidden --relative-address option allows
input addresses to be relative to the module load address instead
of absolute addresses into the image.

llvm-svn: 236653
2015-05-06 22:26:30 +00:00
Pete Cooper 54085cdc7b Fix incorrect kill flags in fastisel.
If called twice in the same BB on the same constant, FastISel::fastEmit_ri_ was marking the materialized vreg as killed on each use, instead of only the last use.

Change this to only mark the last use as killed by making earlier uses check if the vreg is already used elsewhere.

llvm-svn: 236650
2015-05-06 22:09:29 +00:00
Pete Cooper d31583ddfb [x86] Fix register class of folded load index reg.
When folding a load in to another instruction, we need to fix the class of the index register
Otherwise, it could be something like GR64 not GR64_NOSP and would fail the machine verifier.

llvm-svn: 236644
2015-05-06 21:37:19 +00:00
Tim Northover e4310fe946 CodeGen: move over-zealous assert into actual if statement.
It's quite possible to encounter an insertvalue instruction that's more deeply
nested than the value we're looking for, but when that happens we really
mustn't compare beyond the end of the index array.

Since I couldn't see any guarantees about what comparisons std::equal makes, we
probably need to directly check the size beforehand. In practice, I suspect
most std::equal implementations would probably bail early, which would be OK.
But just in case...

rdar://20834485

llvm-svn: 236635
2015-05-06 20:07:38 +00:00
Duncan P. N. Exon Smith 653c1099b4 DwarfDebug: Emit number of bytes in .debug_loc entry directly
Emit the number of bytes in a `.debug_loc` entry directly.  The old code
created temp labels (expensive), emitted the difference between them,
and then emitted one on each side of the relevant bytes.

(I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc`
(the optimized version of ld64's `-save-temps` when linking the
`verify-uselistorder` executable in an LTO bootstrap).  I've hacked
`MCContext::Allocate()` to just call `malloc()` instead of using the
`BumpPtrAllocator` so that the heap profile is easier to read.  As far
as peak memory is concerned, `MCContext::Allocate()` is equivalent to a
leak, since it only gets freed at process teardown.

In my heap profile, this patch drops memory usage of
`DwarfDebug::emitDebugLoc()` from 132.56 MB (11.4%) down to 29.86 MB
(2.7%) at peak memory.  Some of that must be noise from `SmallVector`
(or other) allocations -- peak memory only dropped from 1160 MB down to
1100 MB -- but this nevertheless shaves 5% off the top.)

llvm-svn: 236629
2015-05-06 19:11:20 +00:00
Diego Novillo 14f94de1ee Allow 0-weight branches in BranchProbabilityInfo.
Summary:
When computing branch weights in BPI, we used to disallow branches with
weight 0. This is a minor nuisance, because a branch with weight 0 is
different to "don't have information". In the context of
instrumentation, it may mean "never executed", in the context of
sampling, it means "never or seldom executed".

In allowing 0 weight branches, I ran into issues with the switch
expansion code in selection DAG. It is currently hardwired to not handle
branches with weight 0. To maintain the current behaviour, I changed it
to use 1 when it finds 0, but perhaps the algorithm needs changes to
tolerate branches with weight zero.

Reviewers: hansw

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9533

llvm-svn: 236617
2015-05-06 17:55:11 +00:00
Wei Mi 062c74484d [X86] Disable loop unrolling in loop vectorization pass when VF is 1.
The patch disabled unrolling in loop vectorization pass when VF==1 on x86 architecture,
by setting MaxInterleaveFactor to 1. Unrolling in loop vectorization pass may introduce
the cost of overflow check, memory boundary check and extra prologue/epilogue code when
regular unroller will unroll the loop another time. Disable it when VF==1 remove the
unnecessary cost on x86. The same can be done for other platforms after verifying
interleaving/memory bound checking to be not perf critical on those platforms.

Differential Revision: http://reviews.llvm.org/D9515

llvm-svn: 236613
2015-05-06 17:12:25 +00:00
Pawel Bylica 3b0adaf6b0 Readd the regression test from r236584. Calling convention fixed to linux.
llvm-svn: 236610
2015-05-06 16:43:21 +00:00
Pete Cooper d927c6eaf8 [ARM] Fast-Isel was incorrectly selecting <2 x double> adds.
With neon enabled, we reach SelectBinaryFPOp and are able to get registers for a <2 x double> add.

However, we shouldn't actually attempt arithmetic on it as ARMIselLowering says "v2f64 is legal so that QR subregs can be extracted as f64 elements, but neither Neon nor VFP support any arithmetic operations on it."

This commit disables SelectBinaryFPOp for any vector types.  There's already a FIXME to try handle neon.  Doing so would require fixing this conditional which isn't safe for vectors 'VT == MVT::f64 || VT == MVT::i64'

llvm-svn: 236609
2015-05-06 16:39:17 +00:00
Bill Schmidt 5fe2e25f7c [PPC64LE] Adjust vector splats during VSX swap optimization
The initial code drop for VSX swap optimization permitted the
optimization only when all operations in a web of related computation
are lane-insensitive.  For some lane-sensitive operations, we can
still permit the optimization provided that we make adjustments to
those operations.  This patch adds special handling for vector splats
so that their presence doesn't kill the optimization.

Vector splats are lane-sensitive since they identify by number a
vector element to be used as the source of a splat.  When swap
optimizations take place, the desired vector element will move to the
opposite doubleword of the quadword vector.  We thus replace the index
I by (I + N/2) % N, where N is the number of elements in the vector.

A new test case is added to test that swap optimization succeeds when
vector splats are present, and that the proper input element is used
as the source of the splat.

An ancillary change removes SH_BUILDVEC as one of the kinds of special
handling that may be required by VSX swap optimization.  From
experience with GCC, I had expected to need some modifications for
vector build operations, but I did not find that to be the case.

llvm-svn: 236606
2015-05-06 15:40:46 +00:00
Artyom Skrobov 3f8eae92a4 [ARM] generate VMAXNM/VMINNM for a compare followed by a select, in safe math mode too
llvm-svn: 236590
2015-05-06 11:44:10 +00:00
Pawel Bylica b25491faf4 Revert regression test from r236584.
Temporary remove a regression test added in r236584. It fails on Windows.

llvm-svn: 236586
2015-05-06 10:41:46 +00:00
Pawel Bylica 9f1fb9d1ef SelectionDAG: Handle out-of-bounds index in extract vector element
Summary: This patch correctly handles undef case of EXTRACT_VECTOR_ELT node where the element index is constant and not less than vector size.

Test Plan:
CodeGen for X86 test included.
Also one incorrect regression test fixed.

Reviewers: qcolombet, chandlerc, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D9250

llvm-svn: 236584
2015-05-06 10:19:14 +00:00
Ahmed Bougacha e8d0c4ccea [ARM][FastISel] Use TST #1 instead of CMP #0 for select.
Since r234249, i1 are sext instead of zext; because of that, doing
"CMP rN, #0; IT EQ/NE" isn't correct anymore.

"TST #1" is the conservatively correct alternative - the tradeoff being
that it doesn't have a 16-bit encoding -, so use that instead.

llvm-svn: 236569
2015-05-06 04:14:02 +00:00
Sanjoy Das 77b16b78e9 [Statepoints] Remove broken test case.
statepoint-indirect-return.ll breaks on linux systems.  Delete the test
case to make the bots green while I figure out what the right fix is.

llvm-svn: 236568
2015-05-06 02:51:46 +00:00
Sanjoy Das c6bf3e9f12 [StatepointLowering] Don't create temporary instructions. NFCI.
Summary:
Instead of creating a temporary call instruction and lowering that, use
SelectionDAGBuilder::lowerCallOperands.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9480

llvm-svn: 236563
2015-05-06 02:36:20 +00:00
Pete Cooper d0dae3e577 [X86 fast-isel] Constrain the index reg class to not include SP.
The index reg on instructions with complex address modes is a GPR64_NOSP.  Constrain it to appease the machine verifier.

llvm-svn: 236557
2015-05-05 23:41:53 +00:00
Pete Cooper ce9ad757c7 Fix IfConverter to handle regmask machine operands.
Note, this is a recommit of r236515 after fixing an error in r236514.  The buildbot ran fast enough that it picked up r236514 prior to r236515 and threw an error.  r236515 itself ran 'make check' without errors.

Original commit message follows:

A regmask (typically seen on a call) clobbers the set of registers it lists.  The IfConverter, in UpdatePredRedefs, was handling register defs, but not regmasks.

These are slightly different to a def in that we need to add both an implicit use and def to appease the machine verifier.  Otherwise, uses after the if converted call could think they are reading an undefined register.

Reviewed by Matthias Braun and Quentin Colombet.

llvm-svn: 236550
2015-05-05 22:09:41 +00:00
David Majnemer ac256cfed2 [Inliner] Discard empty COMDAT groups
COMDAT groups which have become rendered unused because of inline are
discardable if we can prove that we've made the group empty.

This fixes PR22285.

llvm-svn: 236539
2015-05-05 20:14:22 +00:00
Peter Collingbourne 85a0e23bc8 Thumb2SizeReduction: Check the correct set of registers for LDMIA.
The register set for LDMIA begins at offset 3, not 4. We were previously
missing the short encoding of this instruction in the case where the base
register was the first register in the register set.

Also clean up some dead code:

- The isARMLowRegister check is redundant with what VerifyLowRegs does;
  replace with an assert.
- Remove handling of LDMDB instruction, which has no short encoding (and
  does not appear in ReduceTable).

Differential Revision: http://reviews.llvm.org/D9485

llvm-svn: 236535
2015-05-05 20:07:10 +00:00
Ulrich Weigand 9958c489bb [DAGCombiner] Account for getVectorIdxTy() when narrowing vector load
This patch makes ReplaceExtractVectorEltOfLoadWithNarrowedLoad convert
the element number from getVectorIdxTy() to PtrTy before doing pointer
arithmetic on it.  This is needed on z, where element numbers are i32
but pointers are i64.

Original patch by Richard Sandiford.

llvm-svn: 236530
2015-05-05 19:34:10 +00:00
Ulrich Weigand af2c618e2b [DAGCombiner] Fix ReplaceExtractVectorEltOfLoadWithNarrowedLoad for BE
For little-endian, the function would convert (extract_vector_elt (load X), Y)
to X + Y*sizeof(elt).  For big-endian it would instead use
X + sizeof(vec) - Y*sizeof(elt).  The big-endian case wasn't right since
vector index order always follows memory/array order, even for big-endian.
(Note that the current handling has to be wrong for Y==0 since it would
access beyond the end of the vector.)

Original patch by Richard Sandiford.

llvm-svn: 236529
2015-05-05 19:33:37 +00:00
Ulrich Weigand 2693c0a491 [LegalizeVectorTypes] Allow single loads and stores for more short vectors
When lowering a load or store for TypeWidenVector, the type legalizer
would use a single load or store if the associated integer type was legal.
E.g. it would load a v4i8 as an i32 if i32 was legal.

This patch extends that behavior to promoted integers as well as legal ones.
If the integer type for the full vector width is TypePromoteInteger,
the element type is going to be TypePromoteInteger too, and it's still
better to use a single promoting load or truncating store rather than N
individual promoting loads or truncating stores.  E.g. if you have a v2i8
on a target where i16 is promoted to i32, it's better to load the v2i8 as
an i16 rather than load both i8s individually.

Original patch by Richard Sandiford.

llvm-svn: 236528
2015-05-05 19:32:57 +00:00
Ulrich Weigand c1708b2618 [SystemZ] Add vector intrinsics
This adds intrinsics to allow access to all of the z13 vector instructions.
Note that instructions whose semantics can be described by standard LLVM IR
do not get any intrinsics.

For each instructions whose semantics *cannot* (fully) be described, we
define an LLVM IR target-specific intrinsic that directly maps to this
instruction.

For instructions that also set the condition code, the LLVM IR intrinsic
returns the post-instruction CC value as a second result.  Instruction
selection will attempt to detect code that compares that CC value against
constants and use the condition code directly instead.

Based on a patch by Richard Sandiford.

llvm-svn: 236527
2015-05-05 19:31:09 +00:00
Ulrich Weigand 5211f9ff4d [SystemZ] Mark v1i128 and v1f128 as unsupported
The ABI specifies that <1 x i128> and <1 x fp128> are supposed to be
passed in vector registers.  We do not yet support those types, and
some infrastructure is missing before we can do so.

In order to prevent accidentally generating code violating the ABI,
this patch adds checks to detect those types and error out if user
code attempts to use them.

llvm-svn: 236526
2015-05-05 19:30:05 +00:00
Ulrich Weigand cd2a1b5341 [SystemZ] Handle sub-128 vectors
The ABI allows sub-128 vectors to be passed and returned in registers,
with the vector occupying the upper part of a register.  We therefore
want to legalize those types by widening the vector rather than promoting
the elements.

The patch includes some simple tests for sub-128 vectors and also tests
that we can recognize various pack sequences, some of which use sub-128
vectors as temporary results.  One of these forms is based on the pack
sequences generated by llvmpipe when no intrinsics are used.

Signed unpacks are recognized as BUILD_VECTORs whose elements are
individually sign-extended.  Unsigned unpacks can have the equivalent
form with zero extension, but they also occur as shuffles in which some
elements are zero.

Based on a patch by Richard Sandiford.

llvm-svn: 236525
2015-05-05 19:29:21 +00:00
Ulrich Weigand 49506d78e7 [SystemZ] Add CodeGen support for scalar f64 ops in vector registers
The z13 vector facility includes some instructions that operate only on the
high f64 in a v2f64, effectively extending the FP register set from 16
to 32 registers.  It's still better to use the old instructions if the
operands happen to fit though, since the older instructions have a shorter
encoding.

Based on a patch by Richard Sandiford.

llvm-svn: 236524
2015-05-05 19:28:34 +00:00
Ulrich Weigand 80b3af7ab3 [SystemZ] Add CodeGen support for v4f32
The architecture doesn't really have any native v4f32 operations except
v4f32->v2f64 and v2f64->v4f32 conversions, with only half of the v4f32
elements being used.  Even so, using vector registers for <4 x float>
and scalarising individual operations is much better than generating
completely scalar code, since there's much less register pressure.
It's also more efficient to do v4f32 comparisons by extending to 2
v2f64s, comparing those, then packing the result.

This particularly helps with llvmpipe.

Based on a patch by Richard Sandiford.

llvm-svn: 236523
2015-05-05 19:27:45 +00:00
Ulrich Weigand cd808237b2 [SystemZ] Add CodeGen support for v2f64
This adds ABI and CodeGen support for the v2f64 type, which is natively
supported by z13 instructions.

Based on a patch by Richard Sandiford.

llvm-svn: 236522
2015-05-05 19:26:48 +00:00
Ulrich Weigand ce4c109585 [SystemZ] Add CodeGen support for integer vector types
This the first of a series of patches to add CodeGen support exploiting
the instructions of the z13 vector facility.  This patch adds support
for the native integer vector types (v16i8, v8i16, v4i32, v2i64).

When the vector facility is present, we default to the new vector ABI.
This is characterized by two major differences:
- Vector types are passed/returned in vector registers
  (except for unnamed arguments of a variable-argument list function).
- Vector types are at most 8-byte aligned.

The reason for the choice of 8-byte vector alignment is that the hardware
is able to efficiently load vectors at 8-byte alignment, and the ABI only
guarantees 8-byte alignment of the stack pointer, so requiring any higher
alignment for vectors would require dynamic stack re-alignment code.

However, for compatibility with old code that may use vector types, when
*not* using the vector facility, the old alignment rules (vector types
are naturally aligned) remain in use.

These alignment rules are not only implemented at the C language level
(implemented in clang), but also at the LLVM IR level.  This is done
by selecting a different DataLayout string depending on whether the
vector ABI is in effect or not.

Based on a patch by Richard Sandiford.

llvm-svn: 236521
2015-05-05 19:25:42 +00:00
Ulrich Weigand a8b04e1cbc [SystemZ] Add z13 vector facility and MC support
This patch adds support for the z13 processor type and its vector facility,
and adds MC support for all new instructions provided by that facilily.

Apart from defining the new instructions, the main changes are:

- Adding VR128, VR64 and VR32 register classes.
- Making FP64 a subclass of VR64 and FP32 a subclass of VR32.
- Adding a D(V,B) addressing mode for scatter/gather operations
- Adding 1-, 2-, and 3-bit immediate operands for some 4-bit fields.
  Until now all immediate operands have been the same width as the
  underlying field (hence the assert->return change in decode[SU]ImmOperand).

In addition, sys::getHostCPUName is extended to detect running natively
on a z13 machine.

Based on a patch by Richard Sandiford.

llvm-svn: 236520
2015-05-05 19:23:40 +00:00
Pete Cooper 05b84d4168 Revert "Fix IfConverter to handle regmask machine operands."
This reverts commit b27413cbfd78d959c18e713bfa271fb69e6b3303 (ie r236515).

This is to get the bots green while i investigate the failures.

llvm-svn: 236517
2015-05-05 18:49:05 +00:00
Pete Cooper 6ebc207703 Fix IfConverter to handle regmask machine operands.
A regmask (typically seen on a call) clobbers the set of registers it lists.  The IfConverter, in UpdatePredRedefs, was handling register defs, but not regmasks.

These are slightly different to a def in that we need to add both an implicit use and def to appease the machine verifier.  Otherwise, uses after the if converted call could think they are reading an undefined register.

Reviewed by Matthias Braun and Quentin Colombet.

llvm-svn: 236515
2015-05-05 18:31:36 +00:00
Daniel Berlin 3459d6ead5 Update BasicAliasAnalysis to understand that nothing aliases with undef values.
It got this in some cases (if one of them was an identified object), but not in all cases.

This caused stores to undef to block load-forwarding in some cases, etc.

Added test to Transforms/GVN to verify optimization occurs as expected.

llvm-svn: 236511
2015-05-05 18:10:49 +00:00
Reid Kleckner 0738a9c02e Re-land "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"
This reverts commit r236360.

This change exposed a bug in WinEHPrepare by opting win32 code into EH
preparation. We already knew that WinEHPrepare has bugs, and is the
status quo for x64, so I don't think that's a reason to hold off on this
change. I disabled exceptions in the sanitizer tests in r236505 and an
earlier revision.

llvm-svn: 236508
2015-05-05 17:44:16 +00:00
Quentin Colombet 61b305edfd [ShrinkWrap] Add (a simplified version) of shrink-wrapping.
This patch introduces a new pass that computes the safe point to insert the
prologue and epilogue of the function.
The interest is to find safe points that are cheaper than the entry and exits
blocks.

As an example and to avoid regressions to be introduce, this patch also
implements the required bits to enable the shrink-wrapping pass for AArch64.


** Context **

Currently we insert the prologue and epilogue of the method/function in the
entry and exits blocks. Although this is correct, we can do a better job when
those are not immediately required and insert them at less frequently executed
places.
The job of the shrink-wrapping pass is to identify such places.


** Motivating example **

Let us consider the following function that perform a call only in one branch of
a if:
define i32 @f(i32 %a, i32 %b)  {
 %tmp = alloca i32, align 4
 %tmp2 = icmp slt i32 %a, %b
 br i1 %tmp2, label %true, label %false

true:
 store i32 %a, i32* %tmp, align 4
 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp)
 br label %false

false:
 %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ]
 ret i32 %tmp.0
}

On AArch64 this code generates (removing the cfi directives to ease
readabilities):
_f:                                     ; @f
; BB#0:
  stp x29, x30, [sp, #-16]!
  mov  x29, sp
  sub sp, sp, #16             ; =16
  cmp  w0, w1
  b.ge  LBB0_2
; BB#1:                                 ; %true
  stur  w0, [x29, #-4]
  sub x1, x29, #4             ; =4
  mov  w0, wzr
  bl  _doSomething
LBB0_2:                                 ; %false
  mov  sp, x29
  ldp x29, x30, [sp], #16
  ret

With shrink-wrapping we could generate:
_f:                                     ; @f
; BB#0:
  cmp  w0, w1
  b.ge  LBB0_2
; BB#1:                                 ; %true
  stp x29, x30, [sp, #-16]!
  mov  x29, sp
  sub sp, sp, #16             ; =16
  stur  w0, [x29, #-4]
  sub x1, x29, #4             ; =4
  mov  w0, wzr
  bl  _doSomething
  add sp, x29, #16            ; =16
  ldp x29, x30, [sp], #16
LBB0_2:                                 ; %false
  ret

Therefore, we would pay the overhead of setting up/destroying the frame only if
we actually do the call.


** Proposed Solution **

This patch introduces a new machine pass that perform the shrink-wrapping
analysis (See the comments at the beginning of ShrinkWrap.cpp for more details).
It then stores the safe save and restore point into the MachineFrameInfo
attached to the MachineFunction.
This information is then used by the PrologEpilogInserter (PEI) to place the
related code at the right place. This pass runs right before the PEI.

Unlike the original paper of Chow from PLDI’88, this implementation of
shrink-wrapping does not use expensive data-flow analysis and does not need hack
to properly avoid frequently executed point. Instead, it relies on dominance and
loop properties.

The pass is off by default and each target can opt-in by setting the
EnableShrinkWrap boolean to true in their derived class of TargetPassConfig.
This setting can also be overwritten on the command line by using
-enable-shrink-wrap.

Before you try out the pass for your target, make sure you properly fix your
emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not
necessarily the entry block.


** Design Decisions **

1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but
for debugging and clarity I thought it was best to have its own file.
2. Right now, we only support one save point and one restore point. At some
point we can expand this to several save point and restore point, the impacted
component would then be:
- The pass itself: New algorithm needed.
- MachineFrameInfo: Hold a list or set of Save/Restore point instead of one
  pointer.
- PEI: Should loop over the save point and restore point.
Anyhow, at least for this first iteration, I do not believe this is interesting
to support the complex cases. We should revisit that when we motivating
examples.

Differential Revision: http://reviews.llvm.org/D9210

<rdar://problem/3201744>

llvm-svn: 236507
2015-05-05 17:38:16 +00:00
Lang Hames cd68eba3b9 [Orc] Reapply r236465 with fixes for the MSVC bots.
llvm-svn: 236506
2015-05-05 17:37:18 +00:00
Kit Barton d4eb73c00e This patch adds ABI support for v1i128 data type.
It adds v1i128 to the appropriate register classes and checks parameter passing
and return values.

This is related to http://reviews.llvm.org/D9081, which will add instructions
that exploit the v1i128 datatype.

Phabricator review: http://reviews.llvm.org/D9475

llvm-svn: 236503
2015-05-05 16:10:44 +00:00
Daniel Sanders eda60d217b [mips] Generate code for insert/extract operations when using the N64 ABI and MSA.
Summary:
When using the N64 ABI, element-indices use the i64 type instead of i32.
In many cases, we can use iPTR to account for this but additional patterns
and pseudo's are also required.

This fixes most (but not quite all) failures in the test-suite when using
N64 and MSA together.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9342

llvm-svn: 236494
2015-05-05 10:32:24 +00:00
Daniel Sanders 4160c802d9 [mips][msa] Test basic operations for the N32 ABI too.
Summary:
This required adding instruction aliases for dneg.

N64 will be enabled shortly but requires additional bugfixes.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9341

llvm-svn: 236489
2015-05-05 08:48:35 +00:00
Lang Hames ac31a1f141 [Orc] Revert r236465 - It broke the Windows bots.
Looks like the usual missing explicit move-constructor issue with MSVC. I should
have a fix shortly.

llvm-svn: 236472
2015-05-04 23:30:01 +00:00
Reid Kleckner 9dad227b85 [X86] Fix assertion while DAG combining offsets and ExternalSymbols
ExternalSymbol nodes do not contain offsets, unlike GlobalValue nodes.

llvm-svn: 236471
2015-05-04 23:22:36 +00:00
Lang Hames a68970dfd5 [Orc] Refactor the compile-on-demand layer to make module partitioning lazy,
and avoid cloning unused decls into every partition.

Module partitioning showed up as a source of significant overhead when I
profiled some trivial test cases. Avoiding the overhead of partitionging
for uncalled functions helps to mitigate this.

This change also means that it is no longer necessary to have a
LazyEmittingLayer underneath the CompileOnDemand layer, since the
CompileOnDemandLayer will not extract or emit function bodies until they are
called.

llvm-svn: 236465
2015-05-04 22:03:10 +00:00
Matthias Braun 219144e6a7 Lit: Allow overriding llvm tool paths+arguments, make -D an alias for --param
These changes allow usages where you want to pass an additional
commandline option to all invocations of a specific llvm tool. Example:

> llvm-lit -Dllc=llc -enable-misched -verify-machineinstrs

Differential Revision: http://reviews.llvm.org/D9487

llvm-svn: 236461
2015-05-04 21:36:36 +00:00
Sanjay Patel ec2d7358b9 zap windows line endings; NFC
llvm-svn: 236460
2015-05-04 21:27:27 +00:00
Tim Northover 851ff69b42 CodeGen: match up correct insertvalue indices when assessing tail calls.
When deciding whether a value comes from the aggregate or inserted value of an
insertvalue instruction, we compare the indices against those of the location
we're interested in. One of the lists needs reversing because the input data is
backwards (so that modifications take place at the end of the SmallVector), but
we were reversing both before leading to incorrect results.

Should fix PR23408

llvm-svn: 236457
2015-05-04 20:41:51 +00:00
Keno Fischer d71a17710b Respect object format choice on Darwin
Summary:
The object format can be set to something other than MachO, e.g.
to use ELF-on-Darwin for MCJIT. This already works on Windows, so
there's no reason it shouldn't on Darwin.

Reviewers: lhames, grosbach

Subscribers: rafael, grosbach, t.p.northover, llvm-commits

Differential Revision: http://reviews.llvm.org/D6185

llvm-svn: 236455
2015-05-04 20:03:01 +00:00
Elena Demikhovsky d41e506342 AVX-512: added a test for encoding
by Asaf Badouh (asaf.badouh@intel.com)

llvm-svn: 236421
2015-05-04 12:59:15 +00:00
Elena Demikhovsky 60eb9db7bb AVX-512: added calling convention for i1 vectors in 32-bit mode.
Fixed some bugs in extend/truncate for AVX-512 target.
Removed VBROADCASTM (masked broadcast) node, since it is not used any more.

llvm-svn: 236420
2015-05-04 12:40:50 +00:00
Elena Demikhovsky 52266388f8 AVX-512: added integer "add" and "sub" instructions with saturation for SKX
with intrinsics and tests

by Asaf Badouh (asaf.badouh@intel.com)

llvm-svn: 236418
2015-05-04 12:35:55 +00:00
Elena Demikhovsky 9ebd877a10 AVX-512: enabled tests for AVX512F set
llvm-svn: 236416
2015-05-04 11:09:41 +00:00
Elena Demikhovsky 2557a22be7 AVX-512: Added VPACK* instructions forms for KNL and SKX
and their intrinsics
by Asaf Badouh (asaf.badouh@intel.com)

llvm-svn: 236414
2015-05-04 09:14:02 +00:00
Elena Demikhovsky 1b60ed7069 Masked gather and scatter intrinsics - enabled codegen for KNL.
llvm-svn: 236394
2015-05-03 07:12:25 +00:00
Simon Pilgrim 017ca19384 [DAGCombiner] Enabled vector float/double -> int constant folding
llvm-svn: 236387
2015-05-02 13:04:07 +00:00
Simon Pilgrim e170a4f5fa Line ending fix
llvm-svn: 236386
2015-05-02 11:50:47 +00:00
Simon Pilgrim 7d6df82dd1 [SSE] Added vector int (i32 and i64) -> float/double conversion tests
llvm-svn: 236385
2015-05-02 11:42:47 +00:00
Simon Pilgrim 6e3b7bad11 [SSE] Added vector float/double -> i32 and i64 conversion tests
llvm-svn: 236384
2015-05-02 11:18:47 +00:00
David Blaikie 72d03efa6d DebugInfo: Use low_pc relative debug_ranges under fission when the CU has a low_pc
Seems we were setting the base address on the wrong DwarfCompileUnit
object so it wasn't being used when generating the ranges.

llvm-svn: 236377
2015-05-02 02:31:49 +00:00
Eric Christopher a2d44dee73 Rework test to use FileCheck by making sure we have no xmm registers
with numbers.

llvm-svn: 236373
2015-05-02 01:06:17 +00:00
Reid Kleckner 83d89fa546 Revert "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"
This reverts commit r236359. Things are still broken despite testing. :(

llvm-svn: 236360
2015-05-01 22:50:14 +00:00
Reid Kleckner 51476acd77 Re-land "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"
This reverts commit r236340.

llvm-svn: 236359
2015-05-01 22:40:25 +00:00
Colin LeMahieu bb0d7cbee1 [Hexagon] r236351 fix does not work on builder configurations yet.
llvm-svn: 236358
2015-05-01 22:39:20 +00:00
Quentin Colombet 0de2346859 [AArch64][FastISel] Variant of the logical instructions that use two input
registers cannot write on SP.

rdar://problem/20748715

llvm-svn: 236352
2015-05-01 21:34:57 +00:00
Colin LeMahieu b662565475 [Hexagon] Adding expression MC emission and removing XFAIL from test that hits this code path.
llvm-svn: 236348
2015-05-01 21:14:21 +00:00
Quentin Colombet 9df2fa261b [AArch64][FastISel] Fix the setting of kill flags for MUL -> UMULH sequences.
rdar://problem/20748715

llvm-svn: 236346
2015-05-01 20:57:11 +00:00
Zachary Turner e5cb269352 [llvm-pdbdump] Support dynamic load address and external symbols.
This patch adds the --load-address command line option to
llvm-pdbdump, which dumps all addresses assuming the module has
loaded at the specified address.

Additionally, this patch adds an option to llvm-pdbdump to support
dumping of public symbols (i.e. symbols with external linkage).

llvm-svn: 236342
2015-05-01 20:24:26 +00:00
Reid Kleckner 2747d3d55a Revert "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"
This reverts commit r236339, it breaks the win32 clang-cl self-host.

llvm-svn: 236340
2015-05-01 20:14:04 +00:00
Reid Kleckner 4856fc61b4 [WinEH] Add an EH registration and state insertion pass for 32-bit x86
This pass is responsible for constructing the EH registration object
that gets linked into fs:00, which is all it does in this change. In the
future, it will also insert stores to update the EH state number.

I considered keeping this functionality in WinEHPrepare, but it's pretty
separable and X86 specific. It has conceptually very little to do with
the task of WinEHPrepare, which is currently outlining.  WinEHPrepare is
also in theory useful on ARM, but this logic is pretty x86 specific.

Reviewers: andrew.w.kaylor, majnemer

Differential Revision: http://reviews.llvm.org/D9422

llvm-svn: 236339
2015-05-01 20:04:54 +00:00
Peter Collingbourne d27d3a151f ARM: Align functions containing Thumb-2 jump tables to 4 bytes.
Functions with jump tables need an alignment of 4 because they use the ADR
instruction, which aligns the PC to 4 bytes before adding an offset.

Differential Revision: http://reviews.llvm.org/D9424

llvm-svn: 236327
2015-05-01 18:05:59 +00:00
James Y Knight 35e04e84fa [Sparc] Repair fixups in little endian mode.
Differential Revision: http://reviews.llvm.org/D9434

llvm-svn: 236324
2015-05-01 17:13:02 +00:00
Toma Tabacu 00e9867988 [mips] [IAS] Fix error messages for using LI with 64-bit immediates.
Summary:
LI should never accept immediates larger than 32 bits.
The additional Is32BitImm boolean also paves the way for unifying the functionality that LA and LI have in common.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9289

llvm-svn: 236313
2015-05-01 12:19:27 +00:00
Toma Tabacu a2861db834 [mips] [IAS] Slightly improve shift instruction generation in expandLoadImm.
Summary:
Generate one DSLL32 of 0 instead of two consecutive DSLL of 16.
In order to do this I had to change createLShiftOri's template argument from a bool to an unsigned.

This also gave me the opportunity to rewrite the mips64-expansions.s test, as it was testing the same cases multiple times and skipping over other cases.
It was also somewhat unreadable, as the CHECK lines were grouped in a huge block of text at the beginning of the file.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8974

llvm-svn: 236311
2015-05-01 10:26:47 +00:00
Simon Pilgrim 9fb06bca67 [SelectionDAG] Unary vector constant folding integer legality fixes
This patch fixes issues with vector constant folding not correctly handling scalar input operands if they require implicit truncation - this was tested with llvm-stress as recommended by Patrik H Hagglund.

The patch ensures that integer input scalars from a build vector are correctly truncated before folding, and that constant integer scalar results are promoted to a legal type before inclusion in the new folded build vector.

I have added another crash test case and also a test for UINT_TO_FP / SINT_TO_FP using an non-truncated scalar input, which was failing before this patch.

Differential Revision: http://reviews.llvm.org/D9282

llvm-svn: 236308
2015-05-01 08:20:04 +00:00
Tom Stellard aa798340c3 R600/SI: Add VCC as an implict def of SI_KILL
When SI_KILL has a register operand, its lowered form writes to vcc.

llvm-svn: 236307
2015-05-01 03:44:09 +00:00
Tom Stellard 0b7feb1cb7 R600/SI: Fix verifier errors from the SIAnnotateControlFlow pass
This pass was generating 'Instruction does not dominate all uses!'
errors for programs which had loops with a condition variable that
depended on the result of a phi instruction from outside of the loop.

The pass was inserting new phi nodes outside of the loop which used values
defined inside the loop.

http://bugs.freedesktop.org/show_bug.cgi?id=90056

llvm-svn: 236306
2015-05-01 03:44:08 +00:00
Quentin Colombet 65b5b01d56 [ARM][TEST] Strengthen test against smarter reg alloc.
Follow-up of r236247.

rdar://problem/20770899

llvm-svn: 236296
2015-05-01 00:45:55 +00:00
Pete Cooper 2127b00cd5 [ARM] optimizeSelect should clear kill flags.
If we move an instruction from one block down to a MOVC and predicate it,
then the original instruction could be moved in to a loop.  In this case,
its invalid for any kill flags to remain on there.

Fails with -verfy-machineinstrs.

rdar://problem/20752113

llvm-svn: 236290
2015-04-30 23:57:47 +00:00
Pete Cooper 451755d370 Commute the internal flag on MachineOperands.
When commuting a thumb instruction in the size reduction pass, thumb
instructions are represented as a bundle and so some operands may be marked
as internal.  The internal flag has to move with the operand when commuting.

This test is sensitive to register allocation so can't specifically check that
this error was happening, but so long as it continues to pass with -verify then
hopefully its still ok.

rdar://problem/20752113

llvm-svn: 236282
2015-04-30 23:14:14 +00:00
Davide Italiano cd2514dca6 [Object] Teach Object and llvm-objdump about ".hidden"
Differential Revision:	http://reviews.llvm.org/D9416
Reviewed by:	rafael

llvm-svn: 236279
2015-04-30 23:08:53 +00:00
Quentin Colombet 329fa890ba [AArch64] Fix bad register class constraint in fast-isel for TST instruction.
rdar://problem/20748715

llvm-svn: 236273
2015-04-30 22:27:20 +00:00
Pete Cooper 5111881cfc Don't always apply kill flag in thumb2 ABS pseudo expansion.
The expansion for t2ABS was always setting the kill flag on the rsb instruction.
It should instead only be set on rsb if it was set on the original ABS instruction.

rdar://problem/20752113

llvm-svn: 236272
2015-04-30 22:15:59 +00:00
Reid Kleckner 60d5232be2 [X86] Use 4 byte preferred aggregate alignment on Win32
This helps reduce the frequency of stack realignment prologues in 32-bit
X86 Windows code. Before this change and the corresponding clang change,
we would take the max of the type preferred alignment and the explicit
alignment on the alloca.

If you don't override aggregate alignment in datalayout, you get a
default of 8. This dates back to 2007 / r34356, and changing it seems
prohibitively difficult at this point.

llvm-svn: 236270
2015-04-30 22:11:59 +00:00
Matthias Braun e48484c64f InstCombineSimplifyDemanded: Remove nsw/nuw flags when optimizing demanded bits
When optimizing demanded bits of the operands of an Add we have to
remove the nsw/nuw flags as we have no guarantee anymore that we don't
wrap.  This is legal here because the top bit is not demanded.  In fact
this operaion was already performed but missed in the case of an Add
with a constant on the right side.  To fix this this patch refactors the
code to unify the code paths in SimplifyDemandedUseBits() handling of
Add/Sub:

- The transformation of Add->Or is removed from the simplify demand
  code because the equivalent transformation exists in
  InstCombiner::visitAdd()
- KnownOnes/KnownZero are not adjusted for Add x, C anymore as
  computeKnownBits() already performs these computations.
- The simplification of the operands is unified. In this new version
  constant on the right side of a Sub are shrunk now as I could not find
  a reason why not to do so.
- The special case for clearing nsw/nuw in ShrinkDemandedConstant() is
  not necessary anymore as the caller does that already.

Differential Revision: http://reviews.llvm.org/D9415

llvm-svn: 236269
2015-04-30 22:05:30 +00:00
Andrea Di Biagio 737a361006 Fix comment in test. NFC.
llvm-svn: 236262
2015-04-30 21:22:28 +00:00
Andrea Di Biagio c84b5bdd69 Fix for PR23103. Correctly propagate the 'IsUndef' flag to the register operands of a commuted instruction.
Revision 220239 exposed a latent bug in method
'TargetInstrInfo::commuteInstruction'. When commuting the operands of a machine
instruction, method 'commuteInstruction' didn't correctly propagate the
'IsUndef' flag to the register operands of the new (commuted) instruction.

Before this patch, the following instruction:
  %vreg4<def> = VADDSDrr  %vreg14, %vreg5<undef>; FR64:%vreg4,%vreg14,%vreg5

was wrongly converted by method 'commuteInstruction' into:
  %vreg4<def> = VADDSDrr  %vreg5, %vreg14<undef>; FR64:%vreg4,%vreg5,%vreg14

The correct instruction should have been:
  %vreg4<def> = VADDSDrr  %vreg5<undef>, %vreg14; FR64:%vreg4,%vreg5,%vreg14

This patch fixes the problem in method 'TargetInstrInfo::commuteInstruction'.
When swapping the operands of a machine instruction, we now make sure that
'IsUndef' flags are correctly set.
Added test case 'pr23103.ll'.

Differential Revision: http://reviews.llvm.org/D9406

llvm-svn: 236258
2015-04-30 21:03:29 +00:00
Kevin Enderby 8972e48bc8 For llvm-objdump, with the -archive-headers and -macho options, use the -non-verbose
option to print the archive headers using raw numeric values.  Also add the -archive-member-offsets
for use with these to also trigger printing of the offset of the archive member from the start
of the archive.

llvm-svn: 236252
2015-04-30 20:30:42 +00:00
Pete Cooper 4d8d2ec3eb Don't rewrite jumps to empty BBs to landing pads.
In the test case here, the 'unreachable' BB was removed by BranchFolding because its empty.

It then rewrote the jump from 'entry' to jump to its fallthrough, which was a landing pad.

This results in 'entry' jumping to 2 different landing pads, which fails the machine verifier.

rdar://problem/20750162

llvm-svn: 236248
2015-04-30 18:58:23 +00:00
Quentin Colombet 0a905042cd [ARM] Do not generate invalid encoding for stack adjust, even if this is just
temporary.

Because of that:
1. The machine verifier was complaining on such code.
2. The generate code worked just because the thumb reduction size pass fixed the
opcode.

rdar://problem/20749824

llvm-svn: 236247
2015-04-30 18:52:49 +00:00
Tim Northover 03b99f66d7 AArch64: add BFC alias for the BFI/BFM instructions.
Unlike 32-bit ARM, AArch64 can use wzr/xzr to implement this without the need
for a separate instruction.

rdar://18679590

llvm-svn: 236245
2015-04-30 18:28:58 +00:00
Jan Vesely 808fff585b Reinstate revisions r234755, r234759, r234760
changes:
  Don't apply on hexagon and NVPTX since they no longer claim to support UADDO/USUBO
  Add location to getConstant
  Drop comment about the ops being turned into expand

llvm-svn: 236240
2015-04-30 17:15:56 +00:00
Rafael Espindola bda1980917 Write sections mostly in one pass.
During ELF writing, there is no need to further relax the sections, so we
should not be creating fragments. This patch avoids doing so in all cases
but debug section compression (that is next).

Also, the ELF format is fairly simple to write. We can do a single pass over
the sections to write them out and compute the section header table.

llvm-svn: 236235
2015-04-30 14:21:49 +00:00
Rafael Espindola fc337022c7 Don't check for offsets in tests where it is not relevant.
llvm-svn: 236233
2015-04-30 13:57:06 +00:00
Rafael Espindola e740409d52 Check the entire content of the comdat group.
llvm-svn: 236230
2015-04-30 13:08:09 +00:00
Daniel Sanders 4d532b5617 [mips] Sorted instructions in mips64r6 disassembly tests. NFC.
llvm-svn: 236223
2015-04-30 10:52:42 +00:00
Daniel Sanders 811f21493c [mips][mips64r6] Sorted instructions in test. NFC.
llvm-svn: 236221
2015-04-30 10:23:48 +00:00
Daniel Sanders 59f89aa8ed [mips][msa] Rename main check prefix to 'ALL' in basic operations tests. NFC
Summary:
The majority of the checks are subtarget independent. The few that aren't
will be corrected shortly.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9340

llvm-svn: 236220
2015-04-30 09:57:37 +00:00
Daniel Sanders fa159165be [mips][msa] Use CHECK-LABEL where missing, and remove checks matching the .size directive. NFC.
Summary: 

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9339

llvm-svn: 236219
2015-04-30 09:56:30 +00:00
Daniel Sanders 90b059d555 [mips] Add missing signext attributes to MSA basic operations tests. NFC.
Summary:
This doesn't make much difference to MIPS32, but it will simplify a
MIPS64r6 bugfix which will follow shortly by removing unnecessary
sign-extension of parameters.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9338

llvm-svn: 236216
2015-04-30 09:24:09 +00:00
Simon Pilgrim ecf5875bd5 [SSE] Fix for MUL v16i8 on pre-SSE41 targets (PR23369).
Sign extension of i8 to i16 was placing the unpacked bytes in the lower byte instead of the upper byte.

llvm-svn: 236209
2015-04-30 08:23:16 +00:00
Sanjoy Das 08e95b4703 [InstCombine] Add new rule for MIN(MAX(~A, ~B), ~C) et. al.
Summary:
Optimizing these well are especially interesting for IRCE since it
"clamps" values by generating this sort of pattern through SCEV
expressions.

Depends on D9352.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9353

llvm-svn: 236203
2015-04-30 04:56:04 +00:00
Sanjoy Das a8c178f280 [InstCombine] Add a new formula for SMIN.
Summary:
After this change `MatchSelectPattern` recognizes the following form
of SMIN:

  Y >s C ? ~Y : ~C == ~Y <s ~C ? ~Y : ~C = SMIN(~Y, ~C)

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9352

llvm-svn: 236202
2015-04-30 04:56:00 +00:00
Filipe Cabecinhas f8a16a952d Don't overflow GCTable
Summary: Bug found with AFL fuzz.

Reviewers: rafael, dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9361

llvm-svn: 236200
2015-04-30 04:09:41 +00:00
Owen Anderson d8a029c81b Semantically revert r236031, which is not a good idea for in-order targets.
At the least it should be guarded by some kind of target hook.
It also introduced catastrophic compile time and code quality
regressions on some out of tree targets (test case still being
reduced/sanitized).

Sanjay agreed with reverting this patch until these issues can be
resolved.

llvm-svn: 236199
2015-04-30 04:06:32 +00:00
Hans Wennborg 680a60f0d0 XFAIL test/CodeGen/Generic/MachineBranchProb.ll on Hexagon (PR23377)
llvm-svn: 236196
2015-04-30 01:59:04 +00:00
Filipe Cabecinhas 9a19e56306 Make sure Op->getType() is a PointerType before we cast<> it.
Bug found with AFL fuzz.

llvm-svn: 236193
2015-04-30 01:13:31 +00:00
Hans Wennborg 4b828d35fd Switch lowering: use profile info to build weight-balanced binary search trees
This will cause hot nodes to appear closer to the root.

The literature says building the tree like this makes it a near-optimal (in
terms of search time given key frequencies) binary search tree. In LLVM's case,
we can do up to 3 comparisons in each leaf node, so it might be better to opt
for lower tree height in some cases; that's something to look into in the
future.

Differential Revision: http://reviews.llvm.org/D9318

llvm-svn: 236192
2015-04-30 00:57:37 +00:00
Filipe Cabecinhas bad0779f63 Make sure we don't resize(0) when we get a fwdref with Idx == UINT_MAX
Make it an error instead.

Bug found with AFL fuzz.

llvm-svn: 236190
2015-04-30 00:52:42 +00:00
Ahmed Bougacha ace0593b42 Flip r236172 testcase RUN option ordering for BSD sed(1). NFC.
llvm-svn: 236186
2015-04-30 00:07:34 +00:00
Pete Cooper 46361a1ea1 Change x86 CMOVE_F to read it source, not write it.
This was breaking sqlite with the machine verifier because operand 0 was a def according to tablegen, but didn't have the 'isDef' flag set.

Looking at the ISA, its clear that this operand is a source as writing to st(0) is implicit.  So move the operand to the correct place in the td file.

rdar://problem/20751584

llvm-svn: 236183
2015-04-29 23:51:33 +00:00
Reid Kleckner bcda1cd45a [WinEH] Start EH preparation for 32-bit x86, it uses no arguments
32-bit x86 MSVC-style exceptions are functionaly similar to 64-bit, but
they take no arguments. Instead, they implicitly use the value of EBP
passed in by the caller as a pointer to the parent's frame. In LLVM, we
can represent this as llvm.frameaddress(1), and feed that into all of
our calls to llvm.framerecover.

The next steps are:
- Add an alloca to the fs:00 linked list of handlers
- Add something like llvm.sjlj.lsda or generalize it to store in the
  alloca
- Move state number calculation to WinEHPrepare, arrange for
  FunctionLoweringInfo to call it
- Use the state numbers to insert explicit loads and stores in the IR

llvm-svn: 236172
2015-04-29 22:49:54 +00:00
Rafael Espindola 88d1f632cf Write the section header string table directly to the output stream.
Instead of accumulating the content in a fragment first, just write it
to the output stream.

Also put it first in the section table, so that we never have to worry
about its index being >= SHN_LORESERVE.

llvm-svn: 236145
2015-04-29 20:25:24 +00:00
Douglas Katzman 9cb88b73c6 Make Sparc assembler accept parenthesized constant expressions.
Differential Revision: http://reviews.llvm.org/D9087

llvm-svn: 236137
2015-04-29 18:48:29 +00:00
Zoran Jovanovic 387ce30685 [mips][microMIPSr6] Implement MUL, MUH, MULU and MUHU instructions
Differential Revision: http://reviews.llvm.org/D8894

llvm-svn: 236131
2015-04-29 17:23:22 +00:00
Reid Kleckner c695471365 [X86] Avoid mangling frameescape labels
x86 Windows uses the '_' prefix for all global symbols, and this was
mistakenly being applied to frameescape labels, which are not externally
visible global symbols. They use the private global prefix 'L'.

The *right* way to fix this is probably to stop masquerading this label
as an ExternalSymbol and create a new SDNode type. These labels are not
"external", and we know they will be resolved by assembly time. Having a
custom SDNode type would allow us to do better X86 address mode
matching, so it's probably worth doing eventually.

llvm-svn: 236123
2015-04-29 16:46:01 +00:00
Duncan P. N. Exon Smith a9308c49ef IR: Give 'DI' prefix to debug info metadata
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`.  The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.

Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one.  It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs.  YMMV of
course.

Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py.  I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three.  It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).

Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.

llvm-svn: 236120
2015-04-29 16:38:44 +00:00
Zoran Jovanovic cca29e8f6e [mips][microMIPSr6] Implement SUB and SUBU instructions
Differential Revision: http://reviews.llvm.org/D8764

llvm-svn: 236118
2015-04-29 16:22:46 +00:00
Zoran Jovanovic 5f34d44354 [mips][microMIPSr6] Implement ADD, ADDU and ADDIU instructions
Differential Revision: http://reviews.llvm.org/D8704

llvm-svn: 236111
2015-04-29 15:11:07 +00:00
James Y Knight c09bdfa4cb Sparc: Prefer reg+reg address encoding when only one register used.
Reg+%g0 is preferred to Reg+imm0 by the manual, and is what GCC produces.

Futhermore, reg+imm is invalid for the (not yet supported) "alternate
address space" instructions.

Differential Revision: http://reviews.llvm.org/D8753

llvm-svn: 236107
2015-04-29 14:54:44 +00:00
Vasileios Kalintiris 1249e74648 Mips fast-isel - handle functions which return i8 or i6 .
Summary: Allow Mips fast-isel to handle functions which return i8/i16 signed/unsigned.

Test Plan:
Make check tests are forthcoming.
Already passes test-suite at O0/O2 for Mips 32 r1/r2

Reviewers: dsanders, rkotler

Subscribers: llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D6765

llvm-svn: 236103
2015-04-29 14:17:14 +00:00
Rafael Espindola cad91323dc Don't constrain the section order in tests that don't depend on it.
llvm-svn: 236102
2015-04-29 13:55:07 +00:00
Daniel Sanders 301f937765 [mips] Correct 128-bit shifts on 64-bit targets.
Summary:
The existing code was correct for 32-bit GPR's but not 64-bit GPR's. It now
accounts for both cases.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits, mohit.bhakkad, sagar

Differential Revision: http://reviews.llvm.org/D9337

llvm-svn: 236099
2015-04-29 12:28:58 +00:00
Filipe Cabecinhas d8a1bcd0ad Check that we have a valid PointerType element type before calling get()
Same as r236073 but for PointerType.

Bug found with AFL fuzz.

llvm-svn: 236079
2015-04-29 02:27:28 +00:00
Filipe Cabecinhas 1351cba720 Turn an assert into report_fatal_error since it's reachable based on user input
Bug found with AFL fuzz.

llvm-svn: 236076
2015-04-29 01:58:31 +00:00
Filipe Cabecinhas f15fb032ef Make sure that isValidElementType(Type) before calling {Array,Struct}Type::get(Type)
Bug found with AFL fuzz.

llvm-svn: 236073
2015-04-29 01:27:01 +00:00
Tim Northover e18d662201 ARM: fix peephole optimisation of TST
We were trying to look through COPY instructions, but only to the next
instruction in a BB and incorrectly anyway. The cases where that would actually
be a good idea are rare enough (and not even tested!) that it's not worth
trying to get right.

rdar://20721342

llvm-svn: 236050
2015-04-28 22:03:55 +00:00
Andrew Kaylor 046f7b42f2 [WinEH] Split blocks at calls to llvm.eh.begincatch
Differential Revision: http://reviews.llvm.org/D9311

llvm-svn: 236046
2015-04-28 21:54:14 +00:00
James Y Knight e8da8096ec Sparc: Add alternate aliases for conditional branch instructions.
llvm-svn: 236042
2015-04-28 21:27:31 +00:00
Sanjay Patel 2fbc4e5c49 transform fadd chains to increase parallelism
This is a compromise: with this simple patch, we should always handle a chain of exactly 3
operations optimally, but we're not generating the optimal balanced binary tree for a longer
sequence.

In general, this transform will reduce the dependency chain for a sequence of instructions
using N operands from a worst case N-1 dependent operations to N/2 dependent operations. 
The optimal balanced binary tree would reduce the chain to log2(N).

The trade-off for not dealing with longer sequences is: (1) we have less complexity in the
compiler, (2) we avoid unknown compile-time blowup calculating a balanced tree, and (3) we
don't need to worry about the increased register pressure required to parallelize longer
sequences. It also seems unlikely that we would ever encounter really long strings of
dependent ops like that in the wild, but I'm not sure how to verify that speculation.
FWIW, I see no perf difference for test-suite running on btver2 (x86-64) with -ffast-math
and this patch.

We can extend this patch to cover other associative operations such as fmul, fmax, fmin, 
integer add, integer mul.

This is a partial fix for:
https://llvm.org/bugs/show_bug.cgi?id=17305

and if extended:
https://llvm.org/bugs/show_bug.cgi?id=21768
https://llvm.org/bugs/show_bug.cgi?id=23116

The issue also came up in:
http://reviews.llvm.org/D8941

Differential Revision: http://reviews.llvm.org/D9232

llvm-svn: 236031
2015-04-28 21:03:22 +00:00
Filipe Cabecinhas b435d0f439 Relax an assert when there's a type mismatch in forward references
Summary:
We don't seem to need to assert here, since this function's callers expect
to get a nullptr on error. This way we don't assert on user input.

Bug found with AFL fuzz.

Reviewers: rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9308

llvm-svn: 236027
2015-04-28 20:18:47 +00:00
Tom Stellard 96301d2455 R600: Fix up for AsmPrinter's OutStreamer being a unique_ptr
Fixes a crash with basically any OpenGL application using the radeonsi
driver.

Patch by: Michel Dänzer

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90176
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 236004
2015-04-28 17:37:03 +00:00
Justin Holewinski 3d2a976197 [NVPTX] Handle addrspacecast constant expressions in aggregate initializers
We need to track if an AddrSpaceCast expression was seen when
generating an MCExpr for a ConstantExpr.  This change introduces a
custom lowerConstant method to the NVPTX asm printer that will create
NVPTXGenericMCSymbolRefExpr nodes at the appropriate places to encode
the information that a given symbol needs to be casted to a generic
address.

llvm-svn: 236000
2015-04-28 17:18:30 +00:00
Elena Demikhovsky 1f7b3644d3 Fixed crash of variable shift inst on AVX2
https://llvm.org/bugs/show_bug.cgi?id=22955

llvm-svn: 235993
2015-04-28 14:46:35 +00:00
Toma Tabacu 7dea2e3982 [mips] [IAS] Do not generate redundant ORi in createLShiftOri.
Summary: If the immediate is 0, the ORi is pointless.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8969

llvm-svn: 235990
2015-04-28 14:06:35 +00:00
Sergey Dmitrouk 842a51bad8 Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"
[DebugInfo] Add debug locations to constant SD nodes

This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).

Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.

Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.

This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.

Differential Revision: http://reviews.llvm.org/D9084

llvm-svn: 235989
2015-04-28 14:05:47 +00:00
Rafael Espindola effdc7e981 Use CIE version 4 for dwarf4.
According to http://www.dwarfstd.org/doc/DWARF4.pdf appendix F the CIE
version for dwarf 4 is 4.

llvm-svn: 235988
2015-04-28 13:55:31 +00:00
Daniel Jasper 48e93f7181 Revert "[DebugInfo] Add debug locations to constant SD nodes"
This breaks a test:
http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870

llvm-svn: 235987
2015-04-28 13:38:35 +00:00
Sergey Dmitrouk adb4c69d5c [DebugInfo] Add debug locations to constant SD nodes
This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).

Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.

Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.

This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.

Differential Revision: http://reviews.llvm.org/D9084

llvm-svn: 235977
2015-04-28 11:56:37 +00:00
Elena Demikhovsky ae51853924 AVX-512: Added "pandn" intrinsics set
by Asaf Badouh (asaf.badouh@intel.com)

llvm-svn: 235971
2015-04-28 08:12:42 +00:00
David Blaikie 2a661cd062 [opaque pointer type] Encode the pointee type in the bitcode for 'cmpxchg'
As a space optimization, this instruction would just encode the pointer
type of the first operand and use the knowledge that the second and
third operands would be of the pointee type of the first. When typed
pointers go away, this assumption will no longer be available - so
encode the type of the second operand explicitly and rely on that for
the third.

Test case added to demonstrate the backwards compatibility concern,
which only comes up when the definition of the second operand comes
after the use (hence the weird basic block sequence) - at which point
the type needs to be explicitly encoded in the bitcode and the record
length changes to accommodate this.

llvm-svn: 235966
2015-04-28 04:30:29 +00:00
Ahmed Bougacha 190528703f [MC] Use LShr for constant evaluation of ">>" on ELF/arm64--darwin.
This matches other assemblers and is less unexpected (e.g. PR23227).
On ELF, I tried binutils gas v2.24 and nasm 2.10.09, and they both
agree on LShr.  On COFF, I couldn't get my hands on an assembler yet,
so don't change the behavior.  For now, don't change it on non-AArch64
Darwin either, as the other assembler is gas v1.38, which does an AShr.

llvm-svn: 235963
2015-04-28 01:37:11 +00:00
Hans Wennborg 67c03759e4 Switch lowering: Take branch weight into account when ordering for fall-through
Previously, the code would try to put a fall-through case last,
even if that meant moving a case with much higher branch weight
further down the chain.

Ordering by branch weight is most important, putting a fall-through
block last is secondary.

llvm-svn: 235942
2015-04-27 23:35:22 +00:00
Rafael Espindola a7c3163cdf Use CIE version 1 for .eh_frame.
According to

http://www.linuxbase.org/betaspecs/lsb/LSB-Core-generic/LSB-Core-generic/ehframechpt.html

we should always use 1.

llvm-svn: 235923
2015-04-27 22:04:24 +00:00
Ahmed Bougacha c004c60c0a [AArch64] Also combine vector selects fed by non-i1 SETCCs.
After legalization, scalar SETCC has an i32 result type on AArch64.
The i1 requirement seems too conservative, replace it with an assert.

This also means that we now can run after legalization. That should also
be fine, since the ops legalizer runs again after each combine, and
all types created all have the same sizes as the (legal) inputs.

Exposed by r235917; while there, robustize its tests (bsl also uses the
register it defines).

llvm-svn: 235922
2015-04-27 21:43:12 +00:00
Ahmed Bougacha 89bba61c84 [AArch64] Don't assert when combining (v3f32 select (setcc f64)).
When the setcc has f64 operands, we can't build a vector setcc mask
to feed a vselect, because f64 doesn't divide v3f32 evenly.
Just bail out when that happens.

llvm-svn: 235917
2015-04-27 21:01:20 +00:00
Hans Wennborg ba6d2568f9 Switch lowering: order bit tests by branch weight.
llvm-svn: 235912
2015-04-27 20:21:17 +00:00
Bill Schmidt fe723b9a6d [PPC64LE] Remove unnecessary swaps from lane-insensitive vector computations
This patch adds a new SSA MI pass that runs on little-endian PPC64
code with VSX enabled. Loads and stores of 4x32 and 2x64 vectors
without alignment constraints are accomplished for little-endian using
lxvd2x/xxswapd and xxswapd/stxvd2x. The existence of the additional
xxswapd instructions hurts performance in comparison with big-endian
code, but they are necessary in the general case to support correct
semantics.

However, the general case does not apply to most vector code. Many
vector instructions are lane-insensitive; they do not "care" which
lanes the parallel computations are performed within, provided that
the resulting data is stored into the correct locations. Thus this
pass looks for computations that perform only lane-insensitive
operations, and remove the unnecessary swaps from loads and stores in
such computations.

Future improvements will allow computations using certain
lane-sensitive operations to also be optimized in this manner, by
modifying the lane-sensitive operations to account for the permuted
order of the lanes. However, this patch only adds the infrastructure
to permit this; no lane-sensitive operations are optimized at this
time.

This code is heavily exercised by the various vectorizing applications
in the projects/test-suite tree. For the time being, I have only added
one simple test case to demonstrate what the pass is doing. Although
it is quite simple, it provides coverage for much of the code,
including the special case handling of copies and subreg-to-reg
operations feeding the swaps. I plan to add additional tests in the
future as I fill in more of the "special handling" code.

Two existing tests were affected, because they expected the swaps to
be present, but they are now removed.

llvm-svn: 235910
2015-04-27 19:57:34 +00:00
Zachary Turner 20dbd0d0de Make llvm-symbolizer work on Windows.
Differential Revision: http://reviews.llvm.org/D9234
Reviewed By: Alexey Samsonov

llvm-svn: 235900
2015-04-27 17:19:51 +00:00
Elena Demikhovsky a480ef5494 AVX-512: added calling conventions for i1 vectors.
Fixed bug: https://llvm.org/bugs/show_bug.cgi?id=20724

llvm-svn: 235889
2015-04-27 15:11:19 +00:00
Brendon Cahoon 55bdeb7bc7 [Hexagon] Use constant extenders to fix up hardware loops
Use a loop instruction with a constant extender for a hardware
loop instruction that is too far away from the start of the loop.
This is cheaper than changing the SA register value.

Differential Revision: http://reviews.llvm.org/D9262

llvm-svn: 235882
2015-04-27 14:16:43 +00:00
Toma Tabacu d9d344b485 [mips] [IAS] Improve warning for using AT with .set noat.
Summary:
Changed the warning message to show the current value of $at, similar to what clang does for typedef's, and renamed warnIfAssemblerTemporary to a more descriptive name.

I also changed the type of variables which store registers from int to unsigned, updated the relevant test and tried to make the related comments clearer.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8479

llvm-svn: 235881
2015-04-27 14:05:04 +00:00
Vasileios Kalintiris 7a6b18783f Reapply "[mips][FastISel] Implement shift ops for Mips fast-isel.""
This reapplies r235194, which was reverted in r235495 because it was causing a
failure in our out-of-tree buildbots for MIPS. With the sign-extension patch
in r235718, this patch doesn't cause any problem any more.

llvm-svn: 235878
2015-04-27 13:28:05 +00:00
Elena Demikhovsky d1084c5b3f AVX-512: Extend/Truncate operations for SKX,
SETCC for bit-vectors

llvm-svn: 235875
2015-04-27 12:57:59 +00:00
Toma Tabacu 217116e684 [MC] [IAS] Add support for the \@ .macro pseudo-variable.
Summary:
When used, it is substituted with the number of .macro instantiations we've done up to that point in time.
So if this is the 1st time we've instantiated a .macro (any .macro, regardless of name), \@ will instantiate to 0, if it's the 2nd .macro instantiation, it will instantiate to 1 etc.

It can only be used inside a .macro definition, an .irp definition or an .irpc definition (those last 2 uses are undocumented).

Reviewers: echristo, rafael

Reviewed By: rafael

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D9197

llvm-svn: 235862
2015-04-27 10:50:29 +00:00
Pawel Bylica c25918a043 Constfold insertelement to undef when index is out-of-bounds
Summary:
This patch adds constant folding of insertelement instruction to undef value when index operand is constant and is not less than vector size or is undef.

InstCombine does not support this case, but I'm happy to add it there also if this change is accepted.

Test Plan: Unittests and regression tests for ConstProp pass.

Reviewers: majnemer

Reviewed By: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9287

llvm-svn: 235854
2015-04-27 09:30:49 +00:00
Simon Pilgrim 4f683c264a [X86][SSE] Add v16i8/v32i8 multiplication support
Patch to allow int8 vectors to be multiplied on the SSE unit instead of being scalarized.

The patch sign extends the i8 lanes to i16, uses the SSE2 pmullw multiplication instruction, then packs the lower byte from each result.

Differential Revision: http://reviews.llvm.org/D9115

llvm-svn: 235837
2015-04-27 07:55:46 +00:00
Philip Reames 63294cbb6a [RewriteStatepointsForGC] Exclude constant values from being considered live at a safepoint
There can be various constant pointers in the IR which do not get relocated at a safepoint. One example is the address of a global variable. Another example is a pointer created via inttoptr. Note that the optimizer itself likes to create such inttoptrs when locally propagating constants through dynamically dead code.

To deal with this, we need to exclude uses of constants from contributing to the liveness of a safepoint which might reach that use. At some later date, it might be worth exploring what could be done to support the relocation of various special types of "constants", but that's future work.

Differential Revision: http://reviews.llvm.org/D9236

llvm-svn: 235821
2015-04-26 19:48:03 +00:00
Philip Reames 2e78fa49a8 Don't Place Entry Safepoints Before the llvm.frameescape() Intrinsic
llvm.frameescape() intrinsic is not a real call. The intrinsic can only exist in the entry block. Inserting a gc.statepoint() before llvm.frameescape() may split the entry block, and push the intrinsic out of the entry block.

Patch by: Swaroop.Sridhar@microsoft.com
Differential Revision: http://reviews.llvm.org/D8910

llvm-svn: 235820
2015-04-26 19:41:23 +00:00
Matt Arsenault 957bfc7458 R600: Remove / merge redundant testcases
llvm-svn: 235813
2015-04-26 00:53:33 +00:00
Sanjay Patel c1d20a36fb [x86] instcombine more cases of insertps into a shufflevector
This is a follow-on to D8833 (insertps optimization when the zero mask is not used).

In this patch, we check for the case where the zmask is used, but both input vectors
to the insertps intrinsic are the same operand or the zmask overrides the destination
lane. This lets us replace the 2nd shuffle input operand with the zero vector.

Differential Revision: http://reviews.llvm.org/D9257

llvm-svn: 235810
2015-04-25 20:55:25 +00:00
Sanjay Patel 3eb5146b3c add SSE run to check non-AVX codegen
llvm-svn: 235809
2015-04-25 20:41:51 +00:00
Simon Pilgrim aedd3c5160 line endings fix
llvm-svn: 235800
2015-04-25 12:12:43 +00:00
Duncan P. N. Exon Smith c8d987b121 Linker: Copy over function metadata attachments
Update `lib/Linker` to handle `Function` metadata attachments.  The
attachments stick with the function body.

llvm-svn: 235786
2015-04-24 22:07:31 +00:00
Duncan P. N. Exon Smith 3d4cd756b6 IR: Add assembly/bitcode support for function metadata attachments
Add serialization support for function metadata attachments (added in
r235783).  The syntax is:

    define @foo() !attach !0 {

Metadata attachments are only allowed on functions with bodies.  Since
they come before the `{`, they're not really part of the body; since
they require a body, they're not really part of the header.  In
`LLParser` I gave them a separate function called from `ParseDefine()`,
`ParseOptionalFunctionMetadata()`.

In bitcode, I'm using the same `METADATA_ATTACHMENT` record used by
instructions.  Instruction metadata attachments are included in a
special "attachment" block at the end of a `Function`.  The attachment
records are laid out like this:

    InstID (KindID MetadataID)+

Note that these records always have an odd number of fields.  The new
code takes advantage of this to recognize function attachments (which
don't need an instruction ID):

    (KindID MetadataID)+

This means we can use the same attachment block already used for
instructions.

This is part of PR23340.

llvm-svn: 235785
2015-04-24 22:04:41 +00:00
Hans Wennborg 86ac630585 SimplifyCFG: Correctly handle switch lookup tables which fully cover the input type and use bit tests to check for holes
When using bit tests for hole checks, we call AddPredecessorToBlock to give the
phi node a value from the bit test block. This would break if we've
previously called removePredecessor on the default destination because the
switch is fully covered.

Test case by Mark Lacey.

llvm-svn: 235771
2015-04-24 20:57:56 +00:00
Reid Kleckner cfbfe6f29c [SEH] Implement GetExceptionCode in __except blocks
This introduces an intrinsic called llvm.eh.exceptioncode. It is lowered
by copying the EAX value live into whatever basic block it is called
from. Obviously, this only works if you insert it late during codegen,
because otherwise mid-level passes might reschedule it.

llvm-svn: 235768
2015-04-24 20:25:05 +00:00
David Blaikie 445e3fbc54 [opaque pointer type] Add textual IR support for explicit type parameter to the invoke instruction
Same as r235145 for the call instruction - the justification, tradeoffs,
etc are all the same. The conversion script worked the same without any
false negatives (after replacing 'call' with 'invoke').

llvm-svn: 235755
2015-04-24 19:32:54 +00:00
Sundeep Kushwaha 5d41a6992d [PATCH] [Hexagon] Adding a test case for calling convention.
http://reviews.llvm.org/D9241

llvm-svn: 235754
2015-04-24 19:22:02 +00:00
David Blaikie b5865f2868 Revert changes to LTO test case since llvm-lto can't handle textual IR inputs
llvm-svn: 235738
2015-04-24 18:13:27 +00:00
David Blaikie f8aca46512 Skip extra LLVM IR assemble/disassemble steps in some tests
llvm-svn: 235736
2015-04-24 18:06:09 +00:00
David Blaikie 5ea1f7b744 [opaque pointer type] bitcode: add explicit callee type to invoke instructions
llvm-svn: 235735
2015-04-24 18:06:06 +00:00
Yaron Keren de2e2b0214 Teach AArch64\lit.local.cfg the new triple names windows-gnu and windows-msvc.
Tests were failing when built with -DLLVM_DEFAULT_TARGET_TRIPLE=i686-pc-windows-gnu.

llvm-svn: 235733
2015-04-24 17:14:16 +00:00
Duncan P. N. Exon Smith 085f80536f Linker: Update -override testcase to check callers
Check that `@main` is calling `@foo2` (the renamed internal function),
not the `@foo` with external linkage that's been pulled in from the
override file.

llvm-svn: 235730
2015-04-24 16:56:24 +00:00
Hans Wennborg ec679a8b3b Switch lowering: fix APInt overflow causing infinite loop / OOM
llvm-svn: 235729
2015-04-24 16:53:55 +00:00
Reid Kleckner 2c3ccaacb7 [WinEH] Split the landingpad BB instead of cloning it
This means we don't have to RAUW the landingpad instruction and
landingpad BB, which is a nice win.

llvm-svn: 235725
2015-04-24 16:22:19 +00:00
Filipe Cabecinhas ff1e234fb8 [BitcodeReader] Fix asserts when we read a non-vector type for insert/extract/shuffle
Added some additional checking for vector types + tests.

Bug found with AFL fuzz.

llvm-svn: 235710
2015-04-24 11:30:15 +00:00
Jingyue Wu 72fca6c89b Resurrect r235688
We should skip vector types which are not SCEVable.

test/CodeGen/NVPTX/sched2.ll passes

llvm-svn: 235695
2015-04-24 04:22:39 +00:00
Jingyue Wu 62af99b0db Revert r235688
Seems breaking builds

llvm-svn: 235690
2015-04-24 03:26:11 +00:00
Jingyue Wu 312fd0242d [NVPTX] Emits "generic()" depending on the original address space
Summary:
Fixes a bug in the NVPTX codegen. The code used to miss necessary "generic()"
on aggregates of addrspacecasts.

Test Plan: addrspacecast-gvar.ll

Reviewers: eliben, jholewinski

Reviewed By: jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9130

llvm-svn: 235689
2015-04-24 02:57:30 +00:00
Jingyue Wu 3daace5295 [NVPTX] enable NaryReassociate in NVPTX
Summary:
We run NaryReassociate right after SLSR because SLSR enables many
opportunities for NaryReassociate. For example, in nary-slsr.ll

  foo((a + b) + c);
  foo((a + b * 2) + c);
  foo((a + b * 3) + c);   // 2 muls and 6 adds

after SLSR:

  ab = a + b;
  foo(ab + c);
  ab2 = ab + b;
  foo(ab2 + c);
  ab3 = ab2 + b;
  foo(ab3 + c);           // 6 adds

after NaryReassociate:

  abc = (a + b) + c;
  foo(abc);
  ab2c = abc + b;
  foo(ab2c);
  ab3c = ab2c + b;
  foo(ab3c);              // 4 adds

Test Plan: nary-slsr.ll

Reviewers: jholewinski, eliben

Reviewed By: eliben

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9066

llvm-svn: 235688
2015-04-24 02:54:06 +00:00
Matt Arsenault 5e10016f03 R600/SI: Fix verifier error when producing v_madmk_f32
Copy the kill flags when swapping the operands.

llvm-svn: 235687
2015-04-24 01:57:58 +00:00
Matthias Braun e1a67412cf R600/RegisterCoalescer: Enable more rematerialization/add missing testcase
This enables the rematerialization of some R600 MOV instructions in the
RegisterCoalescer and adds a testcase for r235668.

llvm-svn: 235675
2015-04-24 00:25:50 +00:00
Reid Kleckner 5c5facc2ce Re-commit "[SEH] Remove the old __C_specific_handler code now that WinEHPrepare works"
This reverts commit r235617.

r235649 should have addressed the problems.

llvm-svn: 235667
2015-04-23 23:22:33 +00:00
Hal Finkel 4dc8fcc224 [PowerPC] Support register name prefixes for vector registers
Match binutils by supporting the optional register name prefix for new vector
registers ("vs" for VSX registers and "q" for QPX registers).

llvm-svn: 235665
2015-04-23 23:16:22 +00:00
Hal Finkel d86e90abdd [PowerPC] Use sync inst alias when printing
So long as the choice between printing msync and sync is not ambiguous, we can
print 'sync 0' and just 'sync'.

llvm-svn: 235663
2015-04-23 23:05:08 +00:00
Tom Stellard ff5cf0e1fd R600: Correctly lower CONCAT_VECTOR nodes with more than 2 operands
llvm-svn: 235662
2015-04-23 22:59:24 +00:00
Hal Finkel fefcfffe68 [PowerPC] Add asm/disasm support for dcbt with hint
Add assembler/disassembler support for dcbt/dcbtst (and aliases) with the hint
field specified (non-zero). Unforunately, the syntax for this instruction is
special in that it differs for server vs. embedded cores:
   dcbt ra, rb, th [server]
   dcbt th, ra, rb [embedded]
where th can be omitted when it is 0. dcbtst is the same. Thus we need to play
games in the parser and the printer to flip the operands around on the embedded
cores. We'll use the server syntax as the default (binutils currently uses the
embedded form by default, but IBM is changing that).

We also stop marking dcbtst as having unmodeled side effects (this is not
necessary, it is just a hint like dcbt -- noticed by inspection, so no separate
test case).

llvm-svn: 235657
2015-04-23 22:47:57 +00:00
Andrew Kaylor 20ae2a311f [WinEH] Ignore filter clauses while mapping landing pad blocks.
llvm-svn: 235656
2015-04-23 22:38:36 +00:00
Reid Kleckner e3af86e9d9 [WinEH] Replace more lpad value uses with undef
We were asserting on code like this:
  extern "C" unsigned long _exception_code();
  void might_crash(unsigned long);
  void foo() {
    __try {
      might_crash(0);
    } __except(1) {
      might_crash(_exception_code());
    }
  }

Gtest and many other libraries get the exception code from the __except
block. What's supposed to happen here is that EAX is live into the
__except block, and it contains the exception code. Eventually we'll
represent that as a use of the landingpad ehptr value, but for now we
can replace it with undef.

llvm-svn: 235649
2015-04-23 21:22:30 +00:00
Quentin Colombet 796d906e06 [MachineCopyPropagation] Handle undef flags conservatively so that we do not
remove copies that are useful after breaking some hardware dependencies.
In other words, handle this kind of situations conservatively by assuming reg2
is redefined by the undef flag.
reg1 = copy reg2
= inst reg2<undef>
reg2 = copy reg1
Copy propagation used to remove the last copy.
This is incorrect because the undef flag on reg2 in inst, allows next
passes to put whatever trashed value in reg2 that may help.
In practice we end up with this code:
reg1 = copy reg2
reg2 = 0
= inst reg2<undef>
reg2 = copy reg1

This fixes PR21743.

llvm-svn: 235647
2015-04-23 21:17:39 +00:00
Tom Stellard 8b0182af2f R600/SI: Fix indirect addressing with a negative constant offset
When the base register index of the vector plus the constant offset
was less than zero, we were passing the wrong base register to the indirect
addressing instruction.

In this case, we need to set the base register to v0 and then add
the computed (negative) index to m0.

llvm-svn: 235641
2015-04-23 20:32:01 +00:00
Peter Collingbourne 167668f8c8 Thumb2: When applying branch optimizations, visit branches in reverse order.
The order in which branches appear in ImmBranches is approximately their
order within the function body. By visiting later branches first, we reduce
the distance between earlier forward branches and their targets, making it
more likely that the cbn?z optimization, which can only apply to forward
branches, will succeed for those earlier branches.

Differential Revision: http://reviews.llvm.org/D9185

llvm-svn: 235640
2015-04-23 20:31:35 +00:00
Peter Collingbourne cfee5b04bc ARM: When re-creating a branch via InsertBranch, preserve CPSR flags.
In particular, this preserves the kill flag, which allows the Thumb2 cbn?z
optimization to be applied in cases where a branch has been re-created after
the live variables analysis pass, e.g. by the machine block placement pass.

This appears to be low risk; a number of other targets seem to already be
doing something similar, e.g. AArch64, PowerPC.

Differential Revision: http://reviews.llvm.org/D9184

llvm-svn: 235639
2015-04-23 20:31:32 +00:00
Peter Collingbourne 6529523151 Thumb2: When optimizing for size, do not if-convert branches involving comparisons with zero.
This allows the constant island pass to lower these branches to cbn?z
instructions, resulting in a shorter instruction sequence.

Differential Revision: http://reviews.llvm.org/D9183

llvm-svn: 235638
2015-04-23 20:31:30 +00:00
Peter Collingbourne 78f1ecc59c ARM: When spilling extra registers for alignment, prefer low registers on all Thumb targets.
This makes it more likely that we can use the 16-bit push and pop instructions
on Thumb-2, saving around 4 bytes per function.

Differential Revision: http://reviews.llvm.org/D9165

llvm-svn: 235637
2015-04-23 20:31:26 +00:00
Peter Collingbourne 1213918bf4 ARM: Only enforce 4-byte alignment on Thumb-2 functions with constant pools.
This appears to have been introduced back in r76698 as part of an unrelated
change. I can find no official ARM documentation stating that Thumb-2 functions
require 4-byte alignment; in fact, ARM documentation appears to contradict
this (see, e.g., ARM Architecture Reference Manual Thumb-2 Supplement,
section 2.6.1: "Thumb-2 enforces 16-bit alignment on all instructions.").

Also remove code that sets alignment for ARM functions, which is redundant
with code in the MachineFunction constructor, and remove the hidden
-arm-align-constant-islands flag, which has been enabled by default since
r146739 (Dec 2011) and has probably received sufficient testing by now.

Differential Revision: http://reviews.llvm.org/D9138

llvm-svn: 235636
2015-04-23 20:31:22 +00:00
Adam Nemet e2b885c4bc [getUnderlyingOjbects] Analyze loop PHIs further to remove false positives
Specifically, if a pointer accesses different underlying objects in each
iteration, don't look through the phi node defining the pointer.

The motivating case is the underlyling-objects-2.ll testcase.  Consider
the loop nest:

  int **A;
  for (i)
    for (j)
       A[i][j] = A[i-1][j] * B[j]

This loop is transformed by Load-PRE to stash away A[i] for the next
iteration of the outer loop:

  Curr = A[0];          // Prev_0
  for (i: 1..N) {
    Prev = Curr;        // Prev = PHI (Prev_0, Curr)
    Curr = A[i];
    for (j: 0..N)
       Curr[j] = Prev[j] * B[j]
  }

Since A[i] and A[i-1] are likely to be independent pointers,
getUnderlyingObjects should not assume that Curr and Prev share the same
underlying object in the inner loop.

If it did we would try to dependence-analyze Curr and Prev and the
analysis of the corresponding SCEVs would fail with non-constant
distance.

To fix this, the getUnderlyingObjects API is extended with an optional
LoopInfo parameter.  This is effectively what controls whether we want
the above behavior or the original.  Currently, I only changed to use
this approach for LoopAccessAnalysis.

The other testcase is to guard the opposite case where we do want to
look through the loop PHI.  If we step through an array by incrementing
a pointer, the underlying object is the incoming value of the phi as the
loop is entered.

Fixes rdar://problem/19566729

llvm-svn: 235634
2015-04-23 20:09:20 +00:00
Jingyue Wu 3286ec1484 [NVPTX] run SeparateConstOffsetFromGEP before SLSR
Summary:
We pick this order because SeparateConstOffsetFromGEP may create more
opportunities for SLSR.

Test Plan:
reassociate-geps-and-slsr.ll
no performance regression on internal benchmarks

Reviewers: meheff

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D9230

llvm-svn: 235632
2015-04-23 20:00:04 +00:00
Tom Stellard db04590717 R600/SI: Add missing -mcpu=SI to assembler test
llvm-svn: 235630
2015-04-23 19:33:55 +00:00
Tom Stellard d1f0f0268c R600/SI: Add assembler support for all CI and VI VOP1 instructions
llvm-svn: 235629
2015-04-23 19:33:54 +00:00
Tom Stellard 7130ef49cb R600/SI: Improve AsmParser support for forced e64 encoding
We can now force e64 encoding even when the operands would be legal
for e32 encoding.

llvm-svn: 235626
2015-04-23 19:33:48 +00:00
Reid Kleckner 909ea7e6b8 Revert "[SEH] Remove the old __C_specific_handler code now that WinEHPrepare works"
We still have some "uses remain after removal" issues in -O0 builds.

This reverts commit r235557.

llvm-svn: 235617
2015-04-23 18:34:01 +00:00
Hal Finkel 7c5cb066d0 [PowerPC] Enable printing instructions using aliases
TableGen had been nicely generating code to print a number of instructions using
shorter aliases (and PowerPC has plenty of short mnemonics), but we were not
calling it. For some of the aliases we support in the parser, TableGen can't
infer the "inverse" alias relationship, so there is still more to do.

Thus, after some hours of updating test cases...

llvm-svn: 235616
2015-04-23 18:30:38 +00:00
Pirama Arumuga Nainar 745615ca00 [AArch64] Add nvcast patterns for v4f16 and v8f16
Summary:
Constant stores of f16 vectors can create NvCast nodes from various
operand types to v4f16 or v8f16 depending on patterns in the stored
constants.  This patch adds nvcast rules with v4f16 and v8f16 values.

AArchISelLowering::LowerBUILD_VECTOR has the details on which constant
patterns generate the nvcast nodes.

Reviewers: jmolloy, srhines, ab

Subscribers: rengolin, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D9201

llvm-svn: 235610
2015-04-23 17:32:25 +00:00
Pirama Arumuga Nainar b18815354d [AArch64] Handle vec4, vec8, vec16 *itofp for half
Summary:
Set operation action for SINT_TO_FP and UINT_TO_FP nodes with v4i32,
v8i8, v8i16 inputs to allow promotion of v4f16 results.

Add tests for sitofp and uitofp for vec4, vec8, vec16, and i8, i16, i32,
and i64 vectors.  Only missing tests are for v16i8 and v16i16 as the
shift operations are too complicated to write a proper check sequence.

The conversions from v4i64 to v4f16 do not depend on this patch - v4i64
is split and the conversion gets handled while lowering v2i64.  I am
adding a test here for completeness.

Reviewers: aemerson, rengolin, ab, jmolloy, srhines

Subscribers: rengolin, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D9166

llvm-svn: 235609
2015-04-23 17:16:27 +00:00
Hans Wennborg 0867b151c9 Re-commit r235560: Switch lowering: extract jump tables and bit tests before building binary tree (PR22262)
Third time's the charm. The previous commit was reverted as a
reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--'
on an iterator at the beginning of a vector, causing asserts
when using debugging iterators. This commit fixes that.

llvm-svn: 235608
2015-04-23 16:45:24 +00:00
Sanjay Patel f4b0f07430 use update_llc_test_checks.py to tighten checking; remove unnecessary CPU param
llvm-svn: 235604
2015-04-23 16:07:50 +00:00
Krzysztof Parzyszek 876a19d855 [Hexagon] Shrink-wrap stack frame (Hexagon-specific)
llvm-svn: 235603
2015-04-23 16:05:39 +00:00
Krzysztof Parzyszek a17cebd219 [Hexagon] Add testcases for stack alignment and variable-sized objects
llvm-svn: 235602
2015-04-23 15:12:49 +00:00
Aaron Ballman 0be238cebd Revert r235560; this commit was causing several failed assertions in Debug builds using MSVC's STL. The iterator is being used outside of its valid range.
llvm-svn: 235597
2015-04-23 13:41:59 +00:00
Filipe Cabecinhas 6621cb7478 Be more strict about the operand for the array type in BitcodeReader
Summary: Bug found with AFL fuzz.

Reviewers: rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9016

llvm-svn: 235596
2015-04-23 13:38:21 +00:00