Commit Graph

16232 Commits

Author SHA1 Message Date
Andrew Trick fb2ba3e1cb Enable the new LoopInfo algorithm by default.
The primary advantage is that loop optimizations will be applied in a
stable order. This helps debugging and unit test creation. It is also
a better overall implementation without pathologically bad performance
on deep functions.

On large functions (llvm-stress --size=200000 | opt -loops)
Before: 0.1263s
After:  0.0225s

On deep functions (after tweaking llvm-stress, thanks Nadav):
Before: 0.2281s
After:  0.0227s

See r158790 for more comments.

The loop tree is now consistently generated in forward order, but loop
passes are applied in reverse order over the program. If we have a
loop optimization that prefers forward order, that can easily be
achieved by adding a different type of LoopPassManager.

llvm-svn: 159183
2012-06-26 04:11:38 +00:00
Owen Anderson c272bab5a5 Define DAGOperand, an empty base class for RegisterClass and Operand. This allows one to write multiclasses that are polymorphic over both registers and non-register operands.
llvm-svn: 159162
2012-06-25 21:25:16 +00:00
Nuno Lopes 490096c8fb add CallSite/CallInst/InvokeInst::hasFnAttr()
llvm-svn: 159144
2012-06-25 16:16:58 +00:00
Rafael Espindola 540c3d23df If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it
hidden. Being linkonce_odr guarantees that it is available in every dso that
needs it. Being a constant/function with unnamed_addr guarantees that the
copies don't have to be merged.

llvm-svn: 159136
2012-06-25 14:30:31 +00:00
Eli Bendersky f0ad3606c7 The name (and comment describing) of llvm::GetFirstDebuigLocInBasicBlock no longer represents what the function does. Therefore, the function is removed and its functionality is folded into the only place in the code-base where it was being used.
llvm-svn: 159133
2012-06-25 10:13:14 +00:00
Chandler Carruth 00bef3fff1 Just remove generic support for C++11 alignas -- GCC is already
advertising complete support w/o alignas implemented, and its
implementation of alignas in the latest versions is so convoluted as to
be unusable.

llvm-svn: 159125
2012-06-25 05:20:13 +00:00
Hal Finkel 3099ce9489 Allow controlling vectorization of boolean values separately from other integer types.
These are used as the result of comparisons, and often handled differently from larger integer types.

llvm-svn: 159111
2012-06-24 13:28:01 +00:00
NAKAMURA Takumi c707c253ad llvm/Support/IntegersSubset.h: Add a copy constructor on IntegersSubset to appease msvc.
msvc mis-infers ParentTy(RHS) to (const RangesCollectionTy &).

llvm-svn: 159101
2012-06-24 03:48:53 +00:00
NAKAMURA Takumi 37d96f0862 llvm/Support/IntegersSubset.h: Fix whitespace.
llvm-svn: 159100
2012-06-24 03:48:47 +00:00
Hal Finkel 4b06b1a0ee Allow BBVectorize to fuse compare instructions.
llvm-svn: 159088
2012-06-23 21:52:50 +00:00
Marshall Clow 78ade1dd08 Add relocation types for Hexagon processor; patch by Sidney Manning <sidneym@codeaurora.org>
llvm-svn: 159081
2012-06-23 14:46:18 +00:00
Hans Wennborg ac9fb36c31 Clean-up after r159077.
Remove temporary GlobalVariable constructors now that Clang has been
updated (r159078).

llvm-svn: 159079
2012-06-23 12:14:23 +00:00
Hans Wennborg cbe34b4cc9 Extend the IL for selecting TLS models (PR9788)
This allows the user/front-end to specify a model that is better
than what LLVM would choose by default. For example, a variable
might be declared as

  @x = thread_local(initialexec) global i32 42

if it will not be used in a shared library that is dlopen'ed.

If the specified model isn't supported by the target, or if LLVM can
make a better choice, a different model may be used.

llvm-svn: 159077
2012-06-23 11:37:03 +00:00
Stepan Dyatkovskiy 8e00efeace Optimized usage of new SwitchInst case values (IntegersSubset type) in Local.cpp, Execution.cpp and BitcodeWriter.cpp.
I got about 1% of compile-time improvement on my machines (Ubuntu 11.10 i386 and Ubuntu 12.04 x64).

llvm-svn: 159076
2012-06-23 10:58:58 +00:00
Jim Grosbach 3a8a0fa8e6 TableGen: AsmMatcher support for better operand diagnostics.
"Invalid operand" may be a completely correct diagnostic, but it's often
insufficiently specific to really help identify and fix the problem in
assembly source. Allow a target to specify a more-specific diagnostic kind
for each AsmOperandClass derived definition and use that to provide
more detailed diagnostics when an operant of that class resulted in a
match failure.

rdar://8987109

llvm-svn: 159050
2012-06-22 23:56:44 +00:00
Jakob Stoklund Olesen a127fc780a Remove ProcessImplicitDefs.h which was unused.
The ProcessImplicitDefs class can be local to its implementation file.

llvm-svn: 159041
2012-06-22 22:27:36 +00:00
Jakob Stoklund Olesen 4fa84ba8b9 Delete a boring statistic.
llvm-svn: 159030
2012-06-22 20:40:15 +00:00
Jakob Stoklund Olesen c61edda0ab Store live intervals in an IndexedMap.
It is both smaller and faster than DenseMap.

llvm-svn: 159029
2012-06-22 20:37:52 +00:00
Hal Finkel 8db5547252 Revert r158679 - use case is unclear (and it increases the memory footprint).
Original commit message:
    Allow up to 64 functional units per processor itinerary.

    This patch changes the type used to hold the FU bitset from unsigned to uint64_t.
    This will be needed for some upcoming PowerPC itineraries.

llvm-svn: 159027
2012-06-22 20:27:13 +00:00
Evan Cheng f5bd6c6510 EmitZerofill should take a 64-bit size or else it's chopping off large zero-filled global. rdar://11729134
llvm-svn: 159023
2012-06-22 20:14:46 +00:00
Jakob Stoklund Olesen 37e797fedc Stop computing physreg live ranges.
Everyone is using on-demand regunit ranges now.

llvm-svn: 159018
2012-06-22 18:20:50 +00:00
Kaelyn Uhrain e209570e8f Remove a variable that is unused when assertions aren't enabled.
llvm-svn: 159011
2012-06-22 17:18:15 +00:00
Jakob Stoklund Olesen b1b3e4aa58 Remove LiveIntervals::trackingRegUnits().
With regunit liveness permanently enabled, this function would always
return true.

Also remove now obsolete code for checking physreg interference.

llvm-svn: 159006
2012-06-22 16:46:44 +00:00
Dmitri Gribenko 1fed006a4d Change comment into proper Doxygen member comment.
llvm-svn: 159000
2012-06-22 16:00:48 +00:00
Stepan Dyatkovskiy a6c8cc307b Fixed r158979.
Original message:
Performance optimizations:
- SwitchInst: case values stored separately from Operands List. It allows to make faster access to individual case value numbers or ranges.
- Optimized IntItem, added APInt value caching.
- Optimized IntegersSubsetGeneric: added optimizations for cases when subset is single number or when subset consists from single numbers only.

llvm-svn: 158997
2012-06-22 14:53:30 +00:00
Rafael Espindola ea59166190 Remove another duplicated variable. We only need one to tell us if the linker
knows dwarf or not.

llvm-svn: 158993
2012-06-22 13:32:49 +00:00
Rafael Espindola d7bdaf5795 Fix a FIXME: DwarfRequiresRelocationForSectionOffset is the same as
DwarfUsesRelocationsAcrossSections.

llvm-svn: 158992
2012-06-22 13:24:07 +00:00
Duncan Sands 83884a1042 Revert commit 158979 (dyatkovskiy) since it is causing several buildbots to
fail.  Original commit message:

Performance optimizations:
- SwitchInst: case values stored separately from Operands List. It allows to make faster access to individual case value numbers or ranges.
- Optimized IntItem, added APInt value caching.
- Optimized IntegersSubsetGeneric: added optimizations for cases when subset is single number or when subset consists from single numbers only.

On my machine these optimizations gave about 4-6% of compile-time improvement.

llvm-svn: 158986
2012-06-22 10:35:06 +00:00
Stepan Dyatkovskiy fcfa633bf8 Performance optimizations:
- SwitchInst: case values stored separately from Operands List. It allows to make faster access to individual case value numbers or ranges.
- Optimized IntItem, added APInt value caching.
- Optimized IntegersSubsetGeneric: added optimizations for cases when subset is single number or when subset consists from single numbers only.

On my machine these optimizations gave about 4-6% of compile-time improvement.

llvm-svn: 158979
2012-06-22 07:35:13 +00:00
Andrew Trick 9c302673b2 Use "NoItineraries" for processors with no itineraries.
This makes it explicit when ScoreboardHazardRecognizer will be used.
"GenericItineraries" would only make sense if it contained real
itinerary values and still required ScoreboardHazardRecognizer.

llvm-svn: 158963
2012-06-22 03:58:51 +00:00
Nick Lewycky 33da33676f Emit relocations for DW_AT_location entries on systems which need it. This is
a recommit of r127757. Fixes PR9493. Patch by Paul Robinson!

llvm-svn: 158957
2012-06-22 01:25:12 +00:00
Lang Hames b8650f106a Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a
boolean flag to an enum: { Fast, Standard, Strict } (default = Standard).

This option controls the creation by optimizations of fused FP ops that store
intermediate results in higher precision than IEEE allows (E.g. FMAs). The
behavior of this option is intended to match the behaviour specified by a
soon-to-be-introduced frontend flag: '-ffuse-fp-ops'.

Fast mode - allows formation of fused FP ops whenever they're profitable.

Standard mode - allow fusion only for 'blessed' FP ops. At present the only
blessed op is the fmuladd intrinsic. In the future more blessed ops may be
added.

Strict mode - allow fusion only if/when it can be proven that the excess
precision won't effect the result.

Note: This option only controls formation of fused ops by the optimizers.  Fused
operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic)
will always be honored, regardless of the value of this option.

Internally TargetOptions::AllowExcessFPPrecision has been replaced by
TargetOptions::AllowFPOpFusion.

llvm-svn: 158956
2012-06-22 01:09:09 +00:00
Nuno Lopes 9792d68381 remove extractMallocCallFromBitCast, since it was tailor maded for its sole user. Update GlobalOpt accordingly.
llvm-svn: 158952
2012-06-22 00:25:01 +00:00
Nuno Lopes dc6085e52d Add support for invoke to the MemoryBuiltin analysid.
Update comments accordingly.

Make instcombine remove useless invokes to C++'s 'new' allocation function (test attached).

llvm-svn: 158937
2012-06-21 21:25:05 +00:00
Nuno Lopes beee2f10e0 move some typedefs so that we don't polute the llvm namespace. this should appease the GCC buildbots
llvm-svn: 158924
2012-06-21 16:58:41 +00:00
Nuno Lopes 55fff83422 refactor the MemoryBuiltin analysis:
- provide more extensive set of functions to detect library allocation functions (e.g., malloc, calloc, strdup, etc)
 - provide an API to compute the size and offset of an object pointed by

Move a few clients (GVN, AA, instcombine, ...) to the new API.
This implementation is a lot more aggressive than each of the custom implementations being replaced.

Patch reviewed by Nick Lewycky and Chandler Carruth, thanks.

llvm-svn: 158919
2012-06-21 15:45:28 +00:00
Nadav Rotem 4e9012c2b1 Add a number of threshold arguments to the SRA pass.
A patch by Tom Stellard with minor changes.

llvm-svn: 158918
2012-06-21 13:44:31 +00:00
Jakob Stoklund Olesen 88ef957e74 Remove LiveIntervals::iterator.
Live intervals for regunits and virtual registers are stored separately,
and physreg live intervals are going away.

To visit the live ranges of all virtual registers, use this pattern
instead:

  for (unsigned i = 0, e = MRI->getNumVirtRegs(); i != e; ++i) {
    unsigned Reg = TargetRegisterInfo::index2VirtReg(i);
    if (MRI->reg_nodbg_empty(Reg))
      continue;

llvm-svn: 158879
2012-06-20 23:54:20 +00:00
Jakob Stoklund Olesen 1911a0203d Remove the RenderMachineFunction HTML output pass.
I don't think anyone has been using this functionality for a while, and
it is getting in the way of refactoring now.

llvm-svn: 158876
2012-06-20 23:47:58 +00:00
Andrew Trick 2ed08a7cc1 Restructure PopulateLoopsDFS::insertIntoLoop.
As Nadav pointed out the first implementation was obscure.

llvm-svn: 158862
2012-06-20 22:18:33 +00:00
Andrew Trick dbb7ae54b8 Add "extern template" declarations now that we use explicit instantiation.
This is supported by gcc and clang, but guarded by a macro for MSVC 2008.

The extern template declaration is not necessary but generally good
form. It can avoid extra instantiations of the template methods
defined inline.

The EXTERN_TEMPLATE_INSTANTIATION macro could probably be generalized to
handle multiple template parameters if someone thinks it's worthwhile.

llvm-svn: 158840
2012-06-20 20:17:20 +00:00
Jakob Stoklund Olesen 833308d785 Only update regunit live ranges that have been precomputed.
Regunit live ranges are computed on demand, so when mi-sched calls
handleMove, some regunits may not have live ranges yet.

That makes updating them easier: Just skip the non-existing ranges. They
will be computed correctly from the rescheduled machine code when they
are needed.

llvm-svn: 158831
2012-06-20 18:00:57 +00:00
Chandler Carruth 5c0997f066 Remove 'static' from inline functions defined in header files.
There is a pretty staggering amount of this in LLVM's header files, this
is not all of the instances I'm afraid. These include all of the
functions that (in my build) are used by a non-static inline (or
external) function. Specifically, these issues were caught by the new
'-Winternal-linkage-in-inline' warning.

I'll try to just clean up the remainder of the clearly redundant "static
inline" cases on functions (not methods!) defined within headers if
I can do so in a reliable way.

There were even several cases of a missing 'inline' altogether, or my
personal favorite "static bool inline". Go figure. ;]

llvm-svn: 158800
2012-06-20 08:39:33 +00:00
Andrew Trick ff2ed7b687 A new algorithm for computing LoopInfo. Temporarily disabled.
-stable-loops enables a new algorithm for generating the Loop
forest. It differs from the original algorithm in a few respects:

- Not determined by use-list order.
- Initially guarantees RPO order of block and subloops.
- Linear in the number of CFG edges.
- Nonrecursive.

I didn't want to change the LoopInfo API yet, so the block lists are
still inclusive. This seems strange to me, and it means that building
LoopInfo is not strictly linear, but it may not be a problem in
practice. At least the block lists start out in RPO order now. In the
future we may add an attribute or wrapper analysis that allows other
passes to assume RPO order.

The primary motivation of this work was not to optimize LoopInfo, but
to allow reproducing performance issues by decomposing the compilation
stages. I'm often unable to do this with the current LoopInfo, because
the loop tree order determines Loop pass order. Serializing the IR
tends to invert the order, which reverses the optimization order. This
makes it nearly impossible to debug interdependent loop optimizations
such as LSR.

I also believe this will provide more stable performance results across time.

llvm-svn: 158790
2012-06-20 05:23:33 +00:00
Andrew Trick cda51d430d Move the implementation of LoopInfo into LoopInfoImpl.h.
The implementation only needs inclusion from LoopInfo.cpp and
MachineLoopInfo.cpp. Clients of the interface should only include the
interface. This makes the interface readable and speeds up rebuilds
after modifying the implementation.

llvm-svn: 158787
2012-06-20 03:42:09 +00:00
Nick Kledzik 18497e9242 Add permissions(), map_file_pages(), and unmap_file_pages() to llvm::sys::fs and add unit test. Unix is implemented. Windows side needs to be implemented.
llvm-svn: 158770
2012-06-20 00:28:54 +00:00
Chad Rosier 7369692790 Add an ensureMaxAlignment() function to MachineFrameInfo (analogous to
ensureAlignment() in MachineFunction).  Also, drop setMaxAlignment() in
favor of this new function.  This creates a main entry point to setting
MaxAlignment, which will be helpful for future work.  No functionality
change intended.

llvm-svn: 158758
2012-06-19 22:59:12 +00:00
Lang Hames 39fb1d08dc Add DAG-combines for aggressive FMA formation.
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or
FSUB + FMUL. The combines are performed when:
(a) Either
      AllowExcessFPPrecision option (-enable-excess-fp-precision for llc)
        OR
      UnsafeFPMath option (-enable-unsafe-fp-math)
    are set, and
(b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of
    the FADD/FSUB, and
(c) The FMUL only has one user (the FADD/FSUB).

If your target has fast FMA instructions you can make use of these combines by
overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for
types supported by your FMA instruction, and adding patterns to match ISD::FMA
to your FMA instructions.

llvm-svn: 158757
2012-06-19 22:51:23 +00:00
Chad Rosier bb335c96f9 Typo. Patch by Cameron McInally <cameron.mcinally@nyu.edu>.
llvm-svn: 158754
2012-06-19 22:28:18 +00:00
Rafael Espindola ca3e0ee8b3 Move the support for using .init_array from ARM to the generic
TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM,
on X86 it is not easy to find out if .init_array should be used or not, so
the decision is made via TargetOptions and defaults to off.

Add a command line option to llc that enables it.

llvm-svn: 158692
2012-06-19 00:48:28 +00:00
Nuno Lopes f9abcb7ba9 revert r158660, since Chris has some issues with this patch (namely using code to reprent information only used by the compiler)
Original commit msg:
add the 'alloc' metadata node to represent the size of offset of buffers pointed to by pointers.
This metadata can be attached to any instruction returning a pointer

llvm-svn: 158688
2012-06-18 23:34:26 +00:00
David Blaikie 2417379c63 Don't copy a potentially-uninitialized variable.
Based on review discussion of r158638 with Chandler Carruth, Tobias von Koch, and Duncan Sands and a -Wmaybe-uninitialized warning from GCC.

llvm-svn: 158685
2012-06-18 22:31:28 +00:00
Hal Finkel 8eac009633 Allow up to 64 functional units per processor itinerary.
This patch changes the type used to hold the FU bitset from unsigned to uint64_t.
This will be needed for some upcoming PowerPC itineraries.

llvm-svn: 158679
2012-06-18 21:08:18 +00:00
Marshall Clow d3e2a76ca4 Added accessors for getting coff_relocation info
llvm-svn: 158675
2012-06-18 19:47:16 +00:00
Nuno Lopes b7c941bad9 add the 'alloc' metadata node to represent the size of offset of buffers pointed to by pointers.
This metadata can be attached to any instruction returning a pointer

llvm-svn: 158660
2012-06-18 16:04:04 +00:00
Benjamin Kramer 23a9c3e090 Bring the return value of SmallVector::insert in line with std::vector::insert.
It always returns the iterator for the first inserted element, or the passed in
iterator if the inserted range was empty. Flesh out the unit test more and fix
all the cases it uncovered so far.

llvm-svn: 158645
2012-06-17 12:46:13 +00:00
Chandler Carruth b539b47793 Remove SmallMap, and the several files that were used to implement it.
We have SmallDenseMap now that has more correct and predictable
semantics, even though it is a more narrow abstraction.

llvm-svn: 158644
2012-06-17 12:07:42 +00:00
Benjamin Kramer 371b9b0e99 SmallVector: return a valid iterator for the rare case of inserting an empty range into a SmallVector.
Patch by Johannes Schaub!

llvm-svn: 158643
2012-06-17 11:52:22 +00:00
Chandler Carruth 4de807a552 Add a unit test for 'swap', and fix a pile of bugs in
SmallDenseMap::swap.

First, make it parse cleanly. Yay for uninstantiated methods.

Second, make the inline-buckets case work correctly. This is way
trickier than it should be due to the uninitialized values in empty and
tombstone buckets.

Finally fix a few typos that caused construction/destruction mismatches
in the counting unittest.

llvm-svn: 158641
2012-06-17 11:28:13 +00:00
Chandler Carruth a1be842f98 Add tests for *DenesMap for both key and value types' construction and
destruction and fix a bug in SmallDenseMap they caught.

This is kind of a poor-man's version of the testing that just adds the
addresses to a set on construction and removes them on destruction. We
check that double construction and double destruction don't occur.
Amusingly enough, this is enough to catch a lot of SmallDenseMap issues
because we spend a lot of time with fixed stable addresses in the inline
buffer.

The SmallDenseMap bug fix included makes grow() not double-destroy in
some cases. It also fixes a FIXME there, the code was pretty crappy. We
now don't have any wasted initialization, but we do move the entries in
inline bucket array an extra time. It's probably a better tradeoff, and
is much easier to get correct.

llvm-svn: 158639
2012-06-17 10:33:51 +00:00
Chandler Carruth 20dd838ae5 Introduce a SmallDenseMap container that re-uses the existing DenseMap
implementation.

This type includes an inline bucket array which is used initially. Once
it is exceeded, an array of 64 buckets is allocated on the heap. The
bucket count grows from there as needed. Some highlights of this
implementation:

- The inline buffer is very carefully aligned, and so supports types
  with alignment constraints.
- It works hard to avoid aliasing issues.
- Supports types with non-trivial constructors, destructors, copy
  constructions, etc. It works reasonably hard to minimize copies and
  unnecessary initialization. The most common initialization is to set
  keys to the empty key, and so that should be fast if at all possible.

This class has a performance / space trade-off. It tries to optimize for
relatively small maps, and so packs the inline bucket array densely into
the object. It will be marginally slower than a normal DenseMap in a few
use patterns, so it isn't appropriate everywhere.

The unit tests for DenseMap have been generalized a bit to support
running over different map implementations in addition to different
key/value types. They've then been automatically extended to cover the
new container through the magic of GoogleTest's typed tests.

All of this is still a bit rough though. I'm going to be cleaning up
some aspects of the implementation, documenting things better, and
adding tests which include non-trivial types. As soon as I'm comfortable
with the correctness, I plan to switch existing users of SmallMap over
to this class as it is already more correct w.r.t. construction and
destruction of objects iin the map.

Thanks to Benjamin Kramer for all the reviews of this and the lead-up
patches. That said, more review on this would really be appreciated. As
I've noted a few times, I'm quite surprised how hard it is to get the
semantics for a hashtable-based map container with a small buffer
optimization correct. =]

llvm-svn: 158638
2012-06-17 09:05:09 +00:00
Benjamin Kramer b9f84bb0ce Guard private fields that are unused in Release builds with #ifndef NDEBUG.
llvm-svn: 158608
2012-06-16 21:48:13 +00:00
Hal Finkel 16ddd4b66b Move the Metadata merging methods from GVN and make them public in MDNode.
There are other passes, BBVectorize specifically, that also need some of
this functionality.

llvm-svn: 158605
2012-06-16 20:33:37 +00:00
Benjamin Kramer 34d6a9e57a Merge the SmallBitVector and BitVector unit tests with gtest's typed test magic and bring SmallBitVector up to date.
llvm-svn: 158600
2012-06-16 10:51:07 +00:00
Chandler Carruth dea00d7c5a Add support to the alignment support header for conjuring a character
array of a suitable size and alignment for any of a number of different
types to be stored into the character array.

The mechanisms for producing an explicitly aligned type are fairly
complex because this operation is poorly supported on all compilers.
We've spent a fairly significant amount of time experimenting with
different implementations inside of Google, and the one using explicitly
expanded templates has been the most robust.

Credit goes to Nick Lewycky for writing the first 20 versions or so of
this logic we had inside of Google. I based this on the only one to
actually survive. In case anyone is worried, yes we are both explicitly
re-contributing and re-licensing it for LLVM. =]

Once the issues with actually specifying the alignment are finished, it
turns out that most compilers don't in turn align anything the way they
are instructed. Testing of this logic against both Clang and GCC
indicate that the alignment constraints are largely ignored by both
compilers! I've come up with and used a work-around by wrapping each
alignment-hinted type directly in a struct, and using that struct to
align the character array through a union. This elaborate hackery is
terrifying, but I've included testing that caught a terrifying number of
bugs in every other technique I've tried.

All of this in order to implement a poor C++98 programmers emulation of
C++11 unrestricted unions in classes such as SmallDenseMap.

llvm-svn: 158597
2012-06-16 08:52:57 +00:00
Chandler Carruth 144a2ac89d Lift the NumElements and NumTombstones members into the super class
rather than the base class. Add a pile of boilerplate to indirect around
this.

This is pretty ugly, but it allows the super class to change the
representation of these values, which will be key for doing
a SmallDenseMap.

Suggestions on better method structuring / naming are welcome, but keep
in mind that SmallDenseMap won't have an 'unsigned' member to expose
a reference to... =/

llvm-svn: 158586
2012-06-16 01:18:07 +00:00
Chandler Carruth d7291625cc Factor DenseMap into a base class that implements the hashtable logic,
and a derived class that provides the allocation and growth strategy.

This is the first (and biggest) step toward building a SmallDenseMap
that actually behaves exactly the same as DenseMap, and supports all the
same types and interface points with the same semantics.

llvm-svn: 158585
2012-06-16 01:05:01 +00:00
Marshall Clow 71757ef3ed Adding acessors to COFFObjectFile so that clients can get at the (non-generic) bits
llvm-svn: 158484
2012-06-15 01:08:25 +00:00
Rafael Espindola def1b09be2 Implement the isSafeToDiscardIfUnused predicate and use it in globalopt and
globaldce. Globaldce was already removing linkonce globals, but globalopt was
not.

llvm-svn: 158476
2012-06-14 22:48:13 +00:00
Stepan Dyatkovskiy 8af777330e SmallMap, FlatArrayMap::copyFrom
Replaced memcpy with std::copy, since the first one may work improperly with non POD data.

llvm-svn: 158457
2012-06-14 16:59:43 +00:00
Chandler Carruth 72738f6341 Group the 'unsigned' members after the pointer to avoid 4 bytes of
padding on x86-64.

llvm-svn: 158421
2012-06-13 21:44:07 +00:00
Kay Tiong Khoo f294921e24 *typo: Cyles changed to Cycles
llvm-svn: 158404
2012-06-13 15:53:04 +00:00
Duncan Sands 318a89ddac When linearizing a multiplication, return at once if we see a factor of zero,
since then the entire expression must equal zero (similarly for other operations
with an absorbing element).  With this in place a bunch of reassociate code for
handling constants is dead since it is all taken care of when linearizing.  No
intended functionality change.

llvm-svn: 158398
2012-06-13 09:42:13 +00:00
Craig Topper 71dc02d659 Fix intrinsics for XOP frczss/sd instructions. These instructions only take one source register and zero the upper bits of the destination rather than preserving them.
llvm-svn: 158396
2012-06-13 07:18:53 +00:00
Jakob Stoklund Olesen 1c66b87f7d Eliminate struct TableGenBackend.
TableGen backends are simply written as functions now.

Patch by Sean Silva!

llvm-svn: 158389
2012-06-13 05:15:49 +00:00
Andrew Trick 5b90645abb sched: Avoid trivially redundant DAG edges. Take the one with higher latency.
llvm-svn: 158379
2012-06-13 02:39:00 +00:00
David Blaikie 5452aa5f47 Remove use of GNU extension to resolve Clang warning.
llvm-svn: 158364
2012-06-12 17:06:32 +00:00
Duncan Sands d7aeefebd6 Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x).  This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value.  It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights.  As a side-effect it
reduces the number of multiplies needed in some cases of large powers.  While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users.  This is progress towards
fixing PR13021.

llvm-svn: 158358
2012-06-12 14:33:56 +00:00
Argyrios Kyrtzidis c6dc4d75fd Satisfy C++ aliasing rules, per suggestion by Chandler.
llvm-svn: 158346
2012-06-12 01:06:16 +00:00
Argyrios Kyrtzidis 8d19c86c9a For llvm::sys::ThreadLocalImpl instead of malloc'ing the platform-specific
thread local data, embed them in the class using a uint64_t and make sure
we get compiler errors if there's a platform where this is not big enough.

This makes ThreadLocal more safe for using it in conjunction with CrashRecoveryContext.

Related to crash in rdar://11434201.

llvm-svn: 158342
2012-06-12 00:21:31 +00:00
Andrew Trick 3e465fb225 misched: When querying RegisterPressureTracker, always save current and max pressure.
llvm-svn: 158340
2012-06-11 23:42:23 +00:00
Jakob Stoklund Olesen e6aed139f0 Write llvm-tblgen backends as functions instead of sub-classes.
The TableGenBackend base class doesn't do much, and will be removed
completely soon.

Patch by Sean Silva!

llvm-svn: 158311
2012-06-11 15:37:55 +00:00
Jakob Stoklund Olesen f30fa58ebb Fix a problem with the reverse bundle iterators.
This showed up the first time rend() was called on a bundled instruction
in the Mips backend.

Also avoid dereferencing end() in bundle_iterator::operator++().

We still don't have a place to put unit tests for this stuff.

llvm-svn: 158310
2012-06-11 15:11:12 +00:00
Craig Topper 7afe343be5 Add intrinsics for immediate form of XOP vprot instructions. Use i128mem instead of f128mem for integer XOP instructions.
llvm-svn: 158291
2012-06-10 07:31:56 +00:00
Craig Topper 3352ba55b9 Replace XOP vpcom intrinsics with fewer intrinsics that take the immediate as an argument.
llvm-svn: 158278
2012-06-09 16:46:13 +00:00
Benjamin Kramer df97aa1628 Hashing: Remove outdated comment. Support for reserved hash values was removed in r151865.
llvm-svn: 158276
2012-06-09 15:33:28 +00:00
Andrew Trick fc8ce08be3 Register pressure: added getPressureAfterInstr.
llvm-svn: 158256
2012-06-09 02:16:58 +00:00
Jakob Stoklund Olesen c26fbbfba5 Sketch a LiveRegMatrix analysis pass.
The LiveRegMatrix represents the live range of assigned virtual
registers in a Live interval union per register unit. This is not
fundamentally different from the interference tracking in RegAllocBase
that both RABasic and RAGreedy use.

The important differences are:

- LiveRegMatrix tracks interference per register unit instead of per
  physical register. This makes interference checks cheaper and
  assignments slightly more expensive. For example, the ARM D7 reigster
  has 24 aliases, so we would check 24 physregs before assigning to one.
  With unit-based interference, we check 2 units before assigning to 2
  units.

- LiveRegMatrix caches regmask interference checks. That is currently
  duplicated functionality in RABasic and RAGreedy.

- LiveRegMatrix is a pass which makes it possible to insert
  target-dependent passes between register allocation and rewriting.
  Such passes could tweak the register assignments with interference
  checking support from LiveRegMatrix.

Eventually, RABasic and RAGreedy will be switched to LiveRegMatrix.

llvm-svn: 158255
2012-06-09 02:13:10 +00:00
Dmitri Gribenko dbeafa773a Convert comments to proper Doxygen comments.
llvm-svn: 158248
2012-06-09 00:01:45 +00:00
Andrew Trick ce679ad89d Removing strange "using" declarations form TargetInstrInfo.
I can't imagine why these were added. Trial and error.

llvm-svn: 158247
2012-06-08 23:56:26 +00:00
Jakob Stoklund Olesen 1224312f5b Reintroduce VirtRegRewriter.
OK, not really. We don't want to reintroduce the old rewriter hacks.

This patch extracts virtual register rewriting as a separate pass that
runs after the register allocator. This is possible now that
CodeGen/Passes.cpp can configure the full optimizing register allocator
pipeline.

The rewriter pass uses register assignments in VirtRegMap to rewrite
virtual registers to physical registers, and it inserts kill flags based
on live intervals.

These finalization steps are the same for the optimizing register
allocators: RABasic, RAGreedy, and PBQP.

llvm-svn: 158244
2012-06-08 23:44:45 +00:00
Andrew Trick 423fa6faee TargetInstrInfo hooks implemented in codegen should be declared pure virtual.
llvm-svn: 158233
2012-06-08 21:52:38 +00:00
Andrew Trick 8cf028752f Sched itinerary fix: Avoid static initializers.
This fixes an accidental dependence on static initialization order that I introduced yesterday.

Thank you Lang!!!

llvm-svn: 158215
2012-06-08 18:25:47 +00:00
Andrew Trick a5d24ca453 Continue factoring computeOperandLatency. Use it for ARM hasHighOperandLatency.
llvm-svn: 158164
2012-06-07 19:42:04 +00:00
Pete Cooper f8d60d36c2 Add internal read flags to MachineInstrBuilder and hook them into the MachineOperand flag of the same name
llvm-svn: 158137
2012-06-07 04:43:52 +00:00
Manman Ren 9c9641812c Revert r157755.
The commit is intended to fix rdar://11540023.
It is implemented as part of peephole optimization. We can actually implement
this in the SelectionDAG lowering phase.

llvm-svn: 158122
2012-06-06 23:53:03 +00:00
Andrew Trick 05ff4667eb Move RegisterClassInfo.h.
Allow targets to access this API. It's required for RegisterPressure.

llvm-svn: 158102
2012-06-06 20:29:31 +00:00
Andrew Trick 88517f608c Move RegisterPressure.h.
Make it a general utility for use by Targets.

llvm-svn: 158097
2012-06-06 19:47:35 +00:00
Benjamin Kramer 009b1c1cf1 Round 2 of dead private variable removal.
LLVM is now -Wunused-private-field clean except for
- lib/MC/MCDisassembler/Disassembler.h. Not sure why it keeps all those unaccessible fields.
- gtest.

llvm-svn: 158096
2012-06-06 19:47:08 +00:00
Benjamin Kramer 628a39faa3 Remove unused private fields found by clang's new -Wunused-private-field.
There are some that I didn't remove this round because they looked like
obvious stubs. There are dead variables in gtest too, they should be
fixed upstream.

llvm-svn: 158090
2012-06-06 18:25:08 +00:00
Jakob Stoklund Olesen f3f7d6f6e2 Simplify LiveInterval::print().
Don't print out the register number and spill weight, making the TRI
argument unnecessary.

This allows callers to interpret the reg field. It can currently be a
virtual register, a physical register, a spill slot, or a register unit.

llvm-svn: 158031
2012-06-05 22:51:54 +00:00
Jakob Stoklund Olesen 12e03dae44 Add experimental support for register unit liveness.
Instead of computing a live interval per physreg, LiveIntervals can
compute live intervals per register unit. This makes impossible the
confusing situation where aliasing registers could have overlapping live
intervals. It should also make fixed interferernce checking cheaper
since registers have fewer register units than aliases.

Live intervals for regunits are computed on demand, using MRI use-def
chains and the new LiveRangeCalc class. Only regunits live in to ABI
blocks are precomputed during LiveIntervals::runOnMachineFunction().

The regunit liveness computations don't depend on LiveVariables.

llvm-svn: 158029
2012-06-05 22:02:15 +00:00
Jakob Stoklund Olesen 989b3b1516 Implement LiveRangeCalc::extendToUses() and createDeadDefs().
These LiveRangeCalc methods are to be used when computing a live range
from scratch.

llvm-svn: 158027
2012-06-05 21:54:09 +00:00
Andrew Trick 4544606c71 misched: API for minimum vs. expected latency.
Minimum latency determines per-cycle scheduling groups.
Expected latency determines critical path and cost.

llvm-svn: 158021
2012-06-05 21:11:27 +00:00
Lang Hames a59100cc08 Add a new intrinsic: llvm.fmuladd. This intrinsic represents a multiply-add
expression (a * b + c) that can be implemented as a fused multiply-add (fma)
if the target determines that this will be more efficient. This intrinsic
will be used to implement FP_CONTRACT support and an aggressive FMA formation
mode.

If your target has a fast FMA instruction you should override the
isFMAFasterThanMulAndAdd method in TargetLowering to return true.

llvm-svn: 158014
2012-06-05 19:07:46 +00:00
Jakob Stoklund Olesen 480bd86182 Remove dead function.
llvm-svn: 158005
2012-06-05 17:19:07 +00:00
Stepan Dyatkovskiy 696835bb03 IntegersSubsetMapping: added exclude operation, that allows to exclude subset of integers from current mapping.
llvm-svn: 157989
2012-06-05 07:57:36 +00:00
Stepan Dyatkovskiy f0a3c46739 IntegersSubsetMapping:
Changed type of Items collection: from std::vector to std::list.
Also some small fixes made in IntegersSubset.h, IntegersSubsetMapping.h and IntegersSubsetTest.cpp.

llvm-svn: 157987
2012-06-05 07:43:08 +00:00
Andrew Trick 73d7736b17 misched: Added MultiIssueItineraries.
This allows a subtarget to explicitly specify the issue width and
other properties without providing pipeline stage details for every
instruction.

llvm-svn: 157979
2012-06-05 03:44:40 +00:00
Andrew Trick ed7c96d7d9 misched: Allow disabling scoreboard hazard checking for subtargets with a
valid itinerary but no pipeline stages.

An itinerary can contain useful scheduling information without specifying pipeline stages for each instruction.

llvm-svn: 157977
2012-06-05 03:44:32 +00:00
Andrew Trick 515f131786 whitespace
llvm-svn: 157976
2012-06-05 03:44:29 +00:00
Jakob Stoklund Olesen 345528944c Remove the last remat-related code from LiveIntervalAnalysis.
Rematerialization is handled by LiveRangeEdit now.

llvm-svn: 157974
2012-06-05 01:06:15 +00:00
Jakob Stoklund Olesen 9e27e2621a Stop using LiveIntervals::isReMaterializable().
It is an old function that does a lot more than required by
CalcSpillWeights, which was the only remaining caller.

The isRematerializable() function never actually sets the isLoad
argument, so don't try to compute that.

llvm-svn: 157973
2012-06-05 01:06:12 +00:00
Jakob Stoklund Olesen 188d830405 Delete dead code.
llvm-svn: 157963
2012-06-04 23:01:41 +00:00
Jakob Stoklund Olesen 11fb248aa6 Switch LiveIntervals member variable to LLVM naming standards.
No functional change.

llvm-svn: 157957
2012-06-04 22:39:14 +00:00
Roman Divacky e3f15c98d1 Implement local-exec TLS on PowerPC.
llvm-svn: 157935
2012-06-04 17:36:38 +00:00
Nadav Rotem b7bb72e4f3 Remove the "-promote-elements" flag. This flag is now enabled by default.
llvm-svn: 157925
2012-06-04 11:27:21 +00:00
Duncan Sands b8d3d74de8 getAllOnesValue also works for vectors of integers.
llvm-svn: 157915
2012-06-04 07:18:12 +00:00
NAKAMURA Takumi 2cf23dc272 IntRange: Restore the copy constuctor explicitly to appase buildbot.
llvm-svn: 157901
2012-06-03 15:42:12 +00:00
Craig Topper fd53b80219 Rename fma4 intrinsics to just fma since they are now used for both FMA4 and FMA3. Autoupgrade support coming in a separate commit.
llvm-svn: 157898
2012-06-03 07:26:46 +00:00
Stepan Dyatkovskiy 8a28bf5dd1 Added unittests for IntegersSubset and IntegersSubsetMapping.
- Fixed IntegersSubsetGeneric copy/assignment behaviour. 
- Fixed IntegersSubsetGeneric::getSize/getSingleValue methods.
- Fixed IntegersSubsetGeneric::verify method.

Also IntegersSubset.h and IntegersSubsetMapping.h headers was fixed.

llvm-svn: 157887
2012-06-02 13:47:12 +00:00
Benjamin Kramer bde9176663 Fix typos found by http://github.com/lyda/misspell-check
llvm-svn: 157885
2012-06-02 10:20:22 +00:00
Stepan Dyatkovskiy 0e46d8a08c PR1255: case ranges.
IntRange converted from struct to class. So main change everywhere is replacement of ".Low/High" with ".getLow/getHigh()"

llvm-svn: 157884
2012-06-02 09:42:43 +00:00
Benjamin Kramer 539df9ef0a Add move semantics to APInt.
llvm-svn: 157883
2012-06-02 08:39:08 +00:00
Stepan Dyatkovskiy cab9603622 Additional change for 157881. Forget to fix another IntegerSubset constructor.
llvm-svn: 157882
2012-06-02 08:03:34 +00:00
Stepan Dyatkovskiy f49b393c04 Small fix due to buildbot failures on mingw32. Fixed call of parent constructor for case when parent is template.
llvm-svn: 157881
2012-06-02 07:44:19 +00:00
Stepan Dyatkovskiy 9549f5894b PR1255: case ranges.
IntegersSubsetGeneric, IntegersSubsetMapping: added IntTy template parameter, that allows use either APInt or IntItem. This change allows to write unittest for these classes.

llvm-svn: 157880
2012-06-02 07:26:00 +00:00
Jakob Stoklund Olesen 83b0ac498a Remove the old register list functions from MCRegisterInfo.
These functions exposed the layout of the underlying data tables as
null-terminated uint16_t arrays.

Use the new MCSubRegIterator, MCSuperRegIterator, and MCRegAliasIterator
classes instead.

llvm-svn: 157855
2012-06-01 23:28:34 +00:00
Jakob Stoklund Olesen 54038d796c Switch all register list clients to the new MC*Iterator interface.
No functional change intended.

Sorry for the churn. The iterator classes are supposed to help avoid
giant commits like this one in the future. The TableGen-produced
register lists are getting quite large, and it may be necessary to
change the table representation.

This makes it possible to do so without changing all clients (again).

llvm-svn: 157854
2012-06-01 23:28:30 +00:00
Manman Ren e873552091 ARM: properly handle alignment for struct byval.
Factor out the expansion code into a function.
This change is to be enabled in clang.

rdar://9877866

llvm-svn: 157830
2012-06-01 19:33:18 +00:00
Benjamin Kramer aa9b845979 Provide move semantics for (Small)BitVector.
CodeGen makes a lot of BitVector copies.

llvm-svn: 157826
2012-06-01 18:52:53 +00:00
Stepan Dyatkovskiy 66305749f1 PR1255: case ranges.
IntegersSubset devided into IntegersSubsetGeneric and into IntegersSubset itself. The first has no references to ConstantInt and works with IntItem only.
IntegersSubsetMapping also made generic. Here added second template parameter "IntegersSubsetTy" that allows to use on of two IntegersSubset types described below.

llvm-svn: 157815
2012-06-01 16:17:57 +00:00
Benjamin Kramer e7e6e9e0a4 Remove noisy semicolons.
llvm-svn: 157814
2012-06-01 15:40:53 +00:00
Stepan Dyatkovskiy bd7303b7f7 PR1255: case ranges.
IntItem cleanup. IntItemBase, IntItemConstantIntImp and IntItem merged into IntItem. All arithmetic operators was propogated from APInt. Also added comparison operators <,>,<=,>=. Currently you will find set of macros that propogates operators from APInt to IntItem in the beginning of IntegerSubset. Note that THESE MACROS WILL REMOVED after all passes will case-ranges compatible. Also note that these macros much smaller pain that something like this:
if (V->getValue().ugt(AnotherV->getValue()) { ... }

These changes made IntItem full featured integer object. It allows to make IntegerSubset class generic (move out all ConstantInt references inside and add unit-tests) in next commits.

llvm-svn: 157810
2012-06-01 10:06:14 +00:00
Eric Christopher 1cf3338bb4 Add support for enum forward declarations.
Part of rdar://11570854

llvm-svn: 157786
2012-06-01 00:22:32 +00:00
Benjamin Kramer 5168a72b26 IntrusiveRefCntPtr: Simplify operator= as suggested by Richard Smith.
This way the constructors do all the hard work. No intended functionality change.

llvm-svn: 157773
2012-05-31 22:25:25 +00:00
Manman Ren 9bccb64e56 X86: replace SUB with CMP if possible
This patch will optimize the following
        movq    %rdi, %rax
        subq    %rsi, %rax
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax
to
        cmpq    %rsi, %rdi
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023
llvm-svn: 157755
2012-05-31 17:20:29 +00:00
Jakob Stoklund Olesen fa9d7db17b Add a PrintRegUnit helper similar to PrintReg.
Reg-units are named after their root registers, and most units have a
single root, so they simply print as 'AL', 'XMM0', etc. The rare dual
root reg-units print as FPSCR~FPSCR_NZCV, FP0~ST7, ...

The printing piggybacks on the existing register name tables, so no
extra const data space is required.

llvm-svn: 157754
2012-05-31 17:18:29 +00:00
Jakob Stoklund Olesen 2be0a77ade Emit register unit root tables.
Each register unit has one or two root registers. The full set of
registers containing a given register unit can be computed as the union
of the root registers and their super-registers.

Provide an MCRegUnitRootIterator class to enumerate the roots.

llvm-svn: 157753
2012-05-31 17:18:26 +00:00
Craig Topper c1ac05dad5 Add intrinsic for pclmulqdq instruction.
llvm-svn: 157731
2012-05-31 04:37:40 +00:00
Benjamin Kramer 50b26ebb2b Teach SCEV's icmp simplification logic that a-b == 0 is equivalent to a == b.
This also required making recursive simplifications until
nothing changes or a hard limit (currently 3) is hit.

With the simplification in place indvars can canonicalize
loops of the form
for (unsigned i = 0; i < a-b; ++i)
into
for (unsigned i = 0; i != a-b; ++i)
which used to fail because SCEV created a weird umax expr
for the backedge taken count.

llvm-svn: 157701
2012-05-30 18:32:23 +00:00
Jakob Stoklund Olesen 161c648135 Add MCRegisterInfo::RegListIterator.
Also add subclasses MCSubRegIterator, MCSuperRegIterator, and
MCRegAliasIterator.

These iterators provide an abstract interface to the MCRegisterInfo
register lists so the internal representation can be changed without
changing all clients.

llvm-svn: 157695
2012-05-30 16:36:28 +00:00
Benjamin Kramer 547ea51a73 Mark insertq/extrq intrinsic readnone.
llvm-svn: 157688
2012-05-30 13:44:25 +00:00
Bob Wilson 33e5188c27 Add an insertPass API to TargetPassConfig. <rdar://problem/11498613>
Besides adding the new insertPass function, this patch uses it to
enhance the existing -print-machineinstrs so that the MachineInstrs
after a specific pass can be printed.

Patch by Bin Zeng!

llvm-svn: 157655
2012-05-30 00:17:12 +00:00
Jakob Stoklund Olesen 45601d1742 Make DiffListIterator public to unbreak the gcc buildbots.
Apparently, a friend can't derive from a private class according to gcc.

llvm-svn: 157654
2012-05-30 00:05:03 +00:00
Jakob Stoklund Olesen 850ef99516 Use MCRegUnitIterator to compute regsOverlap().
The register unit lists are typically much shorter than the register
overlap lists, and the backing table for register units has better cache
locality because it is smaller.

This makes llc about 0.5% faster. The regsOverlap() function isn't that hot.

llvm-svn: 157651
2012-05-29 23:40:02 +00:00
Jakob Stoklund Olesen 7f381bd26d Emit register unit lists for each register.
Register units are already used internally in TableGen to compute
register pressure sets and overlapping registers. This patch makes them
available to the code generators.

The register unit lists are differentially encoded so they can be reused
for many related registers. This keeps the total size of the lists below
200 bytes for most targets. ARM has the largest table at 560 bytes.

Add an MCRegUnitIterator for traversing the register unit lists. It
provides an abstract interface so the representation can be changed in
the future without changing all clients.

llvm-svn: 157650
2012-05-29 23:40:00 +00:00
Douglas Gregor fd2d3c49c0 DenseMap's move assignment operator needs to return *this
llvm-svn: 157644
2012-05-29 20:33:09 +00:00
Benjamin Kramer ef479ea854 Add intrinsics, code gen, assembler and disassembler support for the SSE4a extrq and insertq instructions.
This required light surgery on the assembler and disassembler
because the instructions use an uncommon encoding. They are
the only two instructions in x86 that use register operands
and two immediates.

llvm-svn: 157634
2012-05-29 19:05:25 +00:00
Stepan Dyatkovskiy 58107dd547 ConstantRangesSet renamed to IntegersSubset. CRSBuilder renamed to IntegersSubsetMapping.
llvm-svn: 157612
2012-05-29 12:26:47 +00:00
Peter Collingbourne 913869be45 Add llvm.fabs intrinsic.
llvm-svn: 157594
2012-05-28 21:48:37 +00:00
Meador Inge e17b69a373 PR12696: Attribute bits above 1<<30 are not encoded in bitcode
Attribute bits above 1<<30 are now encoded correctly.  Additionally,
the encoding/decoding functionality has been hoisted to helper functions
in Attributes.h in an effort to help the encoding/decoding to stay in
sync with the Attribute bitcode definitions.

llvm-svn: 157581
2012-05-28 15:45:43 +00:00
Stepan Dyatkovskiy e3e19cbb13 PR1255: Case Ranges
Implemented IntItem - the wrapper around APInt. Why not to use APInt item directly right now?
1. It will very difficult to implement case ranges as series of small patches. We got several large and heavy patches. Each patch will about 90-120 kb. If you replace ConstantInt with APInt in SwitchInst you will need to changes at the same time all Readers,Writers and absolutely all passes that uses SwitchInst.
2. We can implement APInt pool inside and save memory space. E.g. we use several switches that works with 256 bit items (switch on signatures, or strings). We can avoid value duplicates in this case.
3. IntItem can be easyly easily replaced with APInt.
4. Currenly we can interpret IntItem both as ConstantInt and as APInt. It allows to provide SwitchInst methods that works with ConstantInt for non-updated passes.

Why I need it right now? Currently I need to update SimplifyCFG pass (EqualityComparisons). I need to work with APInts directly a lot, so peaces of code
ConstantInt *V = ...;
if (V->getValue().ugt(AnotherV->getValue()) {
  ...
}
will look awful. Much more better this way:
IntItem V = ConstantIntVal->getValue();
if (AnotherV < V) {
}

Of course any reviews are welcome.

P.S.: I'm also going to rename ConstantRangesSet to IntegersSubset, and CRSBuilder to IntegersSubsetMapping (allows to map individual subsets of integers to the BasicBlocks).
Since in future these classes will founded on APInt, it will possible to use them in more generic ways.

llvm-svn: 157576
2012-05-28 12:39:09 +00:00
Stepan Dyatkovskiy fdc233f30b SwitchInst: Due to bad readability case iterators definition was moved to the end of SwitchInst.
llvm-svn: 157575
2012-05-28 10:11:27 +00:00
Chris Lattner 3cb6f83ebb switch AttrListPtr::get to take an ArrayRef, simplifying a lot of clients.
llvm-svn: 157556
2012-05-28 01:47:44 +00:00
Chris Lattner 9a49ffdb47 add some helper methods to make the type more uniform.
llvm-svn: 157554
2012-05-28 01:29:59 +00:00
Chris Lattner ff9e08baf9 rdar://11542750 - llvm.trap should be marked no return.
llvm-svn: 157551
2012-05-27 23:20:41 +00:00
Benjamin Kramer 27f1429717 DenseMap: Use an early exit when there is nothing to do in DestroyAll().
llvm-svn: 157550
2012-05-27 22:53:10 +00:00
Benjamin Kramer 78eb6e91bd IntrusiveRefCntPtr: Use the same pattern as the other operator= overloads when using rvalue refs.
llvm-svn: 157546
2012-05-27 20:46:04 +00:00
Chris Lattner 9db8eed2b2 generalize this to allow any argument.
llvm-svn: 157542
2012-05-27 19:17:16 +00:00
Chris Lattner f39c278384 move some code around so that Verifier.cpp can get access to the intrinsic info table.
llvm-svn: 157540
2012-05-27 18:28:35 +00:00
Benjamin Kramer 1e165f0343 DenseMap: Provide a move ctor and move semantics for operator[] and FindAndConstruct.
The only missing part is insert(), which uses a pair of parameters and I haven't
figured out how to convert it to rvalue references. It's now possible to use a
DenseMap with std::unique_ptr values :)

llvm-svn: 157539
2012-05-27 17:38:30 +00:00
Benjamin Kramer fd91e72ee2 DenseMap: Factor destruction into a common helper method.
llvm-svn: 157538
2012-05-27 17:38:18 +00:00
Benjamin Kramer 15d4169e7e Move-enable IntrusiveRefCntPtr.
These tend to be copied around a lot, moving it instead saves a ton of memory
accesses.

llvm-svn: 157535
2012-05-27 16:22:08 +00:00
Benjamin Kramer 65e75666ff Add support for branch weight metadata to MDBuilder and use it in various places.
llvm-svn: 157515
2012-05-26 13:59:43 +00:00
Justin Holewinski aa58397b3c Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCall
to pass around a struct instead of a large set of individual values.  This
cleans up the interface and allows more information to be added to the struct
for future targets without requiring changes to each and every target.

NV_CONTRIB

llvm-svn: 157479
2012-05-25 16:35:28 +00:00
Jakob Stoklund Olesen 49ea89ee2d Compress MCRegisterInfo register name tables.
Store (debugging) register names as offsets into a string table instead
of as char pointers.

llvm-svn: 157449
2012-05-25 00:21:41 +00:00
Eli Friedman 315a0c79f3 Simplify code for calling a function where CanLowerReturn fails, fixing a small bug in the process.
llvm-svn: 157446
2012-05-25 00:09:29 +00:00
Jakob Stoklund Olesen 74fd80e8fc Don't put TGParser scratch results in the output.
Only fully expanded Records should go into RecordKeeper.

llvm-svn: 157431
2012-05-24 22:17:36 +00:00
Andrew Trick 61f1a278b8 misched: Added ScoreboardHazardRecognizer.
The Hazard checker implements in-order contraints, or interlocked
resources. Ready instructions with hazards do not enter the available
queue and are not visible to other heuristics.

The major code change is the addition of SchedBoundary to encapsulate
the state at the top or bottom of the schedule, including both a
pending and available queue.

The scheduler now counts cycles in sync with the hazard checker. These
are minimum cycle counts based on known hazards.

Targets with no itinerary (x86_64) currently remain at cycle 0. To fix
this, we need to provide some maximum issue width for all targets. We
also need to add the concept of expected latency vs. minimum latency.

llvm-svn: 157427
2012-05-24 22:11:09 +00:00
Justin Holewinski 907f7606f2 Remove the PTX back-end and all of its artifacts (triple, etc.)
This back-end was deprecated in favor of the NVPTX back-end.

NV_CONTRIB

llvm-svn: 157417
2012-05-24 21:38:21 +00:00
Owen Anderson 921082b883 Teach tblgen's set theory "sequence" operator to support an optional stride operand.
llvm-svn: 157416
2012-05-24 21:37:08 +00:00
Nuno Lopes 561dae0947 revert r156383: removal of TYPE_CODE_FUNCTION_OLD
Apparently LLVM only stopped emitting this after LLVM 3.0

llvm-svn: 157325
2012-05-23 15:19:39 +00:00
Eric Christopher c49643586b Add support for C++11 enum classes in llvm.
Part of rdar://11496790

llvm-svn: 157303
2012-05-23 00:09:20 +00:00
Nuno Lopes a2f6cecb6d add a new pass to instrument loads and stores for run-time bounds checking
move EmitGEPOffset from InstCombine to Transforms/Utils/Local.h

(a draft of this) patch reviewed by Andrew, thanks.

llvm-svn: 157261
2012-05-22 17:19:09 +00:00
Nuno Lopes ad40c0a425 revert my previous patches that introduced an additional parameter to the objectsize intrinsic.
After a lot of discussion, we realized it's not the best option for run-time bounds checking

llvm-svn: 157255
2012-05-22 15:25:31 +00:00
Nuno Lopes 61b7fa2985 fix the quotient returned by sdivrem() for the case when LHS is negative and RHS is positive
based on a patch by Preston Briggs, with some modifications

llvm-svn: 157231
2012-05-22 01:09:48 +00:00
Pete Cooper 243efd7ac3 Added address space qualifier to intrinsic PointerType arguments.
llvm-svn: 157218
2012-05-21 23:21:28 +00:00
Stepan Dyatkovskiy e89dafd876 PR1255 (case ranges: work with ConstantRangesSet instead of ConstantInt) related changes for Execution and Verifier.
llvm-svn: 157183
2012-05-21 10:44:40 +00:00
Jakob Stoklund Olesen 29268b50f2 Give a small negative bias to giant edge bundles.
This helps compile time when the greedy register allocator splits live
ranges in giant functions. Without the bias, we would try to grow
regions through the giant edge bundles, usually to find out that the
region became too big and expensive.

If a live range has many uses in blocks near the giant bundle, the small
negative bias doesn't make a big difference, and we still consider
regions including the giant edge bundle.

Giant edge bundles are usually connected to landing pads or indirect
branches.

llvm-svn: 157174
2012-05-21 03:11:23 +00:00
Jakob Stoklund Olesen ef8431d3be Add a LiveRangeQuery class.
This class is meant to be the primary interface for examining a live
range in the vicinity on a given instruction. It avoids all the messy
dealings with iterators and early clobbers.

This is a more abstract interface to live ranges, hiding the
implementation as a vector of segments.

llvm-svn: 157141
2012-05-20 02:44:30 +00:00
Benjamin Kramer c671092b42 Disambiguate call to operator==.
clang++ and msvc happily had no problem with it but g++ refuses to compile.

llvm-svn: 157126
2012-05-19 19:32:11 +00:00
Benjamin Kramer 2dded79f1c ValueMap: Use DenseMap's find_as mechanism to reduce use list churn.
Otherwise just looking up a value in the map requires creating a VH, adding it to the use lists and destroying it again.

llvm-svn: 157124
2012-05-19 19:15:32 +00:00
Benjamin Kramer 1ed0fa452c Move CallbackVHs dtor inline, it can be devirtualized in many cases. Move the other virtual methods out of line as they are only called from within Value.cpp anyway.
llvm-svn: 157123
2012-05-19 19:15:25 +00:00
Benjamin Kramer f928534ef2 Provide move semantics for TinyPtrVector and for DenseMap's rehash function.
This makes DenseMap<..., TinyPtrVector<...>> as cheap as it always should've been!

llvm-svn: 157113
2012-05-19 13:28:54 +00:00
Jakob Stoklund Olesen e5bbe37950 Allow LiveRangeEdit to be created with a NULL parent.
The dead code elimination with callbacks is still useful.

llvm-svn: 157100
2012-05-19 05:25:46 +00:00
Eric Christopher b5cf66cda2 Actually support DW_TAG_rvalue_reference_type that we were trying
to generate out of the front end.

rdar://11479676

llvm-svn: 157094
2012-05-19 01:36:37 +00:00
Andrew Trick 7fa4e0fea6 SCEV: Add MarkPendingLoopPredicates to avoid recursive isImpliedCond.
getUDivExpr attempts to simplify by checking for overflow.
isLoopEntryGuardedByCond then evaluates the loop predicate which
may lead to the same getUDivExpr causing endless recursion.

Fixes PR12868: clang 3.2 segmentation fault.

llvm-svn: 157092
2012-05-19 00:48:25 +00:00
Jakob Stoklund Olesen 3834dae65d Modernize naming convention for class members.
No functional change.

llvm-svn: 157079
2012-05-18 22:10:15 +00:00
Jim Grosbach 4b63d2ae1d Refactor data-in-code annotations.
Use a dedicated MachO load command to annotate data-in-code regions.
This is the same format the linker produces for final executable images,
allowing consistency of representation and use of introspection tools
for both object and executable files.

Data-in-code regions are annotated via ".data_region"/".end_data_region"
directive pairs, with an optional region type.

data_region_directive := ".data_region" { region_type }
region_type := "jt8" | "jt16" | "jt32" | "jta32"
end_data_region_directive := ".end_data_region"

The previous handling of ARM-style "$d.*" labels was broken and has
been removed. Specifically, it didn't handle ARM vs. Thumb mode when
marking the end of the section.

rdar://11459456

llvm-svn: 157062
2012-05-18 19:12:01 +00:00
Nick Kledzik 846e6a6121 fix warnings when compiling with -Wshadow
llvm-svn: 157061
2012-05-18 18:39:06 +00:00
Evandro Menezes 2ca32b3873 [Hexagon] Clean up Hexagon ELF definition.
llvm-svn: 156996
2012-05-17 16:46:46 +00:00
Eric Christopher 7be1ac7e1d Grammar.
llvm-svn: 156955
2012-05-16 22:08:58 +00:00
Duncan Sands e51c8442dd I noticed that named metadata doesn't provide a direct way of getting at the
named metadata list, unlike all the other global objects (global variables,
functions, aliases), so add that for consistency.

llvm-svn: 156915
2012-05-16 12:25:43 +00:00
Daniel Dunbar cce6248ede [Support] Add a version of sys::fs::equivalent() that treats errors as false.
llvm-svn: 156864
2012-05-15 22:07:14 +00:00
Jim Grosbach 97609a84ec TableGen'erate mapping physical registers to encoding values.
Many targets always use the same bitwise encoding value for physical
registers in all (or most) instructions. Add this mapping to the
.td files and TableGen'erate the information and expose an accessor
in MCRegisterInfo.

patch by Tom Stellard.

llvm-svn: 156829
2012-05-15 17:35:57 +00:00
Jim Grosbach c3b0427921 Allow MCCodeEmitter access to the target MCRegisterInfo.
Add the MCRegisterInfo to the factories and constructors.

Patch by Tom Stellard <Tom.Stellard@amd.com>.

llvm-svn: 156828
2012-05-15 17:35:52 +00:00
Stepan Dyatkovskiy 395c893d33 Fixed one small stupid, but critical bug.
llvm-svn: 156810
2012-05-15 09:21:39 +00:00
Michael J. Spencer e27086b0e8 [Support/COFF] Make the order of members in symbol match the standard.
llvm-svn: 156785
2012-05-14 22:43:21 +00:00
Chad Rosier a968caf8e0 Move the capture analysis from MemoryDependencyAnalysis to a more general place
so that it can be reused in MemCpyOptimizer.  This analysis is needed to remove
an unnecessary memcpy when returning a struct into a local variable.
rdar://11341081
PR12686

llvm-svn: 156776
2012-05-14 20:35:04 +00:00
Dan Gohman 164fe18cfe Rename @llvm.debugger to @llvm.debugtrap.
llvm-svn: 156774
2012-05-14 18:58:10 +00:00
Andrew Trick 31ee64d9dc Remove a stale forward declaration.
llvm-svn: 156770
2012-05-14 18:03:19 +00:00
Jakob Stoklund Olesen 77e7b8ede2 Remove the expensive BitVector::operator~().
Returning a temporary BitVector is very expensive. If you must, create
the temporary explicitly: Use BitVector(A).flip() instead of ~A.

llvm-svn: 156768
2012-05-14 15:46:27 +00:00
Jakob Stoklund Olesen 76680e9b4e Remove BitVector binops.
These operators were crazy slow, calling malloc to return a temporary
result. At the same time, they look very innocent when used in code.

If you need temporary BitVectors to compute your thing, create them
explicitly, and use the inplace logical operators. This makes the high
cost explicit in the code.

llvm-svn: 156767
2012-05-14 15:37:25 +00:00
Jakob Stoklund Olesen 2fad493fe4 Add BitVector::anyCommon().
The existing operation (A & B).any() is very slow.

llvm-svn: 156760
2012-05-14 15:01:19 +00:00
Stepan Dyatkovskiy 3dea421826 SwitchInst cosmetics: renamed "Hash" method to "hash"
llvm-svn: 156757
2012-05-14 08:26:31 +00:00
Bill Wendling ea857e1b9f Use ArrayRef instead of an explicit vector type.
llvm-svn: 156755
2012-05-14 07:53:40 +00:00
Stepan Dyatkovskiy 0beab5e1cd Recommited r156374 with critical fixes in BitcodeReader/Writer:
Ordinary patch for PR1255.
Added new case-ranges orientated methods for adding/removing cases in SwitchInst. After this patch cases will internally representated as ConstantArray-s instead of ConstantInt, externally cases wrapped within the ConstantRangesSet object.
Old methods of SwitchInst are also works well, but marked as deprecated. So on this stage we have no side effects except that I added support for case ranges in BitcodeReader/Writer, of course test for Bitcode is also added. Old "switch" format is also supported.

llvm-svn: 156704
2012-05-12 10:48:17 +00:00
Jay Foad ca0c499609 Teach Function::hasAddressTaken that BlockAddress doesn't really take
the address of a function.

llvm-svn: 156703
2012-05-12 08:30:16 +00:00
Michael J. Spencer a39c9bceeb Add doxygen comments.
llvm-svn: 156665
2012-05-11 23:34:39 +00:00
Michael J. Spencer 93303819ac [Support/StringRef] Add find_last_not_of and {r,l,}trim.
llvm-svn: 156652
2012-05-11 22:08:50 +00:00
Sirish Pande 83ccb6ce08 Hexagon V5 intrinsics support.
llvm-svn: 156631
2012-05-11 19:39:13 +00:00
Stepan Dyatkovskiy 05b46b3745 PR1255: ConstantRangesSet and CRSBuilder classes moved from include/llvm to include/llvm/Support.
llvm-svn: 156613
2012-05-11 10:34:23 +00:00
Jim Grosbach 3658412afc Tidy up. Trailing whitespace.
llvm-svn: 156601
2012-05-11 01:39:13 +00:00
Dan Gohman dfab443ae8 Define a new intrinsic, @llvm.debugger. It will be similar to __builtin_trap(),
but it generates int3 on x86 instead of ud2.

llvm-svn: 156593
2012-05-11 00:19:32 +00:00
Eric Christopher b6148ed72c Allow unique_file to take a mode for file permissions, but default
to user only read/write.

Part of rdar://11325849

llvm-svn: 156591
2012-05-11 00:07:44 +00:00
Dan Gohman ed7c24e2d9 Teach DeadStoreElimination to eliminate exit-block stores with phi addresses.
llvm-svn: 156558
2012-05-10 18:57:38 +00:00
Chad Rosier d84eaac673 Add Triple::getiOSVersion.
This new function provides a way to get the iOS version number from ios triples.
Part of rdar://11409204

llvm-svn: 156483
2012-05-09 17:23:48 +00:00
Hans Wennborg b7ef2fe8ae Introduce llvm-c function LLVMPrintModuleToFile.
This lets you save the textual representation of the LLVM IR to a file.
Before this patch it could only be printed to STDERR from llvm-c.

Patch by Carlo Kok!

llvm-svn: 156479
2012-05-09 16:54:17 +00:00
Nuno Lopes 01547b3ad2 change the objectsize intrinsic signature: add a 3rd parameter to denote the maximum runtime performance penalty that the user is willing to accept.
This commit only adds the parameter. Code taking advantage of it will follow.

llvm-svn: 156473
2012-05-09 15:52:43 +00:00
Bill Wendling a3aeb980d2 Supply a C interface to the "LinkModules" method.
Patch by Andrew Wilkins!

llvm-svn: 156469
2012-05-09 08:55:40 +00:00
Kevin Enderby fe3d005ca5 Fix it so llvm-objdump -arch does accept x86 and x86-64 as valid arch names.
PR12731.  Patch by Meador Inge!

llvm-svn: 156444
2012-05-08 23:38:45 +00:00
Eric Christopher d666bb0dd8 Remove excess semi-colons to quiet warnings.
llvm-svn: 156416
2012-05-08 20:45:04 +00:00
Eric Christopher 137e966b65 Update comment.
llvm-svn: 156404
2012-05-08 18:55:57 +00:00
Nuno Lopes f7596c91af remove TYPE_CODE_FUNCTION_OLD type code. it is no longer in use and it was marked for removal in 3.0
llvm-svn: 156383
2012-05-08 16:16:20 +00:00
Stepan Dyatkovskiy 5eafce5c88 Rejected r156374: Ordinary PR1255 patch. Due to clang-x86_64-debian-fnt buildbot failure.
llvm-svn: 156377
2012-05-08 08:33:21 +00:00
Craig Topper 7daf897678 Remove 256-bit AVX non-temporal store intrinsics. Similar was previously done for 128-bit.
llvm-svn: 156375
2012-05-08 06:58:15 +00:00
Stepan Dyatkovskiy b6a4640163 Ordinary patch for PR1255.
Added new case-ranges orientated methods for adding/removing cases in SwitchInst. After this patch cases will internally representated as ConstantArray-s instead of ConstantInt, externally cases wrapped within the ConstantRangesSet object.
Old methods of SwitchInst are also works well, but marked as deprecated. So on this stage we have no side effects except that I added support for case ranges in BitcodeReader/Writer, of course test for Bitcode is also added. Old "switch" format is also supported.

llvm-svn: 156374
2012-05-08 06:36:08 +00:00
Jakob Stoklund Olesen 3c52f0281f Add an MF argument to TRI::getPointerRegClass() and TII::getRegClass().
The getPointerRegClass() hook can return register classes that depend on
the calling convention of the current function (ptr_rc_tailcall).

So far, we have been able to infer the calling convention from the
subtarget alone, but as we add support for multiple calling conventions
per target, that no longer works.

Patch by Yiannis Tsiouris!

llvm-svn: 156328
2012-05-07 22:10:26 +00:00
Jordy Rose 8a7a7d90d7 Constify (trivially) ImmutableSet::iterator::getVisitState().
This was probably intended all along.

llvm-svn: 156318
2012-05-07 19:24:40 +00:00
Jakob Stoklund Olesen 65a6dafc8d Add TRI::getCommonSuperRegClass().
This function is a generalization of getMatchingSuperRegClass() to the
symmetric case where both sides are using a sub-register index. It will
find a super-register class and sub-register indexes that make this
diagram commute:

                                   PreA
                       SuperRC  ---------->  RCA

                          |                   |
                          |                   |
                     PreB |                   | SubA
                          |                   |
                          |                   |
                          V                   V

                         RCB    ----------> SubRC
                                   SubB

This can be used to coalesce copies like:

  %vreg1:sub16 = COPY %vreg2:sub16; GR64:%vreg1, GR32: %vreg2

llvm-svn: 156317
2012-05-07 19:14:58 +00:00
John McCall 02d06b95aa Fix trivial typo in llvm_move.
llvm-svn: 156288
2012-05-07 06:00:23 +00:00
Craig Topper d4e1894ec1 Add SSE4A MOVNTSS/MOVNTSD instructions.
llvm-svn: 156281
2012-05-07 05:36:19 +00:00
Jim Grosbach b51ffd762c Tidy up. Whitespace.
llvm-svn: 156276
2012-05-07 02:25:53 +00:00
Chris Lattner 9322ba824c reapply my patch, with a fix for an off-by-one error. Turned out to be a lot
of work for a drive-by fix :)

llvm-svn: 156246
2012-05-05 22:17:32 +00:00
Chris Lattner 64f65d33df revert my patches, which are causing problems.
llvm-svn: 156245
2012-05-05 22:11:04 +00:00
Chris Lattner 339adf1725 add missing header <shame>
llvm-svn: 156244
2012-05-05 22:04:11 +00:00
Jim Grosbach 7ce129268e Nuke a few dead remnants of the CBE.
llvm-svn: 156241
2012-05-05 17:45:12 +00:00
Daniel Dunbar b57ddd4e29 [Support] Add sys::Process::GetRandomNumber().
- Primitive API, but we rarely have need for random numbers.

llvm-svn: 156237
2012-05-05 16:36:20 +00:00
Daniel Dunbar 407a85e7a6 [build] Add build check for ::arc4random().
llvm-svn: 156236
2012-05-05 16:36:16 +00:00
Benjamin Kramer e31f31e5c0 Add a new target hook "predictableSelectIsExpensive".
This will be used to determine whether it's profitable to turn a select into a
branch when the branch is likely to be predicted.

Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM.

I'm not entirely happy with the name of this flag, suggestions welcome ;)

llvm-svn: 156233
2012-05-05 12:49:14 +00:00
Chandler Carruth 6781821c01 Teach the code extractor how to extract a sequence of blocks from
RegionInfo's RegionNode. This mirrors the logic for automating the
extraction from a Loop.

llvm-svn: 156208
2012-05-04 21:33:30 +00:00
Chandler Carruth 8880325a92 Rename the Region::block_iterator to Region::block_node_iterator, and
add a new Region::block_iterator which actually iterates over the basic
blocks of the region.

The old iterator, now call 'block_node_iterator' iterates over
RegionNodes which contain a single basic block. This works well with the
GraphTraits-based iterator design, however most users actually want an
iterator over the BasicBlocks inside these RegionNodes. Now the
'block_iterator' is a wrapper which exposes exactly this interface.
Internally it uses the block_node_iterator to walk all nodes which are
single basic blocks, but transparently unwraps the basic block to make
user code simpler.

While this patch is a bit of a wash, most of the updates are to internal
users, not external users of the RegionInfo. I have an accompanying
patch to Polly that is a strict simplification of every user of this
interface, and I'm working on a pass that also wants the same simplified
interface.

This patch alone should have no functional impact.

llvm-svn: 156202
2012-05-04 20:55:23 +00:00
Justin Holewinski ae556d3ef7 This patch adds a new NVPTX back-end to LLVM which supports code generation for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it.
The new target machines are:

nvptx (old ptx32) => 32-bit PTX
nvptx64 (old ptx64) => 64-bit PTX

The sources are based on the internal NVIDIA NVPTX back-end, and
contain more functionality than the current PTX back-end currently
provides.

NV_CONTRIB

llvm-svn: 156196
2012-05-04 20:18:50 +00:00
Chandler Carruth 14316fcf7d Factor the computation of input and output sets into a public interface
of the CodeExtractor utility. This allows speculatively computing input
and output sets to measure the likely size impact of the code
extraction.

These sets cannot be reused sadly -- we mutate the function prior to
forming the final sets used by the actual extraction.

The interface has been revamped slightly to make it easier to use
correctly by making the interface const and sinking the computation of
the number of exit blocks into the full extraction function and away
from the rest of this logic which just computed two output parameters.

llvm-svn: 156168
2012-05-04 11:20:27 +00:00
Chandler Carruth 0fde00150d Move the CodeExtractor utility to a dedicated header file / source file,
and expose it as a utility class rather than as free function wrappers.

The simple free-function interface works well for the bugpoint-specific
pass's uses of code extraction, but in an upcoming patch for more
advanced code extraction, they simply don't expose a rich enough
interface. I need to expose various stages of the process of doing the
code extraction and query information to decide whether or not to
actually complete the extraction or give up.

Rather than build up a new predicate model and pass that into these
functions, just take the class that was actually implementing the
functions and lift it up into a proper interface that can be used to
perform code extraction. The interface is cleaned up and re-documented
to work better in a header. It also is now setup to accept the blocks to
be extracted in the constructor rather than in a method.

In passing this essentially reverts my previous commit here exposing
a block-level query for eligibility of extraction. That is no longer
necessary with the more rich interface as clients can query the
extraction object for eligibility directly. This will reduce the number
of walks of the input basic block sequence by quite a bit which is
useful if this enters the normal optimization pipeline.

llvm-svn: 156163
2012-05-04 10:18:49 +00:00
Jakob Stoklund Olesen 796e5272ab Remove the SubRegClasses field from RegisterClass descriptions.
This information in now computed by TableGen.

llvm-svn: 156152
2012-05-04 03:30:34 +00:00
Jakob Stoklund Olesen 3f6faaec70 Remove TargetRegisterClass::SuperRegClasses.
This manually enumerated list of super-register classes has been
superceeded by the automatically computed super-register class masks
available through SuperRegClassIterator.

llvm-svn: 156151
2012-05-04 03:30:28 +00:00
Jakob Stoklund Olesen 75fbe90839 Use SuperRegClassIterator for findRepresentativeClass().
The masks returned by SuperRegClassIterator are computed automatically
by TableGen. This is better than depending on the manually specified
SuperRegClasses.

llvm-svn: 156147
2012-05-04 02:19:22 +00:00
Jakob Stoklund Olesen 57c7050675 Add a SuperRegClassIterator class.
This iterator class provides a more abstract interface to the (Idx,
Mask) lists of super-registers for a register class. The layout of the
tables shouldn't be exposed to clients.

llvm-svn: 156144
2012-05-04 01:48:29 +00:00
Chandler Carruth da7513a834 A pile of long over-due refactorings here. There are some very, *very*
minor behavior changes with this, but nothing I have seen evidence of in
the wild or expect to be meaningful. The real goal is unifying our logic
and simplifying the interfaces. A summary of the changes follows:

- Make 'callIsSmall' actually accept a callsite so it can handle
  intrinsics, and simplify callers appropriately.
- Nuke a completely bogus declaration of 'callIsSmall' that was still
  lurking in InlineCost.h... No idea how this got missed.
- Teach the 'isInstructionFree' about the various more intelligent
  'free' heuristics that got added to the inline cost analysis during
  review and testing. This mostly surrounds int->ptr and ptr->int casts.
- Switch most of the interesting parts of the inline cost analysis that
  were essentially computing 'is this instruction free?' to use the code
  metrics routine instead. This way we won't keep duplicating logic.

All of this is motivated by the desire to allow other passes to compute
a roughly equivalent 'cost' metric for a particular basic block as the
inline cost analysis. Sadly, re-using the same analysis for both is
really messy because only the actual inline cost analysis is ever going
to go to the contortions required for simplification, SROA analysis,
etc.

llvm-svn: 156140
2012-05-04 00:58:03 +00:00
Chandler Carruth 45a5b5ebe9 Add a FoldingSetVector datastructure which is analogous to a SetVector,
but using a FoldingSet underneath and with a largely compatible
interface to that of FoldingSet. This can be used anywhere a FoldingSet
would be natural, but iteration order is significant. The initial
intended use case is in Clang's template specialization lists to
preserve instantiation order iteration.

llvm-svn: 156131
2012-05-03 23:38:34 +00:00
Jakob Stoklund Olesen 2f460ae3b4 Use a shared implementation of getMatchingSuperRegClass().
TargetRegisterClass now gives access to the necessary tables.

llvm-svn: 156122
2012-05-03 22:49:04 +00:00
Jakob Stoklund Olesen 67dd612cdd Add TargetRegisterClass::getSuperRegIndices().
This is a pointer into one of the tables used by
getMatchingSuperRegClass(). It makes it possible to use a shared
implementation of that function.

llvm-svn: 156121
2012-05-03 22:49:00 +00:00
Chandler Carruth a46e62424b Factor the logic for testing whether a basic block is viable for code
extraction into a public interface. Also clean it up and apply it more
consistently such that we check for landing pads *anywhere* in the
extracted code, not just in single-block extraction.

This will be used to guide decisions in passes that are planning to
eventually perform a round of code extraction.

llvm-svn: 156114
2012-05-03 22:26:53 +00:00
Ted Kremenek 1233895097 Add rudimentary CMake logic for detecting Graphviz.
llvm-svn: 156108
2012-05-03 21:51:05 +00:00
Nuno Lopes d2b71e7fa9 add support for calloc to objectsize lowering
llvm-svn: 156102
2012-05-03 21:19:58 +00:00
Jakob Stoklund Olesen 673e085a1c Fix the type of SubClassMask.
llvm-svn: 156084
2012-05-03 18:17:32 +00:00
Jakob Stoklund Olesen f5bc1eb9eb Don't override subreg functions in targets without subregisters.
Some targets have no sub-registers at all. Use the TargetRegisterInfo
versions of composeSubRegIndices(), getSubClassWithSubReg(), and
getMatchingSuperRegClass() for those targets.

llvm-svn: 156075
2012-05-03 16:26:20 +00:00
Andrew Trick 32aea358e1 Added TargetRegisterInfo::getAllocatableClass.
The ensures that virtual registers always belong to an allocatable class.
If your target attempts to create a vreg for an operand that has no
allocatable register subclass, you will crash quickly.

This ensures that targets define register classes as intended.

llvm-svn: 156046
2012-05-03 01:14:37 +00:00
Douglas Gregor 12c1cd33f4 Move llvm-tblgen's StringMatcher into the TableGen library so it can
be used by clang-tblgen.

llvm-svn: 156000
2012-05-02 17:32:48 +00:00
Anders Waldenborg 38ce8615a3 [llvm-c] Make a few function declarations proper prototypes
This avoids warnings when included in a application that
uses -Wstrict-prototypes. 

e.g: AsmPrinters.def:27:1: warning: function declaration isn't a prototype [-Wstrict-prototypes]

llvm-svn: 155997
2012-05-02 16:15:32 +00:00
John McCall 8647296fef Update SmallVector to support move semantics if the host does.
Note that support for rvalue references does not imply support
for the full set of move-related STL operations.

I've preserved support for an odd little thing in insert() where
we're trying to support inserting a new element from an existing
one.  If we actually want to support that, there's a lot more we
need to do:  insert can call either grow or push_back, neither of
which is safe against this particular use pattern.

llvm-svn: 155979
2012-05-02 05:39:15 +00:00
Sirish Pande 94212168fc Target independent Hexagon Packetizer fix.
llvm-svn: 155947
2012-05-01 21:28:30 +00:00
Benjamin Kramer 84b857e4e6 YAMLParser: get rid of global ctors & dtors.
llvm-svn: 155907
2012-05-01 10:19:59 +00:00
Bill Wendling b12f16e75f Change the PassManager from a reference to a pointer.
The TargetPassManager's default constructor wants to initialize the PassManager
to 'null'. But it's illegal to bind a null reference to a null l-value. Make the
ivar a pointer instead.
PR12468

llvm-svn: 155902
2012-05-01 08:27:43 +00:00
Bill Wendling bf4b9afbeb Second attempt at PR12573:
Allow the "SplitCriticalEdge" function to split the edge to a landing pad. If
the pass is *sure* that it thinks it knows what it's doing, then it may go ahead
and specify that the landing pad can have its critical edge split. The loop
unswitch pass is one of these passes. It will split the critical edges of all
edges coming from a loop to a landing pad not within the loop. Doing so will
retain important loop analysis information, such as loop simplify.

llvm-svn: 155817
2012-04-30 10:44:54 +00:00
Craig Topper 929ec4d778 Remove superfluous 'inline'
llvm-svn: 155799
2012-04-29 20:27:47 +00:00
Eli Bendersky 0e2ac5bcdb Fix some formatting, grammar and style issues and add a couple of missing comments.
llvm-svn: 155793
2012-04-29 12:40:47 +00:00
Benjamin Kramer 74a12a46a0 SmallVector: Don't rely on having an assignment operator around in push_back for POD-like types.
llvm-svn: 155791
2012-04-29 10:53:29 +00:00
Eli Bendersky 3b6e07cd1a Fix comments from copy-paste to a more relevant meaning
llvm-svn: 155790
2012-04-29 10:26:26 +00:00
Craig Topper ff40037eb8 Add constants for first and last integer vector types to be consistent with floating point.
llvm-svn: 155787
2012-04-29 07:25:46 +00:00
Craig Topper f6f724e108 Remove tab characters
llvm-svn: 155786
2012-04-29 07:07:36 +00:00
Craig Topper 356c9754e6 Mark the default cases of MVT::getVectorElementType and MVT:getVectorNumElements as unreachable to reduce code size.
llvm-svn: 155785
2012-04-29 07:06:58 +00:00
Jakob Stoklund Olesen 6053899aa0 Don't update spill weights when joining intervals.
We don't compute spill weights until after coalescing anyway.

llvm-svn: 155766
2012-04-28 19:19:11 +00:00
Jakob Stoklund Olesen 4fe0e1908e Spring cleaning - Delete dead code.
llvm-svn: 155765
2012-04-28 19:19:07 +00:00
Benjamin Kramer f819ae6092 If the __is_trivially_copyable type trait is available use it as the baseline for isPodLike.
This way we can enable the POD-like class optimization for a lot more classes,
saving ~120k of code in clang/i386/Release+Asserts when selfhosting.

llvm-svn: 155761
2012-04-28 16:22:31 +00:00
Benjamin Kramer eef0e27519 Use the most basic superclass of SmallVector in ArrayRef.
llvm-svn: 155760
2012-04-28 16:22:26 +00:00
Benjamin Kramer 31f2704a3d Reapply the SmallMap patch with a fix.
Comparing ~0UL with an unsigned will always return false when long is 64 bits long.

llvm-svn: 155568
2012-04-25 18:01:58 +00:00
Jakob Stoklund Olesen 01f201f484 Remove more dead code.
llvm-svn: 155566
2012-04-25 18:01:30 +00:00
Eric Christopher 4ff88c67e0 Revert "First implementation of:"
This reverts commit 76271a3366731d4c372fdebcd8d3437e6e09a61b.

as it's breaking the bots.

llvm-svn: 155562
2012-04-25 17:51:00 +00:00
Stepan Dyatkovskiy 7ce39cdb9f First implementation of:
- FlatArrayMap. Very simple map container that uses flat array inside.
- MultiImplMap. Map container interface, that has two modes, one for small amount of elements and one for big amount.
- SmallMap. SmallMap is DenseMap compatible MultiImplMap. It uses FlatArrayMap for small mode, and DenseMap for big mode. 

Also added unittests for new classes and update for ProgrammersManual.
For more details about new classes see ProgrammersManual and comments in sourcecode.

llvm-svn: 155557
2012-04-25 17:09:38 +00:00
Jakob Stoklund Olesen 60d660fe94 Simplify LiveIntervals::getApproximateInstructionCount().
This function is only used for a heuristic during -join-physregs. It
doesn't need floating point.

llvm-svn: 155554
2012-04-25 16:32:23 +00:00
Jakob Stoklund Olesen e64664e178 Remove a dead function.
llvm-svn: 155553
2012-04-25 16:32:20 +00:00
Andrew Trick aac706240f typo in declaration from earlier today
llvm-svn: 155519
2012-04-25 01:11:22 +00:00
Jim Grosbach 5117ef7453 ARM: improved assembler diagnostics for missing CPU features.
When an instruction match is found, but the subtarget features it
requires are not available (missing floating point unit, or thumb vs arm
mode, for example), issue a diagnostic that identifies what the feature
mismatch is.

rdar://11257547

llvm-svn: 155499
2012-04-24 22:40:08 +00:00
Andrew Trick 4d4b5469ab Fix a naughty header include that breaks "installed" builds.
llvm-svn: 155486
2012-04-24 20:36:19 +00:00
Stepan Dyatkovskiy 040978403a Related to PR1255. Let's begin. I'll commit classes that corresponds to our latest PR1255 discussion posts in llvm-commits.
Strategy.
0. Implement new classes. Classes doesn't affect anything. They still work with ConstantInt base values at this stage.
1. Fictitious replacement of current ConstantInt case values with ConstantRangesSet. Case ranges set will still hold single value, and ConstantInt *getCaseValue() will return it. But additionally implement new method in SwitchInst that allows to work with case ranges. Currenly I think it should be some wrapper that returns either single value or ConstantRangesSet object.
2. Step-by-step replacement of old "ConstantInt* getCaseValue()" with new alternative. Modify algorithms for all passes that works with SwitchInst. But don't modify LLParser and BitcodeReader/Writer. Still hold single value in each ConstantRangesSet object. On this stage some parts of LLVM will use old-style methods, and some ones new-style.
3. After all getCaseValue() usages will removed and whole LLVM and its clients will work in new style - modify LLParser, Reader and Writer. Remove getCaseValue().
4. Replace ConstantInt*-based case ranges set items with APInt ones.

Currently we are on Zero Stage: New classes.
ConstantRangesSet.
I selected ConstantArrays as case ranges set "holder" object (it is a temporary decision, I'll explain why below). The array items are may be ConstantVectors with single item, and ConstantVectors with two items (that means single number and range respectively).
The ConstantInt will used as basic value representation. It will replaced with APInt then. Of course ConstantArray and ConstantVector will go away after ConstantInt => APInt replacement.

New class mandatory features:
- bool isSatisfies(ConstantInt *V) method (need better name?). Returns true if the given value satisfies this case.
- Case's ranges and values enumeration. In some passes we need to analize each case (SwitchLowering for example).

Factory + unified clusterify.
I also propose to implement the factory that allows to build case object with user friendly way. I called it CRSBuilder by now.
Currenly I implemented the factory that allows add,remove pairs of range+successor. It also allows add existing ConstantRangesSet decompiling it to separated ranges. Factory can emit either clusters set (single case range + successor) or the set of "ConstantRangesSet + Successor" pairs.
So you can use it either as builder for new cases set for SwitchInst, or for clusterification of existing cases set.
Just call Factory.optimize() and it emits optimized and sorted clusters collection for you!
I tested clusterification on SelectionDAGBuilder - it works fine. Don't worry it was not included in this patch. Just new classes.
Factory is a template. There are two params: SuccessorClass and IsReadonly. So you can specify what successor you need (BB or MBB). And you can also restrict your factory to use values in read-only mode (SelectionDAGBuilder need IsReadonly=true). Read-only factory couldn't build the cases ranges.

llvm-svn: 155464
2012-04-24 18:31:10 +00:00
Andrew Trick 88639928bd misched: DAG builder support for tracking register pressure within the current scheduling region.
The DAG builder is a convenient place to do it. Hopefully this is more
efficient than a separate traversal over the same region.

llvm-svn: 155456
2012-04-24 17:56:43 +00:00
Preston Gurd 9a0914753a This patch fixes a problem which arose when using the Post-RA scheduler
on X86 Atom. Some of our tests failed because the tail merging part of
the BranchFolding pass was creating new basic blocks which did not
contain live-in information. When the anti-dependency code in the Post-RA
scheduler ran, it would sometimes rename the register containing
the function return value because the fact that the return value was
live-in to the subsequent block had been lost. To fix this, it is necessary
to run the RegisterScavenging code in the BranchFolding pass.

This patch makes sure that the register scavenging code is invoked
in the X86 subtarget only when post-RA scheduling is being done.
Post RA scheduling in the X86 subtarget is only done for Atom.

This patch adds a new function to the TargetRegisterClass to control
whether or not live-ins should be preserved during branch folding.
This is necessary in order for the anti-dependency optimizations done
during the PostRASchedulerList pass to work properly when doing
Post-RA scheduling for the X86 in general and for the Intel Atom in particular.

The patch adds and invokes the new function trackLivenessAfterRegAlloc()
instead of using the existing requiresRegisterScavenging().
It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of
requiresRegisterScavenging(). It changes the all the targets that
implemented requiresRegisterScavenging() to also implement
trackLivenessAfterRegAlloc().  

It adds an assertion in the Post RA scheduler to make sure that post RA
liveness information is available when it is needed.

It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order
to avoid running into the added assertion.

Finally, this patch restores the use of anti-dependency checking
(which was turned off temporarily for the 3.1 release) for
Intel Atom in the Post RA scheduler.

Patch by Andy Zhang!

Thanks to Jakob and Anton for their reviews.

llvm-svn: 155395
2012-04-23 21:39:35 +00:00
Eric Christopher 27deb265f9 Allow forward declarations to take a context. This helps the debugger
find forward declarations in the context that the actual definition
will occur.

rdar://11291658

llvm-svn: 155380
2012-04-23 19:00:11 +00:00
Chandler Carruth af0f8bf595 Temporarily revert r155364 until the upstream review can complete, per
the stated developer policy.

llvm-svn: 155373
2012-04-23 18:28:57 +00:00
Chandler Carruth 3c3bb55a85 Revert r155365, r155366, and r155367. All three of these have regression
test suite failures. The failures occur at each stage, and only get
worse, so I'm reverting all of them.

Please resubmit these patches, one at a time, after verifying that the
regression test suite passes. Never submit a patch without running the
regression test suite.

llvm-svn: 155372
2012-04-23 18:25:57 +00:00
Sirish Pande a3f8ba2439 Hexagon V5 (floating point) support.
llvm-svn: 155367
2012-04-23 17:49:40 +00:00
Sirish Pande 995c8dbfd2 Hexagon Packetizer's target independent fix.
llvm-svn: 155364
2012-04-23 17:49:09 +00:00
Sylvestre Ledru 3099f4bda8 Conflict with st_dev/st_ino identifiers under Debian GNU/Hurd
The problem is that the struct file_status on UNIX systems has two
members called st_dev and st_ino; those are also members of the
struct stat, and they are reserved identifiers which can also be
provided as #define (and this is the case for st_dev on Hurd).
The solution (attached) is to rename them, for example adding a
"fs_" prefix (= file status) to them.

Patch by Pino Toscano

llvm-svn: 155354
2012-04-23 16:37:23 +00:00
Bill Wendling 32854e2727 Add a flag to the struct type finder to collect only those types which have
names. This saves collecting types we normally don't care about.

llvm-svn: 155300
2012-04-21 23:59:16 +00:00
Chris Lattner e39f27ae75 stop hiding SmallVector's append that takes a count + element.
llvm-svn: 155297
2012-04-21 21:02:03 +00:00
Benjamin Kramer 0ab75fd27f Remove unused PointerLikeTypeTraits for IndexListEntry.
It set NumLowBitAvailable = 3 which may not be true on all platforms.  We only
ever use 2 bits (the default) so this assumption can be safely removed

Should fix PR12612.

llvm-svn: 155288
2012-04-21 16:05:27 +00:00
Bill Wendling 1bf41faca2 Revert r155241, which is causing some breakage.
llvm-svn: 155253
2012-04-20 23:11:38 +00:00