llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	5e740dbca6	R600/SI: Fix tests with triples in them Only set the triple from the command line options. Some of these were still testing SI features and using the old r600-- triple. llvm-svn: 238958	2015-06-03 20:04:05 +00:00
Colin LeMahieu	1ce7a11c9c	[Hexagon] Test doesn't work on all platforms. At any rate the uninitialized variable issue was fixed. Removing re-registering ASM backend. llvm-svn: 238949	2015-06-03 18:00:45 +00:00
Colin LeMahieu	a675077310	[Hexagon] Reapply 238772 OSABI was not correctly set, added empty_elf test to make sure it is. llvm-svn: 238947	2015-06-03 17:34:16 +00:00
Frederic Riss	28dbc5ab8b	Revert "[dsymutil] Accept a YAML debug map as input instead of a binary." This reverts commit r238941 while I figure out the bot issues. llvm-svn: 238943	2015-06-03 17:08:42 +00:00
Frederic Riss	063d674c21	[dsymutil] Accept a YAML debug map as input instead of a binary. To do this, the user needs to pass the new -y flag. As it wasn't tested before, the debug map YAML deserialization was completely buggy (mainly because the DebugMapObject has a dual mapping that allows to search by name and by address, but only the StringMap got populated). It's fixed and tested in this commit by augmenting some test with a 2 stage dwarf link: a frist llvm-dsymutil reads the debug map and pipes it in a second instance that does the actual link without touching the initial binary. llvm-svn: 238941	2015-06-03 16:57:16 +00:00
Frederic Riss	34238cfa24	[dsymutil] Replace -parse-only option with -dump-debug-map As the serialized debug map is becoming a first class citizen, a way to cleanly dump it is required. We used -parse-only combined with -v for that purpose before, but it dumps a lot of unrelated debug stuff. Dumping the debug map was the only use of the -parse-only flag anyway, so replace it with a more useful option. llvm-svn: 238940	2015-06-03 16:57:12 +00:00
Matthias Braun	125c9f5f7b	ARM: Thumb2 LDRD/STRD supports independent input/output regs The existing code would unnecessarily break LDRD/STRD apart with non-adjacent registers, on thumb2 this is not necessary. Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore as there is not reason to set register hints anymore, changing that is something for a future patch however. Differential Revision: http://reviews.llvm.org/D9694 Recommiting after the revert in r238821, the buildbot still failed with the patch removed so there seems to be another reason for the breakage. llvm-svn: 238935	2015-06-03 16:30:24 +00:00
Asaf Badouh	402ebb34af	re-apply 238809 AVX-512: Implemented GETEXP instruction for KNL and SKX Added rounding mode modifier for SQRTPS/PD Added tests for encoding and intrinsics. CR: http://reviews.llvm.org/D9991 llvm-svn: 238923	2015-06-03 13:41:48 +00:00
Elena Demikhovsky	21de893377	AVX-512: VSHUFPD instruction selection - code improvements llvm-svn: 238918	2015-06-03 11:21:01 +00:00
Elena Demikhovsky	9e38086534	AVX-512: Implemented SHUFF32x4/SHUFF64x2/SHUFI32x4/SHUFI64x2 instructions for SKX and KNL. Added tests for encoding. By Igor Breger (igor.breger@intel.com) llvm-svn: 238917	2015-06-03 10:56:40 +00:00
Daniel Sanders	8b2354de81	Re-commit r238838, r238844 with fix for host/target endian mismatch and windows buildbot. The windows buildbot originally failed because the check expressions are evaluated as 64-bit values, even for 32-bit symbols. Fixed this by comparing bottom 32-bits of the expressions. The host/target endian mismatch issue is that it's invalid to read/write target values using a host pointer without taking care of endian differences between the target and host. Most (if not all) instances of reinterpret_cast<uint32_t*>() in the RuntimeDyld are examples of this bug. This has been fixed for Mips using the endian aware read/write functions. The original commits were: r238838: [mips] Add RuntimeDyld tests for currently supported O32 relocations. Reviewers: petarj, vkalintiris Reviewed By: vkalintiris Subscribers: vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D10126 r238844: [mips][mcjit] Add support for R_MIPS_PC32. Summary: This allows us to resolve relocations for DW_EH_PE_pcrel TType encodings in the exception handling LSDA. Also fixed a nearby typo. Reviewers: petarj, vkalintiris Reviewed By: vkalintiris Subscribers: vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D10127 llvm-svn: 238915	2015-06-03 10:27:28 +00:00
Rafael Espindola	58628425dc	This reverts commit r238838, r238844 and r238888. Trying to bring back a windows bot: http://lab.llvm.org:8011/builders/clang-x86-win2008-selfhost/builds/1224/steps/ninja%20check%202/logs/FAIL%3A%20LLVM%3A%3AELF_O32_PIC_relocations.s llvm-svn: 238903	2015-06-03 05:39:59 +00:00
Rafael Espindola	cf8beece97	Revert "make reciprocal estimate code generation more flexible by adding command-line options (2nd try)" This reverts commit r238842. It broke -DBUILD_SHARED_LIBS=ON build. llvm-svn: 238900	2015-06-03 05:32:44 +00:00
Rafael Espindola	75d5b5495f	Fix the interpretation of a 0 st_name. The ELF spec is very clear: ----------------------------------------------------------------------------- If the value is non-zero, it represents a string table index that gives the symbol name. Otherwise, the symbol table entry has no name. -------------------------------------------------------------------------- In particular, a st_name of 0 most certainly doesn't mean that the symbol has the same name as the section. llvm-svn: 238899	2015-06-03 05:14:22 +00:00
Filipe Cabecinhas	da86b6d409	[BitcodeReader] Diagnose type mismatches with aliases Bug found with AFL fuzz. llvm-svn: 238895	2015-06-03 01:30:13 +00:00
Filipe Cabecinhas	7b3995885d	[Bitcode] Minimize the test to not conflict with others Source for the test: @bloom = global <3 x i32> <i32 0, i32 1, i32 42> Plus bit twiddling to set the vector numelts to 0 (in the bc file). llvm-svn: 238894	2015-06-03 01:30:08 +00:00
Filipe Cabecinhas	8e42190d20	[BitcodeReader] Check vector size before trying to create a VectorType Bug found with AFL fuzz llvm-svn: 238891	2015-06-03 00:05:30 +00:00
Daniel Sanders	664e7f2e2e	[mips] XFAIL ELF_O32_PIC_relocations.s for big-endian mips The test exposes pre-existing bugs when the endian of the host and target do not match. llvm-svn: 238888	2015-06-02 23:20:40 +00:00
Sanjoy Das	353a19e13c	[RewriteStatepointsForGC] Strip deref info after rewriting. Summary: Once a gc.statepoint has been rewritten to relocate live references, the SSA values represent physical pointers instead of logical references. Logical dereferencability does not imply physical dereferencability and after RewriteStatepointsForGC has run any attributes that imply dereferencability of the logical references need to be stripped. This current approach is conservative, and can be made more precise later if needed. For starters, we need to strip dereferencable attributes only from pointers that live in the GC address space. Reviewers: reames, pgavlin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10105 llvm-svn: 238883	2015-06-02 22:33:37 +00:00
Sanjoy Das	513aadecac	[SelectionDAG] Fix PR23603. Summary: LLVM's MI level notion of invariant_load is different from LLVM's IR level notion of invariant_load with respect to dereferenceability. The IR notion of invariant_load only guarantees that all non-faulting invariant loads result in the same value. The MI notion of invariant load guarantees that the load can be legally moved to any location within its containing function. The MI notion of invariant_load is stronger than the IR notion of invariant_load -- an MI invariant_load is an IR invariant_load + a guarantee that the location being loaded from is dereferenceable throughout the function's lifetime. Reviewers: hfinkel, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10075 llvm-svn: 238881	2015-06-02 22:33:30 +00:00
Filipe Cabecinhas	62431b1d71	[IR/AsmWriter] Output escape sequences if the first character isdigit() If the first character in a metadata attachment's name is a digit, it has to be output using an escape sequence, otherwise it's not valid text IR. Removed an over-zealous assert from LLVMContext which didn't allow this. The rule should only apply to text IR. Actual names can have any sequence of non-NUL bytes. Also added some documentation on accepted names. Bug found with AFL fuzz. llvm-svn: 238867	2015-06-02 21:25:08 +00:00
Filipe Cabecinhas	436923ce35	CHECK-LABEL-ize test. NFC llvm-svn: 238866	2015-06-02 21:25:03 +00:00
Daniel Sanders	c95f3f8c95	[mips] Make TTypeEncoding indirect to allow .eh_frame to be read-only. Summary: Following on from r209907 which made personality encodings indirect, do the same for TType encodings. This fixes the case where a try/catch block needs to generate references to, for example, std::exception in the .gcc_except_table. Previous attempts at committing this broke the buildbots due to bugs in IAS. These bugs have now been fixed so trying again. Reviewers: petarj Reviewed By: petarj Subscribers: srhines, joerg, tberghammer, llvm-commits Differential Revision: http://reviews.llvm.org/D9669 llvm-svn: 238863	2015-06-02 20:32:50 +00:00
Tim Northover	3f3a4d8503	AArch64: fix typo in SMIN far atomics and add tests llvm-svn: 238858	2015-06-02 18:37:20 +00:00
Duncan P. N. Exon Smith	694886989c	DebugInfo: Really support 2^16 arguments in a subprogram As a follow-up to r235955, actually support up to 65535 arguments in a subprogram. r235955 missed assembly support, having only tested the new limit via C++ unit tests. Code patch by Amjad Aboud. llvm-svn: 238854	2015-06-02 17:17:44 +00:00
Duncan P. N. Exon Smith	9ce58b1cfb	DebugInfo: Rename testcases from MD* to DI*, NFC As a follow-up to r236120, rename testcases to match the new names. llvm-svn: 238853	2015-06-02 17:13:25 +00:00
Daniel Sanders	f85028359d	[mips][mcjit] Add support for R_MIPS_PC32. Summary: This allows us to resolve relocations for DW_EH_PE_pcrel TType encodings in the exception handling LSDA. Also fixed a nearby typo. Reviewers: petarj, vkalintiris Reviewed By: vkalintiris Subscribers: vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D10127 llvm-svn: 238844	2015-06-02 15:28:29 +00:00
Sanjay Patel	6f031d848e	make reciprocal estimate code generation more flexible by adding command-line options (2nd try) The first try (r238051) to land this was reverted due to bot failures that were hopefully addressed by r238788. This patch adds a TargetRecip class for processing many recip codegen possibilities. The class is intended to handle both command-line options to llc as well as options passed in from a front-end such as clang with the -mrecip option. The x86 backend is updated to use the new functionality. Only -mcpu=btver2 with -ffast-math should see a functional change from this patch. All other x86 CPUs continue to not use reciprocal estimates by default with -ffast-math. Differential Revision: http://reviews.llvm.org/D8982 llvm-svn: 238842	2015-06-02 15:28:15 +00:00
Daniel Sanders	531063b274	[mips] Add RuntimeDyld tests for currently supported O32 relocations. Reviewers: petarj, vkalintiris Reviewed By: vkalintiris Subscribers: vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D10126 llvm-svn: 238838	2015-06-02 15:01:25 +00:00
Elena Demikhovsky	8938f5acca	AVX-512: Implemented VRANGESD and VRANGESS instructions for SKX Implemented DAG lowering for all these forms. Added tests for encoding. By Igor Breger (igor.breger@intel.com) llvm-svn: 238834	2015-06-02 14:12:54 +00:00
Elena Demikhovsky	44a129c533	AVX-512: Shorten implementation of lowerV16X32VectorShuffle() using lowerVectorShuffleWithSHUFPS() and other shuffle-helpers routines. Added matching of VALIGN instruction. llvm-svn: 238830	2015-06-02 13:43:18 +00:00
Vasileios Kalintiris	bb698c7d5f	[mips] Add support for dynamic stack realignment. Summary: With this change we are able to realign the stack dynamically, whenever it contains objects with alignment requirements that are larger than the alignment specified from the given ABI. We have to use the $fp register as the frame pointer when we perform dynamic stack realignment. In complex stack frames, with variably-sized objects, we reserve additionally the callee-saved register $s7 as the base pointer in order to reference locals. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8633 llvm-svn: 238829	2015-06-02 13:14:46 +00:00
Renato Golin	3a7bec86bd	Revert "ARM: Thumb2 LDRD/STRD supports independent input/output regs" This reverts commit r238795, as it broke the Thumb2 self-hosting buildbot. Since self-hosting issues with Clang are hard to investigate, I'm taking the liberty to revert now, so we can investigate it offline. llvm-svn: 238821	2015-06-02 11:47:30 +00:00
Vladimir Sukharev	5f6f60d942	[AArch64] Add v8.1a atomic instructions Patch by: Tom Coxon Reviewers: t.p.northover Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8501 llvm-svn: 238818	2015-06-02 10:58:41 +00:00
Toma Tabacu	c15dd736f8	[mips] [IAS] Reformat mips-expansions.s. NFC. Summary: Make mips-expansions.s more readable by grouping the instructions with their respective CHECK's. This test is going to get a lot bigger soon and it will become essentially unreadable if the current formatting is kept. I've also made the comments more useful and accurate, and I've restricted the RUN lines to under 80 columns. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10089 llvm-svn: 238817	2015-06-02 10:34:10 +00:00
Daniel Sanders	f87702112d	[mips] Test both %dtprel_hi and %dtprel_lo instead of testing %dtprel_hi twice. The second %dtprel_hi is used on an addiu so it looks like a copy/paste error. llvm-svn: 238815	2015-06-02 10:09:08 +00:00
Daniel Sanders	4d652b8a2d	[mips] Expand tabs in test/MC/Mips/mips-relocations.s llvm-svn: 238814	2015-06-02 10:02:00 +00:00
Toma Tabacu	2969650ecd	[mips] [IAS] Add support for the .set softfloat/hardfloat directives. Summary: These directives are used to set the current value of the SoftFloat feature. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits, mpf Differential Revision: http://reviews.llvm.org/D9074 llvm-svn: 238813	2015-06-02 09:48:04 +00:00
Elena Demikhovsky	3425c932da	AVX-512: Implemented VFIXUPIMMSD and VFIXUPIMMSS instructions for KNL Implemented DAG lowering for all these forms. Added tests for encoding. By Igor Breger (igor.breger@intel.com) llvm-svn: 238811	2015-06-02 08:28:57 +00:00
Asaf Badouh	8d897dd05f	revert 238809 llvm-svn: 238810	2015-06-02 07:45:19 +00:00
Asaf Badouh	17de10f37e	AVX-512: Implemented GETEXP instruction for KNL and SKX Added rounding mode modifier for SQRTPS/PD Added tests for encoding and intrinsics. llvm-svn: 238809	2015-06-02 07:18:14 +00:00
Matthias Braun	e20dc1cd3a	ARM: Thumb2 LDRD/STRD supports independent input/output regs The existing code would unnecessarily break LDRD/STRD apart with non-adjacent registers, on thumb2 this is not necessary. Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore as there is not reason to set register hints anymore, changing that is something for a future patch however. Differential Revision: http://reviews.llvm.org/D9694 llvm-svn: 238795	2015-06-01 23:27:08 +00:00
Matthias Braun	72b8f74813	AArch64: Use CMP;CCMP sequences for and/or/setcc trees. Previously CCMP/FCCMP instructions were only used by the AArch64ConditionalCompares pass for control flow. This patch uses them for SELECT like instructions as well by matching patterns in ISelLowering. PR20927, rdar://18326194 Differential Revision: http://reviews.llvm.org/D8232 llvm-svn: 238793	2015-06-01 22:31:17 +00:00
Matthias Braun	c1e029e93d	LiveRangeEdit: Fix liveranges not shrinking on subrange kill. If a dead instruction we may not only have a last-use in the main live range but also in a subregister range if subregisters are tracked. We need to partially rebuild live ranges in both cases. The testcase only broke when subregister liveness was enabled. I commited it in the current form because there is currently no flag to enable/disable subregister liveness. This fixes PR23720. llvm-svn: 238785	2015-06-01 21:26:26 +00:00
Frederic Riss	08462f7859	[dsymutil] Use YAMLIO to dump debug map. Doing so will allow us to also accept a YAML debug map in input as using YAMLIO gives us the parsing for free. Being able to have textual debug maps will in turn allow much more control over the tests, because 1/ no need to check-in a binary containing the debug map and 2/ it will allow to use the same objects/IR files with made-up debug-maps to test different scenari. llvm-svn: 238781	2015-06-01 21:12:45 +00:00
Rafael Espindola	b5815b4738	Revert "[Hexagon] Adding basic ELF relocation generation and testing advanced relaxation codepath." This reverts commit r238748. It broke the msan bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/4372/steps/check-llvm%20msan/logs/stdio llvm-svn: 238772	2015-06-01 19:20:47 +00:00
Owen Anderson	15d1805504	Teach the IR Sink pass to (conservatively) respect convergent annotations. llvm-svn: 238762	2015-06-01 17:20:31 +00:00
Vasileios Kalintiris	cbbf8e0a39	[mips][FastISel] Implement bswap. Summary: Implement bswap intrinsic for MIPS FastISel. It's very different for misp32 r1/r2 . Based on a patch by Reed Kotler. Test Plan: bswap1.ll test-suite Reviewers: dsanders, rkotler Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D7219 llvm-svn: 238760	2015-06-01 16:40:45 +00:00
Vasileios Kalintiris	bdb91b31f0	[mips][FastISel] Implement intrinsics memset, memcopy & memmove. Summary: Implement the intrinsics memset, memcopy and memmove in MIPS FastISel. Make some needed infrastructure fixes so that this can work. Based on a patch by Reed Kotler. Test Plan: memtest1.ll The patch passes test-suite for mips32 r1/r2 and at O0/O2 Reviewers: rkotler, dsanders Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D7158 llvm-svn: 238759	2015-06-01 16:36:01 +00:00
Vasileios Kalintiris	8fcb3986d0	[mips][FastISel] Implement srem/urem and sdiv/udiv instructions. Summary: Implement the LLVM assembly urem/srem and sdiv/udiv instructions in MIPS FastISel. Based on a patch by Reed Kotler. Test Plan: srem1.ll div1.ll test-suite at O0/O2 for mips32 r1/r2 Reviewers: dsanders, rkotler Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D7028 llvm-svn: 238757	2015-06-01 16:17:37 +00:00
Vasileios Kalintiris	127f894b55	[mips][FastISel] Implement the select statement for MIPS FastISel. Summary: Implement the LLVM IR select statement for MIPS FastISelsel. Based on a patch by Reed Kotler. Test Plan: "Make check" test included now. Passes test-suite at O2/O0 mips32 r1/r2. Reviewers: dsanders, rkotler Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D6774 llvm-svn: 238756	2015-06-01 15:56:40 +00:00
Vasileios Kalintiris	7f680e156e	[mips][FastISel] Clobber HI0/LO0 registers in MUL instructions. Summary: The contents of the HI/LO registers are unpredictable after the execution of the MUL instruction. In addition to implicitly defining these registers in the MUL instruction definition, we have to mark those registers as dead too. Without this the fast register allocator is running out of registers when the MUL instruction is followed by another one that tries to allocate the AC0 register. Based on a patch by Reed Kotler. Reviewers: dsanders, rkotler Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D9825 llvm-svn: 238755	2015-06-01 15:48:09 +00:00
Rafael Espindola	7f7caf9167	Fix relocation selection for foo-. on mips. This handles only the 32 bit case. llvm-svn: 238751	2015-06-01 15:10:51 +00:00
Colin LeMahieu	a739a4b3c7	[Hexagon] Adding basic ELF relocation generation and testing advanced relaxation codepath. llvm-svn: 238748	2015-06-01 14:51:26 +00:00
Asaf Badouh	f6289f24f7	First commit test. llvm-svn: 238745	2015-06-01 13:56:00 +00:00
Elena Demikhovsky	67afb630e1	AVX-512: Optimized vector shuffle for v16f32 and v16i32 types. llvm-svn: 238743	2015-06-01 13:26:18 +00:00
Luke Cheeseman	4c476858cc	Removing commited assembly file. llvm-svn: 238742	2015-06-01 13:18:53 +00:00
Luke Cheeseman	85fd06d389	Re-commit of r238201 with fix for building with shared libraries. llvm-svn: 238739	2015-06-01 12:02:47 +00:00
Elena Demikhovsky	3582eb3b39	AVX-512: Implemented VRANGEPD and VRANGEPD instructions for SKX. Implemented DAG lowering for all these forms. Added tests for encoding. By Igor Breger (igor.breger@intel.com) llvm-svn: 238738	2015-06-01 11:05:34 +00:00
Elena Demikhovsky	0c41088ebf	AVX-512: Implemented vector shuffle lowering for v8i64 and v8f64 types. I removed the vector-shuffle-512-v8.ll, it is auto-generated test, not valid any more. llvm-svn: 238735	2015-06-01 09:49:53 +00:00
Elena Demikhovsky	75ede68793	AVX-512: added all forms of VPSHUFD and VPSHUFHW, VPSHUFLW including encodings. llvm-svn: 238729	2015-06-01 07:17:23 +00:00
Elena Demikhovsky	42c96d9c0a	AVX-512: Implemented VFIXUPIMMPD and VFIXUPIMMPS instructions for KNL and SKX Implemented DAG lowering for all these forms. Added tests for encoding. by Igor Breger (igor.breger@intel.com) llvm-svn: 238728	2015-06-01 06:50:49 +00:00
Elena Demikhovsky	dd68d0cb0f	AVX-512: Fixed a bug in compress and expand intrinsics. By Igor Breger (igor.breger@intel.com) llvm-svn: 238724	2015-06-01 06:30:13 +00:00
David Majnemer	7666be70e4	[PHITransAddr] Don't translate unreachable values Unreachable values may use themselves in strange ways due to their dominance property. Attempting to translate through them can lead to infinite recursion, crashing LLVM. Instead, claim that we weren't able to translate the value. This fixes PR23096. llvm-svn: 238702	2015-06-01 00:15:08 +00:00
Keno Fischer	c2c6018cce	[DWARF] Fix a bug in line info handling This fixes a bug in the line info handling in the dwarf code, based on a problem I when implementing RelocVisitor support for MachO. Since addr+size will give the first address past the end of the function, we need to back up one line table entry. Fix this by looking up the end_addr-1, which is the last address in the range. Note that this also removes a duplicate output from the llvm-rtdyld line table dump. The relevant line is the end_sequence one in the line table and has an offset of the first address part the end of the range and hence should not be included. Also factor out the common functionality into a separate function. This comes up on MachO much more than on ELF, since MachO doesn't store the symbol size separately, hence making said situation always occur. Differential Revision: http://reviews.llvm.org/D9925 llvm-svn: 238699	2015-05-31 23:37:04 +00:00
Rafael Espindola	a82ce1d97a	For COFF and MachO, compute the gap between to symbols. Before r238028 we used to do this in O(N^2), now we do it in O(N log N). llvm-svn: 238698	2015-05-31 23:15:35 +00:00
Tim Northover	a603c4076c	ARM: recommit r237590: allow jump tables to be placed as constant islands. The original version didn't properly account for the base register being modified before the final jump, so caused miscompilations in Chromium and LLVM. I've fixed this and tested with an LLVM self-host (I don't have the means to build & test Chromium). The general idea remains the same: in pathological cases jump tables can be too far away from the instructions referencing them (like other constants) so they need to be movable. Should fix PR23627. llvm-svn: 238680	2015-05-31 19:22:07 +00:00
Davide Italiano	3dbd7ae0e3	Clarify how the binary file checked in was generated. llvm-svn: 238665	2015-05-30 22:43:36 +00:00
Keno Fischer	281b6941cf	Add RelocVisitor support for MachO This commit adds partial support for MachO relocations to RelocVisitor. A simple test case is added to show that relocations are indeed being applied and that using llvm-dwarfdump on MachO files no longer errors. Correctness is not yet tested, due to an unrelated bug in DebugInfo, which will be fixed with appropriate testcase in a followup commit. Differential Revision: http://reviews.llvm.org/D8148 llvm-svn: 238663	2015-05-30 19:44:53 +00:00
Chandler Carruth	cb58910ce8	[x86] Unify the horizontal adding used for popcount lowering taking the best approach of each. For vNi16, we use SHL + ADD + SRL pattern that seem easily the best. For vNi32, we use the PUNPCK + PSADBW + PACKUSWB pattern. In some cases there is a huge improvement with this in IACA's estimated throughput -- over 2x higher throughput!!!! -- but the measurements are too good to be true. In one narrow case, the SHL + ADD + SHL + ADD + SRL pattern looks slightly faster, but I'm not sure I believe any of the measurements at this point. Both are the exact same uops though. Hard to be confident of anything past that. If anyone wants to collect very detailed (Agner-level) timings with the result of this patch, or with the i32 case replaced with SHL + ADD + SHl + ADD + SRL, I'd be very interested. Note that you'll need to test it on both Ivybridge and Haswell, with both SSE3, SSSE3, and AVX selected as I saw unique behavior in each of these buckets with IACA all of which should be checked against measured performance. But this patch is still a useful improvement by dropping duplicate work and getting the much nicer PSADBW lowering for v2i64. I'd still like to rephrase this in terms of generic horizontal sum. It's a bit lame to have a special case of that just for popcount. llvm-svn: 238652	2015-05-30 10:35:03 +00:00
Chandler Carruth	3bedf4407b	[x86] Update the order of instructions after I switched to a bitcast helper that skips creating a cast when it isn't necessary. It's really somewhat concerning that this was caused by the the presence of a no-op bitcast, but... llvm-svn: 238642	2015-05-30 06:02:37 +00:00
Chandler Carruth	2599da3cfd	[x86] Restore the bitcasts I removed when refactoring this to avoid shifting vectors of bytes as x86 doesn't have direct support for that. This removes a bunch of redundant masking in the generated code for SSE2 and SSE3. In order to avoid the really significant code size growth this would have triggered, I also factored the completely repeatative logic for shifting and masking into two lambdas which in turn makes all of this much easier to read IMO. llvm-svn: 238637	2015-05-30 04:05:11 +00:00
Chandler Carruth	6ba9730a4e	[x86] Implement a faster vector population count based on the PSHUFB in-register LUT technique. Summary: A description of this technique can be found here: http://wm.ite.pl/articles/sse-popcount.html The core of the idea is to use an in-register lookup table and the PSHUFB instruction to compute the population count for the low and high nibbles of each byte, and then to use horizontal sums to aggregate these into vector population counts with wider element types. On x86 there is an instruction that will directly compute the horizontal sum for the low 8 and high 8 bytes, giving vNi64 popcount very easily. Various tricks are used to get vNi32 and vNi16 from the vNi8 that the LUT computes. The base implemantion of this, and most of the work, was done by Bruno in a follow up to D6531. See Bruno's detailed post there for lots of timing information about these changes. I have extended Bruno's patch in the following ways: 0) I committed the new tests with baseline sequences so this shows a diff, and regenerated the tests using the update scripts. 1) Bruno had noticed and mentioned in IRC a redundant mask that I removed. 2) I introduced a particular optimization for the i32 vector cases where we use PSHL + PSADBW to compute the the low i32 popcounts, and PSHUFD + PSADBW to compute doubled high i32 popcounts. This takes advantage of the fact that to line up the high i32 popcounts we have to shift them anyways, and we can shift them by one fewer bit to effectively divide the count by two. While the PSHUFD based horizontal add is no faster, it doesn't require registers or load traffic the way a mask would, and provides more ILP as it happens on different ports with high throughput. 3) I did some code cleanups throughout to simplify the implementation logic. 4) I refactored it to continue to use the parallel bitmath lowering when SSSE3 is not available to preserve the performance of that version on SSE2 targets where it is still much better than scalarizing as we'll still do a bitmath implementation of popcount even in scalar code there. With #1 and #2 above, I analyzed the result in IACA for sandybridge, ivybridge, and haswell. In every case I measured, the throughput is the same or better using the LUT lowering, even v2i64 and v4i64, and even compared with using the native popcnt instruction! The latency of the LUT lowering is often higher than the latency of the scalarized popcnt instruction sequence, but I think those latency measurements are deeply misleading. Keeping the operation fully in the vector unit and having many chances for increased throughput seems much more likely to win. With this, we can lower every integer vector popcount implementation using the LUT strategy if we have SSSE3 or better (and thus have PSHUFB). I've updated the operation lowering to reflect this. This also fixes an issue where we were scalarizing horribly some AVX lowerings. Finally, there are some remaining cleanups. There is duplication between the two techniques in how they perform the horizontal sum once the byte population count is computed. I'm going to factor and merge those two in a separate follow-up commit. Differential Revision: http://reviews.llvm.org/D10084 llvm-svn: 238636	2015-05-30 03:20:59 +00:00
Chandler Carruth	c2e400de83	[x86] Restructure the parallel bitmath lowering of popcount into a separate routine, generalize it to work for all the integer vector sizes, and do general code cleanups. This dramatically improves lowerings of byte and short element vector popcount, but more importantly it will make the introduction of the LUT-approach much cleaner. The biggest cleanup I've done is to just force the legalizer to do the bitcasting we need. We run these iteratively now and it makes the code much simpler IMO. Other changes were minor, and mostly naming and splitting things up in a way that makes it more clear what is going on. The other significant change is to use a different final horizontal sum approach. This is the same number of instructions as the old method, but shifts left instead of right so that we can clear everything but the final sum with a single shift right. This seems likely better than a mask which will usually have to read the mask from memory. It is certaily fewer u-ops. Also, this will be temporary. This and the LUT approach share the need of horizontal adds to finish the computation, and we have more clever approaches than this one that I'll switch over to. llvm-svn: 238635	2015-05-30 03:20:55 +00:00
Filipe Cabecinhas	14e686774d	[BitcodeReader] Change an assert to a call to a call to Error() It's reachable from user input. Bug found with AFL fuzz. llvm-svn: 238633	2015-05-30 00:17:20 +00:00
Reid Kleckner	e6531a5588	[WinEH] Adjust the 32-bit SEH prologue to better match reality It turns out that _except_handler3 and _except_handler4 really use the same stack allocation layout, at least today. They just make different choices about encoding the LSDA. This is in preparation for lowering the llvm.eh.exceptioninfo(). llvm-svn: 238627	2015-05-29 22:57:46 +00:00
Reid Kleckner	173a72524f	Disable FP elimination in funcs using 32-bit MSVC EH personalities The value in 'ebp' acts as an implicit argument to the outlined handlers, and is recovered with frameaddress(1). llvm-svn: 238619	2015-05-29 21:58:11 +00:00
Matthias Braun	165d467125	MachineCopyPropagation: Remove the copies instead of using KILL instructions. For some history here see the commit messages of r199797 and r169060. The original intent was to fix cases like: %EAX<def> = COPY %ECX<kill>, %RAX<imp-def> %RCX<def> = COPY %RAX<kill> where simply removing the copies would have RCX undefined as in terms of machine operands only the ECX part of it is defined. The machine verifier would complain about this so 169060 changed such COPY instructions into KILL instructions so some super-register imp-defs would be preserved. In r199797 it was finally decided to always do this regardless of super-register defs. But this is wrong, consider: R1 = COPY R0 ... R0 = COPY R1 getting changed to: R1 = KILL R0 ... R0 = KILL R1 It now looks like R0 dies at the first KILL and won't be alive until the second KILL, while in reality R0 is alive and must not change in this part of the program. As this only happens after register allocation there is not much code still performing liveness queries so the issue was not noticed. In fact I didn't manage to create a testcase for this, without unrelated changes I am working on at the moment. The fix is simple: As of r223896 the MachineVerifier allows reads from partially defined registers, so the whole transforming COPY->KILL thing is not necessary anymore. This patch also changes a similar (but more benign case as the def and src are the same register) case in the VirtRegRewriter. Differential Revision: http://reviews.llvm.org/D10117 llvm-svn: 238588	2015-05-29 18:19:25 +00:00
Nemanja Ivanovic	376e17364f	Add support for VSX FMA single-precision instructions to the PPC back end This patch corresponds to review: http://reviews.llvm.org/D9941 It adds the various FMA instructions introduced in the version 2.07 of the ISA along with the testing for them. These are operations on single precision scalar values in VSX registers. llvm-svn: 238578	2015-05-29 17:13:25 +00:00
Alex Lorenz	09b832cac5	MIR Serialization: use correct line and column numbers for LLVM IR errors. This commit translates the line and column numbers for LLVM IR errors from the numbers in the YAML block scalar to the numbers in the MIR file so that the MIRParser users can report LLVM IR errors with the correct line and column numbers. Reviewers: Duncan P. N. Exon Smith Differential Revision: http://reviews.llvm.org/D10108 llvm-svn: 238576	2015-05-29 17:05:41 +00:00
Reid Kleckner	1d3d4adbb9	[WinEH] Emit EH tables for __CxxFrameHandler3 on 32-bit x86 Small (really small!) C++ exception handling examples work on 32-bit x86 now. This change disables the use of .seh_* directives in WinException when CFI is not in use. It also uses absolute symbol references in the tables instead of imagerel32 relocations. Also fixes a cache invalidation bug in MMI personality classification. llvm-svn: 238575	2015-05-29 17:00:57 +00:00
Jingyue Wu	995dde2799	[NVPTXFavorNonGenericAddrSpaces] recursively trace into GEP and BitCast Summary: This patch allows NVPTXFavorNonGenericAddrSpaces to remove addrspacecast from longer chains consisting of GEPs and BitCasts. For example, it can now optimize %0 = addrspacecast [10 x float] addrspace(3)* @a to [10 x float]* %1 = gep [10 x float]* %0, i64 0, i64 %i %2 = bitcast float* %1 to i32* %3 = load i32* %2 ; emits ld.u32 to %0 = gep [10 x float] addrspace(3)* @a, i64 0, i64 %i %1 = bitcast float addrspace(3)* %0 to i32 addrspace(3)* %3 = load i32 addrspace(3)* %1 ; emits ld.shared.f32 Test Plan: @ld_int_from_global_float in access-non-generic.ll Reviewers: broune, eliben, jholewinski, meheff Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10074 llvm-svn: 238574	2015-05-29 17:00:27 +00:00
Jingyue Wu	a84feb1727	[DependenceAnalysis] Extend unifySubscriptType for handling coupled subscript groups. Summary: In continuation to an earlier commit to DependenceAnalysis.cpp by jingyue (r222100), the type for all subscripts in a coupled group need to be the same since constraints from one subscript may be propagated to another during testing. During testing, new SCEVs may be created and the operands for these need to be the same. This patch extends unifySubscriptType() to work on lists of subscript pairs, ensuring a common extended type for all of them. Test Plan: Added a test case to NonCanonicalizedSubscript.ll which causes dependence analysis to crash without this fix. All regression tests pass. Reviewers: spop, sebpop, jingyue Reviewed By: jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9698 llvm-svn: 238573	2015-05-29 16:58:08 +00:00
Colin LeMahieu	68d967d92e	[Hexagon] Disassembling, printing, and emitting instructions a whole-bundle at a time which is the semantic unit for Hexagon. Fixing tests to use the new format. Disabling tests in the direct object emission path for a followup patch. llvm-svn: 238556	2015-05-29 14:44:13 +00:00
Quentin Colombet	5f834c2260	Add a test for the MachineCopyPropagation change landed in r238518. llvm-svn: 238537	2015-05-29 01:40:00 +00:00
Ahmed Bougacha	eb4dbd8552	[TableGen][AsmMatcherEmitter] Only parse isolated tokens as registers. Fixes PR23455, where, when TableGen generates the matcher from the AsmString, it splits "cmp${cc}ss" into tokens, and the "ss" suffix is recognized as the SS register. I can't think of a situation where that's a feature, not a bug, hence: when a token is "isolated", i.e., it is followed and preceded by separators, it shouldn't be parsed as a register. Differential Revision: http://reviews.llvm.org/D9844 llvm-svn: 238536	2015-05-29 01:03:37 +00:00
Ahmed Bougacha	0ea9d1e753	[IR] fptrunc-of-fptrunc isn't an EliminableCastPair. Double and single rounding can produce different results. This is the IR counterpart to r228911. llvm-svn: 238531	2015-05-29 00:04:30 +00:00
Chandler Carruth	39691c41bf	[x86] Move the vector popcount tests into non-ISA files, and instead organize them by the width of vector. This makes it a lot easier to see that we're covering all of the vector types but not doing so excessively. This also adds tests across the spectrum of SSE versions in addition to the AVX versions. If you're really tired of seeing the massive sprawl of scalarized code for this, don't worry, I'm just about to land Bruno's patch that dramatically improve the situation for SSSE3 and newer. llvm-svn: 238520	2015-05-28 22:46:48 +00:00
Alex Lorenz	78d7831b0f	MIR Serialization: print and parse machine function names. This commit introduces a serializable structure called 'llvm::yaml::MachineFunction' that stores the machine function's name. This structure will mirror the machine function's state in the future. This commit prints machine functions as YAML documents containing a YAML mapping that stores the state of a machine function. This commit also parses the YAML documents that contain the machine functions. Reviewers: Duncan P. N. Exon Smith Differential Revision: http://reviews.llvm.org/D9841 llvm-svn: 238519	2015-05-28 22:41:12 +00:00
David Majnemer	4e6438c534	Add testcase for r238503. llvm-svn: 238515	2015-05-28 22:12:27 +00:00
Reid Kleckner	fe4d491bd9	[WinEH] Start inserting state number stores for C++ EH This moves all the state numbering code for C++ EH to WinEHPrepare so that we can call it from the X86 state numbering IR pass that runs before isel. Now we just call the same state numbering machinery and insert a bunch of stores. It also populates MachineModuleInfo with information about the current function. llvm-svn: 238514	2015-05-28 22:00:24 +00:00
Rafael Espindola	bb35ebd189	Don't special case undefined symbol when deciding the symbol order. ELF has no restrictions on where undefined symbols go relative to other defined symbols. In fact, gas just sorts them together. Do the same. This was there since r111174 probably just because the MachO writer has it. llvm-svn: 238513	2015-05-28 21:59:34 +00:00
Andy Ayers	b63298e0c8	Revise test to run llc and llvm-mc separately. Differential Revision: http://reviews.llvm.org/D10066 llvm-svn: 238508	2015-05-28 21:49:50 +00:00
Wei Mi	e2538b5639	Enable exitValue rewrite only when the cost of expansion is low. The patch evaluates the expansion cost of exitValue in indVarSimplify pass, and only does the rewriting when the expansion cost is low or loop can be deleted with the rewriting. It provides an option "-replexitval=" to control the default aggressiveness of the exitvalue rewriting. It also fixes some missing cases in SCEVExpander::isHighCostExpansionHelper to enhance the evaluation of SCEV expansion cost. Differential Revision: http://reviews.llvm.org/D9800 llvm-svn: 238507	2015-05-28 21:49:07 +00:00
Reid Kleckner	80956a0142	Disable x86 tail call optimizations that jump through GOT For x86 targets, do not do sibling call optimization when materializing the callee's address would require a GOT relocation. We can still do tail calls to internal functions, hidden functions, and protected functions, because they do not require this kind of relocation. It is still possible to get GOT relocations when the user explicitly asks for it with musttail or -tailcallopt, both of which are supposed to guarantee TCO. Based on a patch by Chih-hung Hsieh. Reviewers: srhines, timmurray, danalbert, enh, void, nadav, rnk Subscribers: joerg, davidxl, llvm-commits Differential Revision: http://reviews.llvm.org/D9799 llvm-svn: 238487	2015-05-28 20:44:28 +00:00
Daniel Sanders	b34dab3d00	Revert r238427 - [mips] Make TTypeEncoding indirect to allow .eh_frame to be read-only. It caused a smaller number of failures than the previous attempt at committing but still caused a couple on the llvm-linux-mips builder. Reverting while I investigate the remainder. llvm-svn: 238483	2015-05-28 20:30:32 +00:00
Alexey Samsonov	6ecbd064e1	Object, ELF: Use error code instead of calling report_fatal_error() Make createELFObjectFile() return object_error::parse_failed on encountering invalid ELF file, instead of crashing the program. llvm-svn: 238481	2015-05-28 20:25:42 +00:00
Peter Collingbourne	450fbee6b2	Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM. We were previously codegen'ing these as regular load/store operations and hoping that the register allocator would allocate registers in ascending order so that we could apply an LDM/STM combine after register allocation. According to the commit that first introduced this code (r37179), we planned to teach the register allocator to allocate the registers in ascending order. This never got implemented, and up to now we've been stuck with very poor codegen. A much simpler approach for achiveing better codegen is to create LDM/STM instructions with identical sets of virtual registers, let the register allocator pick arbitrary registers and order register lists when printing an MCInst. This approach also avoids the need to repeatedly calculate offsets which ultimately ought to be eliminated pre-RA in order to decrease register pressure. This is implemented by lowering the memcpy intrinsic to a series of SD-only MCOPY pseudo-instructions which performs a memory copy using a given number of registers. During SD->MI lowering, we lower MCOPY to LDM/STM. This is a little unusual, but it avoids the need to encode register lists in the SD, and we can take advantage of SD use lists to decide whether to use the _UPD variant of the instructions. Fixes PR9199. Differential Revision: http://reviews.llvm.org/D9508 llvm-svn: 238473	2015-05-28 20:02:45 +00:00
David Majnemer	dd04352558	[InstCombine] Fold IntToPtr and PtrToInt into preceding loads. Currently we only fold a BitCast into a Load when the BitCast is its only user. Do the same for any no-op cast. Differential Revision: http://reviews.llvm.org/D9152 llvm-svn: 238452	2015-05-28 18:39:17 +00:00
Kai Nacke	3adf9b8d80	[mips] Add new format for dmtc2/dmfc2 for Octeon CPUs. Octeon CPUs use dmtc2 rt,imm16 and dmfcp2 rt,imm16 for the crypto coprocessor. E.g. dmtc2 rt,0x4057 starts calculation of sha-1. I had to introduce a new deconding namespace to avoid a decoding conflict. Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D10083 llvm-svn: 238439	2015-05-28 16:23:16 +00:00

1 2 3 4 5 ...

30331 Commits