Commit Graph

77 Commits

Author SHA1 Message Date
Tatsuyuki Ishi 826f385348 [Support] Use find() for faster StringRef::count (NFC)
While profiling InclusionRewriter, it was found that counting lines was
so slow that it took up 20% of the processing time. Surely, calling
memcmp() of size 1 on every substring in the window isn't a good idea.

Use StringRef::find() instead; in the case of N=1 it will forward to
memcmp which is much more optimal. For 2<=N<256 it will run the same
memcmp loop as we have now, which is still suboptimal but at least does
not regress anything.

Differential Revision: https://reviews.llvm.org/D133658
2022-10-27 10:22:25 +02:00
Kazu Hirata 1b97645e56 [ADT] Introduce StringRef::{starts,ends}_width{,_insensitive}
This patch introduces:

  StringRef::starts_with
  StringRef::starts_with_insensitive
  StringRef::ends_with
  StringRef::ends_with_insensitive

to be more compatible with std::string and std::string_view.

I'm planning to deprecate the existing functions in favor of the new
ones.

Differential Revision: https://reviews.llvm.org/D136030
2022-10-15 15:06:37 -07:00
Tatsuyuki Ishi 5699692dfc [Support] Add fast path for StringRef::find with needle of length 2.
InclusionRewriter on Windows (CRLF line endings) will exercise this in a
hot path. Calling memcmp repeatedly would be highly suboptimal for that
use case, so give it a specialized path.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D133660
2022-10-09 00:58:07 +00:00
Nathan James a13b61f7f0
[ADT] Add edit_distance_insensitive to StringRef
In some instances its advantageous to calculate edit distances without worrying about casing.
Currently to achieve this both strings need to be converted to the same case first, then edit distance can be calculated.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D126159
2022-06-05 12:03:09 +01:00
serge-sans-paille 5f290c090a Move STLFunctionalExtras out of STLExtras
Only using that change in StringRef already decreases the number of
preoprocessed lines from 7837621 to 7776151 for LLVMSupport

Perhaps more interestingly, it shows that many files were relying on the
inclusion of StringRef.h to have the declaration from STLExtras.h. This
patch tries hard to patch relevant part of llvm-project impacted by this
hidden dependency removal.

Potential impact:
- "llvm/ADT/StringRef.h" no longer includes <memory>,
  "llvm/ADT/Optional.h" nor "llvm/ADT/STLExtras.h"

Related Discourse thread:
https://llvm.discourse.group/t/include-what-you-use-include-cleanup/5831
2022-01-24 14:13:21 +01:00
Kazu Hirata 262dd1e42d [llvm] Use range-based for loops (NFC) 2021-12-02 09:27:47 -08:00
Martin Storsjö 3eed57e7ef [ADT] Rename StringRef case insensitive methods for clarity
Rename functions with the `xx_lower()` names to `xx_insensitive()`.
This was requested during the review of D104218.

Test names and variables in llvm/unittests/ADT/StringRefTest.cpp
that refer to "lower" are renamed to "insensitive" correspondingly.

Unused function aliases with the former method names are left
in place (without any deprecation attributes) for transition purposes.

All references within the monorepo will be changed (with essentially
mechanical changes), and then the old names will be removed in a
later commit.

Also remove the superfluous method names at the start of doxygen
comments, for the methods that are touched here. (There are more
occurrances of this left in other methods though.) Also remove
duplicate doxygen comments from the implementation file.

Differential Revision: https://reviews.llvm.org/D104819
2021-06-25 00:22:00 +03:00
Benjamin Kramer 8f9d3b937c [StringRef] Use some trickery to avoid initializing the std::string returned by upper()/lower() 2020-05-21 16:03:09 +02:00
Benjamin Kramer b198f1f86c Make some static class members constexpr
This allows them to be ODR used in C++17 mode. NFC.
2020-04-22 12:25:01 +02:00
Ehud Katz 24b326cc61 [APFloat] Fix checked error assert failures
`APFLoat::convertFromString` returns `Expected` result, which must be
"checked" if the LLVM_ENABLE_ABI_BREAKING_CHECKS preprocessor flag is
set.
To mark an `Expected` result as "checked" we must consume the `Error`
within.
In many cases, we are only interested in knowing if an error occured,
without the need to examine the error info. This is achieved, easily,
with the `errorToBool()` API.
2020-01-09 09:42:32 +02:00
Ehud Katz f3f7dc3d29 [APFloat] Fix compilation warnings 2020-01-06 11:30:40 +02:00
Ehud Katz c5fb73c5d1 [APFloat] Add recoverable string parsing errors to APFloat
Implementing the APFloat part in PR4745.

Differential Revision: https://reviews.llvm.org/D69770
2020-01-06 10:09:01 +02:00
Alexandre Ganea 92b68c1937 [polly][Support] Un-break polly tests
Previously, the polly unit tests were stuck in a infinite loop.
There was an edge case in StringRef::count() introduced by 9f6b13e5cc, where an empty 'Str' would cause the function to never exit.
Also fixed usage in polly.
2020-01-01 17:29:04 -05:00
Johannes Doerfert 9f6b13e5cc [Support] Fix behavior of StringRef::count with overlapping occurrences, add tests
Summary:
Fix the behavior of StringRef::count(StringRef) to not count overlapping occurrences, as is stated in the documentation.
Fixes bug https://bugs.llvm.org/show_bug.cgi?id=44072

I added Krzysztof Parzyszek to review this change because a use of this function in HexagonInstrInfo::getInlineAsmLength might depend on the overlapping-behavior. I don't have enough domain knowledge to tell if this change could break anything there.

All other uses of this method in LLVM (besides the unit tests) only use single-character search strings. In those cases, search occurrences can not overlap anyway.

Patch by Benno (@Bensge)

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D70585
2019-12-24 18:30:41 -06:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Fangrui Song f78650a8de Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338293
2018-07-30 19:41:25 +00:00
Serguei Katkov 768d6dd087 Fix APFloat from string conversion for Inf
The method IEEEFloat::convertFromStringSpecials() does not recognize
the "+Inf" and "-Inf" strings but these strings are printed for
the double Infinities by the IEEEFloat::toString().

This patch adds the "+Inf" and "-Inf" strings to the list of recognized
patterns in IEEEFloat::convertFromStringSpecials().

Re-landing after fix.

Reviewers: sberg, bogner, majnemer, timshen, rnk, skatkov, gottesmm, bkramer, scanon, anna
Reviewed By: anna
Subscribers: mkazantsev, FlameTop, llvm-commits, reames, apilipenko
Differential Revision: https://reviews.llvm.org/D38030

llvm-svn: 321054
2017-12-19 04:27:39 +00:00
Francis Visoiu Mistrih 26d6fc1f0e [Support] Merge toLower / toUpper implementations
Merge the ones from StringRef and StringExtras.

llvm-svn: 319171
2017-11-28 14:22:27 +00:00
Zachary Turner 8bd42a1a98 [Support] Add StringRef::getAsDouble.
Differential Revision: https://reviews.llvm.org/D29918

llvm-svn: 295089
2017-02-14 19:06:37 +00:00
Chandler Carruth ecbe61966f Tweak the core loop in StringRef::find to avoid calling memcmp on every
iteration.

Instead, load the byte at the needle length, compare it directly, and
save it to use in the lookup table of lengths we can skip forward.

I also added an annotation to expect that the comparison fails so that
the loop gets laid out contiguously without the call to memcpy (and the
substantial register shuffling that the ABI requires of that call).

Finally, because this behaves especially badly with a needle length of
one (by calling memcmp with a zero length) special case that to directly
call memchr, which is what we should have been doing anyways.

This was motivated by the fact that there are a large number of test
cases in 'check-llvm' where FileCheck's performance is dominated by
calls to StringRef::find (in a release, no-asserts build). I'm working
on patches to generally improve matters there, but this alone was worth
a 12.5% improvement in one test case where FileCheck spent 92% of its
time in this routine.

I experimented a bunch with different minor variations on this theme,
for example setting the pointer *at* the last byte and indexing
backwards for the call to memcmp. That didn't improve anything on this
version and seemed more complex. I also tried other things to make the
loop flow more nicely and none worked. =/ It is a bit unfortunate, the
generated code here remains pretty gross, but I don't see any obvious
ways to improve it. At this point, most of my ideas would be really
elaborate:

1) While the remainder of the string is long enough, we could load
   a 16-byte or 32-byte vector at the address of the last byte and use
   palignr to rotate that and check the first 15- or 31-bytes at the
   front of the next segment, essentially pre-loading the first several
   bytes of the next iteration so we could quickly detect a mismatch in
   those bytes without an additional memory access. Down side would be
   the code complexity, having a fallback loop, and likely misaligned
   vector load. Plus it would make the common case of the last byte not
   matching somewhat slower (need some extraction from a vector).
2) While we have space, we could do an aligned load of a 16- or 32-byte
   vector that *contains* the end byte, and use any peceding bytes to
   have a more precise "no" test, and any subsequent bytes could be
   saved for the next iteration. This remove any unaligned load penalty,
   but still requires us to pay the overhead of vector extraction for
   the cases where we didn't need to do anything other than load and
   compare the last byte.
3) Try to walk from the last byte in a way that is more friendly to
   cache and/or memory pre-fetcher considering we have to poke the last
   byte anyways.

No idea if any of these are really worth pursuing though. They all seem
somewhat unlikely to yield big wins in practice and to be a lot of work
and complexity. So I settled here, which at least seems like a strict
improvement over the previous version.

llvm-svn: 289373
2016-12-11 07:46:21 +00:00
Zachary Turner 17412b03b2 [Support] Add StringRef::find_lower and contains_lower.
Differential Revision: https://reviews.llvm.org/D25299

llvm-svn: 286724
2016-11-12 17:17:12 +00:00
Zachary Turner d5d57635ba Speculative fix for build failures due to consumeInteger.
A recent patch added support for consumeInteger() and made
getAsInteger delegate to this function.  A few buildbots are
failing as a result with an assertion failure.  On a hunch,
I tested what happens if I call getAsInteger() on an empty
string, and sure enough it crashes the same way that the
buildbots are crashing.

I confirmed that getAsInteger() on an empty string did not
crash before my patch, so I suspect this to be the cause.

I also added a unit test for the empty string.

llvm-svn: 282170
2016-09-22 15:55:05 +00:00
Zachary Turner 65fd2fc7b4 [Support] Add StringRef::consumeInteger.
StringRef::getInteger() exists and treats the entire string as
an integer of the specified radix, failing if any invalid characters
are encountered or the number overflows.

Sometimes you might have something like "123456foo" and you want
to get the number 123456 and leave the string "foo" remaining.
This is similar to what would be possible by using the standard
runtime library functions strtoul et al and specifying an end
pointer.

This patch adds consumeInteger(), which does exactly that.  It
consumes as much as possible until an invalid character is found,
and modifies the StringRef in place so that upon return only
the portion of the StringRef after the number remains.

Differential Revision: https://reviews.llvm.org/D24778

llvm-svn: 282164
2016-09-22 15:05:19 +00:00
Colin LeMahieu 0143146514 [MCParser] Accept uppercase radix variants 0X and 0B
Differential Revision: http://reviews.llvm.org/D14781

llvm-svn: 263802
2016-03-18 18:22:07 +00:00
Chandler Carruth 233edd20a7 [ADT] Rewrite the StringRef::find implementation to be simpler, clearer,
and tremendously less reliant on the optimizer to fix things.

The code is always necessarily looking for the entire length of the
string when doing the equality tests in this find implementation, but it
previously was needlessly re-checking the size each time among other
annoyances.

By writing this so simply an ddirectly in terms of memcmp, it also is
about 8x faster in a debug build, which in turn makes FileCheck about 2x
faster in 'ninja check-llvm'. This saves about 8% of the time for
FileCheck-heavy parts of the test suite like the x86 backend tests.

llvm-svn: 247269
2015-09-10 11:17:49 +00:00
Chandler Carruth 4425c91dea [ADT] Fix a confusing interface spec and some annoying peculiarities
with the StringRef::split method when used with a MaxSplit argument
other than '-1' (which nobody really does today, but which should
actually work).

The spec claimed both to split up to MaxSplit times, but also to append
<= MaxSplit strings to the vector. One of these doesn't make sense.
Given the name "MaxSplit", let's go with it being a max over how many
*splits* occur, which means the max on how many strings get appended is
MaxSplit+1. I'm not actually sure the implementation correctly provided
this logic either, as it used a really opaque loop structure.

The implementation was also playing weird games with nullptr in the data
field to try to rely on a totally opaque hidden property of the split
method that returns a pair. Nasty IMO.

Replace all of this with what is (IMO) simpler code that doesn't use the
pair returning split method, and instead just finds each separator and
appends directly. I think this is a lot easier to read, and it most
definitely matches the spec. Added some tests that exercise the corner
cases around StringRef() and StringRef("") that all now pass.

I'll start using this in code in the next commit.

llvm-svn: 247249
2015-09-10 07:51:37 +00:00
Chandler Carruth 477121721b [ADT] Add a single-character version of the small vector split routine
on StringRef. Finding and splitting on a single character is
substantially faster than doing it on even a single character StringRef
-- we immediately get to a *very* tuned memchr call this way.

Even nicer, we get to this even in a debug build, shaving 18% off the
runtime of TripleTest.Normalization, helping PR23676 some more.

llvm-svn: 247244
2015-09-10 06:07:03 +00:00
Craig Topper e1d1294853 Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or just letting them be implicitly created.
llvm-svn: 216525
2014-08-27 05:25:25 +00:00
Craig Topper 3ced27c835 Remove custom implementations of max/min in StringRef that was originally added to work an old gcc bug. I believe its been fixed by now.
llvm-svn: 216156
2014-08-21 04:31:10 +00:00
Craig Topper c10719f55d [C++11] Make use of 'nullptr' in the Support library.
llvm-svn: 205697
2014-04-07 04:17:22 +00:00
Ahmed Charles 56440fd820 Replace OwningPtr<T> with std::unique_ptr<T>.
This compiles with no changes to clang/lld/lldb with MSVC and includes
overloads to various functions which are used by those projects and llvm
which have OwningPtr's as parameters. This should allow out of tree
projects some time to move. There are also no changes to libs/Target,
which should help out of tree targets have time to move, if necessary.

llvm-svn: 203083
2014-03-06 05:51:42 +00:00
Rui Ueyama 00e24e48b6 Add {start,end}with_lower methods to StringRef.
startswith_lower is ocassionally useful and I think worth adding.
endwith_lower is added for completeness.

Differential Revision: http://llvm-reviews.chandlerc.com/D2041

llvm-svn: 193706
2013-10-30 18:32:26 +00:00
Dmitri Gribenko 292c9200fc Added const qualifier to StringRef::edit_distance member function
Patch by Ismail Pazarbasi.

llvm-svn: 189162
2013-08-24 01:50:41 +00:00
Manman Ren 9c5e998043 Revert r185852.
llvm-svn: 185861
2013-07-08 20:27:34 +00:00
Manman Ren c6fe5bc77c StringRef: add DenseMapInfo for StringRef.
Remove the implementation in include/llvm/Support/YAMLTraits.h.
Added a DenseMap type DITypeHashMap in DebugInfo.h:
  DenseMap<std::pair<StringRef, unsigned>, MDNode*>

llvm-svn: 185852
2013-07-08 19:17:48 +00:00
Chandler Carruth ed0881b2a6 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Nick Kledzik 35c79da3f8 Improve overflow detection in StringRef::getAsUnsignedInteger().
llvm-svn: 165038
2012-10-02 20:01:48 +00:00
Michael J. Spencer 93303819ac [Support/StringRef] Add find_last_not_of and {r,l,}trim.
llvm-svn: 156652
2012-05-11 22:08:50 +00:00
Chris Lattner 5e14666149 Don't die with an assertion if the Result bitwidth is already correct. This
fixes an assert reading "1239123123123123" when the result is already 64-bit.

llvm-svn: 155329
2012-04-23 00:27:54 +00:00
Chris Lattner 0a1bafed7b No need for "else if" after a return. Autosense "0o123" as octal in
StringRef::getAsInteger

llvm-svn: 155298
2012-04-21 22:03:05 +00:00
Michael J. Spencer cfa95f66a1 Make StringRef::getAsInteger work with all integer types. Before this change
it would fail with {,u}int64_t on x86-64 Linux.

This also removes code duplication.

llvm-svn: 152517
2012-03-10 23:02:54 +00:00
Chandler Carruth ca99ad3f0d Add generic support for hashing StringRef objects using the new hashing library.
llvm-svn: 152003
2012-03-04 10:55:27 +00:00
Duncan Sands 69d7a91334 Workaround a miscompilation by gcc-4.3 that showed up as a failure
of the StringRef.Split2 unittest on 32 bit machines.

llvm-svn: 151358
2012-02-24 09:01:34 +00:00
Duncan Sands 8570b29dfe Move the implementation of StringRef::split out of StringExtras.cpp
and into StringRef.cpp, which is where the other StringRef stuff is.

llvm-svn: 151054
2012-02-21 12:00:25 +00:00
Kaelyn Uhrain 7a9ccf4c09 Add function for computing the edit distance of two arrays.
Accomplished by moving the body of StringRef::edit_distance into
a separate function that accepts two ArrayRefs, and making
StringRef::edit_distance a wrapper around the new function.

llvm-svn: 150621
2012-02-15 22:13:07 +00:00
Benjamin Kramer e3b94d1b55 Fix a typo.
llvm-svn: 143890
2011-11-06 20:36:50 +00:00
Daniel Dunbar 3fa528d67c ADT/StringRef: Add ::lower() and ::upper() methods.
llvm-svn: 143880
2011-11-06 18:04:43 +00:00
Benjamin Kramer e664de33b1 Fix handling of the From parameter in StringRef::find.
Enable bounds checking to catch this kind of bug earlier.

llvm-svn: 142247
2011-10-17 20:49:40 +00:00
Benjamin Kramer 4d681d7dc4 Add a bad char heuristic to StringRef::find.
Based on Horspool's simplified version of Boyer-Moore. We use a constant-sized table of
uint8_ts to keep cache thrashing low, needles bigger than 255 bytes are uncommon anyways.

The worst case is still O(n*m) but we do a lot better on the average case now.

llvm-svn: 142061
2011-10-15 10:08:31 +00:00
Jakob Stoklund Olesen c874e2d8fb Fix a bug in compare_numeric().
Thanks to Alexandru Dura and Jonas Paulsson for finding it.

llvm-svn: 140859
2011-09-30 17:03:55 +00:00