Commit Graph

63 Commits

Author SHA1 Message Date
Artem Dergachev 44551cf693 [analyzer] Move taint API from ProgramState to a separate header. NFC.
It is now an inter-checker communication API, similar to the one that
connects MallocChecker/CStringChecker/InnerPointerChecker: simply a set of
setters and getters for a state trait.

Differential Revision: https://reviews.llvm.org/D59861

llvm-svn: 357326
2019-03-29 22:49:30 +00:00
Kristof Umann 2827349c9d [analyzer] Use the new infrastructure of expressing taint propagation, NFC
In D55734, we implemented a far more general way of describing taint propagation
rules for functions, like being able to specify an unlimited amount of
source and destination parameters. Previously, we didn't have a particularly
elegant way of expressing the propagation rules for functions that always return
(either through an out-param or return value) a tainted value. In this patch,
we model these functions similarly to other ones, by assigning them a
TaintPropagationRule that describes that they "create a tainted value out of
nothing".

The socket C function is somewhat special, because for certain parameters (for
example, if we supply localhost as parameter), none of the out-params should
be tainted. For this, we added a general solution of being able to specify
custom taint propagation rules through function pointers.

Patch by Gábor Borsik!

Differential Revision: https://reviews.llvm.org/D59055

llvm-svn: 355703
2019-03-08 15:47:56 +00:00
Kristof Umann 855478328b [analyzer] Fix taint propagation in GenericTaintChecker
The gets function has no SrcArgs. Because the default value for isTainted was
false, it didn't mark its DstArgs as tainted.

Patch by Gábor Borsik!

Differential Revision: https://reviews.llvm.org/D58828

llvm-svn: 355396
2019-03-05 12:42:59 +00:00
Artem Dergachev 2a5fb1252e [analyzer] NFC: GenericTaintChecker: Revise rule specification mechanisms.
Provide a more powerful and at the same time more readable way of specifying
taint propagation rules for known functions within the checker.

Now it should be possible to specify an unlimited amount of source and
destination parameters for taint propagation.

No functional change intended just yet.

Patch by Gábor Borsik!

Differential Revision: https://reviews.llvm.org/D55734

llvm-svn: 352572
2019-01-30 00:06:43 +00:00
Kristof Umann 058a7a450a [analyzer] Supply all checkers with a shouldRegister function
Introduce the boolean ento::shouldRegister##CHECKERNAME(const LangOptions &LO)
function very similarly to ento::register##CHECKERNAME. This will force every
checker to implement this function, but maybe it isn't that bad: I saw a lot of
ObjC or C++ specific checkers that should probably not register themselves based
on some LangOptions (mine too), but they do anyways.

A big benefit of this is that all registry functions now register their checker,
once it is called, registration is guaranteed.

This patch is a part of a greater effort to reinvent checker registration, more
info here: D54438#1315953

Differential Revision: https://reviews.llvm.org/D55424

llvm-svn: 352277
2019-01-26 14:23:08 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Artem Dergachev b68cb5498b [analyzer] GenericTaint: Fix formatting to prepare for incoming improvements.
Patch by Gábor Borsik!

Differential Revision: https://reviews.llvm.org/D54918

llvm-svn: 349698
2018-12-19 23:35:08 +00:00
Kristof Umann 76a21502fd [analyzer][NFC] Move CheckerRegistry from the Core directory to Frontend
ClangCheckerRegistry is a very non-obvious, poorly documented, weird concept.
It derives from CheckerRegistry, and is placed in lib/StaticAnalyzer/Frontend,
whereas it's base is located in lib/StaticAnalyzer/Core. It was, from what I can
imagine, used to circumvent the problem that the registry functions of the
checkers are located in the clangStaticAnalyzerCheckers library, but that
library depends on clangStaticAnalyzerCore. However, clangStaticAnalyzerFrontend
depends on both of those libraries.

One can make the observation however, that CheckerRegistry has no place in Core,
it isn't used there at all! The only place where it is used is Frontend, which
is where it ultimately belongs.

This move implies that since
include/clang/StaticAnalyzer/Checkers/ClangCheckers.h only contained a single function:

class CheckerRegistry;

void registerBuiltinCheckers(CheckerRegistry &registry);

it had to re purposed, as CheckerRegistry is no longer available to
clangStaticAnalyzerCheckers. It was renamed to BuiltinCheckerRegistration.h,
which actually describes it a lot better -- it does not contain the registration
functions for checkers, but only those generated by the tblgen files.

Differential Revision: https://reviews.llvm.org/D54436

llvm-svn: 349275
2018-12-15 16:23:51 +00:00
Adrian Prantl 9fc8faf9e6 Remove \brief commands from doxygen comments.
This is similar to the LLVM change https://reviews.llvm.org/D46290.

We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46320

llvm-svn: 331834
2018-05-09 01:00:01 +00:00
Henry Wong 29204c2dfa [analyzer] Move `TaintBugVisitor` from `GenericTaintChecker.cpp` to `BugReporterVisitors.h`.
Summary: `TaintBugVisitor` is a universal visitor, and many checkers rely on it, such as `ArrayBoundCheckerV2.cpp`, `DivZeroChecker.cpp` and `VLASizeChecker.cpp`. Moving `TaintBugVisitor` to `BugReporterVisitors.h` enables other checker can also track where `tainted` value came from.

Reviewers: NoQ, george.karpenkov, xazax.hun

Reviewed By: george.karpenkov

Subscribers: szepet, rnkovacs, a.sidorin, cfe-commits, MTC

Differential Revision: https://reviews.llvm.org/D45682

llvm-svn: 330596
2018-04-23 14:41:17 +00:00
Henry Wong cb2ad24c5c [analyzer] Improves the logic of GenericTaintChecker identifying stdin.
Summary:
GenericTaintChecker can't recognize stdin in some cases. The reason is that `if (PtrTy->getPointeeType() == C.getASTContext().getFILEType()` does not hold when stdin is encountered.

My platform is ubuntu16.04 64bit, gcc 5.4.0, glibc 2.23. The definition of stdin is as follows:
```
__BEGIN_NAMESPACE_STD
/* The opaque type of streams.  This is the definition used elsewhere.  */
typedef struct _IO_FILE FILE;
___END_NAMESPACE_STD

  ...

/* The opaque type of streams.  This is the definition used elsewhere.  */
typedef struct _IO_FILE __FILE;   

  ...

/* Standard streams.  */
extern struct _IO_FILE *stdin;      /* Standard input stream.  */
extern struct _IO_FILE *stdout;     /* Standard output stream.  */
extern struct _IO_FILE *stderr;     /* Standard error output stream.  */
```

The type of stdin is as follows AST:
```
ElaboratedType 0xc911170'struct _IO_FILE'sugar
`-RecordType 0xc911150'struct _IO_FILE'
 `-CXXRecord 0xc923ff0'_IO_FILE'
```

`C.getASTContext().GetFILEType()` is as follows AST:
```
TypedefType 0xc932710 'FILE' sugar
|-Typedef 0xc9111c0 'FILE'
`-ElaboratedType 0xc911170 'struct _IO_FILE' sugar
  `-RecordType 0xc911150 'struct _IO_FILE'
      `-CXXRecord 0xc923ff0 '_IO_FILE'
```

So I think it's better to use `getCanonicalType()`.

Reviewers: zaks.anna, NoQ, george.karpenkov, a.sidorin

Reviewed By: zaks.anna, a.sidorin

Subscribers: a.sidorin, cfe-commits, xazax.hun, szepet, MTC

Differential Revision: https://reviews.llvm.org/D39159

llvm-svn: 326709
2018-03-05 15:41:15 +00:00
George Karpenkov d703ec94a9 [analyzer] introduce getSVal(Stmt *) helper on ExplodedNode, make sure the helper is used consistently
In most cases using
`N->getState()->getSVal(E, N->getLocationContext())`
is ugly, verbose, and also opens up more surface area for bugs if an
inconsistent location context is used.

This patch introduces a helper on an exploded node, and ensures
consistent usage of either `ExplodedNode::getSVal` or
`CheckContext::getSVal` across the codebase.
As a result, a large number of redundant lines is removed.

Differential Revision: https://reviews.llvm.org/D42155

llvm-svn: 322753
2018-01-17 20:27:29 +00:00
Artem Dergachev 3ef5deb3a7 [analyzer] In getSVal() API, disable auto-detection of void type as char type.
This is a follow-up from r314910. When a checker developer attempts to
dereference a location in memory through ProgramState::getSVal(Loc) or
ProgramState::getSVal(const MemRegion *), without specifying the second
optional QualType parameter for the type of the value he tries to find at this
location, the type is auto-detected from location type. If the location
represents a value beyond a void pointer, we thought that auto-detecting the
type as 'char' is a good idea. However, in most practical cases, the correct
behavior would be to specify the type explicitly, as it is available from other
sources, and the few cases where we actually need to take a 'char' are
workarounds rather than an intended behavior. Therefore, try to fail with an
easy-to-understand assertion when asked to read from a void pointer location.

Differential Revision: https://reviews.llvm.org/D38801

llvm-svn: 320451
2017-12-12 02:27:55 +00:00
Artem Dergachev eed7a3102c [analyzer] Support partially tainted records.
The analyzer's taint analysis can now reason about structures or arrays
originating from taint sources in which only certain sections are tainted.

In particular, it also benefits modeling functions like read(), which may
read tainted data into a section of a structure, but RegionStore is incapable of
expressing the fact that the rest of the structure remains intact, even if we
try to model read() directly.

Patch by Vlad Tsyrklevich!

Differential revision: https://reviews.llvm.org/D28445

llvm-svn: 304162
2017-05-29 15:42:56 +00:00
Anna Zaks 12d0c8d662 [analyzer] Extend taint propagation and checking to support LazyCompoundVal
A patch by Vlad Tsyrklevich!

Differential Revision: https://reviews.llvm.org/D28445

llvm-svn: 297326
2017-03-09 00:01:16 +00:00
Anna Zaks d4e43ae22a [analyzer] Add bug visitor for taint checker.
Add a bug visitor to the taint checker to make it easy to distinguish where
the tainted value originated. This is especially useful when the original
taint source is obscured by complex data flow.

A patch by Vlad Tsyrklevich!

Differential Revision: https://reviews.llvm.org/D30289

llvm-svn: 297324
2017-03-09 00:01:07 +00:00
Alexander Kornienko 9c10490efe Refactor: Simplify boolean conditional return statements in lib/StaticAnalyzer/Checkers
Summary: Use clang-tidy to simplify boolean conditional return values

Reviewers: dcoughlin, krememek

Subscribers: krememek, cfe-commits

Patch by Richard Thomson!

Differential Revision: http://reviews.llvm.org/D10021

llvm-svn: 256491
2015-12-28 13:06:58 +00:00
Devin Coughlin e39bd407ba [analyzer] Add generateErrorNode() APIs to CheckerContext.
The analyzer trims unnecessary nodes from the exploded graph before reporting
path diagnostics. However, in some cases it can trim all nodes (including the
error node), leading to an assertion failure (see
https://llvm.org/bugs/show_bug.cgi?id=24184).

This commit addresses the issue by adding two new APIs to CheckerContext to
explicitly create error nodes. Unless the client provides a custom tag, these
APIs tag the node with the checker's tag -- preventing it from being trimmed.
The generateErrorNode() method creates a sink error node, while
generateNonFatalErrorNode() creates an error node for a path that should
continue being explored.

The intent is that one of these two methods should be used whenever a checker
creates an error node.

This commit updates the checkers to use these APIs. These APIs
(unlike addTransition() and generateSink()) do not take an explicit Pred node.
This is because there are not any error nodes in the checkers that were created
with an explicit different than the default (the CheckerContext's Pred node).

It also changes generateSink() to require state and pred nodes (previously
these were optional) to reduce confusion.

Additionally, there were several cases where checkers did check whether a
generated node could be null; we now explicitly check for null in these places.

This commit also includes a test case written by Ying Yi as part of
http://reviews.llvm.org/D12163 (that patch originally addressed this issue but
was reverted because it introduced false positive regressions).

Differential Revision: http://reviews.llvm.org/D12780

llvm-svn: 247859
2015-09-16 22:03:05 +00:00
Ted Kremenek 3a0678e33c [analyzer] Apply whitespace cleanups by Honggyu Kim.
llvm-svn: 246978
2015-09-08 03:50:52 +00:00
Aaron Ballman 8d3a7a56a9 Clarify pointer ownership semantics by hoisting the std::unique_ptr creation to the caller instead of hiding it in emitReport. NFC.
llvm-svn: 240400
2015-06-23 13:15:32 +00:00
Enrico Pertoso 4432d87578 Fixes a typo in a comment.
llvm-svn: 238910
2015-06-03 09:10:58 +00:00
Craig Topper 0dbb783c7b [C++11] Use 'nullptr'. StaticAnalyzer edition.
llvm-svn: 209642
2014-05-27 02:45:47 +00:00
Nuno Lopes fb744589bc remove a bunch of unused private methods
found with a smarter version of -Wunused-member-function that I'm playwing with.
Appologies in advance if I removed someone's WIP code.

 ARCMigrate/TransProperties.cpp                  |    8 -----
 AST/MicrosoftMangle.cpp                         |    1 
 Analysis/AnalysisDeclContext.cpp                |    5 ---
 Analysis/LiveVariables.cpp                      |   14 ----------
 Index/USRGeneration.cpp                         |   10 -------
 Sema/Sema.cpp                                   |   33 +++++++++++++++++++++---
 Sema/SemaChecking.cpp                           |    3 --
 Sema/SemaDecl.cpp                               |   20 ++------------
 StaticAnalyzer/Checkers/GenericTaintChecker.cpp |    1 
 9 files changed, 34 insertions(+), 61 deletions(-)

llvm-svn: 204561
2014-03-23 17:12:37 +00:00
Aaron Ballman be22bcb180 [C++11] Replacing DeclBase iterators specific_attr_begin() and specific_attr_end() with iterator_range specific_attrs(). Updating all of the usages of the iterators with range-based for loops.
llvm-svn: 203474
2014-03-10 17:08:28 +00:00
Ahmed Charles b89843299a Replace OwningPtr with std::unique_ptr.
This compiles cleanly with lldb/lld/clang-tools-extra/llvm.

llvm-svn: 203279
2014-03-07 20:03:18 +00:00
Alexander Kornienko 4aca9b1cd8 Expose the name of the checker producing each diagnostic message.
Summary:
In clang-tidy we'd like to know the name of the checker producing each
diagnostic message. PathDiagnostic has BugType and Category fields, which are
both arbitrary human-readable strings, but we need to know the exact name of the
checker in the form that can be used in the CheckersControlList option to
enable/disable the specific checker.

This patch adds the CheckName field to the CheckerBase class, and sets it in
the CheckerManager::registerChecker() method, which gets them from the
CheckerRegistry.

Checkers that implement multiple checks have to store the names of each check
in the respective registerXXXChecker method.

Reviewers: jordan_rose, krememek

Reviewed By: jordan_rose

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2557

llvm-svn: 201186
2014-02-11 21:49:21 +00:00
Aaron Ballman f58070baed Switched FormatAttr to using an IdentifierArgument instead of a StringArgument since that is a more accurate modeling.
llvm-svn: 189851
2013-09-03 21:02:22 +00:00
David Blaikie 05785d1622 Include llvm::Optional in clang/Basic/LLVM.h
Post-commit CR feedback from Jordan Rose regarding r175594.

llvm-svn: 175679
2013-02-20 22:23:23 +00:00
David Blaikie 2fdacbc5b0 Replace SVal llvm::cast support to be well-defined.
See r175462 for another example/more details.

llvm-svn: 175594
2013-02-20 05:52:05 +00:00
Dmitri Gribenko f857950d39 Remove useless 'llvm::' qualifier from names like StringRef and others that are
brought into 'clang' namespace by clang/Basic/LLVM.h

llvm-svn: 172323
2013-01-12 19:30:44 +00:00
Chandler Carruth 3a02247dc9 Sort all of Clang's files under 'lib', and fix up the broken headers
uncovered.

This required manually correcting all of the incorrect main-module
headers I could find, and running the new llvm/utils/sort_includes.py
script over the files.

I also manually added quite a few missing headers that were uncovered by
shuffling the order or moving headers up to be main-module-headers.

llvm-svn: 169237
2012-12-04 09:13:33 +00:00
Benjamin Kramer ea70eb30a0 Pull the Attr iteration parts out of Attr.h, so including DeclBase.h doesn't pull in all the generated Attr code.
Required to pull some functions out of line, but this shouldn't have a perf impact.
No functionality change.

llvm-svn: 169092
2012-12-01 15:09:41 +00:00
Jordan Rose 0c153cb277 [analyzer] Use nice macros for the common ProgramStateTraits (map, set, list).
Also, move the REGISTER_*_WITH_PROGRAMSTATE macros to ProgramStateTrait.h.

This doesn't get rid of /all/ explicit uses of ProgramStatePartialTrait,
but it does get a lot of them.

llvm-svn: 167276
2012-11-02 01:54:06 +00:00
Jordan Rose e10d5a7659 [analyzer] Rename 'EmitReport' to 'emitReport'.
No functionality change.

llvm-svn: 167275
2012-11-02 01:53:40 +00:00
Benjamin Kramer e6f7008534 Remove trivial destructor from SVal.
This enables the faster SmallVector in clang and also allows clang's unused
variable warnings to be more effective. Fix the two instances that popped up.

The RetainCountChecker change actually changes functionality, it would be nice
if someone from the StaticAnalyzer folks could look at it.

llvm-svn: 160444
2012-07-18 19:08:44 +00:00
Jordan Rose 6cd16c5152 [analyzer] Guard against C++ member functions that look like system functions.
C++ method calls and C function calls both appear as CallExprs in the AST.
This was causing crashes for an object that had a 'free' method.

<rdar://problem/11822244>

llvm-svn: 160029
2012-07-10 23:13:01 +00:00
Benjamin Kramer 474261af7b Fix typos found by http://github.com/lyda/misspell-check
llvm-svn: 157886
2012-06-02 10:20:41 +00:00
Anna Zaks b508d29b78 [analyzer] Don't crash even when the system functions are redefined.
(Applied changes to CStringAPI, Malloc, and Taint.)

This might almost never happen, but we should not crash even if it does.
This fixes a crash on the internal analyzer buildbot, where postgresql's
configure was redefining memmove (radar://11219852).

llvm-svn: 154451
2012-04-10 23:41:11 +00:00
Anna Zaks 3705a1ee10 [analyzer] Change naming in bug reports "tainted" -> "untrusted"
llvm-svn: 151120
2012-02-22 02:35:58 +00:00
Dylan Noblesmith e27789991d Basic: import OwningPtr<> into clang namespace
llvm-svn: 149798
2012-02-05 02:12:40 +00:00
Ted Kremenek 49b1e38e4b Change references to 'const ProgramState *' to typedef 'ProgramStateRef'.
At this point this is largely cosmetic, but it opens the door to replace
ProgramStateRef with a smart pointer that more eagerly acts in the role
of reclaiming unused ProgramState objects.

llvm-svn: 149081
2012-01-26 21:29:00 +00:00
Anna Zaks bf740512ec [analyzer] Add more C taint sources/sinks.
llvm-svn: 148844
2012-01-24 19:32:25 +00:00
Anna Zaks 97bef5642e [analyzer] It's possible to have a non PointerType expression evaluate to a Loc value. When this happens, use the default type.
llvm-svn: 148631
2012-01-21 06:59:01 +00:00
David Blaikie e4d798f078 More dead code removal (using -Wunreachable-code)
llvm-svn: 148577
2012-01-20 21:50:17 +00:00
Anna Zaks 3b754b25bd [analyzer] Add socket API as a source of taint.
llvm-svn: 148518
2012-01-20 00:11:19 +00:00
Anna Zaks 7f6a6b7507 [analyzer] Refactor: prePropagateTaint ->
TaintPropagationRule::process().

Also remove the "should be a pointer argument" warning - should be
handled elsewhere.

llvm-svn: 148372
2012-01-18 02:45:13 +00:00
Anna Zaks 560dbe9ac9 [analyzer] Taint: warn when tainted data is used to specify a buffer
size (Ex: in malloc, memcpy, strncpy..)

(Maybe some of this could migrate to the CString checker. One issue
with that is that we might want to separate security issues from
regular API misuse.)

llvm-svn: 148371
2012-01-18 02:45:11 +00:00
Anna Zaks 5d324e509c [analyzer] Taint: add taint propagation rules for string and memory copy
functions.

llvm-svn: 148370
2012-01-18 02:45:07 +00:00
Anna Zaks 3666d2c160 [analyzer] Taint: generalize taint propagation to simplify adding more
taint propagation functions.

llvm-svn: 148266
2012-01-17 00:37:02 +00:00
Anna Zaks 0244cd7450 [analyzer] Taint: add system and popen as undesirable sinks for taint
data.

llvm-svn: 148176
2012-01-14 02:48:40 +00:00